The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.
quick consensus on tech-kern
tech-kern:
- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.
The latter actually function (due to luck) for calls with one argument,
but will fail badly if more than one is required.
Noticed as an error in the ktrace outut by Nicolas Joly, reported on
tech-kern.
With stackgap and CHECK_ALT_xxx removal, some linux32 and netbsd32
syscalls are now identical.
To avoid code duplication, remove the linux32 definition and use the
netbsd32 one (no functional change).
Should fix PR/36939 and make the rlimit code MP safe.
Posted for comment to tech-kern (non received!)
The p_limit field (for a process) is only be changed once (on the first
write), and a reference to the old structure is kept (for code paths
that have cached the pointer).
Only p->p_limit is now locked by p->p_mutex, and since the referenced memory
will not go away, is only needed if the pointer is to be changed.
The contents of 'struct plimit' are all locked by pl_mutex, except that the
code doesn't bother to acquire it for reads (which are basically atomic).
Add FORK_SHARELIMIT that causes fork1() to share the limits between parent
and child, use it for the IRIX_PR_SULIMIT.
Fix borked test for both IRIX_PR_SUMASK and IRIX_PR_SDIR being set.
of the corresponding 32bit architecture.
Use it for the 64bit items in netbsd32_statvfs so that the structure
doesn't collect 8byte alignment (and 4 bytes of trailing padding).
This replaces the 'packed' attribute which wasn't architecture specific
and would cause massive overheads accessing every member of sparc64.
Should allow the MIPS64 port do DTRT.
that use struct ifreq which have not been explicitly versioned.
If someone feels like fixing it with a list aproach, I think below is
a complete list - the one used in the previous version missed a lot of them.
BIOCGETIF
BIOCSETIF
GREDSOCK
GREGADDRD
GREGADDRS
GREGPROTO
GRESADDRD
GRESADDRS
GRESPROTO
GRESSOCK
SIOCADDMULTI
SIOCDELMULTI
SIOCDIFADDR
SIOCDIFADDR_IN6
SIOCDIFPHYADDR
SIOCGDEFIFACE_IN6
SIOCGIFADDR
SIOCGIFADDR_IN6
SIOCGIFAFLAG_IN6
SIOCGIFALIFETIME_IN6
SIOCGIFBRDADDR
SIOCGIFDLT
SIOCGIFDSTADDR
SIOCGIFDSTADDR_IN6
SIOCGIFFLAGS
SIOCGIFGENERIC
SIOCGIFMETRIC
SIOCGIFMTU
SIOCGIFNETMASK
SIOCGIFNETMASK_IN6
SIOCGIFPDSTADDR
SIOCGIFPDSTADDR_IN6
SIOCGIFPSRCADDR
SIOCGIFPSRCADDR_IN6
SIOCGIFSTAT_ICMP6
SIOCGIFSTAT_IN6
SIOCGPVCSIF
SIOCGVH
SIOCIFCREATE
SIOCIFDESTROY
SIOCSDEFIFACE_IN6
SIOCSIFADDR
SIOCSIFADDR_IN6
SIOCSIFALIFETIME_IN6
SIOCSIFBRDADDR
SIOCSIFDSTADDR
SIOCSIFDSTADDR_IN6
SIOCSIFFLAGS
SIOCSIFGENERIC
SIOCSIFMEDIA
SIOCSIFMETRIC
SIOCSIFMTU
SIOCSIFNETMASK
SIOCSIFNETMASK_IN6
SIOCSNDFLUSH_IN6
SIOCSPFXFLUSH_IN6
SIOCSPVCSIF
SIOCSRTRFLUSH_IN6
SIOCSVH
TAPGIFNAME
1 microsecond into the future, the thread could enter an untimed sleep.
- Change the signature of _lwp_park() to accept an lwpid_t and second
hint pointer, but do so in a way that remains compatible with older
pthread libraries. This can be used to wake another thread before the
calling thread goes asleep, saving at least one syscall + involuntary
context switch. This turns out to be a fairly large win on the condvar
benchmarks that I have tried.
- Mark some more syscalls MP safe.
Possibly the standard nfs code needs teaching how to set the length and
address family in order to support non-netbsd sockaddr.
There are now no active stackgap() calls in the compat tree.
ioctl in order to change the display colours.
Changing the code to not need the stackgap is rather pervasive, and it isn't
at all clear this is useful effort given the suspected bitrottedness
of compat darwin.
cmsg->cmsg_len is 'size_t' not 'socklen_t' - so it is 8 bytes on 64bit
platforms instead of 4. There is also never padding after the header.
Redo linux sendmsg() so that it stands a chance of getting it right.
Redo linux recvmsg() so that it process control data directly from the mbuf
list. Allowing it to hack the data without using the stackgap.
so that it isn't necessary to copy data to/from the stackgap.
Given the nature of the code in this file, it is now probably slightly
more broken than previously. but nothing serious should be worse!
in order to avoid the stackgap (etc).
Note that since changing the darwin socket address is simply a matter of
translating the address family and adding sa_len, it can easily be done
on the mbuf resident address before/after copying to/from userspace.
Simplify the convertion of AF_LOCAL addresses by usingthe user-supplied
buffer length instead of dowing an unbounded strlen().
Untested - did this work before?
If sa_len is zero, believing the size passed to bind/connect seems
better than trying to strlen somthing that might run off the mapped kma.
Verify the address family against the array size before indexing.
the compat_10 and compat_14 functions - makes the code neater and, removes
many data copies and also removes the stackgap use.
Also (indirectly) fixes some code paths that fotgot to do copyin/out.
values of the SS_ONSTACK and SS_DISABLE constants.
Use it to shorten the source files when this action is replicated.
Actually, given the monstrous complexity of sigaltstack1() there is
probably a much better way to do this...
doesn't obtain the ports, gain and balance related parameters.
Those generally require reading from the hardware and therefore are much
more expensive to obtain. Modify OSS emulation to use the new ioctl
where possible.
This reduces CPU usage of mplayer during mp3 playback with my Thinkpad
from 20% to < 1% and from 50% to 20% during Xvid playback.
Review and comments from jmcneill@
compat_10_netbsd32_sys_semsys() (where the one parameter is already in
kernel space).
Note that the code in compat_10_netbsd32_sys_semsys() has always been wrong,
since it called compat_14_sys___semctl() - which would read 64bit values!
once the 'address' has been copied into an mbuf.
Add extra flags for 'struct msghdr.msg_flags' to indicate that the address
and control are already in mbufs, and that the uio structure is in userspace
for sending data, rename sendit() to do_sys_sendmsg() to ensure no old code
passes in random flags.
Changes to compat code to use new functions - removing some stackgap use.
Fix a 'use after free' in compat_43_sys_recvmsg.
I ***THINK*** the code that converts 'cmsg' formatted data is borked!
svr4_stream.c ought to be generated from svr4_32_stream.c during the build.
OSIOCGIFADDR -> OOSIOCGIFADDR
OSIOCGIFDSTADDR -> OOSIOCGIFDSTADDR
OSIOCGIFNETMASK -> OOSIOCGIFNETMASK
Also, one instance of needing to include <net/if.h> before
<compat/sys/sockio.h> due to use of IFNAMSIZ in the latter.
Discussed with christos.
compatibility with the older ioctls. This avoids stack smashing and
abuse of "struct sockaddr" when ioctls placed "struct sockaddr_foo's" that
were longer than "struct sockaddr".
XXX: Some of the emulations might be broken; I tried to add code for
them but I did not test them.
- Make linux_sys_rt_sigreturn() return EJUSTRETURN on success.
- Add missing rax to linux_sigcontext structure; and save/restore
its value like other members in linux_sendsig()/linux_sys_rt_sigreturn().
With valuable help from manu.
from doc/BRANCHES:
idle lwp, and some changes depending on it.
1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.
Make the same changes to the svr4 code.
Add some 'missing' simple_unlock(&fp->f_slock) to the svr4_32 version of this
code. These files now compare if feed the svr4_32 copy though:
sed -e 's/4_32/4/g;s/_P32//g'
Note in passing that the code paths that call simple_unlock(&fp->f_slock)
are completely broken.
and 'rusage' without having to copy data to/from stackgap buffers.
The old split (find_stopped_child) could be removed.
amd64 seems to run netbsd32, linux and linux32 emulations. sparc64 compiles.
which can either be copied directly to userspace, or converted then copied.
Saves replicating a lot of code in the compat functions (esp. for
getvfsstat) at a cast of an extra function call in the non-emulated case -
which is unlikely to be measurable given the other costs of the actions
involved (even on vax).
Remove dofhstat() and dofhstatvfs() (and the last caller).
Remove some redundant stackgap_init() calls.
avoid having to allocate space in the 'stackgap'
- which is very LWP unfriendly.
The additional code for non-emulation namei() is trivial, the reduction for
the emulations is massive.
The vnode for a processes emulation root is saved in the cwdi structure
during process exec.
If the emulation root the TRYEMULROOT flag are set, namei() will do an initial
search for absolute pathnames in the emulation root, if that fails it will
retry from the normal root.
".." at the emulation root will always go to the real root, even in the middle
of paths and when expanding symlinks.
Absolute symlinks found using absolute paths in the emulation root will be
relative to the emulation root (so /usr/lib/xxx.so -> /lib/xxx.so links
inside the emulation root don't need changing).
If the root of the emulation would be returned (for an emulation lookup), then
the real root is returned instead (matching the behaviour of emul_lookup,
but being a cheap comparison here) so that programs that scan "../.."
looking for the root dircetory don't loop forever.
The target for symbolic links is no longer mangled (it used to get the
CHECK_ALT_xxx() treatment, so could get /emul/xxx prepended).
CHECK_ALT_xxx() are no more. Most of the change is deleting them, and adding
TRYEMULROOT to the flags to NDINIT().
A lot of the emulation system call stubs could now be deleted.
updated path parameter is ever valid - especially when emul_find() fails.
Use the modified path for the access() calls.
Found when compiling with emul_find() hacked to always fail.
pointers to and from 64bit kernel pointers. Instead use the defines
NETBSD32PTR64(p32) to read a 32bit pointer and (the new) NETBSD32PTR32(p32,p64)
to write a 32bit pointer throughout.
The 32bit pointer is now a struct to enforce the above.
amd64 (with linux emul) and sparc64 will both compile (when the arch stuff
goes in soon), and amd64 still runs some i386 binaries.
sys_stat() and friends, instead use do_sys_stat() and do_sys_fstat()
that write the answer into a kernel buffer (on stack) that can be
converted to the correct form and written the userspace.
I've test compiled a few kernels, and tested i386 netbsd1.6 ls.
Given I think I've fixed some bugs, it might be 50-50 with new ones.