1 microsecond into the future, the thread could enter an untimed sleep.
- Change the signature of _lwp_park() to accept an lwpid_t and second
hint pointer, but do so in a way that remains compatible with older
pthread libraries. This can be used to wake another thread before the
calling thread goes asleep, saving at least one syscall + involuntary
context switch. This turns out to be a fairly large win on the condvar
benchmarks that I have tried.
- Mark some more syscalls MP safe.
values of the SS_ONSTACK and SS_DISABLE constants.
Use it to shorten the source files when this action is replicated.
Actually, given the monstrous complexity of sigaltstack1() there is
probably a much better way to do this...
compat_10_netbsd32_sys_semsys() (where the one parameter is already in
kernel space).
Note that the code in compat_10_netbsd32_sys_semsys() has always been wrong,
since it called compat_14_sys___semctl() - which would read 64bit values!
once the 'address' has been copied into an mbuf.
Add extra flags for 'struct msghdr.msg_flags' to indicate that the address
and control are already in mbufs, and that the uio structure is in userspace
for sending data, rename sendit() to do_sys_sendmsg() to ensure no old code
passes in random flags.
Changes to compat code to use new functions - removing some stackgap use.
Fix a 'use after free' in compat_43_sys_recvmsg.
I ***THINK*** the code that converts 'cmsg' formatted data is borked!
svr4_stream.c ought to be generated from svr4_32_stream.c during the build.
and 'rusage' without having to copy data to/from stackgap buffers.
The old split (find_stopped_child) could be removed.
amd64 seems to run netbsd32, linux and linux32 emulations. sparc64 compiles.
which can either be copied directly to userspace, or converted then copied.
Saves replicating a lot of code in the compat functions (esp. for
getvfsstat) at a cast of an extra function call in the non-emulated case -
which is unlikely to be measurable given the other costs of the actions
involved (even on vax).
Remove dofhstat() and dofhstatvfs() (and the last caller).
Remove some redundant stackgap_init() calls.
avoid having to allocate space in the 'stackgap'
- which is very LWP unfriendly.
The additional code for non-emulation namei() is trivial, the reduction for
the emulations is massive.
The vnode for a processes emulation root is saved in the cwdi structure
during process exec.
If the emulation root the TRYEMULROOT flag are set, namei() will do an initial
search for absolute pathnames in the emulation root, if that fails it will
retry from the normal root.
".." at the emulation root will always go to the real root, even in the middle
of paths and when expanding symlinks.
Absolute symlinks found using absolute paths in the emulation root will be
relative to the emulation root (so /usr/lib/xxx.so -> /lib/xxx.so links
inside the emulation root don't need changing).
If the root of the emulation would be returned (for an emulation lookup), then
the real root is returned instead (matching the behaviour of emul_lookup,
but being a cheap comparison here) so that programs that scan "../.."
looking for the root dircetory don't loop forever.
The target for symbolic links is no longer mangled (it used to get the
CHECK_ALT_xxx() treatment, so could get /emul/xxx prepended).
CHECK_ALT_xxx() are no more. Most of the change is deleting them, and adding
TRYEMULROOT to the flags to NDINIT().
A lot of the emulation system call stubs could now be deleted.
pointers to and from 64bit kernel pointers. Instead use the defines
NETBSD32PTR64(p32) to read a 32bit pointer and (the new) NETBSD32PTR32(p32,p64)
to write a 32bit pointer throughout.
The 32bit pointer is now a struct to enforce the above.
amd64 (with linux emul) and sparc64 will both compile (when the arch stuff
goes in soon), and amd64 still runs some i386 binaries.
sys_stat() and friends, instead use do_sys_stat() and do_sys_fstat()
that write the answer into a kernel buffer (on stack) that can be
converted to the correct form and written the userspace.
I've test compiled a few kernels, and tested i386 netbsd1.6 ls.
Given I think I've fixed some bugs, it might be 50-50 with new ones.
netbsd32_lwp.c, and remove remaining traces of SA.
This still needs some MD (and possibly MI, depending on the chosen
solution) changes to actually work.
Patch by Slava Semushin <slava.semushin@gmail.com>
Again, this was tested by comparing obj files from a pristine and a patched
source tree against an i386/ALL kernel, and also for src/sbin/fsck_ffs,
src/sbin/fsdb and src/usr.sbin/makefs. Only changes in assert() line numbers
were detected in 'objdump -d' output.
> It seems that 32bits programs, running under compat_netbsd32, using
> setrlimit force all other programs to have their maximum data size
> fixed at 3GB, where native 64bits apps used 8GB previously.
I tracked this one to the `netbsd32_adjust_limits()' function (called
when creating a new process under compat_netbsd32), where data and
stack limits are set without checking for shared `p_limit' structure
(p_limit->p_refcnt > 1). This explain the side effect where processes
have their limits changed when a compat_netbsd32 (or compat_linux32)
program is run.
The fix is to use `dosetrlimit()' to ensure the needed copy-on-write
behaviour for shared structure.
- Remove useless (thanks to COMPAT_XX behaviour) #ifdefs in
syscalls.master
- Make netbsd32_compat_43.c compiled per COMPAT_LINUX32 because the latter
needs stuff from it.
Fixes Perry's PR#34951.
implies that _UC_CPU must be set in the context passed. Check for this
and return EINVAL if not; this gives a cheap test for corrupted
ucontexts eg on a signal handler stack which would go unnoticed otherwise.
-Don't ckeck for NULL ucontext pointers explicitely. This is an error,
except in the swapcontext() case where it can be easily caught in
userland.
- remove the support of variable-sized filehandle from compat version of
syscalls. (strictly speaking, it breaks abi. i don't think it's a problem
because this feature is short-lived and there are no affected in-tree
filesystems.)
- unify vfs_copyinfh_alloc and vfs_copyinfh_alloc_size.
- vfs_copyinfh_alloc_size: check fhsize strictly.
- reduce code duplication between compat and current syscalls.
While touching all vptofh/fhtovp functions, get rid of VFS_MAXFIDSIZ,
version the getfh(2) syscall and explicitly pass the size available in
the filehandle from userland.
Discussed on tech-kern, with lots of help from yamt (thanks!).
EPROTONOSUPPORT instead of EAFNOSUPPORT.
from pavel@ with a little bit of clean up from myself.
XXX: netbsd32 (and perhaps other emulations) should be able
XXX: to call the standard socket calls for this i think, but
XXX: revisit this at another time.
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
will change "struct ntptimeval", so some translation would be necessary.
ntp_gettine is considered dispensable, the only userland program known
to use it is "ntptime".
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.
- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.
code.
- To achieve COMPAT_NETBSD32 compatibility, introduce a parameter to
kevent1 that points to functions that do the actual copyin/copyout
operations. This is similar to what was done in FreeBSD by Paul Saab.
- Add the COMPAT_NETBSD32 definitions and hooks.
discussed in the PR.
- introduce sys/timevar.h to hold kernel-specific stuff relevant to
sys/time.h. Ideally, timevar.h would contain all (or almost) of the
#ifdef _KERNEL part of time.h, but that's a pretty big and tedious
change to make. For now, it will contain only the prototypes I
introduced when working on COMPAT_NETBSD32.
- split copyinout_t into copyin_t and copyout_t, it makes prototypes more
explicit about the meaning of a given argument. Suggested by yamt@.
- move copyinout_t definition in sys/time.h to systm.h as copyin_t and
copyout_t
- make everything uses the new types and include the proper headers at
the proper places.
as an argument a function that will retrieve an element of the pointer
arrays in user space. This allows COMPAT_NETBSD32 to share the code for
the emulated version of execve(2), and fixes various issues that came from
the slow drift between the two implementations.
Note: when splitting up a syscall function, I'll use two different ways
of naming the resulting helper function. If it stills does
copyin/out operations, it will be named <syscall>1(). If it does
not (as it was the case for get/setitimer), it will be named
do<syscall>.
relevant code with the COMPAT_NETBSD32 version, and make the latter use
the new functions.
This fixes netbsd32_setitimer() which had drifted from the native syscall
and did not work properly anymore.