Attempt to fix all the code paths so that the 'fp' returned by fd_getfile()
isn't left locked, and is always unlocked (and ref-counted) before
doing anything that might sleep.
int foo(struct lwp *l, void *v, register_t *retval)
to:
int foo(struct lwp *l, const struct foo_args *uap, register_t *retval)
Fixup compat code to not write into 'uap' and (in some cases) to actually
pass a correctly formatted 'uap' structure with the right name to the
next routine.
A few 'compat' routines that just call standard ones have been deleted.
All the 'compat' code compiles (along with the kernels required to test
build it).
98% done by automated scripts.
Don't open-code queue(3) macros (x = ifnet.tqh_first; y =
x.if_list.tqe_next). Instead, use the macros themselves.
Use IFNET_FOREACH() and IFADDR_FOREACH().
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.
quick consensus on tech-kern
so that it isn't necessary to copy data to/from the stackgap.
Given the nature of the code in this file, it is now probably slightly
more broken than previously. but nothing serious should be worse!
values of the SS_ONSTACK and SS_DISABLE constants.
Use it to shorten the source files when this action is replicated.
Actually, given the monstrous complexity of sigaltstack1() there is
probably a much better way to do this...
once the 'address' has been copied into an mbuf.
Add extra flags for 'struct msghdr.msg_flags' to indicate that the address
and control are already in mbufs, and that the uio structure is in userspace
for sending data, rename sendit() to do_sys_sendmsg() to ensure no old code
passes in random flags.
Changes to compat code to use new functions - removing some stackgap use.
Fix a 'use after free' in compat_43_sys_recvmsg.
I ***THINK*** the code that converts 'cmsg' formatted data is borked!
svr4_stream.c ought to be generated from svr4_32_stream.c during the build.
compatibility with the older ioctls. This avoids stack smashing and
abuse of "struct sockaddr" when ioctls placed "struct sockaddr_foo's" that
were longer than "struct sockaddr".
XXX: Some of the emulations might be broken; I tried to add code for
them but I did not test them.
Make the same changes to the svr4 code.
Add some 'missing' simple_unlock(&fp->f_slock) to the svr4_32 version of this
code. These files now compare if feed the svr4_32 copy though:
sed -e 's/4_32/4/g;s/_P32//g'
Note in passing that the code paths that call simple_unlock(&fp->f_slock)
are completely broken.
and 'rusage' without having to copy data to/from stackgap buffers.
The old split (find_stopped_child) could be removed.
amd64 seems to run netbsd32, linux and linux32 emulations. sparc64 compiles.
avoid having to allocate space in the 'stackgap'
- which is very LWP unfriendly.
The additional code for non-emulation namei() is trivial, the reduction for
the emulations is massive.
The vnode for a processes emulation root is saved in the cwdi structure
during process exec.
If the emulation root the TRYEMULROOT flag are set, namei() will do an initial
search for absolute pathnames in the emulation root, if that fails it will
retry from the normal root.
".." at the emulation root will always go to the real root, even in the middle
of paths and when expanding symlinks.
Absolute symlinks found using absolute paths in the emulation root will be
relative to the emulation root (so /usr/lib/xxx.so -> /lib/xxx.so links
inside the emulation root don't need changing).
If the root of the emulation would be returned (for an emulation lookup), then
the real root is returned instead (matching the behaviour of emul_lookup,
but being a cheap comparison here) so that programs that scan "../.."
looking for the root dircetory don't loop forever.
The target for symbolic links is no longer mangled (it used to get the
CHECK_ALT_xxx() treatment, so could get /emul/xxx prepended).
CHECK_ALT_xxx() are no more. Most of the change is deleting them, and adding
TRYEMULROOT to the flags to NDINIT().
A lot of the emulation system call stubs could now be deleted.
sys_stat() and friends, instead use do_sys_stat() and do_sys_fstat()
that write the answer into a kernel buffer (on stack) that can be
converted to the correct form and written the userspace.
I've test compiled a few kernels, and tested i386 netbsd1.6 ls.
Given I think I've fixed some bugs, it might be 50-50 with new ones.
process was reparented. Change proc_free() to copy the rusage to a buffer
on the stack if required, so it can be passed both to the debugger and
to the real parent process.
Fixes kern/35582 (kernel panics with gdb).
Patch by Slava Semushin <slava.semushin@gmail.com>
Again, this was tested by comparing obj files from a pristine and a patched
source tree against an i386/ALL kernel, and also for src/sbin/fsck_ffs,
src/sbin/fsdb and src/usr.sbin/makefs. Only changes in assert() line numbers
were detected in 'objdump -d' output.
least Linux 2.4.31, Irix 6.5.20 and Solaris 10) use EAFNOSUPPORT.
Only the Linux emulation has been tested.
XXX somebody should audit the other emulations...
EPROTONOSUPPORT instead of EAFNOSUPPORT.
from pavel@ with a little bit of clean up from myself.
XXX: netbsd32 (and perhaps other emulations) should be able
XXX: to call the standard socket calls for this i think, but
XXX: revisit this at another time.
will change "struct ntptimeval", so some translation would be necessary.
ntp_gettine is considered dispensable, the only userland program known
to use it is "ntptime".
- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.
* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2
Tested on amd64, compile-tested on sparc64.
1. make fileops const
2. add 2 new negative errno's to `officially' support the cloning hack:
- EDUPFD (used to overload ENODEV)
- EMOVEFD (used to overload ENXIO)
3. Created an fdclone() function to encapsulate the operations needed for
EMOVEFD, and made all cloners use it.
4. Centralize the local noop/badop fileops functions to:
fnullop_fcntl, fnullop_poll, fnullop_kqfilter, fbadop_stat
Add a new explicit `struct proc *p' argument to socreate(), sosend().
Use that argument instead of curproc. Follow-on changes to pass that
argument to socreate(), sosend(), and (*so->so_send)() calls.
These changes reviewed and independently recoded by Matt Thomas.
Changes to soreceive() and (*dom->dom_exernalize() from Matt Thomas:
pass soreceive()'s struct uio* uio->uio_procp to unp_externalize().
Eliminate curproc from unp_externalize. Also, now soreceive() uses
its uio->uio_procp value, pass that same value downward to
((pr->pru_usrreq)() calls for consistency, instead of (struct proc * )0.
Similar changes in sys/nfs to eliminate (most) uses of curproc,
either via the req-> r_procp field of a struct nfsreq *req argument,
or by passing down new explicit struct proc * arguments.
Reviewed by: Matt Thomas, posted to tech-kern.
NB: The (*pr->pru_usrreq)() change should be tested on more (all!) protocols.
hostid of the machine rather than always getting "0". Tested with
hostid(1) from solaris-2.8 and with lmhostid (part of FlexLM) for solaris.
Approved by atatat@.
exec case, as the emulation already has the ability to intercept that
with the e_proc_exec hook. It is the responsability of the emulation to
take appropriaye action about lwp_emuldata in e_proc_exec.
Patch reviewed by Christos.
Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al. Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.
Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded. Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.
All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.
PS - I'm sorry, but there's a distinct lack of documentation at the
moment. I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
so that a specific emulation has the oportunity to filter out some signals.
if sigfilter returns 0, then no signal is sent by kpsignal2().
There is another place where signals can be generated: trapsignal. Since this
function is already an emulation hook, no call to the sigfilter hook was
introduced in trapsignal.
This is needed to emulate the softsignal feature in COMPAT_DARWIN (signals
sent as Mach exception messages)
of the sibling list so that find_stopped_child can be optimised to avoid
traversing the entire sibling list - helps when a process has a lot of
children.
- Modify locking in pfind() and pgfind() to that the caller can rely on the
result being valid, allow caller to request that zombies be findable.
- Rename pfind() to p_find() to ensure we break binary compatibility.
- Remove svr4_pfind since p_find willnow do the job.
- Modify some of the SMP locking of the proc lists - signals are still stuffed.
Welcome to 1.6ZF
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.
From FreeBSD with slight modifications.
Approved by: Frank van der Linden <fvdl@netbsd.org>