of the generic (NetBSD specific) sendsig().
We can only work with ...sigcontext for now anyway; the
versioning stuff in sendsig() isn't helpful for osf1 emul.
instead of the compat16_ thing.
This saves 2 pointless copyout/copyin cycles to/from
the "stackgap" buffer, and it gets us a step closer
to a COMPAT_OSF! which works w/o COMPAT_16.
This still doesn't support SA_SIGINFO because the old
COMPAT_16 signal trampoline is used.
1. make fileops const
2. add 2 new negative errno's to `officially' support the cloning hack:
- EDUPFD (used to overload ENODEV)
- EMOVEFD (used to overload ENXIO)
3. Created an fdclone() function to encapsulate the operations needed for
EMOVEFD, and made all cloners use it.
4. Centralize the local noop/badop fileops functions to:
fnullop_fcntl, fnullop_poll, fnullop_kqfilter, fbadop_stat
do { ... } while(/*CONSTCOND*/0)
so that they can be used unadorned in if/else blocks, etc. This means
that you now *have* to put a ; at the end of the "call" to these
macros.
supported options can't get out of sync. This add support for the
linux __WCLONE and __WALL options (NetBSD version: WALTSIG and WALLSIG)
Add a diagnostic check to see if the one unhandled option (__WNOTHREAD) is
specified.
This should prevent linux processes from losing their children and creating
tons of zombie processes.
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.
while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.
segment should succeed even if the segment would be marked removed; use this
to implement the Linux-compatible semantics of shmat(2)
this fixes the old Linux VMware3 graphics problem with local display,
and possibly other local Linux X clients using MIT-SHM
for Linux-compatible shmat() behaviour - shmat() for the removed shared memory
segment must work from all callers, the shared memory id could be passed e.g.
to native X server via MIT-SHM
temporarily remove the functionality, the Linux-compatible semantics
will be reimplemented differently
provide f_frsize. It cannot be actually used to GNU C statvfs() bug
in f_frsize != f_bsize case, so just keep pretending we don't support it.
Update comments and explain the situation in detail there.
explicit size types - the structure definition is actually identical
on currently support COMPAT_LINUX archs, so no point to have 6 copies of it
in the tree
- filesystem size is expressed in number of fragments, not blocks;
this fixes computed filesystem sizes for Linux df(1) and other Linux
binaries using statfs(2) for filesystems, which use different value
for frament and block, such as FFS
- use FS f_namemax instead of always using MAXNAMLEN
- print the socketcall type
- special case socket(2) call, it's also the only one with first argument
not being a socket descriptor
- only dump the relevant part of linux_socketcall_dummy_args, instead
of always the whole structure
grow-down auto extend segment) by allocating segment sized at
current stack size limit, and offsetting requested/returned address
as required
due to how normal virtual memory management work, allocating the
full sized stack memory segment up-front actually requires exactly same
amount of VA space and physical memory as the Linux 'grow' scheme and the
'grow' scheme is quite a lot more difficult to use in applications correctly,
so it's not very apparent why Linux introduced this feature at all
this fixes Thomas Klausner's Heroes3 crash, and might also
fix PR 26687 by Jan Schaumann
sys/kern/exec_aout.c back in *1995*, apparently the line from my
license notice:
* must display the following acknowledgement:
was accidentally dropped. This mistake was propagated into
hpux_exec_aout.c when it was split out of hpux_exec.c.
(Thanks to hubertf for noticing!)
* Rather than using mnt_maxsymlinklen to indicate that a file systems returns
d_type fields(!), add a new internal flag, IMNT_DTYPE.
Add 3 new elements to ufsmount:
* um_maxsymlinklen, replaces mnt_maxsymlinklen (which never should have existed
in the first place).
* um_dirblksiz, which tracks the current directory block size, eliminating the
FS-specific checks littered throughout the code. This may be used later to
make the block size variable.
* um_maxfilesize, which is the maximum file size, possibly adjusted lower due
to implementation issues.
Sync some bug fixes from FFS into ext2fs, particularly:
* ffs_lookup.c 1.21, 1.28, 1.33, 1.48
* ffs_inode.c 1.43, 1.44, 1.45, 1.66, 1.67
* ffs_vnops.c 1.84, 1.85, 1.86
Clean up some crappy pointer frobnication.
rather than EPERM; to emulate this properly, translate the error to EISDIR
if the target patch exists and points to a directory
this fixes the 'ant clean' problem reported by Marc Recht on current-users@
with SuSE 9.1 libraries
share same 'break' value used for brk()/sbrk(), otherwise application SIGSEGVs
quickly once different threads try to adjust data segment size
this fixes linux Mozilla crashes with SuSE 9.1 libraries, and possibly
other linux applications using real threads
connect madvise(2) and mincore(2) - apparently the newer Linux libs
don't stub it anymore, so allow the application to take advantage
of them
the Linux calls appear to be compatible in the flag values and semantics,
so a wrapper is not necessary
don't stub it anymore, so allow the application to take advantage
of them
the Linux calls appear to be compatible in the flag values and semantics,
so a wrapper is not necessary
itself needs to be translated. It means that we must translate it in
every system call using it: recvfrom, sendto, connect, accept, bind,
getpeername, getsockname.
library call). Programs such as ifconfig or XFree86 4.4 XDarwin use it.
The emulation is not complete, as ifconfig is not able to display inet6
addresses correctly.
This areas is called the comm pages. It is used to provide fast access to
several data and functions.
The comm pages are mapped starting at 0xffff800 (address chosed so that
absolute branch can be used, so it can be accessed even when dynamic linking
is not ready). NetBSD has the user stack here, so we need to provide a
Darwin-specific stack setup routine which sets the top of the stack at
0xbfff0000.
This implementation is not complete but it does enough to get MacOS X.3
starting again (static binaries run, dynamic binaries still have an issue).
in the comm pages functions, we only implement bcopy, pthread_self and
memcpy.
TODO:
- clean up the powerpc specific code from MD parts
- for now we map only one page to avoid a crash, we want two pages.
- write all the comm functions.
fhstatvfs1 and getvfsstat)
o Move the statfs family out of netbsd32_fs.c and netbsd32_netbsd.c to
netbsd_compat_20.c, compiled with COMPAT_20
Reviewed by christos@.
native version does non executable mappings on the stack. This is a
showstopper for Linux binaries.
To fix that we supply a copy f the native stack setup function for Linux
binaries, with the executable bit set.
otherwise, linux_syscall() returns garbage, at least on i386.
(it returns native_to_linux_errno[EPASSTHROUGH] where EPASSTHROUGH == -4.)
i choose EINVAL rather than ENOTTY, because linux's pipe returns it
and i think that it's a common case.
- Not enabled by default. Needs kernel option FFS_SNAPSHOT.
- Change parameters of ffs_blkfree.
- Let the copy-on-write functions return an error so spec_strategy
may fail if the copy-on-write fails.
- Change genfs_*lock*() to use vp->v_vnlock instead of &vp->v_lock.
- Add flag B_METAONLY to VOP_BALLOC to return indirect block buffer.
- Add a function ffs_checkfreefile needed for snapshot creation.
- Add special handling of snapshot files:
Snapshots may not be opened for writing and the attributes are read-only.
Use the mtime as the time this snapshot was taken.
Deny mtime updates for snapshot files.
- Add function transferlockers to transfer any waiting processes from
one lock to another.
- Add vfsop VFS_SNAPSHOT to take a snapshot and make it accessible through
a vnode.
- Add snapshot support to ls, fsck_ffs and dump.
Welcome to 2.0F.
Approved by: Jason R. Thorpe <thorpej@netbsd.org>
Add a new explicit `struct proc *p' argument to socreate(), sosend().
Use that argument instead of curproc. Follow-on changes to pass that
argument to socreate(), sosend(), and (*so->so_send)() calls.
These changes reviewed and independently recoded by Matt Thomas.
Changes to soreceive() and (*dom->dom_exernalize() from Matt Thomas:
pass soreceive()'s struct uio* uio->uio_procp to unp_externalize().
Eliminate curproc from unp_externalize. Also, now soreceive() uses
its uio->uio_procp value, pass that same value downward to
((pr->pru_usrreq)() calls for consistency, instead of (struct proc * )0.
Similar changes in sys/nfs to eliminate (most) uses of curproc,
either via the req-> r_procp field of a struct nfsreq *req argument,
or by passing down new explicit struct proc * arguments.
Reviewed by: Matt Thomas, posted to tech-kern.
NB: The (*pr->pru_usrreq)() change should be tested on more (all!) protocols.
and tweak lkminit_*.c (where applicable) to call them, and to call
sysctl_teardown() when being unloaded.
This consists of (1) making setup functions not be static when being
compiled as lkms (change to sys/sysctl.h), (2) making prototypes
visible for the various setup functions in header files (changes to
various header files), and (3) making simple "load" and "unload"
functions in the actual lkminit stuff.
linux_sysctl.c also needs its root exposed (ie, made not static) for
this (when built as an lkm).
hostid of the machine rather than always getting "0". Tested with
hostid(1) from solaris-2.8 and with lmhostid (part of FlexLM) for solaris.
Approved by atatat@.
from here: set p_execsw to the new thing, and call
the new emulation's syscall_intern()
XXX there are more differences to kern_exec.c, sa/ras
related afaics, this is harmliss for now since
netbsd32 doesn't support multithreaded programs yet --
one day one execve() implementation should be shared
by native and netbsd32 code.
PR#23470, with minor updates by me. This is only the syscall support
from that PR, for now.
Changes: port over fix from FreeBSD for multicast address generation.
Changed bcopy to memcpy. For now, #ifdef notyet the portions of
kern_uuid.c that are meant to be used by (currently nonexistent) other
things in the kernel. Added syscall to COMPAT_FREEBSD as well, though
that's currently not useful, as any program new enough to use this call
also uses other syscalls we don't (yet) emulate.
- delete ktrsyscall32()
- add a check #ifdef _LP64 to do the conversion if P_32 is set to the
standard ktrsyscall()
- add a couple of similar _LP64/P_32 checks to the systrace code.
this should get systrace working for 32 bit apps as well as complete
ktrace support for "trace_enter/trace_exit" using platforms such as amd64.
XXX: systrace isn't supported on sparc64 currently... (it doesn't use
trace_enter/trace_exit, or have it's own calls to systrace_xxx()...)
the main purpose of this function is to adjust the "argsize" value of
the ktrace syscall record, otherwise userland will see N/2 (rounded
down) arguments instead of N.
raised the exception, don't release the lock, this causes a crash (the lock
shall be released by the process that took it). Wakeup the thread instead,
it will release the lock itself.
emul.darwin.init.pid instead of emul.darwin.init_pid, and so on.
This breaks backward compatibility with the pre-dynamic sysctl(8) for
emul.darwin, but it has never been available in a formal release, so
it should be alright.
remote process. This new implementation also passes all the test programs
I've written so far.
- When exceptions come from traps, no UNIX signal should evet be sent.
- Add a lock to ensure a debugger handles only one exception at a time
- Use a structure to hold flavor and behavior in exception ports, instead
of stuffing the two argument into an int.
- Implement new Mach services: thread_suspend, thread_resume and thread_abort
- Implement Darwin's ptrace PT_ATTACHEXC and PT_THUPDATE commands
- Handle NULL second argument correctly in sigprocmask.
- One mistake in the last commit (darwin_tracesig prototype)
fit what it does.
The softsignal feature is used in Darwin to trace processes. When the
traced process gets a signal, this raises an exception. The debugger will
receive the exception message, use ptrace with PT_THUPDATE to pass the
signal to the child or discard it, and then it will send a reply to the
exception message, to resume the child.
With the hook at the beginnng of kpsignal2, we are in the context of the
signal sender, which can be the kill(1) command, for instance. We cannot
afford to sleep until the debugger tells us if the signal should be
delivered or not.
Therefore, the hook to generate the Mach exception must be in the traced
process context. That was we can sleep awaiting for the debugger opinion
about the signal, this is not a problem. The hook is hence located into
issignal, at the place where normally SIGCHILD is sent to the debugger,
whereas the traced process is stopped. If the hook returns 0, we bypass
thoses operations, the Mach exception mecanism will take care of notifying
the debugger (through a Mach exception), and stop the faulting thread.
exec case, as the emulation already has the ability to intercept that
with the e_proc_exec hook. It is the responsability of the emulation to
take appropriaye action about lwp_emuldata in e_proc_exec.
Patch reviewed by Christos.
argument, large sigset), and the older sigprocset (no old set argument,
small sigset). It feature old set argument and small sigset.
We now emulates this correctly.
Add some methods to IOFramebuffer (DARWIN_IOFBSETBOUNDS,
DARWIN_IOFBSETCURSORVISIBLE) and to IOHIDSystem (DARWIN_IOHIDPOSTEVENT),
all are unimplemented empty shells.
receiver namespace.
While we are there, refactor mach_msg_overwrite by splitting it into
several smaller functions. It had grown too big to be easily maintainable.
Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al. Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.
Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded. Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.
All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.
PS - I'm sorry, but there's a distinct lack of documentation at the
moment. I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
so that a specific emulation has the oportunity to filter out some signals.
if sigfilter returns 0, then no signal is sent by kpsignal2().
There is another place where signals can be generated: trapsignal. Since this
function is already an emulation hook, no call to the sigfilter hook was
introduced in trapsignal.
This is needed to emulate the softsignal feature in COMPAT_DARWIN (signals
sent as Mach exception messages)
Exceptions coming from a trap are generated from darwin_trapsignal()
softsignals are from darwin_sigfilter(), a function that is called
from darwin_trapsignal() and from kpsignal2() [the latter from a
emulation specific hook which is not yet committed]
Make some sanity checks to avoid sending data to a port with no receiver.
See http://mail-index.netbsd.org/tech-kern/2003/12/01/0008.html and
follow-ups for details.
wrong on the semantic front; the spurious wakeup confuses Darwin's gdb.
Allow vm, task and thread operations on remote processes. The code to pick up
the remote process is in mach_sys_msg_trap(), so that any Mach service can
use it.
rename FPBASE to _FPBASE, so that we avoid polluting the user's
name space when e.g. <sys/ptrace.h> is included. Previously, the
PC symbol in mips/regnum.h would conflict with the declaration of
the external variable by the same name in termcap.h, as discovered
by the ``okheaders'' regression test.
blocked in the kernel. The task that catched the exception may unblock
it by sending a reply to the exception message (Of course it will have
to change something so that the exception is not immediatly raised again).
Handling of this reply is a bit complicated, as the kernel acts as the
client instead of the server. In this situation, we receive a message
but we will not send any reply (the message we receive is already a reply).
I have not found anything better than a special case in
mach_msg_overwrite_trap() to handle this.
A surprise: exceptions ports are preserved accross forks.
While we are there, use appropriate 64 bit types for make_memory_entry_64.
may turn into exceptions on Mach: a small message sent by the kernel to
the task that requested the exception.
On Darwin, when an exception is sent, no signal can be delivered.
TODO: more exceptions: arithmetic, bad instructions, emulation, s
software, and syscalls (plain and Mach). There is also RPC alert, but
I have no idea about what it is.
While we are there, remove some user ktrace in notification code, and add
a NODEF qualifier in mach_services.master: it will be used for notifications
and exceptions, where the kernel is always client and never server: we
don't want the message to be displayed as "unimplemented xxx" in kdump (thus
UNIMPL is not good), but we don't want to generate the server prototype
(therefore, STD is not good either). NODEF will declare it normally in the
name tables without creating the prototype.
will have unimplemented services showing their names in ktrace
Add a new generated file with only service id and name, which will
be included by kdump to display services names.
This removes the need for using the user ktrace facility for services names.
1) make sure Mach servers will not work on data beyond the end of the
request message buffer.
2) make sure that on copying out the reply message buffer, we will not
leak kernel data located after the buffer.
3) make sure that the server will not overwrite memory beyond the end
of the reply message buffer. That check is the responsability of the
server, there is just a DIAGNOSTIC test to check everything is in
good shape. All currently implemented servers in NetBSD have been
modified to check for this condition
While we are here, build the mach services table (formerly in mach_namemap.c)
and the services prototypes automatically from mach_services.master, just
as this is done for system calls.
The next step would be to fold the message formats in the mach_services.master
file, but this tends to be difficult, as some messages are quite long and
complex.
copyin() or copyout().
uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.
of the sibling list so that find_stopped_child can be optimised to avoid
traversing the entire sibling list - helps when a process has a lot of
children.
- Modify locking in pfind() and pgfind() to that the caller can rely on the
result being valid, allow caller to request that zombies be findable.
- Rename pfind() to p_find() to ensure we break binary compatibility.
- Remove svr4_pfind since p_find willnow do the job.
- Modify some of the SMP locking of the proc lists - signals are still stuffed.
Welcome to 1.6ZF
While we are here, try to tag machine dependent functions in header files.
also transformed darwin_ppc_*_state into mach_ppc_*_state, as this is
what they really are (COMPAT_DARWIN is on the top of COMPAT_MACH, not the
other way around)
While we are there, resolved another mystery: the unallocated port described
in the comment removed by this commit was in fact allocated by mach_task_pid().