merge the two emul_irix structures; the only difference was
setregs function, which can be handled by exec-specific setregs hook
rename setregs_n32() to irix_n32_setregs(), and make it suitable
as the exec-specific setregs hook
make irix_check_exec() a macro now that just single compare
it checks both the alternative/emul tree, and the non-emul tree.
This makes it possible to run chrooted emulated binaries without need
to setup shadow /emul tree within the chroot hierarchy.
Only tested for COMPAT_LINUX, changes to other compat modules were
mechanical.
Fixes kern/19161 by Christian Groessler.
kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals
kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)
based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
private to the process within the share group.
There is one bit missing in this implementation: when replicating a change
in a process VM to the other process of the share group, we avoid copying
mappings for private regions in the target process, but we don't prevent
copying private regions from the source process.
it actually fixes a problem:
When /bin/sh gets a SIGSEGV, its signal handler calls brk and the offending
instruction is retried. Usually it gets another SIGSEGV, and things loops
until it pases without the SIGSEGV. This is the normal mode of operation, and
it can be reproduced on IRIX by a 10kB shell script starting by echo /*
However... the signal handler checks for BADVADDR in the saved registers
in struct sigcontext. If it does not find it, it gives up and exit instead
of retrying. Filling the field enables us to carry on normal operation
(which is to get dozens of SIGSEGV) instead of getting a failure at the
first SIGSEGV.
memory fault handler. IRIX uses irix_vm_fault, and all other emulation
use NULL, which means to use uvm_fault.
- While we are there, explicitely set to NULL the uninitialized fields in
struct emul: e_fault and e_sysctl on most ports
- e_fault is used by the trap handler, for now only on mips. In order to avoid
intrusive modifications in UVM, the function pointed by e_fault does not
has exactly the same protoype as uvm_fault:
int uvm_fault __P((struct vm_map *, vaddr_t, vm_fault_t, vm_prot_t));
int e_fault __P((struct proc *, vaddr_t, vm_fault_t, vm_prot_t));
- In IRIX share groups, all the VM space is shared, except one page.
This bounds us to have different VM spaces and synchronize modifications
to the VM space accross share group members. We need an IRIX specific hook
to the page fault handler in order to propagate VM space modifications
caused by page faults.
This merge changes the device switch tables from static array to
dynamically generated by config(8).
- All device switches is defined as a constant structure in device drivers.
- The new grammer ``device-major'' is introduced to ``files''.
device-major <prefix> char <num> [block <num>] [<rules>]
- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.
- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.
- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.
- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.
- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.
usync_cntl() system calls.
- when usync_cntl is used and the process is aborted (eg: by kill -9)
libc does not call usync_cntl() to unblock things. We have to cleanup
data allocated in the kernel. This is now done through the emulation
specific exit hook
- IRIX initialize some data in the system part of the PRDA: the pid and
a prid (PRDA ID?). We initialize both to pid.
- Move back struct irix_share_group from irix_exec.h to irix_prctl.h, it
is more revelant here.
- fix a few typos
* struct sigacts gets a new sigact_sigdesc structure, which has the
sigaction and the trampoline/version. Version 0 means "legacy kernel
provided trampoline". Other versions are coordinated with machine-
dependent code in libc.
* sigaction1() grows two more arguments -- the trampoline pointer and
the trampoline version.
* A new __sigaction_sigtramp() system call is provided to register a
trampoline along with a signal handler.
* The handler is no longer passed to sensig() functions. Instead,
sendsig() looks up the handler by peeking in the sigacts for the
process getting the signal (since it has to look in there for the
trampoline anyway).
* Native sendsig() functions now select the appropriate trampoline and
its arguments based on the trampoline version in the sigacts.
Changes to libc to use the new facility will be checked in later. Kernel
version not bumped; we will ride the 1.6C bump made recently.
private area called PRDA that remains unshared. We implement this by using
different vmspace for each share group member, and keeping the memory
appings in sync on each mmap/munmap/mprotect/break...
We use irix_saddr_sync_vmcmd and irix_saddr_sync_syscall to apply a
vmcmd or a syscall to all share group member, this makes the job a bit
easier.
Also implements {get|set}rlimit{64}.
- First implementation of procblk(). THis is supposed to suspend processes.
We emulate this by sending a SIGSTOP, which is not very accurate since
on IRIX, sending a SIGCONT to a process suspended by procblk() will not
resume it.
- support for shared groups
poll will return true until the semaphore is blocked again, but before the
semaphore is blocked, poll returns false.
We do this by maintaining another queue of "released" processes in
struct irix_usema_rec. Unblocking causes the waiting process record to be
moved to the released queue, and poll check for the process in this released
queue.
a SIGSEGV when sigaction(2) is used before a fork(2) and a signal is received
in the child.
- we now nearly correctly emulate PR_TERMCHILD in prctl(2). (the perfect
emulation would not send a SIGHUP if the parent is killed)
return the number of processes waiting on the semaphore. We now maintiain
a count of waiting processes.
- Blocked processes are unblocked "first in, first out". We now have a
queue of waiting processes on a asemaphores, so that we can wakeup the
first blocked process.
Problems:
- We now have a lot of dynamic memory allocation, it may be a bit slow.
- Nothing is SMP safe for now. We need to add locks.
- On close, we forget about a semaphore, which is incorrect. One process
can close its fd attached on a semaphore, but other processes would carry
on using it. Since any process can join a shared arena, this is not an
easy thing to solve.
- A lot of usema/usync functionnalities are still to be discovered.
successfully emulates a few test program that use poll semaphores,
including the attach-to-file-descriptor-and-select feature.
There are a few issues:
1) at least one ioctl need to set retval. We handle this in irix_sys_ioctl()
by replacing the data argument by a pointer to a strucutre in the stackgap
that carries the real data and retval. The underlying ioctl methods can
therefore retreive both data and retval.
2) usemaclone is a cloning device: each time it is open, it creates a new
context, and ioctl operation on each open file descriptor will lead to
different behavior. This functionnality is available in NetBSD through the
devvp branch. This first implementation does not use devvp yet, but this
should be done later. Currently, we create a new vnode, and we provide our
own vnode operations. Some operation are applied to the cloned vnode, others
are applied to the original vnode. The v_data field is used to hold a
reference to the original vnode so that we can work on it.
3) at least the setattr vnode operation needs some customisation: IRIX
libc relies on the fact that fchmod on /dev/usema will return 0 in case
of failure.