Commit Graph

5327 Commits

Author SHA1 Message Date
ad
057666ad0c setrunnable: adjust to slightly different locking strategy post yamt-idlewlp.
Should fix kern/36398. Untested due to connectivity issues.
2007-05-31 22:06:09 +00:00
rmind
59085afd2c Make AIO initialization MP-safe.
Actually, lwp_exit() with (l != curlwp) will not work.
This fix might be pulled up from vmlocking branch.
2007-05-31 06:24:23 +00:00
rmind
0a227b1913 - Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.
2007-05-31 05:29:43 +00:00
dyoung
8c4b63fb77 Do not run ctags on sys/dev/usb/usb_port.h. Its #defines shadow
NetBSD symbols, such as clalloc(), that one might one to tag to.
2007-05-26 05:34:04 +00:00
tnn
6380d93405 When renaming, copy the new name into the designated memory area.
Tested by martti@
2007-05-22 10:39:10 +00:00
dsl
b113bdbde9 Fix logic inversion - probably PR kern/36284 2007-05-21 18:30:35 +00:00
christos
c61eed39a8 rename si_sigval -> si_value to match POSIX RTS. 2007-05-21 15:35:47 +00:00
skrll
5492d86688 Correct comment. 2007-05-21 11:56:35 +00:00
christos
09a50be501 - remove pathname_ interface.
- use macros to deal with pathnames in userspace, when veriexec is used.
- reorder the veriexec_ call arguments for consistency.
With help from elad@ finding the last bug.
2007-05-19 22:11:22 +00:00
yamt
f03010953f merge yamt-idlelwp branch. asked by core@. some ports still needs work.
from doc/BRANCHES:

	idle lwp, and some changes depending on it.

	1. separate context switching and thread scheduling.
	   (cf. gmcgarry_ctxsw)
	2. implement idle lwp.
	3. clean up related MD/MI interfaces.
	4. make scheduler(s) modular.
2007-05-17 14:51:11 +00:00
hannken
64b7e5637e Fstrans_start() always returns zero, so change its type to void. 2007-05-17 07:26:21 +00:00
christos
50ab9d6934 - since mknod now can create regular files, make sure veriexec allows it.
Done in a way to minimize ifdefs. Per discussions with elad.
2007-05-17 00:46:30 +00:00
hannken
0453160a52 Use rwlock for fmi_shared_lock and fmi_lazy_lock.
Ok: Andrew Doran <ad@netbsd.org>
2007-05-16 16:11:56 +00:00
elad
6700cfccd6 Some Veriexec stuff that's been rotting in my tree for months.
Bug fixes:
  - Fix crash reported by Scott Ellis on current-users@.

  - Fix race conditions in enforcing the Veriexec rename and remove
    policies. These are NOT security issues.

  - Fix memory leak in rename handling when overwriting a monitored
    file.

  - Fix table deletion logic.

  - Don't prevent query requests if not in learning mode.


KPI updates:
  - fileassoc_table_run() now takes a cookie to pass to the callback.

  - veriexec_table_add() was removed, it is now done internally. As a
    result, there's no longer a need for VERIEXEC_TABLESIZE.

  - veriexec_report() was removed, it is now internal.

  - Perform sanity checks on the entry type, and enforce default type
    in veriexec_file_add() rather than in veriexecctl.

  - Add veriexec_flush(), used to delete all Veriexec tables, and
    veriexec_dump(), used to fill an array with all Veriexec entries.


New features:
  - Add a '-k' flag to veriexecctl, to keep the filenames in the kernel
    database. This allows Veriexec to produce slightly more accurate
    logs under certain circumstances. In the future, this can be either
    replaced by vnode->pathname translation, or combined with it.

  - Add a VERIEXEC_DUMP ioctl, to dump the entire Veriexec database.
    This can be used to recover a database if the file was lost.
    Example usage:

        # veriexecctl dump > /etc/signatures

    Note that only entries with the filename kept (that is, were loaded
    with the '-k' flag) will be dumped.

    Idea from Brett Lymn.

  - Add a VERIEXEC_FLUSH ioctl, to delete all Veriexec entries. Sample
    usage:

        # veriexecctl flush

  - Add a 'veriexec_flags' rc(8) variable, and make its default have
    the '-k' flag. On systems using the default signatures file
    (generaetd from running 'veriexecgen' with no arguments), this will
    use additional 32kb of kernel memory on average.

  - Add a '-e' flag to veriexecctl, to evaluate the fingerprint during
    load. This is done automatically for files marked as 'untrusted'.


Misc. stuff:
  - The code for veriexecctl was massively simplified as a result of
    eliminating the need for VERIEXEC_TABLESIZE, and now uses a single
    pass of the signatures file, making the loading somewhat faster.

  - Lots of minor fixes found using the (still under development)
    Veriexec regression testsuite.

  - Some of the messages Veriexec prints were improved.

  - Various documentation fixes.


All relevant man-pages were updated to reflect the above changes.

Binary compatibility with existing veriexecctl binaries is maintained.
2007-05-15 19:47:43 +00:00
dsl
2e12e4f4e1 Fallout from caddr_t deletion - remove a load of redundant (void *) casts. 2007-05-13 20:24:21 +00:00
dsl
9bdbb03424 nanosleep1() shouldn't try to get the current time into a NULL address. 2007-05-13 19:51:35 +00:00
dsl
f23edc42dd Instead of the #define versions of tc_getfrequency() and nanouptime(), use
the function ones in kern_kern_clock.c (adding tc_getfrequency).
Adjust includes so this builds.
2007-05-13 14:43:52 +00:00
dsl
88e6c5604d Add a #define for nanouptime() in the !__HAVE_TIMECOUNTERS case. 2007-05-13 10:58:50 +00:00
dsl
1c85a3efd8 Split sys_nanosleep(). 2007-05-13 10:34:25 +00:00
dsl
701496b5c6 Split the fcntl locking code out from its copyin/out.
Use to avoid all the stackgap stuff in compat code.
2007-05-12 23:02:49 +00:00
dsl
ef3fdc4a07 Change interface to settimeofday1() so that it can also be used from
compat code in order to avoid the stackgap.
2007-05-12 20:27:13 +00:00
dsl
c83f8a10ad Change the compat sys_[fl]utime code to not use the stackgap. 2007-05-12 17:28:19 +00:00
dsl
f56bfb975c Add the child 'rusage' of an exiting process to its own 'rusage' exactly
once, and prior to passing it to the caller of sys_wait4() and at the same
time as adding it to the parent.
Commands like:
time sh -c 'i=0; while [ $i -lt 1000 ]; do i=$(expr $i + 1); done'
now give same output.
2007-05-08 20:10:14 +00:00
manu
31b57f40ff Add the TFTPROOT kernel option for TFTP'ing root RAMdisk at root mount time.
This allows working around situations where a kernel with embedded RAMdisk
cannot be booted by the bootloader because the RAMdisk is too big.
2007-05-08 06:10:27 +00:00
rmind
10c3d35ca0 Rename vfs_aio.c to sys_aio.c as decided in <tech-kern>.
Please note, that <tech-kern> people should note about
file names before commit. Otherwise, function may fail
with errno set to EDIRTY, and return -1. ;)
2007-05-07 22:22:20 +00:00
dsl
1844147fa9 Split sys_wait4() so that compat code can fiddle with the returned 'status'
and 'rusage' without having to copy data to/from stackgap buffers.
The old split (find_stopped_child) could be removed.
amd64 seems to run netbsd32, linux and linux32 emulations. sparc64 compiles.
2007-05-07 16:53:17 +00:00
dsl
832ca390e2 Add child rusage values to exiting process in 'find_stopped_child'
so that it is (correctlly) available to the caller of wait4().
The self and child rusage values remain split for zombies.
2007-05-07 09:30:14 +00:00
dyoung
e1d4e2922e In AppleTalk, IPv4, and IPv6 routing domains, help sockaddr_cmp()
avoid an indirect function call by comparing the family, length,
and bytes [dom->dom_sa_cmpofs, dom->dom_sa_cmpofs + dom->dom_sa_cmplen),
corresponding to the the sockaddrs' "address" members.

For ISO, actually use sockaddr_iso_cmp, for a change.  Thanks to
yamt@ for pointing out my error.
2007-05-06 02:56:37 +00:00
ad
501930d97e aio_init: limit wmesg strings to 8 characters. 2007-05-05 20:38:43 +00:00
yamt
c9ba84ac33 aio_worker: exit properly. 2007-05-04 14:28:40 +00:00
rmind
29cb26a639 - Make aio_listio_max and aio_max changeable via sysctl.
- Set a lower priority for AIO-worker thread, because current could cause
  interactivity problems (eg. with qemu - thanks <xtraeme> for testing).
  Mark it as XXX for now - after priority model change, this should
  be reconsidered anyway.
- Do not copyout() with lock held in sys_aio_cancel().
- Fix a leak of the lock in aio_process().
- Check for any error of cv_wait_sig().
- Cache p->p_aio in aio_exit().

Thanks <ad> for catching the issues!
2007-05-03 22:03:40 +00:00
dyoung
72f0a6dfb0 Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing.  Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously.  Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs.  I have
  introduced routines for allocating, copying, and duplicating,
  and freeing sockaddrs:

        struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
        struct sockaddr *sockaddr_copy(struct sockaddr *dst,
                                       const struct sockaddr *src);
        struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
        void sockaddr_free(struct sockaddr *sa);

  sockaddr_alloc() returns either a sockaddr from the pool belonging
  to the specified family, or NULL if the pool is exhausted.  The
  returned sockaddr has the right size for that family; sa_family
  and sa_len fields are initialized to the family and sockaddr
  length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
  sockaddr_in).  sockaddr_free() puts the given sockaddr back into
  its family's pool.

  sockaddr_dup() and sockaddr_copy() work analogously to strdup()
  and strcpy(), respectively.  sockaddr_copy() KASSERTs that the
  family of the destination and source sockaddrs are alike.

  The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
  passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
  family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
  etc.  They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more.  All protocol families
  use struct route.  I have changed the route cache, 'struct route',
  so that it does not contain storage space for a sockaddr.  Instead,
  struct route points to a sockaddr coming from the pool the sockaddr
  belongs to.  I added a new method to struct route, rtcache_setdst(),
  for setting the cache destination:

        int rtcache_setdst(struct route *, const struct sockaddr *);

  rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
  available to create the sockaddr storage.

  It is now possible for rtcache_getdst() to return NULL if, say,
  rtcache_setdst() failed.  I check the return value for NULL
  everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
  caches, dom_rtcache.  rtflushall(sa_family_t af) looks up the
  domain indicated by 'af', walks the domain's list of route caches
  and invalidates each one.
2007-05-02 20:40:22 +00:00
yamt
6bcb315f7d lockdebug_abort: s/int/u_int/ for lock id as the rest of code. 2007-05-02 14:07:02 +00:00
rmind
0994dd0691 - Create sysctl nodes for AIO.
- Add POSIX defined system variables and constants of AIO_LISTIO_MAX and
  AIO_MAX values.  Both with _POSIX_ASYNCHRONOUS_IO, provide them in
  sysconf(3) and getconf(1) interfaces.
- Clean up sysconf(3) for handling sysctl nodes dynamically.
2007-05-01 01:01:25 +00:00
dsl
e6918d8f47 Remove proc->p_ru and the 'rusage' pool.
I think it existed to cache the numbers in kernel memory of a zombie when
proc->p_stats was part of the 'u' area - so got freed earlier and wouldn't
(easily) be accessible from a separate process.  However since both the
p_ru and p_stats fields are freed at the same time it is no longer needed.
Ride the recent 4.99.19 version change.
2007-04-30 20:11:41 +00:00
rmind
9c025db4ef Regen syscalls for AIO. 2007-04-30 14:47:32 +00:00
rmind
67d703cf25 Import of POSIX Asynchronous I/O.
Seems to be quite stable. Some work still left to do.

Please note, that syscalls are not yet MP-safe, because
of the file and vnode subsystems.

Reviewed by: <tech-kern>, <ad>
2007-04-30 14:44:28 +00:00
dsl
0df00dcf55 Split the statvfs functions so that the 'work' is done to a kernel buffer
which can either be copied directly to userspace, or converted then copied.
Saves replicating a lot of code in the compat functions (esp. for
getvfsstat) at a cast of an extra function call in the non-emulated case -
which is unlikely to be measurable given the other costs of the actions
involved (even on vax).
Remove dofhstat() and dofhstatvfs() (and the last caller).
Remove some redundant stackgap_init() calls.
2007-04-30 08:32:14 +00:00
msaitoh
8ce1f4fff2 fix typos 2007-04-29 20:23:34 +00:00
isaki
e7c552f22e Fix format of the combination of 'F\B\L' and ':\V' in
bitmask_snprintf(9).
2007-04-28 13:11:53 +00:00
dsl
e9a2689558 Since ktrace/systrace can sleep, move the VREF(dp) to before them. 2007-04-26 21:21:44 +00:00
dsl
41bef1b523 Be a little less over-zelous about converting ".." at the emulation root
to the real root.  Rather that do the check inside lookup() - where it
applies to to every ".." in a pathname, explicitly check the start of
the caller-supplied buffers and any absolute symbolic links.
Note that in the latter case the re-search from the real root is supressed.
Should fix PR kern/36225
2007-04-26 20:58:37 +00:00
dsl
9f6d43522e Pass the emulation root string into namei() from emul_find_interp() so that
the ktrace entries for lookups done during exec can have the full filename.
This is rather a hack :-)
2007-04-26 20:06:55 +00:00
dsl
7a81c4d42e Move the ktrace (and systrace) in namei() inside the retry loop for
emulation lookups.
If doing a lookup relative to the emulation root, prepend the emulation root
to the traced filename.
While here pass the filename length through to the ktrace code since namei()
knows the length and ktr_namei() would have to call strlen().
Note: that if namei() is being called during execve processing, the emulation
root name isn't available and "/emul/???" is used.  Also namei() has to use
strlen() to get the lenght on the emulatoon root - even though it is a
compile-time constant string.
2007-04-26 16:27:32 +00:00
dsl
47799dd2af Move the place where we convert the return value of emulation lookups that
would return the emulation-root to the real root to the main exit path.
Means that lookups of both "/" and "/." get converted from "/emul/xxx" to "/".
2007-04-25 20:41:42 +00:00
dsl
0182ef09da When we return the real root instead of the emulated root, we may
not have the parent vnode for the emulated root - so dont vput() it.
May fix PR kern/36197.
2007-04-23 07:04:30 +00:00
dsl
2ad47f228f I'm not sure why I decided that cwdinit() shouldn't copy cwd_edir.
Since this is called in fork() it does rather need to give the child
process the parent's emulation root.
This means that (for example) an emulated shell will, by default, run
programs from the emulation root.
2007-04-22 18:41:49 +00:00
dsl
b8fbaf8c4b Change the way that emulations locate files within the emulation root to
avoid having to allocate space in the 'stackgap'
  - which is very LWP unfriendly.
The additional code for non-emulation namei() is trivial, the reduction for
  the emulations is massive.
The vnode for a processes emulation root is saved in the cwdi structure
  during process exec.
If the emulation root the TRYEMULROOT flag are set, namei() will do an initial
  search for absolute pathnames in the emulation root, if that fails it will
  retry from the normal root.
".." at the emulation root will always go to the real root, even in the middle
  of paths and when expanding symlinks.
Absolute symlinks found using absolute paths in the emulation root will be
  relative to the emulation root (so /usr/lib/xxx.so -> /lib/xxx.so links
  inside the emulation root don't need changing).
If the root of the emulation would be returned (for an emulation lookup), then
  the real root is returned instead (matching the behaviour of emul_lookup,
  but being a cheap comparison here) so that programs that scan "../.."
  looking for the root dircetory don't loop forever.
The target for symbolic links is no longer mangled (it used to get the
  CHECK_ALT_xxx() treatment, so could get /emul/xxx prepended).
CHECK_ALT_xxx() are no more. Most of the change is deleting them, and adding
  TRYEMULROOT to the flags to NDINIT().
A lot of the emulation system call stubs could now be deleted.
2007-04-22 08:29:55 +00:00
ad
b0c22204d2 process_stoptrace: after setting a pending stop on curproc, call issignal
once to have it do the needful. PR kern/36161.
2007-04-19 22:42:10 +00:00
yamt
3829d825af malloc: fix a deadlock. 2007-04-19 11:03:44 +00:00