Commit Graph

5345 Commits

Author SHA1 Message Date
dsl 0699e1c8bf Move the point at which sys_readv and sys_preadv (and writev) get merged
so that the same common code can be used with a kernel-resident 'iov'
array from the 32-bit compat code (which currently has its own copy
of these routines.
2007-06-16 20:48:03 +00:00
ad a2884738ea Nuke __HAVE_SPLBIGLOCK. 2007-06-15 20:59:38 +00:00
ad 029f4f9cd7 splstatclock, spllock -> splhigh 2007-06-15 20:17:07 +00:00
ad bd5831ff06 proc_free: avoid a potential race where we could free struct proc before
the last LWP in the process is off the CPU. Noted by yamt@.
2007-06-15 18:29:53 +00:00
ad 71d19c248a - ksem_proc_dtor: fix a use-after-free
- LOCK_ASSERT -> KASSERT
- Use kmem(9)
2007-06-15 18:27:13 +00:00
dyoung db12d3f8a6 #include sys/bootblocks.h for its MBR #definitions. 2007-06-14 17:18:40 +00:00
yamt 3aa0b315cd proc_drainrefs: fix the case of exec failure. 2007-06-14 14:29:50 +00:00
yamt b1cae5b7e6 exit_lwps: fix a deadlock. 2007-06-13 12:14:10 +00:00
christos 7754b3471a remove an unneeded cast and merge one more switch case. 2007-06-08 17:51:41 +00:00
christos 19a2c6c6d2 - only unlock if were dealing with a process.
- use the right mutex.
2007-06-08 17:49:13 +00:00
hannken 6087f7cc14 Dounmount(): rearrange mountlist_slock. vfs_allocate_syncvnode() may sleep
getting a new vnode so it must not be called with this simple_lock taken.

Fixes PR #36395
2007-06-07 10:03:12 +00:00
yamt da51d139a4 improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.
2007-06-05 12:31:30 +00:00
dsl 7ba299c5d4 Split sys__lwp_park() so that the compat/netbsd32 code can copyin and convert
its timeout then call the standard function.
2007-06-03 09:50:12 +00:00
dsl b38af594ea Move the #if at the top of trace_enter/exit back above the declaration of 'p'
(where it used to be in rev 1.147) so that this code compiles when none of
the trace options are in use.
Fixes PR kern/36431
2007-06-03 07:47:50 +00:00
dsl 21d1d4f346 Instead of unconditionally initialising the ktriov and conditionally
copying in aiov, just unconditionally copy in aiov.
Probably saves a mispredicted branch and a data cache miss - as well as
removing some code.
2007-06-02 13:38:31 +00:00
enami d35ef328a7 - Fix obvious typos so that sendto(2) works.
- Wrap lines again.
2007-06-02 01:24:34 +00:00
dsl d7f93c5c67 Split sys_bind() and sys_connect() so that compat code can use common code
once the 'address' has been copied into an mbuf.
Add extra flags for 'struct msghdr.msg_flags' to indicate that the address
  and control are already in mbufs, and that the uio structure is in userspace
  for sending data, rename sendit() to do_sys_sendmsg() to ensure no old code
  passes in random flags.
Changes to compat code to use new functions - removing some stackgap use.
Fix a 'use after free' in compat_43_sys_recvmsg.
I ***THINK*** the code that converts 'cmsg' formatted data is borked!
svr4_stream.c ought to be generated from svr4_32_stream.c during the build.
2007-06-01 22:53:52 +00:00
dsl d23c3b01a0 Add a ktrkuser() function that can be used to generate a KTR_USER trace
entry from kernel-resident data.
Mainly so I can (ab)use the KTR_USER entry for extra info.
2007-06-01 20:24:21 +00:00
ad 057666ad0c setrunnable: adjust to slightly different locking strategy post yamt-idlewlp.
Should fix kern/36398. Untested due to connectivity issues.
2007-05-31 22:06:09 +00:00
rmind 59085afd2c Make AIO initialization MP-safe.
Actually, lwp_exit() with (l != curlwp) will not work.
This fix might be pulled up from vmlocking branch.
2007-05-31 06:24:23 +00:00
rmind 0a227b1913 - Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.
2007-05-31 05:29:43 +00:00
dyoung 8c4b63fb77 Do not run ctags on sys/dev/usb/usb_port.h. Its #defines shadow
NetBSD symbols, such as clalloc(), that one might one to tag to.
2007-05-26 05:34:04 +00:00
tnn 6380d93405 When renaming, copy the new name into the designated memory area.
Tested by martti@
2007-05-22 10:39:10 +00:00
dsl b113bdbde9 Fix logic inversion - probably PR kern/36284 2007-05-21 18:30:35 +00:00
christos c61eed39a8 rename si_sigval -> si_value to match POSIX RTS. 2007-05-21 15:35:47 +00:00
skrll 5492d86688 Correct comment. 2007-05-21 11:56:35 +00:00
christos 09a50be501 - remove pathname_ interface.
- use macros to deal with pathnames in userspace, when veriexec is used.
- reorder the veriexec_ call arguments for consistency.
With help from elad@ finding the last bug.
2007-05-19 22:11:22 +00:00
yamt f03010953f merge yamt-idlelwp branch. asked by core@. some ports still needs work.
from doc/BRANCHES:

	idle lwp, and some changes depending on it.

	1. separate context switching and thread scheduling.
	   (cf. gmcgarry_ctxsw)
	2. implement idle lwp.
	3. clean up related MD/MI interfaces.
	4. make scheduler(s) modular.
2007-05-17 14:51:11 +00:00
hannken 64b7e5637e Fstrans_start() always returns zero, so change its type to void. 2007-05-17 07:26:21 +00:00
christos 50ab9d6934 - since mknod now can create regular files, make sure veriexec allows it.
Done in a way to minimize ifdefs. Per discussions with elad.
2007-05-17 00:46:30 +00:00
hannken 0453160a52 Use rwlock for fmi_shared_lock and fmi_lazy_lock.
Ok: Andrew Doran <ad@netbsd.org>
2007-05-16 16:11:56 +00:00
elad 6700cfccd6 Some Veriexec stuff that's been rotting in my tree for months.
Bug fixes:
  - Fix crash reported by Scott Ellis on current-users@.

  - Fix race conditions in enforcing the Veriexec rename and remove
    policies. These are NOT security issues.

  - Fix memory leak in rename handling when overwriting a monitored
    file.

  - Fix table deletion logic.

  - Don't prevent query requests if not in learning mode.


KPI updates:
  - fileassoc_table_run() now takes a cookie to pass to the callback.

  - veriexec_table_add() was removed, it is now done internally. As a
    result, there's no longer a need for VERIEXEC_TABLESIZE.

  - veriexec_report() was removed, it is now internal.

  - Perform sanity checks on the entry type, and enforce default type
    in veriexec_file_add() rather than in veriexecctl.

  - Add veriexec_flush(), used to delete all Veriexec tables, and
    veriexec_dump(), used to fill an array with all Veriexec entries.


New features:
  - Add a '-k' flag to veriexecctl, to keep the filenames in the kernel
    database. This allows Veriexec to produce slightly more accurate
    logs under certain circumstances. In the future, this can be either
    replaced by vnode->pathname translation, or combined with it.

  - Add a VERIEXEC_DUMP ioctl, to dump the entire Veriexec database.
    This can be used to recover a database if the file was lost.
    Example usage:

        # veriexecctl dump > /etc/signatures

    Note that only entries with the filename kept (that is, were loaded
    with the '-k' flag) will be dumped.

    Idea from Brett Lymn.

  - Add a VERIEXEC_FLUSH ioctl, to delete all Veriexec entries. Sample
    usage:

        # veriexecctl flush

  - Add a 'veriexec_flags' rc(8) variable, and make its default have
    the '-k' flag. On systems using the default signatures file
    (generaetd from running 'veriexecgen' with no arguments), this will
    use additional 32kb of kernel memory on average.

  - Add a '-e' flag to veriexecctl, to evaluate the fingerprint during
    load. This is done automatically for files marked as 'untrusted'.


Misc. stuff:
  - The code for veriexecctl was massively simplified as a result of
    eliminating the need for VERIEXEC_TABLESIZE, and now uses a single
    pass of the signatures file, making the loading somewhat faster.

  - Lots of minor fixes found using the (still under development)
    Veriexec regression testsuite.

  - Some of the messages Veriexec prints were improved.

  - Various documentation fixes.


All relevant man-pages were updated to reflect the above changes.

Binary compatibility with existing veriexecctl binaries is maintained.
2007-05-15 19:47:43 +00:00
dsl 2e12e4f4e1 Fallout from caddr_t deletion - remove a load of redundant (void *) casts. 2007-05-13 20:24:21 +00:00
dsl 9bdbb03424 nanosleep1() shouldn't try to get the current time into a NULL address. 2007-05-13 19:51:35 +00:00
dsl f23edc42dd Instead of the #define versions of tc_getfrequency() and nanouptime(), use
the function ones in kern_kern_clock.c (adding tc_getfrequency).
Adjust includes so this builds.
2007-05-13 14:43:52 +00:00
dsl 88e6c5604d Add a #define for nanouptime() in the !__HAVE_TIMECOUNTERS case. 2007-05-13 10:58:50 +00:00
dsl 1c85a3efd8 Split sys_nanosleep(). 2007-05-13 10:34:25 +00:00
dsl 701496b5c6 Split the fcntl locking code out from its copyin/out.
Use to avoid all the stackgap stuff in compat code.
2007-05-12 23:02:49 +00:00
dsl ef3fdc4a07 Change interface to settimeofday1() so that it can also be used from
compat code in order to avoid the stackgap.
2007-05-12 20:27:13 +00:00
dsl c83f8a10ad Change the compat sys_[fl]utime code to not use the stackgap. 2007-05-12 17:28:19 +00:00
dsl f56bfb975c Add the child 'rusage' of an exiting process to its own 'rusage' exactly
once, and prior to passing it to the caller of sys_wait4() and at the same
time as adding it to the parent.
Commands like:
time sh -c 'i=0; while [ $i -lt 1000 ]; do i=$(expr $i + 1); done'
now give same output.
2007-05-08 20:10:14 +00:00
manu 31b57f40ff Add the TFTPROOT kernel option for TFTP'ing root RAMdisk at root mount time.
This allows working around situations where a kernel with embedded RAMdisk
cannot be booted by the bootloader because the RAMdisk is too big.
2007-05-08 06:10:27 +00:00
rmind 10c3d35ca0 Rename vfs_aio.c to sys_aio.c as decided in <tech-kern>.
Please note, that <tech-kern> people should note about
file names before commit. Otherwise, function may fail
with errno set to EDIRTY, and return -1. ;)
2007-05-07 22:22:20 +00:00
dsl 1844147fa9 Split sys_wait4() so that compat code can fiddle with the returned 'status'
and 'rusage' without having to copy data to/from stackgap buffers.
The old split (find_stopped_child) could be removed.
amd64 seems to run netbsd32, linux and linux32 emulations. sparc64 compiles.
2007-05-07 16:53:17 +00:00
dsl 832ca390e2 Add child rusage values to exiting process in 'find_stopped_child'
so that it is (correctlly) available to the caller of wait4().
The self and child rusage values remain split for zombies.
2007-05-07 09:30:14 +00:00
dyoung e1d4e2922e In AppleTalk, IPv4, and IPv6 routing domains, help sockaddr_cmp()
avoid an indirect function call by comparing the family, length,
and bytes [dom->dom_sa_cmpofs, dom->dom_sa_cmpofs + dom->dom_sa_cmplen),
corresponding to the the sockaddrs' "address" members.

For ISO, actually use sockaddr_iso_cmp, for a change.  Thanks to
yamt@ for pointing out my error.
2007-05-06 02:56:37 +00:00
ad 501930d97e aio_init: limit wmesg strings to 8 characters. 2007-05-05 20:38:43 +00:00
yamt c9ba84ac33 aio_worker: exit properly. 2007-05-04 14:28:40 +00:00
rmind 29cb26a639 - Make aio_listio_max and aio_max changeable via sysctl.
- Set a lower priority for AIO-worker thread, because current could cause
  interactivity problems (eg. with qemu - thanks <xtraeme> for testing).
  Mark it as XXX for now - after priority model change, this should
  be reconsidered anyway.
- Do not copyout() with lock held in sys_aio_cancel().
- Fix a leak of the lock in aio_process().
- Check for any error of cv_wait_sig().
- Cache p->p_aio in aio_exit().

Thanks <ad> for catching the issues!
2007-05-03 22:03:40 +00:00
dyoung 72f0a6dfb0 Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing.  Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously.  Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs.  I have
  introduced routines for allocating, copying, and duplicating,
  and freeing sockaddrs:

        struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
        struct sockaddr *sockaddr_copy(struct sockaddr *dst,
                                       const struct sockaddr *src);
        struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
        void sockaddr_free(struct sockaddr *sa);

  sockaddr_alloc() returns either a sockaddr from the pool belonging
  to the specified family, or NULL if the pool is exhausted.  The
  returned sockaddr has the right size for that family; sa_family
  and sa_len fields are initialized to the family and sockaddr
  length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
  sockaddr_in).  sockaddr_free() puts the given sockaddr back into
  its family's pool.

  sockaddr_dup() and sockaddr_copy() work analogously to strdup()
  and strcpy(), respectively.  sockaddr_copy() KASSERTs that the
  family of the destination and source sockaddrs are alike.

  The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
  passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
  family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
  etc.  They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more.  All protocol families
  use struct route.  I have changed the route cache, 'struct route',
  so that it does not contain storage space for a sockaddr.  Instead,
  struct route points to a sockaddr coming from the pool the sockaddr
  belongs to.  I added a new method to struct route, rtcache_setdst(),
  for setting the cache destination:

        int rtcache_setdst(struct route *, const struct sockaddr *);

  rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
  available to create the sockaddr storage.

  It is now possible for rtcache_getdst() to return NULL if, say,
  rtcache_setdst() failed.  I check the return value for NULL
  everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
  caches, dom_rtcache.  rtflushall(sa_family_t af) looks up the
  domain indicated by 'af', walks the domain's list of route caches
  and invalidates each one.
2007-05-02 20:40:22 +00:00