Commit Graph

3737 Commits

Author SHA1 Message Date
yamt 1e18e59746 - borrow vmspace0 in uvm_proc_exit instead of uvmspace_free.
the latter is not a appropriate place to do so and it broke vfork.
- deactivate pmap before calling cpu_exit() to keep a balance of
  pmap_activate/deactivate.
2004-02-09 13:11:21 +00:00
yamt fa47baddee lwp_exit2: grab kernel_lock to preserve locking order. 2004-02-09 13:02:48 +00:00
yamt a45adbd9c7 don't deactivate pmap in exit1 because we'll touch the pmap later.
instead, borrow vmspace0 immediately before destroying the pmap
in uvmspace_free.
2004-02-07 10:05:52 +00:00
christos 13057976a6 include <uvm/uvm_object.h> for the benefit of ports that don't include
it in <machine/pmap.h>
2004-02-06 13:46:27 +00:00
junyoung 48d5030e12 ANSIfy & zap some blank lines. 2004-02-06 08:08:46 +00:00
junyoung 9a410f9ed0 Rename es_check in struct execsw to es_makecmds. 2004-02-06 08:02:58 +00:00
pk f092315b50 pg_delete: re-arrange SESSRELE() calls to allow for better code generation. 2004-02-06 06:59:33 +00:00
pk 7026ce08c8 ioctl TIOCSCTTY: re-arrange SESSHOLD() calls to allow for better code generation. 2004-02-06 06:58:21 +00:00
christos 0283dcd8a6 - Don't use uao_ functions directly; use them through the pgops methods.
- Fix missing reference leak in the error path of shmat() mentioned in
  Full-Disclosure.
2004-02-05 22:28:33 +00:00
christos 6b1b54b981 Don't use uao_reference, directly use the pgops instead. XXX: we should
prolly make all the uao_ functions used in pgops static.
2004-02-05 22:26:52 +00:00
tls aeaf748ff2 Buffer cache fixes to avoid thrashing between high and low water marks
and uncontrolled growth.

The key fix is from Dan Carasone, who noticed that buf_canfree() was
counting in _bytes_ but freeing in _buffers_, which caused the instant
drop to lowater observed by some users.

We now control the rate of growth; the probability of getting a new
allocation is inversely proportional to the current size of the
cache.  This idea is from a long-ago conversation with Kirk McKusick
and, if memory serves, was used for the file-system cache in some
other BSD variant at some point in history.

With growth and shrinkage more or less dealt with, we return the
default maximum cache size to 15%.  The default _minimum_ cache size
is raised from 1/16 of the maximum cache size to 1/8, since 1/16 was
chosen when the maximum size was 30% of memory.

Finally, after observing the behaviour of the pagedaemon and the
buffer cache drainer under pathological workloads (e.g. a benchmark
that steps through 75% of available memory backwards) I have moved
the call to buf_drain() to the beginning of the pagedaemon from the
end; if the pagedaemon bogs down, it still won't get run as often
as it should, but at least this way it will see the state of the
free count and free target _before_ the scan step does its thing.
2004-01-30 11:32:16 +00:00
tsarna 72489e1ea0 uuidgen(2) syscall. Originally from FreeBSD, ported by John Franklin in
PR#23470, with minor updates by me. This is only the syscall support
from that PR, for now.

Changes: port over fix from FreeBSD for multicast address generation.
Changed bcopy to memcpy.  For now, #ifdef notyet the portions of
kern_uuid.c that are meant to be used by (currently nonexistent) other
things in the kernel.  Added syscall to COMPAT_FREEBSD as well, though
that's currently not useful, as any program new enough to use this call
also uses other syscalls we don't (yet) emulate.
2004-01-29 02:00:02 +00:00
dan c6ba3edf9d Reduce the default BUFCACHE to 10% for now. Too many users are
tripping over this getting too large, and suffering other performance
problems due to the lack of good backpressure shrinking the bufcache
when other memory is required.  Again, this tunable should be
revisited when the backpressure mechanism has been improved.

sysctl vm.bufcache can be used to manually tune those rare machines
that might need more than this.

See comments in rev 1.106 for more detail.
2004-01-27 11:35:23 +00:00
hannken 3db4e2acd8 Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.
VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp)  Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp)      Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.
2004-01-25 18:06:48 +00:00
hannken d7f6cbf8bc Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern. 2004-01-25 18:02:04 +00:00
hannken b1cb363c11 Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.
VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp)  Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp)      Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.
2004-01-25 18:02:03 +00:00
wiz f62661e104 Add semicolons after variable declarations; closes PR 24201. 2004-01-24 01:40:57 +00:00
simonb 2763a4b916 Fix NTP PPSAPI support (enabled with "options PPS_SYNC"):
From PR kern/13702 from Charles Carvalho.  Tested on alpha and
i386 with a Laipac TF10 PPS-capable GPS.  The com.c change was
copied wholesale from Charles' z8530tty.c patch.
2004-01-23 05:01:19 +00:00
atatat 4fe5b245f9 Fix the kern.mbuf tunables. 2004-01-21 02:11:20 +00:00
yamt ce0a402d3c bufpool_page_alloc: for no-wait allocations, specify UVM_KMF_TRYLOCK as well. 2004-01-19 11:57:42 +00:00
atatat 47768f08f2 In sysctl_locate(), use "rnode" like everywhere else, don't call it
"rv".

In sysctl_destroyv(), deal with deleting alias nodes, and pass a token
size_t to sysctl_destroy().

In sysctl_free(), check that "node" has not reached "rnode", not that
"pnode" has.

In sysctl_realloc(), don't bother setting sysctl_clen...the value is
unchanged.
2004-01-17 04:01:14 +00:00
atatat 5d3b89e2f4 Avoid dereferencing l...it might be NULL 2004-01-17 03:33:24 +00:00
yamt 047fc0b378 - fix locking order problem. (pa_slock -> pr_slock)
- protect pr_phtree with pr_slock.
- add some LOCK_ASSERTs.
2004-01-16 12:47:37 +00:00
mrg 23884e8622 clean up a little:
- delete ktrsyscall32()
- add a check #ifdef _LP64 to do the conversion if P_32 is set to the
standard ktrsyscall()
- add a couple of similar _LP64/P_32 checks to the systrace code.

this should get systrace working for 32 bit apps as well as complete
ktrace support for "trace_enter/trace_exit" using platforms such as amd64.

XXX: systrace isn't supported on sparc64 currently... (it doesn't use
trace_enter/trace_exit, or have it's own calls to systrace_xxx()...)
2004-01-16 05:03:02 +00:00
mrg 4c2a3c644a export ktrinitheader() and ktrwrite() for ktrsyscall32(), which is used
to write 32 bit syscall arguments in a 64 bit format.
2004-01-15 14:29:20 +00:00
enami 9e2ac76ac4 Obviously, sizeof(u_int) is not enough to copy struct buf.
Prevents ``sysctl -a'' from dumping core.
2004-01-15 09:03:26 +00:00
yamt 7f20b0c529 bump vnode hold count for page cache as well
to resolve unfairness between page cache and traditional buffer cache.
pointed by enami tsugutomo on current-users@.
2004-01-14 11:28:04 +00:00
jdolecek 475a5858bf g/c process state SDEAD - it's not used anymore after 'reaper' removal 2004-01-11 19:39:48 +00:00
jdolecek a1090edbd2 fix assertion - non-alive processes are in SZOMB state now
fixes PR kern/24033 by Martin Husemann
2004-01-11 18:51:15 +00:00
hannken ed68c4e34c Allow vfs_write_suspend() to wait if the file system is already
suspending.

Move vfs_write_suspend() and vfs_write_resume() from kern/vfs_vnops.c
to kern/vfs_subr.c.

Change vnode write gating in ufs/ffs/ffs_softdep.c (from FreeBSD).

When vnodes are throttled in softdep_trackbufs() check for
file system suspension every 10 msecs to avoid a deadlock.
2004-01-10 17:16:38 +00:00
yamt a3b2d1879c add a new bufq strategy, BUFQ_PRIOCSCAN (per-priority CSCAN).
discussed on tech-kern@
2004-01-10 14:49:44 +00:00
yamt 8c55727694 reset i/o priority in geteblk() as well. 2004-01-10 14:43:05 +00:00
yamt 7266a95907 store a i/o priority hint in struct buf for buffer queue discipline. 2004-01-10 14:39:50 +00:00
thorpej 4aeba6790d Initialize buffer pools with PR_IMMEDRELEASE. Don't use pool_reclaim()
on those pools; it is no longer necessary.
2004-01-09 19:01:01 +00:00
thorpej 7f125220f4 Add a new pool initialization flag, PR_IMMEDRELEASE. This flag causes
idle pool pages to be returned to the system immediately upon becoming
de-fragmented.

Also, in pool_do_put(), don't free back an idle page unless we are over
our minimum page claim.
2004-01-09 19:00:16 +00:00
tls e4758a97ae Change BUFCACHE (default hard limit on physmem consumption by metadata
cache) from 30% to 20%.  This seems to significantly smooth the oscillation
between "almost no memory available" and "UVM free target available" caused
by the current sudden, heavy backpressure on the metadata cache.  We should
revisit this again once the backpressure mechanism is better tuned; ideally,
the hard limit should almost never come into play, because the metadata
cache should gradually give back pages as buffers hit the AGE list and as
the page cache demands them, rather than giving back a big slug of pages
all at once when UVM decides it's in a hurry and fires off the page daemon.

Just how well this adjustment works is likely to vary significantly from
machine to machine depending on I/O mix, filesystem frag size, and total
memory.  However, 20% seems to be quite a bit better than 30% on several
systems I've tested and is, coincidentally, more than enough to cache
the entire metadata working set of the AnonCVS server with 100 clients,
which is a useful worst-case stake in the ground...
2004-01-09 06:26:15 +00:00
tls 0d6723b09f Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels).  This
brings the hit rate on my machines from below 70% to above 90%.  We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes.  Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.
2004-01-09 00:04:53 +00:00
tls 28364b01be Add pool_reclaim() on pool to which we just pool_put() a buffer in
buf_mrelease().  Without this, though the pages are returned to the
relevant *pool*, they are never available for any other use in the
system.

Now the backpressure on the physical size of the buffer cache through
the buf_drain() call in the pagedaemon works correctly.  If anything,
it may be a bit more aggressive than intended.  On my 256MB system,
with vm.bufcache set to the default 30% of physmem, a kernel with this
fix can do 5 simultaneous config/makedep/builds of different NetBSD
kernels in 1313 seconds; with the "traditional" buffer cache code it
requires 1320 seconds.  Running "find / -type d -exec ls -l {}" while
the build is going demonstrates that the backpressure is working
correctly: free memory oscillates slowly between close to none and
the UVM target free, and vmstat -m shows a large number of releases
for the buffer pools.

For future work: how is "bufpl" memory returned to the system?  This
is not obvious to me (I must be looking in the wrong place).  Also,
buf_mrelease() is also called from brelse() in some cases.  Would it
be better to add a pool flag causing automatic release of full pages
as they become available (not fragmented)?  Jason Thorpe proposed this
and it seems more elegant than cleaning the _entire_ pool only upon
memory pressure.

Greg Oster did a lot of the work of figuring this out.  Jason proposed
the use of pool_reclaim as a way to fix it.
2004-01-08 23:41:14 +00:00
cube 3bf5e4c13b If ksyms have not been initialized, return ENXIO in ksymsopen instead of
ksymsread, because ksyms client test availability with open() and not
read().
2004-01-08 22:48:26 +00:00
thorpej d76fa360ef Back out >2 PT_LOAD changes from rev 1.96. They cause older GCC3-compiled
PowerPC binaries to fail.  The compiler has since been fixed, but
compatibility with older binaries needs to be maintained.

PR kern/23758.
2004-01-07 16:42:53 +00:00
jdolecek 26767eb2ae fix F_MAXFD fcntl - it returned the value as errno instead
of return value from the syscall
from mouss <usebsd at free dot fr>
2004-01-07 09:26:29 +00:00
atatat 5efc584023 Expose the buf_map symbol so that pmap(1) can find it.
Split the sysctl setup routine into two routines, one for each
"subtree".  Perhaps it's a little pedantic, but it's cleaner.  Also,
assert that the "kern" and "vm" nodes exist.
2004-01-06 13:51:09 +00:00
lukem 7bb9d6c875 Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate  const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.
2004-01-05 03:33:06 +00:00
christos b76a454b90 Ad F_CLOSEM, F_MAXFD from Matt Thomas. 2004-01-05 00:36:49 +00:00
pk 90cc172b86 bufpool_page_free: pass `buf_map' to uvm_km_free(). 2004-01-04 16:17:13 +00:00
kleink 1b16e3f0a3 ; may be a comment character in assembly, use \n as a separator instead. 2004-01-04 13:27:53 +00:00
jdolecek 089abdad44 Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
  as FPU state), and is the last potentially blocking operation;
  all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
  by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
  for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread
2004-01-04 11:33:29 +00:00
jdolecek 6ea538748e constify a bit 2004-01-03 20:10:01 +00:00
jdolecek 1d86b1f39f fix some comments, use NULL instead of 0 for pointer comparison 2004-01-03 19:43:55 +00:00
cl d5645dec8e regen 2004-01-02 18:53:45 +00:00