further deprecate struct timezone usage by changing `tzp' argument to
gettimeofday() to void *; align utimes(2) declaration by changing `times`
argument from struct timeval * to struct timeval[2]. From Murray
Armfield in PR standards/25331.
In due curse, reflect these changes in futimes(2), lutimes(2), and
settimeofday(2).
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.
Convert struct session, ucred and lockf to pools.
list of file system types currently supported by the kernel.
Previously there wasn't an easy way to determine this.
(Code shamelessly cribbed from subr_disk.c::sysctl_hw_disknames().)
Use LIST_FOREACH() appropriately.
and unkillable processes.
1. Introduce new SBSIZE resource limit from FreeBSD to limit socket buffer
size resource.
2. make sokvareserve interruptible, so processes ltsleeping on it can be
killed.
would be good) mostly copied from sysctl(3). This takes care of the
top-level, most of kern.* and hw.* (modulo the ath and bge stuff), and
all of proc.*.
If you don't want the added rodata in your kernel, use "options
SYSCTL_NO_DESCR" in your kernel config.
setup function to set the description, even if the node has been
instantiated elsewhere. Or not, depending on the other that the setup
functions are called.
namely, allow following operations.
- purge only an entry specified by a component name.
- purge only child entries.
- purge only parent entries.
no objections on tech-kern@.
Add an initializer for them: KSI_INIT_EMPTY
Add a predicate for them: KSI_EMPTY_P
Don't bother storing empty ksiginfo_t's since they have no information.
Change uses of KSI_INIT to KSI_INIT_EMPTY where no other information other
than the signo is being filled in.
not blocked. Otherwise (it if it blocked or the hanlder is set to SIG_IGN)
reset the signal back to its default settings so that a coredump can be
generated.
other semantics from an earlier incarnation.
Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.
Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)
- Fix a 32-bit overflow that could erroneously return true even if the
currently allocated buffer memory was greater than the high water mark.
- Add an early check for bufmem > hiwater to avoid a needless call to
random().
- Sprinkle some comments.
Add a vm.bufmem sysctl so the current bufmem value can be easily queried
from userland.
Reviewed by Thor Simon.
size 0.
This way, individual ports can circumvent sigcode mapping
by setting sigcode/esigcode.
(would be better to clean up the __HAVE_SIGINFO/COMPAT_XX
stuff, but it is not a good moment now)
- factor out common code.
- don't stop searching before the target.
- touch the correct object.
- validate the argument before the loop otherwise we need to roll back.
- move per VP data into struct sadata_vp referenced from l->l_savp
* VP id
* lock on VP data
* LWP on VP
* recently blocked LWP on VP
* queue of LWPs woken which ran on this VP before sleep
* faultaddr
* LWP cache for upcalls
* upcall queue
- add current concurrency and requested concurrency variables
- make process exit run LWP on all VPs
- make signal delivery consider all VPs
- make timer events consider all VPs
- add sa_newsavp to allocate new sadata_vp structure
- add sa_increaseconcurrency to prepare new VP
- make sys_sa_setconcurrency request new VP or wakeup idle VP
- make sa_yield lower current concurrency
- set sa_cpu = VP id in upcalls
- maintain cached LWPs per VP
be updated. (This is needed to be compatible with how pre-SIGINFO signals
operated. If you siglongjmp out of a signal handler, the SS_ONSTACK state
needs to be cleared. This commit restores that functionality).
re-used by another cpu immediately. in that case, lwp_exit2() will
access freed memory. to fix this:
- remove curlwp from p_lwps in exit1() rather than letting lwp_exit2() do so.
- add assertions to ensure freed proc has no lwps.
kern/24329 from me and kern/24574 from Havard Eidnes.
In addition to current one (i.e., don't wast so large part of the page),
- if the header fitsin the page without wasting any items, put it there.
- don't put the header in the page if it may consume rather big item.
For example, on i386, header is now allocated in the page for the pools
like fdescpl or sigapl, and allocated off the page for the pools like
buf1k or buf2k.
kern.proc, kern.proc2, kern.lwp, and kern.buf.
Define more MIB for kern.buf so that specific buffers can be selected
(only all/all is supported right now), and use a 32/64 bit agnostic
structure for communcating buffer information to userland.
Convert systat to the new kern.buf method.
Clean up the vm.buf* handling a little. There's no actual need to
record the dynamically assigned OIDs, since sysctl_data can tell us
what we're looking at.
Oh, and fix a typo in a comment.
and then allocate it on demand. Rename some common symbols (__bss_start,
_edata, _end, __start_link_set_*, __stop_link_set_*) so that ".<module>"
is appended to them. This shrinks an amd64 kernel by 20KB of BSS.
of using on-stack memory, so that this wouldn't eventually cause kernel
panic if the process get swapped out and another process runs kqueue_scan()
problem pointed out in kern/24220 by Stephan Uphoff
called with every buffer written through spec_strategy().
Used by fss(4). Future file-system-internal snapshots will need them too.
Welcome to 1.6ZK
Approved by: Jason R. Thorpe <thorpej@netbsd.org>
so the last change has us comparing pages to bytes instead of pages
to buffers! The consequence was to try to free radically less memory
than UVM wanted us to -- though always at least one buffer, which is
probably why the results weren't dire.
This does suggest that buf_canrelease() could be a *lot* more
conservative about how much to release than "2 * page deficit". In
fact, serious trouble seems to ensue if it's not -- when anything
else on the system demands enough pages, we slam down to the low
water mark nd stay there. I've adjusted it to use min(page defecit,
buffer memory / 16), which still isn't quite right but seems better.
Another change: consider the case of an infinite loop that does
"tar xzf pkgsrc.tar.gz ; rm -rf pkgsrc". Each time the rm runs,
all the dead metadata will go on the AGE list -- and, until we hit
the high-water mark, stay there, at which point it may be slowly
recycled. Two adjustments seem to solve this: 1) whack buf_lotsfree()
to return 0 if there's anything on the AGE list; 2) whack buf_canrelease()
to count the memory used by the AGE list and always return at least
that much.
This basically turns the AGE list into a "delayed free" list, since we
can't entirely eliminate it as we can't free pool items from interrupt
context (e.g. from biodone()).
To consider: with the bookkeeping corrected, should buf_drain() move
back to the _end_ of the pagedaemon, and should the calculation then
try to give back at least the current defecit?
the latter is not a appropriate place to do so and it broke vfork.
- deactivate pmap before calling cpu_exit() to keep a balance of
pmap_activate/deactivate.
and uncontrolled growth.
The key fix is from Dan Carasone, who noticed that buf_canfree() was
counting in _bytes_ but freeing in _buffers_, which caused the instant
drop to lowater observed by some users.
We now control the rate of growth; the probability of getting a new
allocation is inversely proportional to the current size of the
cache. This idea is from a long-ago conversation with Kirk McKusick
and, if memory serves, was used for the file-system cache in some
other BSD variant at some point in history.
With growth and shrinkage more or less dealt with, we return the
default maximum cache size to 15%. The default _minimum_ cache size
is raised from 1/16 of the maximum cache size to 1/8, since 1/16 was
chosen when the maximum size was 30% of memory.
Finally, after observing the behaviour of the pagedaemon and the
buffer cache drainer under pathological workloads (e.g. a benchmark
that steps through 75% of available memory backwards) I have moved
the call to buf_drain() to the beginning of the pagedaemon from the
end; if the pagedaemon bogs down, it still won't get run as often
as it should, but at least this way it will see the state of the
free count and free target _before_ the scan step does its thing.
PR#23470, with minor updates by me. This is only the syscall support
from that PR, for now.
Changes: port over fix from FreeBSD for multicast address generation.
Changed bcopy to memcpy. For now, #ifdef notyet the portions of
kern_uuid.c that are meant to be used by (currently nonexistent) other
things in the kernel. Added syscall to COMPAT_FREEBSD as well, though
that's currently not useful, as any program new enough to use this call
also uses other syscalls we don't (yet) emulate.
tripping over this getting too large, and suffering other performance
problems due to the lack of good backpressure shrinking the bufcache
when other memory is required. Again, this tunable should be
revisited when the backpressure mechanism has been improved.
sysctl vm.bufcache can be used to manually tune those rare machines
that might need more than this.
See comments in rev 1.106 for more detail.
VOP_STRATEGY(bp) is replaced by one of two new functions:
- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.
DEV_STRATEGY(bp) is used only for block-to-block device situations.
VOP_STRATEGY(bp) is replaced by one of two new functions:
- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.
DEV_STRATEGY(bp) is used only for block-to-block device situations.