Commit Graph

76 Commits

Author SHA1 Message Date
ad
1d7848ad43 Process concurrent page faults on individual uvm_objects / vm_amaps in
parallel, where the relevant pages are already in-core.  Proposed on
tech-kern.

Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until
adjustments are made to their pmaps.
2020-03-22 18:32:41 +00:00
ad
d305fcb599 sysctl_vm_uvmexp2(): some counters were needlessly truncated. 2020-03-19 20:23:19 +00:00
ad
05a3457e85 Merge from yamt-pagecache (after much testing):
- Reduce unnecessary page scan in putpages esp. when an object has a ton of
  pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
  precisely in uvm layer.
2020-01-15 17:55:43 +00:00
ad
5c06357c90 Rename uvm_free() -> uvm_availmem(). 2019-12-31 13:07:09 +00:00
ad
92ec96a550 Counter tweaks:
"zeroaborts" + "free" don't need to be per-CPU counters, and "bucketmiss"
wasn't used.  Remove those and cluster by usage.
2019-12-21 14:33:18 +00:00
ad
ddd3a0be1e uvmexp.free -> uvm_free() 2019-12-21 13:00:20 +00:00
ad
a98966d3dc - Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
  done with a separate commit.  Cuts system time for a build by 20-25% on
  a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).
2019-12-16 22:47:54 +00:00
jdolecek
a679821368 add sysctl to easily set ubc_direct
PR kern/53124
2019-01-07 22:48:01 +00:00
riastradh
d1579b2d70 Rename min/max -> uimin/uimax for better honesty.
These functions are defined on unsigned int.  The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER!  Some subsystems have

	#define min(a, b)	((a) < (b) ? (a) : (b))
	#define max(a, b)	((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX.  Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate.  But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all.  (Who knows, maybe in some cases integer
truncation is actually intended!)
2018-09-03 16:29:22 +00:00
mrg
277fd3d5f5 add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.
2017-12-02 08:15:42 +00:00
joerg
5f391f4ae2 Export the guard size of the main thread via vm.guard_size. Add a
complementary writable sysctl for the initial guard size of threads
created via pthread_create. Let the existing attribut accessors do the
right thing. Raise the default guard size for threads to 64KB.
2017-07-02 16:41:32 +00:00
msaitoh
db917302c6 Sort in uvmexp_sysctl's order for readability. No functional change. 2014-12-01 04:11:14 +00:00
msaitoh
571ed845ad Fix a bug that "vmstat -s" print uvmexp.ncolors incorrectly. 2014-12-01 04:02:40 +00:00
martin
87d65da062 Fix copy & pasto 2014-02-26 20:33:53 +00:00
matt
b007730cbb Add vm.min_address and vm.max_address which return VM_MIN_ADDRESS and
VM_MAXUSER_ADDRESS.
2014-02-26 16:11:59 +00:00
pooka
4f6fb3bf35 Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before
the sysctl link sets are processed, and remove redundancy.

Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate
lines of code.
2014-02-25 18:30:08 +00:00
dsl
e21a34c25e Add some pre-processor magic to verify that the type of the data item
passed to sysctl_createv() actually matches the declared type for
  the item itself.
In the places where the caller specifies a function and a structure
  address (typically the 'softc') an explicit (void *) cast is now needed.
Fixes bugs in sys/dev/acpi/asus_acpi.c sys/dev/bluetooth/bcsp.c
  sys/kern/vfs_bio.c sys/miscfs/syncfs/sync_subr.c and setting
  AcpiGbl_EnableAmlDebugObject.
(mostly passing the address of a uint64_t when typed as CTLTYPE_INT).
I've test built quite a few kernels, but there may be some unfixed MD
  fallout. Most likely passing &char[] to char *.
Also add CTLFLAG_UNSIGNED for unsiged decimals - not set yet.
2012-06-02 21:36:41 +00:00
para
e62ee4d475 extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged
2012-01-27 19:48:38 +00:00
christos
4df396b653 prevent kernel from writing more than userland passed. 2011-12-30 19:01:07 +00:00
christos
cff1dac372 if you are going to dereference a variable, check the variable itself, not
it cousin.
2011-11-13 02:10:40 +00:00
chuck
40ec801a13 udpate license clauses on my code to match the new-style BSD licenses.
based on second diff that rmind@ sent me.

no functional change with this commit.
2011-02-02 15:25:27 +00:00
matt
6a66466f0c Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits.  Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.
2010-12-20 00:25:23 +00:00
enami
2c376812ef Nowadays, comparing priority against PZERO doesn't make any sense.
Instead, see if a process waits uninterruptibly like ps does,
so that the second column (`b') of default vmstat output prints
some useful value (-t is still broken though).
2010-11-16 03:49:53 +00:00
uebayasi
8e3c57e6cc Include uvm/uvm.h because this is part of UVM. 2010-11-06 12:18:17 +00:00
rmind
5f0ac9a4fa - Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
  Avoids blocking effect on real-time threads.  Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON.  Also,
sched_pstats() might be cleaned-up slightly.
2010-04-16 03:21:49 +00:00
mrg
665bf35b1e now that CTLTYPE_BOOL actually works, use it to export vm_page_zero_enable
as vm.idlezero in a way that actually works on big endian systems.
2010-04-11 01:53:03 +00:00
rmind
40cf6f3659 Remove uarea swap-out functionality:
- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code.  Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.
2009-10-21 21:11:57 +00:00
ad
cbbf514e2c - vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
  CPU's lists and the global lists. When allocating, prefer to take pages
  from the local CPU. If none are available take from the global list as
  done now. Proposed on tech-kern@.
2008-06-04 12:45:28 +00:00
ad
6d70f903e6 Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
  be sent from a hardware interrupt handler. Signal activity must be
  deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
  and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
2008-04-24 15:35:27 +00:00
yamt
e781af39bd implement priority inheritance. 2007-02-26 09:20:52 +00:00
pavel
934634a18c Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.
2007-02-17 22:31:36 +00:00
ad
2a34d8467a Fix load average calculation:
- Don't consider kernel threads when calculating the load average. Their
  priorities are no longer adjusted by the scheduler, and their level of
  activity is dependent upon running user processes.
- Change the (l->l_priority > PZERO) check in uvm_meter() to (l->l_flag &
  L_SINTR). I think this check was originally intended to weed out
  processes sleeping interruptably.
2007-02-15 20:22:43 +00:00
ad
b07ec3fc38 Merge newlock2 to head. 2007-02-09 21:55:00 +00:00
yamt
1a7bc55dcc remove some __unused from function parameters. 2006-11-01 10:17:58 +00:00
dogcow
e0909532c0 even more __unused. 2006-10-12 20:18:19 +00:00
yamt
9d3e3eab23 merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
	- implement an alternative replacement policy
2006-09-15 15:51:12 +00:00
kardel
de4337ab21 merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
  time.tv_sec -> time_second
- struct timeval mono_time is gone
  mono_time.tv_sec -> time_uptime
- access to time via
	{get,}{micro,nano,bin}time()
	get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
  Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
  NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
2006-06-07 22:33:33 +00:00
yamt
ca08761347 whitespace in SYSCTL_DESCR. 2005-12-21 12:23:44 +00:00
yamt
4fce5d6f5e make length of inactive queue tunable by sysctl. (vm.inactivepct) 2005-12-21 12:19:04 +00:00
christos
95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
simonb
155c8bd666 Whitespace nit. 2005-11-09 12:47:39 +00:00
thorpej
e569facced Use ANSI function decls. 2005-06-27 02:19:48 +00:00
yamt
627b0d5099 remove anon related statistics which are no longer used. 2005-05-15 08:01:06 +00:00
yamt
56a653490c expose vm_page_zero_enable as vm.idlezero sysctl.
XXX assuming boolean_t == int.
2004-10-10 09:57:31 +00:00
atatat
db2f5beb7d Sysctl descriptions under vm subtree 2004-05-25 04:31:17 +00:00
atatat
19af35fd0d Tango on sysctl_createv() and flags. The flags have all been renamed,
and sysctl_createv() now uses more arguments.
2004-03-24 15:34:46 +00:00
junyoung
1e2b269ded - Nuke __P().
- Drop trailing spaces.
2004-03-24 07:50:48 +00:00
yamt
0cad61498f sysctl_vm_updateminmax: fix swapped filemin and execmin.
the problem reported by Vesbula on current-users@.
2004-01-11 18:42:25 +00:00
tsutsui
082caf94ca Allow sysctl(8) to update vm.{anon,exec,file}{min,max}.
XXX needs sysctl(9) man page to confirm this change..
2003-12-07 00:40:43 +00:00
atatat
13f8d2ce5f Dynamic sysctl.
Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al.  Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.

Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded.  Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.

All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.

PS - I'm sorry, but there's a distinct lack of documentation at the
moment.  I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
2003-12-04 19:38:21 +00:00