Commit Graph

1871 Commits

Author SHA1 Message Date
chs
c40f790c72 add missing boilerplate for UVMHIST. 2018-06-02 15:24:55 +00:00
chs
0656708806 allow tmpfs files to be larger than 4GB. 2018-05-28 21:04:35 +00:00
jdolecek
91c2b8613a uvm_pageactivate() needs to be called _after_ code is done with the page, no reason
to bother pdaemon with PG_BUSY pages; also clear the PG_FAKE and PG_CLEAN after
we are done with the write

this does not make any difference on my machine, but maybe it might fix
the machine check panic on Martin's alpha

while here remove UBC_PARTIALOK handling from ubc_zeropage_direct(), just to be sure
it works exactly the same as the non-direct one
2018-05-26 18:57:35 +00:00
jdolecek
546a023e11 add the KASSERT() for loan_count wrap-around to all places which increase it 2018-05-25 20:11:03 +00:00
jdolecek
9b785b5267 adjust heuristics for read-ahead to skip the full read-ahead when last page of
the range is already cached; this speeds up I/O from cache, since it avoids
the lookup and allocation overhead

on my system I observed 4.5% - 15% improvement for cached I/O - from 2.2 GB/s to
2.3 GB/s for cached reads using non-direct UBC, and from 5.6 GB/s to 6.5 GB/s
for UBC using direct map

part of PR kern/53124
2018-05-19 15:18:02 +00:00
jdolecek
6a369df7db change code to take advantage of direct map when available, avoiding the need
to map pages into kernel

this improves performance of UBC-based (read(2)/write(2)) I/O especially
for cached block I/O - sequential read on my NVMe goes from 1.7 GB/s to 1.9 GB/s
for non-cached, and from 2.2 GB/s to 5.6 GB/s for cached read

the new code is conditional now and off for now, so that it can be tested further;
can be turned on by adjusting ubc_direct variable to true

part of fix for PR kern/53124
2018-05-19 15:13:26 +00:00
jdolecek
62a2bd8b5b add experimental new function uvm_direct_process(), to allow of read/writes
of contents of uvm pages without mapping them into kernel, using
direct map or moral equivalent; pmaps supporting the interface need
to provide pmap_direct_process() and define PMAP_DIRECT

implement the new interface for amd64; I hear alpha and mips might be relatively
easy to add too, but I lack the knowledge

part of resolution for PR kern/53124
2018-05-19 15:03:26 +00:00
jdolecek
482e5d893a Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.
2018-05-19 11:39:37 +00:00
jdolecek
1f4255bff3 detect wraparound when bumping page wire_count and loan_count 2018-05-19 11:02:33 +00:00
christos
87287ec175 don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.
2018-05-08 19:33:57 +00:00
christos
61768d92a9 update maxrss (used to always be 0). Patterned after the OpenBSD changes. 2018-05-07 21:00:14 +00:00
jakllsch
0e522444a9 In uvm_page_recolor(), kmem_free() old size rather than new size.
From Yaniv Abraham-Rabinovitch in PR kern/53208.
2018-04-24 16:35:53 +00:00
jdolecek
d24fed3cfe add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings
2018-04-20 19:02:18 +00:00
jdolecek
b6ce67bcb3 make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints
2018-04-20 18:58:10 +00:00
christos
87fd18f8e5 s/static inline/static __inline/g for consistency. 2018-04-19 21:50:06 +00:00
jdolecek
554e919d8c fix typo in comment 2018-04-02 18:25:41 +00:00
mlelstv
c69d6c55e1 Increase UVM read ahead window limit a bit to match concurrency of reading
from the raw device.
2018-03-30 07:22:59 +00:00
jdolecek
af45d25bfb mark ubc_winshift and ubc_winsize as __read_mostly, they are used often
so might benefit from cache placement
2018-03-26 21:43:30 +00:00
christos
f31ca13583 finish moving the compat code out. 2018-03-15 03:21:58 +00:00
christos
33ff5e3b54 Untangle the swapctl compat code mess. Welcome to lucky 13. 2018-03-15 00:48:13 +00:00
jdolecek
f0e5de0264 fix the DIAGNOSTIC function pmap_tlb_asid_count() to not expect
that TLBINFO_ASID_INUSE_P() returns just 0 or 1; the underlying
__BITMAP_ISSET() actually returns the matching bit nowadays, which
caused miscounting

fixes PR kern/53054 by Sevan Janiyan
2018-02-25 21:43:03 +00:00
jdolecek
51cdc2d5b6 adjust KASSERT() triggered in PR port-cobalt/53054 to provide more info 2018-02-25 16:44:31 +00:00
jdolecek
f82a61429e KERNEL_PID is > 0 on powerpc/ibm4xx, need to mask all bits <0,
KERNEL_PID> to avoid triggering KASSERT() checking allocated asid
is bigger than KERNEL_PID; adjust also TLBINFO_ASID_INITIAL_FREE()
accordingly

discussed with Nick
2018-02-21 21:53:54 +00:00
jdolecek
f90211bb4c convert to use actual __BITMAP_*() macros from <sys/bitops.h>, and make
it possible to override the ASID bitmap length; default to 256 ASIDs as before

XXX NFCI; compile tested only on evbpcc and evbmips, unfortunately didn't
find any combination of port using the MI pmap_tlb.c and working in QEMU
2018-02-19 22:01:15 +00:00
jdolecek
d80d16cd1e a bit of DRY - add macro for initial free ASID count 2018-02-19 21:40:45 +00:00
jdolecek
4248f17bba make it possible to not use the icache evcnts 2018-02-19 21:20:33 +00:00
maxv
d4848b04b5 Use UVM_PROT_RW instead of UVM_PROT_ALL. This doesn't change anything,
since the protection code is not applied: the pages are manually kentered
as RW.

But fix it anyway, so that "pmap 0" does not say the map is executable.
2018-02-09 09:07:13 +00:00
mrg
70223d7695 uvm_map_extract() has an indentation issue. 2018-02-06 09:20:29 +00:00
christos
4af99c2770 CID-1427737: Pacify coverity using KASSERT 2018-01-21 17:58:43 +00:00
kamil
f98f70a745 Revert vadvise(2) removal
This system call was used in legacy Lisp code, that was inherited to modern
age and still compiled against supported compat layers (e.g. in clisp,
oaklisp, Franz Lisp).

It used to instruct the kernel about paging policy (G/C aware, flush etc).

Newly compiled code (assuming that it will detect vadvise()) will use the
libc stub for vadvise(). The headers for this interface are gone.

vadvise(2) could be marked as COMPAT_80, but as long as we support ultrix,
sunos or aout68k ABI, don't bother with this.

Requested by <mrg>
2018-01-06 16:41:23 +00:00
kamil
102875f88e Drop SYS_vadvise
The (o)vadvise syscall is dummy since the beginning of NetBSD.

It is an obsolete remnant from the old UNIX.

Sponsored by <The NetBSD Foundation>
2017-12-19 19:40:03 +00:00
kamil
885229d011 Drop SYS_sbrk
sbrk - change data segment size

This syscall is dummy since the inception of the project.

Sponsored by <The NetBSD Foundation>
2017-12-19 18:34:47 +00:00
kamil
438b670525 Drop the sstk(2) syscall stub
sstk - change stack section size

This functionality has never been implemented and is a remnant from 16-bit
UNIX. This stub appeared with the first NetBSD commit.

Sponsored by <The NetBSD Foundation>
2017-12-19 08:48:19 +00:00
maya
76bbcdbd5b Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq
2017-12-15 16:03:29 +00:00
mrg
277fd3d5f5 add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.
2017-12-02 08:15:42 +00:00
chs
fdfeaafd1a In uvm_fault_upper_enter(), if pmap_enter(PMAP_CANFAIL) fails, assert that
the pmap did not leave around a now-stale pmap mapping for an old page.
If such a pmap mapping still existed after we unlocked the vm_map,
the UVM code would not know later that it would need to lock the
lower layer object while calling the pmap to remove or replace that
stale pmap mapping.  See PR 52706 for further details.
2017-11-20 21:06:54 +00:00
mrg
2dc04f8ca1 remove duplicate prototype. 2017-11-14 06:43:23 +00:00
pgoyette
8c42a6afbc Remove unneeded casts to (uintptr_t). This is already taken care of in
the xxxHIST_LOG() macros.

No need to pull-up to -8 - the extra cast really won't hurt anything.
2017-10-30 03:25:14 +00:00
pgoyette
7fb159dd6a And replace an instance of "%p" conversion with "%#jx" 2017-10-30 01:19:46 +00:00
kre
cf610eb9fd Remove a stray '"' (obvious typo) and add a couple of casts that are
probably needed.
2017-10-30 00:55:42 +00:00
pgoyette
cb32a134a5 Update the kernhist(9) kernel history code to address issues identified
in PR kern/52639, as well as some general cleaning-up...

(As proposed on tech-kern@ with additional changes and enhancements.)

Details of changes:

* All history arguments are now stored as uintmax_t values[1], both in
  the kernel and in the structures used for exporting the history data
  to userland via sysctl(9).  This avoids problems on some architectures
  where passing a 64-bit (or larger) value to printf(3) can cause it to
  process the value as multiple arguments.  (This can be particularly
  problematic when printf()'s format string is not a literal, since in
  that case the compiler cannot know how large each argument should be.)

* Update the data structures used for exporting kernel history data to
  include a version number as well as the length of history arguments.

* All [2] existing users of kernhist(9) have had their format strings
  updated.  Each format specifier now includes an explicit length
  modifier 'j' to refer to numeric values of the size of uintmax_t.

* All [2] existing users of kernhist(9) have had their format strings
  updated to replace uses of "%p" with "%#jx", and the pointer
  arguments are now cast to (uintptr_t) before being subsequently cast
  to (uintmax_t).  This is needed to avoid compiler warnings about
  casting "pointer to integer of a different size."

* All [2] existing users of kernhist(9) have had instances of "%s" or
  "%c" format strings replaced with numeric formats; several instances
  of mis-match between format string and argument list have been fixed.

* vmstat(1) has been modified to handle the new size of arguments in the
  history data as exported by sysctl(9).

* vmstat(1) now provides a warning message if the history requested with
  the -u option does not exist (previously, this condition was silently
  ignored, with only a single blank line being printed).

* vmstat(1) now checks the version and argument length included in the
  data exported via sysctl(9) and exits if they do not match the values
  with which vmstat was built.

* The kernhist(9) man-page has been updated to note the additional
  requirements imposed on the format strings, along with several other
  minor changes and enhancements.

[1] It would have been possible to use an explicit length (for example,
    uint64_t) for the history arguments.  But that would require another
    "rototill" of all the users in the future when we add support for an
    architecture that supports a larger size.  Also, the printf(3) format
    specifiers for explicitly-sized values, such as "%"PRIu64, are much
    more verbose (and less aesthetically appealing, IMHO) than simply
    using "%ju".

[2] I've tried very hard to find "all [the] existing users of kernhist(9)"
    but it is possible that I've missed some of them.  I would be glad to
    update any stragglers that anyone identifies.
2017-10-28 00:37:11 +00:00
utkarsh009
756731076f [syzkaller] Fix for PR #52658 as suggested by riastradh@
The bug was found by Dmitry Vyukov (dvyukov@google.com)
using syzkaller and was tested by me on a VM running
8.99.5
2017-10-27 12:01:08 +00:00
pgoyette
81e66525e6 Fix user-triggerable kernel crash as reported in PR kern/52573 (from
Bruno Haible).

XXX Pull-up to netbsd-8
2017-10-01 01:45:02 +00:00
skrll
67f418007a There's no need to call pmap_tlb_invalidate_addr if pmap_remove_all was
called and PMAP_DEFERRED_ACTIVATE is set.
2017-09-07 06:29:47 +00:00
christos
5d5f2ef038 PR/52384: make uvm_fault_check() return EFAULT not EACCES, like our man pages
(but not OpenGroup which does not document EFAULT for read/write, and only
documents EACCES for sockets) say for read/write.
2017-07-09 20:53:09 +00:00
joerg
5f391f4ae2 Export the guard size of the main thread via vm.guard_size. Add a
complementary writable sysctl for the initial guard size of threads
created via pthread_create. Let the existing attribut accessors do the
right thing. Raise the default guard size for threads to 64KB.
2017-07-02 16:41:32 +00:00
skrll
e90fc54cd1 Use pte_set 2017-06-24 07:30:17 +00:00
skrll
659c16eaca Trailing whitespace 2017-06-24 05:49:50 +00:00
skrll
8d59a50047 Use __BIT(0) for PV_KENTER. NFC. 2017-06-24 05:39:53 +00:00
skrll
e4bfdd1c38 Whitespace - comment alignment. 2017-06-24 05:34:37 +00:00