Commit Graph

272061 Commits

Author SHA1 Message Date
tkusumi
fd0af97215 dm: Remove misleading comment on linear target arg
The offset arg is mandatory.
Remove code and comment that makes it sounds like it's optional.

taken-from: DragonFlyBSD
2019-12-02 16:10:34 +00:00
tkusumi
6e7d723b30 dm: Add a comment on race window on unload
There is a minor race window on unload vs device creation
that can cause panic.
53a07f3ae7

taken-from: DragonFlyBSD
2019-12-02 15:17:43 +00:00
wiz
784b1c1585 postfix-3.4.8 out. 2019-12-02 14:15:22 +00:00
riastradh
f43739fc00 Use LFENCE/SFENCE/MFENCE in x86 bus_space_barrier.
These are needed for BUS_SPACE_MAP_PREFETCHABLE mappings.  On x86,
these are WC-type memory regions, which means -- unlike normal
WB-type memory regions -- loads can be reordered with loads,
requiring LFENCE, and stores can be reordered with stores, requiring
SFENCE.

Reference: AMD64 Architecture Programmer's Manual, Volume 2: System
Programming, Sec. 7.4.1 `Memory Barrier Interaction with Memory
Types', Table 7-3 `Memory Access Ordering Rules'.
2019-12-02 08:33:52 +00:00
riastradh
7a245208a0 Use BUS_SPACE_MAP_PREFETCHABLE only if BAR and driver agree on it.
- A driver that expects prefetchable memory and knows to issue the
  needed bus_space_barrier calls can pass BUS_SPACE_MAP_PREFETCHABLE
  to indicate a desire to map the memory prefetchable if the BAR
  allows it.

  (A driver that _really wants_ BUS_SPACE_MAP_PREFETCHABLE even if
  the BAR claims _not_ to be prefetchable can use pci_mapreg_info and
  bus_space_map explicitly -- this is not different from what we have
  today.)

- For a driver that _does not_ expect prefetchable memory, the
  appearance of the prefetchable bit in the BAR shouldn't cause it to
  use BUS_SPACE_MAP_PREFETCHABLE, because the driver will not issue
  the needed bus_space_barrier calls to get sensible results.

Note: `Prefetchable' here, sometimes called `write-combining', means
reads have no side effects, and writes are idempotent, so it is safe
to issue reads out of order and safe to combine writes.

Mappings with BUS_SPACE_MAP_PREFETCHABLE are often more weakly
ordered than normal memory -- e.g., on x86, in WC-type memory
regions, loads can be reordered with loads, stores can be reordered
with stores, which is not possible with any other type of memory
regions.

Discussed on tech-kern a while ago:

https://mail-index.NetBSD.org/tech-kern/2017/03/22/msg021678.html

This is option A, which received the most support.  This should help
unconfuse drivers that do not expect prefetchable mappings, like
Yamaguchi-san tripped over recently:

https://mail-index.NetBSD.org/tech-kern/2019/12/02/msg025785.html
2019-12-02 08:33:42 +00:00
msaitoh
f500185efd Use PCI_MSIX_"TBL"BIR_MASK instead of PCI_MSIX_"PBA"BIR_MASK for MSI-X table.
This is not a real bug because both macros have the same value.
2019-12-02 03:06:51 +00:00
christos
da2b6a5521 Add cfi annotations so that gdb can unwind the stack through signal handlers. 2019-12-02 01:38:54 +00:00
rillig
8f698bca87 Add more tests for variable modifiers in make. 2019-12-02 01:01:08 +00:00
rillig
f5741db816 Fix out-of-bounds read in Str_Match. 2019-12-01 23:53:49 +00:00
uwe
45b09b5d95 Add missing #include <sys/atomic.h> 2019-12-01 23:14:47 +00:00
sevan
b0d24b3c37 SCSI OSD 2019-12-01 23:08:09 +00:00
mlelstv
b9aa28ee06 Don't deregister twice with pmf. 2019-12-01 21:02:09 +00:00
mlelstv
b9245c5465 Reset MCU ready status before resetting the MCU.
Fixes PR kern/54728
2019-12-01 21:01:19 +00:00
jmcneill
aa92e84215 Attempt to load the zfs module even if /etc/zfs/zpool.cache is absent. The
module needs to be loaded to create a pool in the first place, and
autoloading won't work after the fact won't work at securelevel=1.
2019-12-01 21:00:43 +00:00
riastradh
53ecfc3aad Restore xcall(9) fast path using atomic_load/store_*.
While here, fix a bug that was formerly in xcall(9): a missing
acquire operation in the xc_wait fast path so that all memory
operations in the xcall on remote CPUs will happen before any memory
operations on the issuing CPU after xc_wait returns.

All stores of xc->xc_donep are done with atomic_store_release so that
we can safely use atomic_load_acquire to read it outside the lock.
However, this fast path only works on platforms with cheap 64-bit
atomic load/store, so conditionalize it on __HAVE_ATOMIC64_LOADSTORE.
(Under the lock, no need for atomic loads since nobody else will be
issuing stores.)

For review, here's the relevant diff from the old version of the fast
path, from before it was removed and some other things changed in the
file:

diff --git a/sys/kern/subr_xcall.c b/sys/kern/subr_xcall.c
index 45a877aa90e0..b6bfb6455291 100644
--- a/sys/kern/subr_xcall.c
+++ b/sys/kern/subr_xcall.c
@@ -84,6 +84,7 @@ __KERNEL_RCSID(0, "$NetBSD: subr_xcall.c,v 1.27 2019/10/06 15:11:17 uwe Exp $");
 #include <sys/evcnt.h>
 #include <sys/kthread.h>
 #include <sys/cpu.h>
+#include <sys/atomic.h>

 #ifdef _RUMPKERNEL
 #include "rump_private.h"
@@ -334,10 +353,12 @@ xc_wait(uint64_t where)
 		xc = &xc_low_pri;
 	}

+#ifdef __HAVE_ATOMIC64_LOADSTORE
 	/* Fast path, if already done. */
-	if (xc->xc_donep >= where) {
+	if (atomic_load_acquire(&xc->xc_donep) >= where) {
 		return;
 	}
+#endif

 	/* Slow path: block until awoken. */
 	mutex_enter(&xc->xc_lock);
@@ -422,7 +443,11 @@ xc_thread(void *cookie)
 		(*func)(arg1, arg2);

 		mutex_enter(&xc->xc_lock);
+#ifdef __HAVE_ATOMIC64_LOADSTORE
+		atomic_store_release(&xc->xc_donep, xc->xc_donep + 1);
+#else
 		xc->xc_donep++;
+#endif
 	}
 	/* NOTREACHED */
 }
@@ -462,7 +487,6 @@ xc__highpri_intr(void *dummy)
 	 * Lock-less fetch of function and its arguments.
 	 * Safe since it cannot change at this point.
 	 */
-	KASSERT(xc->xc_donep < xc->xc_headp);
 	func = xc->xc_func;
 	arg1 = xc->xc_arg1;
 	arg2 = xc->xc_arg2;
@@ -475,7 +499,13 @@ xc__highpri_intr(void *dummy)
 	 * cross-call has been processed - notify waiters, if any.
 	 */
 	mutex_enter(&xc->xc_lock);
-	if (++xc->xc_donep == xc->xc_headp) {
+	KASSERT(xc->xc_donep < xc->xc_headp);
+#ifdef __HAVE_ATOMIC64_LOADSTORE
+	atomic_store_release(&xc->xc_donep, xc->xc_donep + 1);
+#else
+	xc->xc_donep++;
+#endif
+	if (xc->xc_donep == xc->xc_headp) {
 		cv_broadcast(&xc->xc_busy);
 	}
 	mutex_exit(&xc->xc_lock);
2019-12-01 20:56:39 +00:00
ad
f36df6629d Avoid calling pmap_page_protect() while under uvm_pageqlock. 2019-12-01 20:31:40 +00:00
jmcneill
84d4c1fb40 Enable ZFS support on aarch64 2019-12-01 20:28:25 +00:00
jmcneill
9f2ee97f0a Flush insn / data caches after loading modules 2019-12-01 20:27:26 +00:00
jmcneill
e578db34f0 Need sys/atomic.h on NetBSD 2019-12-01 20:26:31 +00:00
jmcneill
fa74c92e0a Provide a default ptob() implementation 2019-12-01 20:26:05 +00:00
jmcneill
87afc7bc0f Initialize b_dev before passing buf to d_minphys (ldminphys needs this) 2019-12-01 20:25:31 +00:00
jmcneill
2e3c4047ee Build aarch64 modules without fp or simd instructions. 2019-12-01 20:24:47 +00:00
ad
ea045f02e7 Another instance of cpu_onproc to replace. 2019-12-01 19:21:13 +00:00
ad
bcbc56a72a Regen. 2019-12-01 18:32:07 +00:00
ad
2b25ff6d06 Back out previous temporarily - seeing unusual lookup failures. Will
come back to it.
2019-12-01 18:31:19 +00:00
ad
f278a3b979 Add ci_onproc. 2019-12-01 18:29:26 +00:00
ad
64e45337af cpu_onproc -> ci_onproc 2019-12-01 18:12:51 +00:00
kamil
82c05df197 Switch in_interrupt() in KCOV to cpu_intr_p()
This makes KCOV more MI friendly and removes x86-specific in_interrupt()
implementation.
2019-12-01 17:41:11 +00:00
kamil
018b416e9d Disable KCOV instrumentation in x86_machdep.c
This allows to use cpu_intr_p() directly inside KCOV.
2019-12-01 17:25:47 +00:00
ad
10fb14e25f Init kern_runq and kern_synch before booting secondary CPUs. 2019-12-01 17:08:31 +00:00
ad
ca7481a7dd Back out the fastpath change in xc_wait(). It's going to be done differently. 2019-12-01 17:06:00 +00:00
ad
b33d8c3694 Free pages in batch instead of taking uvm_pageqlock for each one. 2019-12-01 17:02:50 +00:00
ad
5ce257a95c __cacheline_aligned on a lock. 2019-12-01 16:44:11 +00:00
ad
2ed8ce1089 NetBSD 9.99.19 - many kernel data structure changes 2019-12-01 16:36:25 +00:00
tkusumi
c6a1f11f46 dm: Fix race on pdev create
List lookup and insert need to be atomic.
ac816675c8

take-from: DragonFlyBSD
2019-12-01 16:33:33 +00:00
ad
e398df6f78 Make the fast path in xc_wait() depend on _LP64 for now. Needs 64-bit
load/store.  To be revisited.
2019-12-01 16:32:01 +00:00
riastradh
6dde3af7e3 Mark unreachable branch with __unreachable() to fix i386/ALL build. 2019-12-01 16:22:10 +00:00
ad
57eb66c673 Fix false sharing problems with cpu_info. Identified with tprof(8).
This was a very nice win in my tests on a 48 CPU box.

- Reorganise cpu_data slightly according to usage.
- Put cpu_onproc into struct cpu_info alongside ci_curlwp (now is ci_onproc).
- On x86, put some items in their own cache lines according to usage, like
  the IPI bitmask and ci_want_resched.
2019-12-01 15:34:44 +00:00
riastradh
d6cbc02da6 Adapt <sys/pslist.h> to use atomic_load/store_*.
Changes:

- membar_producer();
  *p = v;

    =>

  atomic_store_release(p, v);

  (Effectively like using membar_exit instead of membar_producer,
  which is what we should have been doing all along so that stores by
  the `reader' can't affect earlier loads by the writer, such as
  KASSERT(p->refcnt == 0) in the writer and atomic_inc(&p->refcnt) in
  the reader.)

- p = *pp;
  if (p != NULL) membar_datadep_consumer();

    =>

  p = atomic_load_consume(pp);

  (Only makes a difference on DEC Alpha.  As long as lists generally
  have at least one element, this is not likely to make a big
  difference, and keeps the code simpler and clearer.)

No other functional change intended.  While here, annotate each
synchronizing load and store with its counterpart in a comment.
2019-12-01 15:28:19 +00:00
riastradh
3006828963 Rework modified atomic_load/store_* to work on const pointers. 2019-12-01 15:28:02 +00:00
ad
c242783135 Fix a longstanding problem with LWP limits. When changing the user's
LWP count, we must use the process credentials because that's what
the accounting entity is tied to.

Reported-by: syzbot+d193266676f635661c62@syzkaller.appspotmail.com
2019-12-01 15:27:58 +00:00
jmcneill
19487d1aef Remove the pretty much useless 128MB swap partition from the arm images. 2019-12-01 15:07:04 +00:00
ad
7ce773db14 Make cpu_intr_p() safe to use anywhere, i.e. outside assertions:
Don't call kpreempt_disable() / kpreempt_enable() to make sure we're not
preempted while using the value of curcpu().  Instead, observe the value of
l_ncsw before and after the check to see if we have been preempted.  If
we have been preempted, then we need to retry the read.
2019-12-01 14:52:13 +00:00
ad
2fa8dbd876 Minor correction to previous. 2019-12-01 14:43:26 +00:00
ad
221d5f982e - Adjust uvmexp.swpgonly with atomics, and make uvm_swap_data_lock static.
- A bit more __cacheline_aligned on mutexes.
2019-12-01 14:40:31 +00:00
ad
0aaea1d84e Deactivate pages in batch instead of acquiring uvm_pageqlock repeatedly. 2019-12-01 14:30:01 +00:00
ad
6e176d2434 Give each of the page queue locks their own cache line. 2019-12-01 14:28:01 +00:00
ad
24e75c17af Activate pages in batch instead of acquring uvm_pageqlock a zillion times. 2019-12-01 14:24:43 +00:00
ad
4bc8197e77 If the system is not up and running yet, just run the function locally. 2019-12-01 14:20:00 +00:00
ad
2ca6e3ffb4 Map the video RAM cacheable/prefetchable, it's very slow and this helps a bit. 2019-12-01 14:18:51 +00:00