Commit Graph

9654 Commits

Author SHA1 Message Date
hannken
4ac834d923 Fix a logic error introduced with Rev. 1.507: defer setting MNT_RDONLY
only if going from read-write to read-only.

Should fix PR kern/52045 (panic: ffs_sync: rofs mod, fs=/ after fsck)
2017-03-07 11:54:16 +00:00
hannken
6e4272e4af Always use the lowest mount for fstrans and suspend. This way we
enter/leave or suspend/resume the stack of layered file systems as a unit.
2017-03-06 10:11:21 +00:00
hannken
0f10eb2124 Deny unmounting file systems below layered file systems. 2017-03-06 10:10:43 +00:00
hannken
6caedad35c Change vrecycle() and vgone() to lock with LK_RETRY. If this node is
a layerfs node the lower node(s) may already be reclaimed.
2017-03-06 10:07:52 +00:00
mlelstv
ba576b71a7 Enhance disk metrics by calculating a weighted sum that is incremented
by the number of concurrent I/O requests. Also introduce a new disk_wait()
function to measure requests waiting in a bufq.
iostat -y now reports data about waiting and active requests.

So far only drivers using dksubr and dk, ccd, wd and xbd collect data about
waiting requests.
2017-03-05 23:07:12 +00:00
mrg
e99ba17226 add missing sys/evcnt.h include. 2017-03-05 20:45:49 +00:00
jdolecek
54d8d0371a add some event counters, for commits, writes, cache flush 2017-03-05 13:57:29 +00:00
hannken
a57a3961af Add an operation to test a mount for fstrans support and use it for
_fstrans_start(), fstrans_done(), fstrans_is_owner(), vfs_suspend()
and vfs_resume().

Test for fstrans support before ASSERT_SLEEPABLE().
2017-03-02 10:41:27 +00:00
hannken
b038eea5e1 Suspend the mounted file system while updating. 2017-03-01 10:45:24 +00:00
hannken
90ead62d2f Change the protocol to update a mounted file system from read-write
to read-only and vice versa:

- Add an internal flag IMNT_WANTRDONLY.
- Set either IMNT_WANTRDWR or IMNT_WANTRDONLY if going from or to read-only.
- After successfull call to VFS_MOUNT() set or clear MNT_RDONLY.

Adapt tmpfs and rumpfs to the new protocol.  Other file systems will be
updated when they get the IMNT_CAN_RWTORO property.

Welcome to 7.99.64
2017-03-01 10:44:47 +00:00
hannken
0d6fbaf0a0 Must always lock the parent -> lock the child -> unlock the parent. 2017-03-01 10:43:37 +00:00
jakllsch
aa28e4fbed pi_bsize must be at least pi_secsize
Allows block device accesses to 4KiB logical sector disks to function on the
vast majority of ports with 2KiB BLKDEV_IOSIZE.
2017-02-28 00:33:36 +00:00
hannken
d0ef892c64 Test for fstrans support before trying to allocate per-thread info.
PR kern/51996 (kmem_alloc called from intr context in fstrans_get_lwp_info)
2017-02-23 11:23:22 +00:00
kamil
5c4cff4517 Fix build of ports without PT_STEP
Fallout after PT_*DBREGS introduction.

Sponsored by <The NetBSD Foundation>
2017-02-23 04:48:36 +00:00
kamil
988eb7ed71 Introduce PT_GETDBREGS and PT_SETDBREGS in ptrace(2) on i386 and amd64
This interface is modeled after FreeBSD API with the usage.

This replaced previous watchpoint API. The previous one was introduced
recently in NetBSD-current and remove its spurs without any
backward-compatibility.

Design choices for Debug Register accessors:
 - exec() (TRAP_EXEC event) must remove debug registers from LWP
 - debug registers are only per-LWP, not per-process globally
 - debug registers must not be inherited after (v)forking a process
 - debug registers must not be inherited after forking a thread
 - a debugger is responsible to set global watchpoints/breakpoints with the
   debug registers, to achieve this PTRACE_LWP_CREATE/PTRACE_LWP_EXIT event
   monitoring function is designed to be used
 - debug register traps must generate SIGTRAP with si_code TRAP_DBREG
 - debugger is responsible to retrieve debug register state to distinguish
   the exact debug register trap (DR6 is Status Register on x86)
 - kernel must not remove debug register traps after triggering a trap event
   a debugger is responsible to detach this trap with appropriate PT_SETDBREGS
   call (DR7 is Control Register on x86)
 - debug registers must not be exposed in mcontext
 - userland must not be allowed to set a trap on the kernel

Implementation notes on i386 and amd64:
 - the initial state of debug register is retrieved on boot and this value is
   stored in a local copy (initdbregs), this value is used to initialize dbreg
   context after PT_GETDBREGS
 - struct dbregs is stored in pcb as a pointer and by default not initialized
 - reserved registers (DR4-DR5, DR9-DR15) are ignored

Further ideas:
 - restrict this interface with securelevel

Tested on real hardware i386 (Intel Pentium IV) and amd64 (Intel i7).

This commit enables 390 debug register ATF tests in kernel/arch/x86.
All tests are passing.

This commit does not cover netbsd32 compat code. Currently other interface
PT_GET_SIGINFO/PT_SET_SIGINFO is required in netbsd32 compat code in order to
validate reliably PT_GETDBREGS/PT_SETDBREGS.

This implementation does not cover FreeBSD specific defines in their
<x86/reg.h>: DBREG_DR7_LOCAL_ENABLE, DBREG_DR7_GLOBAL_ENABLE, DBREG_DR7_LEN_1
etc. These values tend to be reinvented by each tracer on its own. GNU
Debugger (GDB) works with NetBSD debug registers after adding this patch:

--- gdb/amd64bsd-nat.c.orig	2016-02-10 03:19:39.000000000 +0000
+++ gdb/amd64bsd-nat.c
@@ -167,6 +167,10 @@ amd64bsd_target (void)

 #ifdef HAVE_PT_GETDBREGS

+#ifndef DBREG_DRX
+#define	DBREG_DRX(d,x)	((d)->dr[(x)])
+#endif
+
 static unsigned long
 amd64bsd_dr_get (ptid_t ptid, int regnum)
 {


Another reason to stop introducing unpopular defines covering machine
specific register macros is that these value varies across generations of
the same CPU family.

GDB demo:
  (gdb) c
  Continuing.

  Watchpoint 2: traceme

  Old value = 0
  New value = 16
  main (argc=1, argv=0x7f7fff79fe30) at test.c:8
  8               printf("traceme=%d\n", traceme);

(Currently the GDB interface is not reliable due to NetBSD support bugs)

Sponsored by <The NetBSD Foundation>
2017-02-23 03:34:22 +00:00
kamil
fb0af2ac33 Improve PT_SET_SIGMASK and PT_GET_SIGMASK API in ptrace(2)
Use proper check for LW_SYSTEM, don't depend on PT_GETREGS/PT_SETREGS.
Don't allow to mask SA_CANTMASK signals with PT_SET_SIGMASK (this covers
SIGSTOP and SIGKILL).

Add new ATF tests:
 - setsigmask5
   Verify that sigmask cannot be set to SIGKILL

 - setsigmask6
   Verify that sigmask cannot be set to SIGSTOP

Sponsored by <The NetBSD Foundation>
2017-02-23 00:50:09 +00:00
kamil
f9b2093d06 Introduce new ptrace(2) API to allow/prevent exection of LWP
Introduce new API for debuggers to allow/prevent execution of the specified
thread.

New ptrace(2) operations:

     PT_RESUME     Allow execution of a specified thread, change its state
                   from suspended to continued.  The addr argument is unused.
                   The data argument specifies the LWP ID.

                   This call is equivalent to _lwp_continue(2) called by a
                   traced process.  This call does not change the general
                   process state from stopped to continued.

     PT_SUSPEND    Prevent execution of a specified thread, change its state
                   from continued to suspended.  The addr argument is unused.
                   The data argument specifies the requested LWP ID.

                   This call is equivalent to _lwp_suspend(2) called by a
                   traced process.  This call does not change the general
                   process state from continued to stopped.

This interface is modeled after FreeBSD, however with NetBSD specific arguments
passed to ptrace(2) -- FreeBSD passes only thread id, NetBSD passes process and
thread id.

Extend PT_LWPINFO operation in ptrace(2) to report suspended threads. In the
ptrace_lwpinfo structure in pl_event next to PL_EVENT_NONE and PL_EVENT_SIGNAL
add new value PL_EVENT_SUSPENDED.

Add new errno(2) value EDEADLK that might be returned by ptrace(2). It prevents
dead-locking in a scenario of resuming a process or thread that is prevented
from execution. This fixes bug that old API was vulnerable to this scenario.

Kernel bump delayed till introduction of PT_GETDBREGS/PT_SETDBREGS soon.

Add new ATF tests:
 - resume1
   Verify that a thread can be suspended by a debugger and later
   resumed by the debugger

 - suspend1
   Verify that a thread can be suspended by a debugger and later
   resumed by a tracee

 - suspend2
   Verify that the while the only thread within a process is
   suspended, the whole process cannot be unstopped

Sponsored by <The NetBSD Foundation>
2017-02-22 23:43:43 +00:00
hannken
a378d58ecb Enable fstrans on all file systems.
Welcome to 7.99.61
2017-02-22 09:50:13 +00:00
hannken
8c2ff4e99d Regen. 2017-02-22 09:47:18 +00:00
hannken
99694efaee Prepare to move fstrans into vnode_if.c, allow "FSTRANS=YES"
and "FSTRANS=NO" in the vop description.
Add fstrans_start()/fstrans_done() to all vops that have FSTRANS=YES
or have the first vnode unlocked.
2017-02-22 09:45:51 +00:00
rin
ede747a0c4 PR kern/51208
Add DISKLABEL_EI (``Endian-Independent'' disklabel) kernel option to machines
that support Master Boot Record (MBR)
2017-02-19 07:43:42 +00:00
chs
006dc29ca6 obey the executable's ELF alignment constraints for PIE.
this fixes gdb of PIE binaries on mac68k (and other platforms
which use an ELF alignment that is larger than PAGE_SIZE).
2017-02-18 01:29:09 +00:00
hannken
7599fb1f37 Bring back vrele_flush() to flush deferred vrele() o an suspended file system. 2017-02-17 08:30:00 +00:00
hannken
4f18a321ca Make sure vcache_reclaim() will complete before file system suspension. 2017-02-17 08:27:58 +00:00
hannken
90afdff6e3 Take fstrans_start before syncing a file system. 2017-02-17 08:26:07 +00:00
hannken
a863cd745e Let syncer try fstrans_start() before running VFS_SYNC() to get rid
of the syncer lock/unlock from vfs_suspend().
2017-02-17 08:25:15 +00:00
hannken
b62f0c07fe Protect attaching and detaching lwp_info to mount with a mutex. 2017-02-17 08:24:07 +00:00
zafer
6914423cda fix number of arguments of kmem_alloc and kmem_zalloc macro. ok skrll. 2017-02-13 16:53:41 +00:00
uwe
1159401280 netbsd_elf_signature - look at note segments (phdrs) not note
sections.  They point to the same data in the file, but sections are
for linkers and are not necessarily present in an executable.

The original switch from phdrs to shdrs seems to be just a cop-out to
avoid parsing multiple notes per segment, which doesn't really avoid
the problem b/c sections also can contain multiple notes.
2017-02-12 21:52:46 +00:00
maxv
8fdaa9399d Add a KASSERT, otherwise it looks like a NULL deref; from Mootja. 2017-02-12 18:43:56 +00:00
kamil
61aff29627 Introduce new interface in ptrace(2) - PT_GET_SIGMASK and PT_SET_SIGMASK
Add new interface to add ability to get/set signal mask of a tracee.
It has been inspired by Linux PTRACE_GETSIGMASK and PTRACE_SETSIGMASK, but
adapted for NetBSD API.

This interface is used for checkpointing software to set/restore context
of a process including signal mask like criu or just to track this property
in reverse-execution software like Record and Replay Framework (rr).


Add new ATF tests for this interface
====================================
getsigmask1:
    Verify that plain PT_SET_SIGMASK can be called

getsigmask2:
    Verify that PT_SET_SIGMASK reports correct mask from tracee

setsigmask1:
    Verify that plain PT_SET_SIGMASK can be called with empty mask

setsigmask2:
    Verify that sigmask is preserved between PT_GET_SIGMASK and
    PT_SET_SIGMASK

setsigmask3:
    Verify that sigmask is preserved between PT_GET_SIGMASK, process
    resumed and PT_SET_SIGMASK

setsigmask4:
    Verify that new sigmask is visible in tracee


Kernel ABI bump delayed as there are more interfaces to come in ptrace(2).

Sponsored by <The NetBSD Foundation>
2017-02-12 06:09:52 +00:00
kamil
9a6383f067 Be paranoid about PT_SET_SIGINFO and PT_GET_SIGINFO in ptrace(2)
Currently a tracer is prohibited to read and write memory of a tracee.
Prohibit reading and faking signal information.

Sponsored by <The NetBSD Foundation>
2017-02-11 19:32:41 +00:00
christos
b4abffbdeb expose sendmsg_so and recvmsg_so. 2017-02-03 16:06:45 +00:00
christos
8c06ed2feb expose copyout_sockname_sb 2017-02-02 15:37:42 +00:00
maya
1aa8013394 restore r1.118 2017-02-01 01:51:07 +00:00
christos
a4ac56487b We need to define COMPAT_NETBSD32 before we include other files;
otherwise things like ucontext32_t will be missing.
2017-01-28 16:43:59 +00:00
hannken
748bb65685 Vrecycle() cannot wait for the vnode lock. On a leaf file system this lock
will always succeed as we hold the last reference and prevent further
references.  On layered file systems waiting for the lock would open a can of
deadlocks as the lower vnodes may have other active references.
2017-01-27 10:50:10 +00:00
hannken
8e09b56de2 When called with WRITECLOSE vflush() must sync the vnode and take
care of unlinked but open vnodes.

PR kern/30525 remounting ffs read-only (mount -ur) does not sync metadata.
2017-01-27 10:46:18 +00:00
christos
eb4e2ff6fe rump does not have ucontext32_t 2017-01-27 03:53:01 +00:00
christos
914b3cbf1a use __HAVE_COMPAT_NETBSD32 2017-01-26 15:54:31 +00:00
martin
ee2ea00dd5 Restrict the forcing of COMPAT_NETBSD32 to _LP64 kernels - this is probably
not the right thing to do, but unbreaks the build for now.
2017-01-26 08:09:27 +00:00
martin
d57a70bd13 No COMPAT_NETBSD32 for rump 2017-01-26 07:54:05 +00:00
christos
9be065fb89 For LOCKDEBUG:
Always provide the location of the caller of the lock as __func__, __LINE__.
2017-01-26 04:11:56 +00:00
christos
655a10972a always compile in the COMPAT32 code; it is tiny and if we don't it breaks
the modules.
2017-01-26 03:54:54 +00:00
christos
46149e83cc don't return early holding a lock! 2017-01-26 03:54:01 +00:00
christos
4705defbf3 es_arglen is already in bytes... 2017-01-25 17:57:14 +00:00
christos
44c43f62df The argument length is in bytes; don't use howmany() 2017-01-25 17:56:45 +00:00
christos
908d408b7e PR/51916: Kamil Rytarowski: Don't multiply es_arglen with ptrsz since it is
already in bytes and contains the maximum possible size:
	ELF_AUX_ENTRIES * sizeof(auxv) + MAXPATHLEN + ALIGN
2017-01-25 17:55:47 +00:00
skrll
c8226a8b4f Fix build 2017-01-20 09:45:13 +00:00
skrll
fd0caf00f0 Simplify getiobuf. buf_init already does bp->b_objlock == &buffer_lock 2017-01-20 08:16:31 +00:00