NetBSD

Commit Graph

Author	SHA1	Message	Date
ad	9f3236292d	Defer some wakeups till lock release.	2023-10-08 12:38:58 +00:00
ad	05f14f5014	sleepq_block(): slightly reduce number of test+branch in the common case.	2023-10-08 11:12:47 +00:00
ad	d89ef44f99	sleepq_uncatch(): fix typo that's been there since 2020, hello @thorpej lol: - l->l_flag = ~(LW_SINTR \| LW_CATCHINTR \| LW_STIMO); + l->l_flag &= ~(LW_SINTR \| LW_CATCHINTR \| LW_STIMO);	2023-10-07 20:48:50 +00:00
ad	bf5d5d6058	sleepq_uncatch(): clear LW_STIMO too, so that there's no possibility that the newly non-interruptable sleep could produce EWOULDBLOCK (paranoia).	2023-10-07 14:12:29 +00:00
ad	c55cb7eb1b	Update comments to match reality	2023-10-05 19:44:26 +00:00
ad	68fa584377	Arrange to update cached LWP credentials in userret() rather than during syscall/trap entry, eliminating a test+branch on every syscall/trap. This wasn't possible in the 3.99.x timeframe when l->l_cred came about because there wasn't a reliable/timely way to force an ONPROC LWP running on a remote CPU into the kernel (which is just about the only new thing in this scheme).	2023-10-05 19:41:03 +00:00
ad	5e6f75a121	Resolve !MULTIPROCESSOR build problem with the nasty kernel lock macros.	2023-10-05 19:28:30 +00:00
ad	b316ad652f	The idle LWP doesn't need to care about kernel_lock.	2023-10-05 19:10:18 +00:00
ad	01e5be2450	kern_sig.c: remove problematic kernel_lock handling which is unneeded in 2023.	2023-10-05 19:06:30 +00:00
riastradh	f103f77a25	lwp_pctr(9): Make this a little more robust. No substantive change to machine code on aarch64. (Instructions and registers got reordered a little but not in a way that matters.)	2023-10-05 13:05:18 +00:00
riastradh	c6e0728e36	kern_cctr.c: Fix broken indentation. No functional change intended.	2023-10-05 12:05:59 +00:00
ad	c43491e4bf	pipe1(): call getnanotime() once not twice.	2023-10-04 22:41:56 +00:00
ad	2c21032618	pipe->pipe_waiters isn't needed on NetBSD, kernel condvars do this for free.	2023-10-04 22:19:58 +00:00
ad	0f335007fe	kauth_cred_hold(): return cred verbatim so that donating a reference to another data structure can be done more elegantly.	2023-10-04 22:17:09 +00:00
ad	d96d10fff5	pipe_read(): try to skip locking the pipe if a non-blocking fd is used, as is very often the case with BSD make (from FreeBSD/mjg@).	2023-10-04 22:12:23 +00:00
ad	66fdfe9cee	match_process(): most of the fields being inspected are covered by proc_lock so don't grab p->p_lock so much.	2023-10-04 20:48:13 +00:00
ad	3d1cabfdfe	Do cv_broadcast(&p->p_lwpcv) after dropping p->p_lock in a few places, to reduce contention.	2023-10-04 20:46:33 +00:00
ad	ce3debcbe5	lwp_wait(): restart the loop if p->p_lock dropped to reap zombie (paranoid).	2023-10-04 20:45:13 +00:00
ad	27711c94c3	Sprinkle a bunch more calls to lwp_need_userret(). There should be no functional change but it does get rid of a bunch of assumptions about how mi_userret() works making it easier to adjust in that in the future, and works as a kind of documentation too.	2023-10-04 20:44:15 +00:00
ad	725adb2afe	Sprinkle a bunch more calls to lwp_need_userret(). There should be no functional change but it does get rid of a bunch of assumptions about how mi_userret() works making it easier to adjust in that in the future, and works as a kind of documentation too.	2023-10-04 20:42:38 +00:00
ad	9a587cc997	Turnstiles: use the syncobj name for ps/top wmesg when sleeping since it's more informative than "tstile".	2023-10-04 20:39:35 +00:00
ad	0a6ca13bec	Eliminate l->l_biglocks. Originally I think it had a use but these days a local variable will do.	2023-10-04 20:29:18 +00:00
ad	a355028fa4	Eliminate l->l_ncsw and l->l_nivcsw. From memory think they were added before we had per-LWP struct rusage; the same is now tracked there.	2023-10-04 20:28:05 +00:00
ad	9fc55e2ba8	Tweak a couple of comments.	2023-10-02 21:50:18 +00:00
ad	2246c1eb4d	Use kmem_intr_*() variants for lock objects since aiodoned was done away with and we process these I/Os in soft interrupt context now.	2023-10-02 21:03:55 +00:00
ad	089029363e	kauth_cred_groupmember(): check egid before a tedious scan of groups.	2023-10-02 20:59:12 +00:00
ad	cbc1d2c479	Sigh.. Adjust previous to work as intended. The boosted LWP priority didn't persist as far as the run queue because l_syncobj gets reset earlier than I recalled.	2023-09-23 20:23:07 +00:00
ad	6ed72b5fad	- Simplify how priority boost for blocking in kernel is handled. Rather than setting it up at each site where we block, make it a property of syncobj_t. Then, do not hang onto the priority boost until userret(), drop it as soon as the LWP is out of the run queue and onto a CPU. Holding onto it longer is of questionable benefit. - This allows two members of lwp_t to be deleted, and mi_userret() to be simplified a lot (next step: trim it down to a single conditional). - While here, constify syncobj_t and de-inline a bunch of small functions like lwp_lock() which turn out not to be small after all (I don't know why, but atomic_*_relaxed() seem to provoke a compiler shitfit above and beyond what volatile does).	2023-09-23 18:48:04 +00:00
ad	59e0001f2c	Repply this change with a couple of bugs fixed: - Do away with separate pool_cache for some kernel objects that have no special requirements and use the general purpose allocator instead. On one of my test systems this makes for a small (~1%) but repeatable reduction in system time during builds presumably because it decreases the kernel's cache / memory bandwidth footprint a little. - vfs_lockf: cache a pointer to the uidinfo and put mutex in the data segment.	2023-09-23 18:21:11 +00:00
ad	6c27fcfbed	kernel_lock isn't needed to synchronise kthread_exit() and kthread_join().	2023-09-23 14:40:42 +00:00
msaitoh	821e159c0a	s/ for for / for / in comment.	2023-09-21 09:31:49 +00:00
ad	60240e1fc2	Fix a comment.	2023-09-19 22:15:32 +00:00
ad	ef0f79c8d1	Back out recent change to replace pool_cache with then general allocator. Will return to this when I have time again.	2023-09-12 16:17:21 +00:00
martin	8b4bc2e26f	Add missing <sys/intr.h> include (previously indirectly hidden via pool.h)	2023-09-11 08:55:01 +00:00
ad	cbcf86cb1f	- Do away with separate pool_cache for some kernel objects that have no special requirements and use the general purpose allocator instead. On one of my test systems this makes for a small (~1%) but repeatable reduction in system time during builds presumably because it decreases the kernel's cache / memory bandwidth footprint a little. - vfs_lockf: cache a pointer to the uidinfo and put mutex in the data segment.	2023-09-10 14:45:52 +00:00
ad	37ee486b91	It's easy to exhaust the open file limit on a system with many CPUs due to caching. Allow a bit of leeway to reduce the element of surprise.	2023-09-10 14:44:08 +00:00
ad	4c2ca10b4a	Assert that kmem_alloc() provides the expected alignment.	2023-09-10 14:29:13 +00:00
ad	e5756b164f	do_sys_accessat(): copy credentials only when needed.	2023-09-09 18:34:44 +00:00
ad	0860546435	Fix a ~16 year old perf regression: when accepting a connection, add a reference to the caller's credentials rather than copying them.	2023-09-09 18:30:56 +00:00
ad	6e2810dfd3	- Shrink namecache entries to 64 bytes on 32-bit platforms and use 32-bit key values there for speed (remains 128 bytes & 64-bits on _LP64). - Comments.	2023-09-09 18:27:59 +00:00
christos	2b1dcc58b0	Move the initialization of the random hash for addresses earlier so that it does not happen under a spin lock context (when it is first used).	2023-09-09 16:01:09 +00:00
ad	70ddceb5d7	Fix a ~16 year old perf regression: when creating a socket, add a reference to the caller's credentials rather than copying them. On an 80486DX2/66 this seems to ~halve the time taken to create a socket.	2023-09-07 20:12:33 +00:00
ad	8479531bde	Remove dodgy and unused mutex_owner_running() & rw_owner_running().	2023-09-07 20:05:41 +00:00
riastradh	5c3232db9d	heartbeat(9): Make heartbeat_suspend/resume nestable. And make them bind to the CPU as a side effect, instead of requiring the caller to have already done so. This lets us eliminate the assertions so we can use them in ddb even when things are going haywire and we just want to get diagnostics. XXX kernel revbump -- struct cpu_info change	2023-09-06 12:29:14 +00:00
simonb	715431a6a9	Whitespace nit.	2023-09-04 09:13:23 +00:00
riastradh	792ae95f90	heartbeat(9): Move #ifdef HEARTBEAT to sys/heartbeat.h. Less error-prone this way, and the callers are less cluttered.	2023-09-02 17:44:59 +00:00
riastradh	c0abd507e0	heartbeat(9): Move panicstr check into the IPI itself. We can't return early from defibrillate because the IPI may have yet to run -- we can't return until the other CPU is definitely done using the ipi_msg_t we created on the stack. We should avoid calling panic again on the patient CPU in case it was already in the middle of a panic, so that we don't re-enter panic while, e.g., trying to print a stack trace. Sprinkle some comments.	2023-09-02 17:44:41 +00:00
riastradh	11062fec23	heartbeat(9): More detail about manual test success criteria. Changes comments only, no functional change.	2023-09-02 17:44:32 +00:00
riastradh	95d8ae3ce4	heartbeat(9): Ignore stale tc if primary CPU heartbeat is suspended. The timecounter ticks only on the primary CPU, so of course it will go stale if it's suspended. (It is, perhaps, a mistake that it only ticks on the primary CPU, even if the primary CPU is offlined or in a polled-input console loop, but that's a separate issue.)	2023-09-02 17:44:23 +00:00
riastradh	572220daab	heartbeat(9): New flag SPCF_HEARTBEATSUSPENDED. This way we can suspend heartbeats on a single CPU while the console is in polling mode, not just when the CPU is offlined. This should be rare, so it's not _convenient_, but it should enable us to fix polling-mode console input when the hardclock timer is still running on other CPUs.	2023-09-02 17:43:37 +00:00
riastradh	b339b8878f	cpu_setstate: Fix call to heartbeat_suspend. Do this on successful offlining, not on failed offlining. No functional change right now because heartbeat_suspend is implemented as a noop -- heartbeat(9) will just check the SPCF_OFFLINE flag. But if we change it to not be a noop, well, then we need to call it in the right place.	2023-09-02 17:43:28 +00:00
skrll	42c5a783e2	Trailing whitespace.	2023-09-01 16:57:33 +00:00
andvar	9eee100ff9	remove broken #ifdef KADB code block in subr_prf. kdbpanic() was seemingly MIPS only and removed back in 1997, since mips/locore.S rev 1.31. should fix builds with KADB option enabled (tested on arc).	2023-08-29 21:23:14 +00:00
rin	21280b65d3	exec_elf: Sort auxv entries by value of types No significant changes intended. Just for slightly nicer output for gdb "info auxv".	2023-08-17 06:58:26 +00:00
christos	8e699bff4f	F_GETPATH guarantees that data points to a MAXPATHLEN pointer, so go back to using MAXPATHLEN.	2023-08-12 23:22:49 +00:00
christos	42c84f8ba4	mfd_name should be already NUL terminated.	2023-08-12 23:09:12 +00:00
christos	f495a620b7	Add missing F_GETPATH (from Theodore Preduta)	2023-08-12 23:07:46 +00:00
riastradh	e5b4f1636b	workqueue(9): Factor out wq->wq_flags & WQ_FPU in workqueue_worker. No functional change intended. Makes it clearer that s is initialized when used.	2023-08-09 08:24:18 +00:00
riastradh	9b18147164	workqueue(9): Sort includes. No functional change intended.	2023-08-09 08:24:08 +00:00
riastradh	0d01e7b05c	workqueue(9): Avoid unnecessary mutex_exit/enter cycle each loop.	2023-08-09 08:23:45 +00:00
riastradh	6ce73099f6	workqueue(9): Stop violating queue(3) internals.	2023-08-09 08:23:35 +00:00
riastradh	23d0d09807	workqueue(9): Sprinkle dtrace probes for workqueue_wait edge cases. Let's make it easy to find out whether these are hit.	2023-08-09 08:23:25 +00:00
riastradh	53a88a4133	workqueue(9): Avoid touching running work items in workqueue_wait. As soon as the workqueue function has called, it is forbidden to touch the struct work passed to it -- the function might free or reuse the data structure it is embedded in. So workqueue_wait is forbidden to search the queue for the batch of running work items. Instead, use a generation number which is odd while the thread is processing a batch of work and even when not. There's still a small optimization available with the struct work pointer to wait for: if we find the work item in one of the per-CPU _pending_ queues, then after we wait for a batch of work to complete on that CPU, we don't need to wait for work on any other CPUs. PR kern/57574 XXX pullup-10 XXX pullup-9 XXX pullup-8	2023-08-09 08:23:13 +00:00
riastradh	4bff62f739	xcall(9): Rename condvars to be less confusing. The `cv' suffix is not helpful and `xclocv' looks like some kind of clock at first glance. Just say `xclow' and `xchigh'.	2023-08-06 17:50:20 +00:00
riastradh	763d441de3	cprng(9): Drop and retake percpu reference across entropy_extract. entropy_extract may sleep on an adaptive lock, which invalidates percpu(9) references. Add a note in the comment over entropy_extract about this. Discovered by stumbling upon this panic during a test run: [ 1.0200050] panic: kernel diagnostic assertion "(cprng == percpu_getref(cprng_fast_percpu)) && (percpu_putref(cprng_fast_percpu), true)" failed: file "/home/riastradh/netbsd/current/src/sys/rump/librump/rumpkern/../../../crypto/cprng_fast/cprng_fast.c", line 117 XXX pullup-10	2023-08-05 11:21:24 +00:00
andvar	69dbeb3df8	s/acccept/accept/ in comment.	2023-08-05 09:25:39 +00:00
riastradh	72c927ccb4	entropy(9): Disable !cold assertion in rump for now. Evidently rump starts threads before it sets cold = 0, and deferring the call to rump_thread_allow(NULL) in rump.c rump_init_callback until after setting cold = 0 doesn't work either -- rump kernels just hang. To be investigated -- for now, let's just stop breaking thousands of tests.	2023-08-04 16:02:01 +00:00
riastradh	f4fdfc607f	Revert "softint(9): Sprinkle KASSERT(!cold)." Temporary workaround for PR kern/57563 -- to be fixed properly after analysis.	2023-08-04 12:24:36 +00:00
riastradh	aa8e725d99	softint(9): Sprinkle KASSERT(!cold). Softints are forbidden to run while cold. So let's make sure nobody even tries it -- if they do, they might be delayed indefinitely, which is going to be much harder to diagnose.	2023-08-04 07:40:30 +00:00
riastradh	3586ae1d3b	entropy(9): Simplify stages. Split interrupt vs non-interrupt paths. - Nix the entropy stage (cold, warm, hot). Just use the usual kernel `cold' (cold: single-core, single-thread; interrupts may happen), and don't make any three-way distinction about whether interrupts or threads or other CPUs can be running. Instead, while cold, use splhigh/splx or forbid paths to come from interrupt context, and while warm, use mutex or the per-CPU hard and soft interrupt paths for low latency. This comes at a small cost to some interrupt latency, since we may stir the pool in interrupt context -- but only for a very short window early at boot between configure and configure2, so it's hard to imagine it matters much. - Allow rnd_add_uint32 to run in hard interrupt context or with spin locks held, but defer processing to softint and drop samples on the floor if buffer is full. This is mainly used for cheaply tossing samples from drivers for non-HWRNG devices into the entropy pool, so it is often used from interrupt context and/or under spin locks. - New rnd_add_data_intr provides the interrupt-like data entry path for arbitrary buffers and driver-specified entropy estimates: defer processing to softint and drop samples on the floor if buffer is full. - Document that rnd_add_data is forbidden under spin locks outside interrupt context (will crash in LOCKDEBUG), and inadvisable in interrupt context (but technically permitted just in case there are compatibility issues for now); later we can forbid it altogether in interrupt context or under spin locks. - Audit all uses of rnd_add_data to use rnd_add_data_intr where it might be used in interrupt context or under a spin lock. This fixes a regression from last year when the global entropy lock was changed from IPL_VM (spin) to IPL_SOFTSERIAL (adaptive). Thought I'd caught all the problems from that, but another one bit three different people this week, presumably because of recent changes that led to more non-HWRNG drivers entering the entropy consolidation path from rnd_add_uint32. In my attempt to preserve the rnd(9) API for the (now long-since abandoned) prospect of pullup to netbsd-9 in my rewrite of the entropy subsystem in 2020, I didn't introduce a separate entry point for entering entropy from interrupt context or equivalent, i.e., spin locks held, and instead made rnd_add_data rely on cpu_intr_p() to decide whether to process the whole sample under a lock or only take as much as there's buffer space for before scheduling a softint. In retrospect, that was a mistake (though perhaps not as much of a mistake as other entropy API decisions...), a mistake which is finally getting rectified now by rnd_add_data_intr. XXX pullup-10	2023-08-04 07:38:53 +00:00
riastradh	140f7071dd	fileassoc(9): Fast paths to skip global locks when not in use. PR kern/57552	2023-08-02 07:11:31 +00:00
christos	2c545067c7	Add EPOLL_CLOEXEC (Theodore Preduta)	2023-07-30 18:31:13 +00:00
riastradh	f381a67eb4	timecounter(9): Nix trailing whitespace. No functional change intended.	2023-07-30 12:39:18 +00:00
rin	7c274a73de	sys_epoll: whitespace -> tab. no binary changes.	2023-07-30 04:39:00 +00:00
rin	9920a46aac	sys_memfd: Comply with our implicit naming convention; do_memfd_truncate() --> memfd_truncate_locked(). NFC.	2023-07-29 23:59:59 +00:00
rin	a00d9217b4	sys_memfd: Fix logic errors for offset in the previous.	2023-07-29 23:51:29 +00:00
christos	4ab15e90fb	Fix locking and offset issues pointed out by @riastradh (Theodore Preduta)	2023-07-29 17:54:54 +00:00
christos	d3ba7ba3a2	Add tests for t_memfd_create and fix bug found by tests	2023-07-29 12:16:34 +00:00
riastradh	ff3504bd37	sys: Rename sys/miscfd.h -> sys/memfd.h. Let's not create new dumping grounds for miscellaneous stuff; one header file for one purpose.	2023-07-29 08:46:47 +00:00
riastradh	1a841079ee	memfd(2): Convert KASSERT to CTASSERT. Move it closer to where it's relevant too.	2023-07-29 08:46:27 +00:00
pgoyette	2e3b428875	remove extra `_' to fix debug build	2023-07-29 04:06:32 +00:00
christos	63ea783feb	regen	2023-07-28 18:20:28 +00:00
christos	d11110f473	Add epoll(2) from Theodore Preduta as part of GSoC 2023	2023-07-28 18:18:59 +00:00
riastradh	002ee648a8	timecounter(9): Link to phk's timecounter paper for reference. No functional change intended.	2023-07-28 10:37:28 +00:00
riastradh	a263f0256c	kern: Restore non-atomic time_second symbol. This is used by savecore(8), vmstat(8), and possibly other things. XXX Should really teach dump and savecore(8) to use an intentionally designed header, rather than relying on kvm(3) -- and make it work for saving cores from other kernels so you don't have to boot the same buggy kernel to get a core dump.	2023-07-27 01:48:49 +00:00
riastradh	841a5477f2	autoconf(9): Print `waiting for devices' normally once a minute.	2023-07-18 11:57:37 +00:00
riastradh	86f2cee2fd	device_printf(9): Lock to avoid interleaving output. XXX pullup-9 XXX pullup-10	2023-07-17 22:57:35 +00:00
riastradh	113db6d7a3	timecounter(9): Sprinkle membar_consumer around th->th_generation. This code was apparently written under the misapprehension that membar_producer on one side is good enough, but that doesn't accomplish anything other than making the code unnecessarily obscure. For semantics, you always always always need memory barriers to come in pairs, with membar_consumer on the reading side if you want the membar_producer to have on the writing side to have any useful effect. It is unfortunate that this might hurt performance, but that's an unfortunate consequence of the design made without understanding memory barriers, not an unfortunate consequence of the memory barriers. If it is really critical for the read side to avoid memory barriers, then the write side needs to issue an IPI or xcall to effect memory barriers -- at higher cost to the write side, of course.	2023-07-17 21:51:45 +00:00
riastradh	69dc442726	timecounter(9): Use atomic_store_release/load_consume for timehands. This probably fixes real bugs on Alpha and makes the synchronization pattern clearer everywhere.	2023-07-17 21:51:30 +00:00
riastradh	f9c3bb074c	timecounter(9): Use seqlock for atomic snapshots of timebase.	2023-07-17 21:51:20 +00:00
riastradh	4dc2a56d7d	Revert "timecounter(9): Use an ipi barrier on time_second/uptime rollover." Evidently rump doesn't have ipi, so this won't work unless we have an alternate approach for rump.	2023-07-17 15:41:05 +00:00
riastradh	f748b08cbe	timecounter(9): No static; committed wrong version of patch.	2023-07-17 13:48:14 +00:00
riastradh	a41dd9cf49	timecounter(9): Limit scope of time__second/uptime. Relevant only if __HAVE_ATOMIC64_LOADSTORE -- not updated otherwise.	2023-07-17 13:44:24 +00:00
riastradh	acb819b891	timecounter(9): Use an ipi barrier on time_second/uptime rollover. This way we only need __insn_barrier, not membar_consumer, on the read side.	2023-07-17 13:42:23 +00:00
riastradh	0050e3876e	timecounter(9): Revert last -- timecounter_lock is already IPL_HIGH.	2023-07-17 13:42:02 +00:00
riastradh	3d1d26e54d	timecounter(9): Ward off interrupts during time_second/uptime update. Only relevant during 32-bit wraparound, so the potential performance impact of using splhigh here is negligible; indeed, we would have to go out of our way to exercise this in tests before it will ever happen in the next century.	2023-07-17 13:35:07 +00:00
riastradh	32fba50890	timecounter(9): Fix thinko in previous. Swapped the wrong variable in this mental macro expansion!	2023-07-17 13:29:12 +00:00
riastradh	5524172f85	kern: Make time_second and time_uptime macros that work atomically. These use atomic load on platforms with atomic 64-bit load, and seqlocks on platforms without. This has the unfortunate side effect of slightly reducing the real times available on 32-bit platforms, from ending some time in the year 584942417218 AD, available on 64-bit platforms, to ending some time in the year 584942417355 AD. But during that slightly shorter time, 32-bit platforms can avoid bugs arising from non-atomic access to time_uptime and time_second. Note: All platforms still have non-atomic access problems for bintime, binuptime, nanotime, nanouptime, &c. This can be addressed by putting a seqlock around timebasebin and possibly some other variable -- to be done in a later change. XXX kernel ABI change -- deleting symbols	2023-07-17 12:55:20 +00:00
riastradh	f485358332	kern: New struct syncobj::sobj_name member for diagnostics. XXX potential kernel ABI change -- not sure any modules actually use struct syncobj but it's hard to rule that out because sys/syncobj.h leaks into sys/lwp.h	2023-07-17 12:54:29 +00:00
riastradh	614a646114	kthread(9): Fix nested kthread_join. No reason for one kthread_join to interfere with another, or to cause non-cyclic dependencies to get stuck. Uses struct lwp::l_private for this purpose, which for user threads stores the tls pointer. I don't think anything in kthread(9) uses l_private -- generally kernel threads will use lwp specificdata. But maybe this should use a new member instead, or a union member with an existing pointer for the purpose.	2023-07-17 10:55:27 +00:00

1 2 3 4 5 ...

12093 Commits