Written by Hiroshi Noguchi, of which an updated version was posted to
port-mac68k in 2001.
Attachments were added to kernel configs for platforms that already had
the Cabletron (se.4) driver added, although other platorms may benefit.
Reviewed on tech-net by Izumi Tsutsui.
This just goes through my recent reference count membar audit and
changes membar_exit to membar_release and membar_enter to
membar_acquire -- this should make everything cheaper on most CPUs
without hurting correctness, because membar_acquire is generally
cheaper than membar_enter.
If two threads are using an object that is freed when the reference
count goes to zero, we need to ensure that all memory operations
related to the object happen before freeing the object.
Using an atomic_dec_uint_nv(&refcnt) == 0 ensures that only one
thread takes responsibility for freeing, but it's not enough to
ensure that the other thread's memory operations happen before the
freeing.
Consider:
Thread A Thread B
obj->foo = 42; obj->baz = 73;
mumble(&obj->bar); grumble(&obj->quux);
/* membar_exit(); */ /* membar_exit(); */
atomic_dec -- not last atomic_dec -- last
/* membar_enter(); */
KASSERT(invariant(obj->foo,
obj->bar));
free_stuff(obj);
The memory barriers ensure that
obj->foo = 42;
mumble(&obj->bar);
in thread A happens before
KASSERT(invariant(obj->foo, obj->bar));
free_stuff(obj);
in thread B. Without them, this ordering is not guaranteed.
So in general it is necessary to do
membar_exit();
if (atomic_dec_uint_nv(&obj->refcnt) != 0)
return;
membar_enter();
to release a reference, for the `last one out hit the lights' style
of reference counting. (This is in contrast to the style where one
thread blocks new references and then waits under a lock for existing
ones to drain with a condvar -- no membar needed thanks to mutex(9).)
I searched for atomic_dec to find all these. Obviously we ought to
have a better abstraction for this because there's so much copypasta.
This is a stop-gap measure to fix actual bugs until we have that. It
would be nice if an abstraction could gracefully handle the different
styles of reference counting in use -- some years ago I drafted an
API for this, but making it cover everything got a little out of hand
(particularly with struct vnode::v_usecount) and I ended up setting
it aside to work on psref/localcount instead for better scalability.
I got bored of adding #ifdef __HAVE_ATOMIC_AS_MEMBAR everywhere, so I
only put it on things that look performance-critical on 5sec review.
We should really adopt membar_enter_preatomic/membar_exit_postatomic
or something (except they are applicable only to atomic r/m/w, not to
atomic_load/store_*, making the naming annoying) and get rid of all
the ifdefs.
This code paths is entered by kthreads marked MP-safe, not just from
autoconf.
I'm not sure this is sufficient -- it's not clear to me whether
anything prevents concurrently scanning the same target. Someone with
a better understanding of scsi(4) locking will have to audit this.
(For example, maybe it is guaranteed only to happen only either (a)
in autoconf, or (b) in a thread that doesn't start until autoconf is
done. But I don't know -- and if it is this, it should be asserted
so we can verify it.)
Fix kernel freeze for wdc(4) variants with ATAC_CAP_NOIRQ:
(1) Change ata_xfer_ops:c_poll from void to int function. When it returns
ATAPOLL_AGAIN, let ata_xfer_start() iterate itself again.
(2) Let wdc_ata_bio_poll() return ATAPOLL_AGAIN until ATA_ITSDONE is
achieved.
A similar change has been made for mvsata(4) (see mvsata_bio_poll()),
and no functional changes for other devices.
This is how the drivers worked before jdolecek-ncq branch was merged.
Note that this changes are less likely to cause infinite recursion:
(1) wdc_ata_bio_intr() called from wdc_ata_bio_poll() asserts ATA_ITSDONE
in its error handling paths via wdc_ata_bio_done().
(2) Return value from c_start (= wdc_ata_bio_start()) is checked in
ata_xfer_start().
Therefore, errors encountered in ata_xfer_ops:c_poll and c_start routines
terminate the recursion for wdc(4). The situation is similar for mvsata(4).
Still, there is a possibility where ata_xfer_start() takes long time to
finish a normal operation. This can result in a delayed response for lower
priority interrupts. But, I've never observed such a situation, even when
heavy thrashing takes place for swap partition in wd(4).
"Go ahead" by jdolecek@.
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.
This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.
NetBSD 9.99.89
Just in case of uninitialized padding which would lead to kernel
stack disclosure. If the compiler can prove the memset redundant
then it can optimize it away; otherwise better safe than sorry.
I think the iwi(4), mcd(4), and ses(4) changes actually plug leaks;
the raidframe(4) change probably doesn't (but doesn't hurt).
These were originally made failable back in 2017 when if_initialize
allocated a softint in every interface for link state changes, so
that it could fail gracefully instead of panicking:
https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html
However, this spawned many seldom- or never-tested error branches,
which are risky to have around. And that softint in every interface
has since been replaced by a single global workqueue, because link
state changes require thread context but not low latency or high
throughput:
https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html
So there is no longer any reason for if_initialize to fail. (The
subroutine if_stats_init can't fail because percpu_alloc can't fail
either.)
There is a snag: the softint_establish in if_percpuq_create could
fail, potentially leading to bad consequences later on trying to use
the softint. This change doesn't introduce any new bugs because of
the snag -- if_percpuq_attach was already broken. However, the snag
can be better addressed without spawning error branches, either by
using a single softint or making softints less scarce.
(Separate commit will change the signatures of if_attach and
if_initialize to return void, scheduled to ride whatever is the next
convenient kernel bump.)
Patch and testing on amd64 and evbmips64-eb by maya@; commit message
soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.
auto-unloaded. Disable for now.
All of these need to be updated with an appropriate refcount mechanism
to ensure that the code and/or tables aren't unloaded while they are
being used.
Simplify and make extensible the config_search() / config_found() /
config_attach() interfaces: rather than having different variants for
which arguments you want pass along, just have a single call that
takes a variadic list of tag-value arguments.
Adjust all call sites:
- Simplify wherever possible; don't pass along arguments that aren't
actually needed.
- Don't be explicit about what interface attribute is attaching if
the device only has one. (More simplification.)
- Add a config_probe() function to be used in indirect configuiration
situations, making is visibly easier to see when indirect config is
in play, and allowing for future change in semantics. (As of now,
this is just a wrapper around config_match(), but that is an
implementation detail.)
Remove unnecessary or redundant interface attributes where they're not
needed.
There are currently 5 "cfargs" defined:
- CFARG_SUBMATCH (submatch function for direct config)
- CFARG_SEARCH (search function for indirect config)
- CFARG_IATTR (interface attribte)
- CFARG_LOCATORS (locators array)
- CFARG_DEVHANDLE (devhandle_t - wraps OFW, ACPI, etc. handles)
...and a sentinel value CFARG_EOL.
Add some extra sanity checking to ensure that interface attributes
aren't ambiguous.
Use CFARG_DEVHANDLE in MI FDT, OFW, and ACPI code, and macppc and shark
ports to associate those device handles with device_t instance. This
will trickle trough to more places over time (need back-end for pre-OFW
Sun OBP; any others?).