might get an interrupt before re-enabling WP, and be rescheduled as a
result. In practice it never happens, because the previous PSL always
has interrupts disabled too.
is not happy in npf_reassembly, because NPC_IPFRAG is again returned after
the packet was reassembled.
I'm wondering whether it would not be better to just remove the fragment
header in frag6_input directly.
Reuse the attach1's test body for race1.
Add a new test race1:
Assert that await_zombie() in attach1 always finds a single
process and no other error is reported
race1 requires HAVE_PID in wait(2)-like function.
This test is executed in a loop for 5 seconds (16k iterations on Intel i7).
A buggy kernel was asserting an error within this timeframe almost always.
The bug in the kernel is now gone and this test is expected to pass
correctly.
Sponsored by <The NetBSD Foundation>
Add await_zombie_raw() that is the same as await_zombie(), whith an
addition of additional "useconds_t ms" parameter indicating delays between
new polling for a zombie process.
This new function will be used for testing a race condition that has been
observed occassionally crashing a test case -- returning duplicate entries
for KERN_PROC_PID.
Sponsored by <The NetBSD Foundation>
The first mistake was npf_inet.c rev1.37:
"Don't reassemble ipv6 fragments, instead treat the first fragment
as a regular packet (subject to filtering rules), and pass
subsequent fragments in the same group unconditionally."
Doing this was entirely wrong, because then a packet just had to push
the L4 payload in a secondary fragment, and NPF wouldn't apply rules on
it - meaning any IPv6 packet could bypass >=L4 filtering. This mistake
was supposed to be a fix for the second mistake.
The second mistake was that ip6_reass_packet (in npf_reassembly) was
getting called with npc->npc_hlen. But npc_hlen pointed to the last
encountered header in the IPv6 chain, which was not necessarily the
fragment header. So ip6_reass_packet was given garbage, and would fail,
resulting in the packet getting kicked. So basically IPv6 was broken by
NPF.
The first mistake is reverted, and the second one is fixed by doing:
- hlen = sizeof(struct ip6_frag);
+ hlen = 0;
Now the iteration stops on the fragment header, and the call to
ip6_reass_packet is valid.
My npf_inet.c rev1.38 is partially reverted: we don't need to worry
about failing properly to advance; once the packet is reassembled
npf_cache_ip gets called again, and this time the whole chain should be
there.
Tested with a simple UDPv6 server - send a 3000-byte-sized buffer, the
packet gets correctly reassembled by NPF now.
Swap the order of looking into zombie and all process lists, start now
with the zombie one. This prevents a race observed previously that the
same process could be detected on both lists during a single polling call.
While there:
- Short-circuit break for KERN_PROC_PID, once a pid has been detected.
- Removal of redundant "if (kbuf)" and "if (marker)" checks.
- Update of comments regarding potential optimization, explaining why we
don't want to it as of now. Performance gain from lookup call vs
iteration over a list is neglible on a regular system.
- Return ESRCH when no results have been found. This allows more easily
to implement a retry or abandon algorithm.
This corrects races observed in the existing ATF ptrace(2) tests, related
to await_zombie(). This function was expecting to check whether a process
has been transformed into a zombie, however it was causing occasional
crashes as it was overflowing the return buffer, returning the same pid
twice: once from allproc list and the second time from zombieproc one.
Fix suggested by <christos>
Short-circuit break suggested by <kre>
Discussed on tech-kern.
Sponsored by <The NetBSD Foundation>
That is, there was a comment "Ditto" - which once upon a time, was used
to indicate the the previous comment applied here as well. Time passed,
and software mutated, and the previous comment was unfortunately sacrificed.
Poor little Ditto was left all alone.
Noticed while doing some software archaeology.