It didn't work correctly because rumphijack for vmstat didn't work expectedly;
vmstat has the sgid bit for kvm(3) and that prevents rumphijack from working.
Address the issue by cloning a vmstat binary without the sgid bit temporarily
and using it for rumphijack. Note that it's a workaround. vmstat should stop
using kvm(3) for /dev/kmem and drop the sgid bit eventually.
A receiver of an ICMPv6 request packet creates a stale cache entry and it turns
into the delay state on replying the packet. After 5 second, the receiver sends
an NS packet as a reachability confirmation, which disturbs the test and causes
a unexpected result.
Should fix PR misc/54451
make the latter number show the actual number of ICMP packets the test
attempted to send. Thus, the two numbers can now be meaningfully
compared, and their difference indicates the number of packets lost.
KERN_PROC_CWD in sysctl(3)
That is kern.proc.$$.KERN_PROC_CWD (I think - not that it matters here)
The effect is that -lrump now requires -lrumpvfs
This set of changes fixes (I believe) regular dynamic builds,
more might be required for static builds (will be verified soon).
We use IPv6 NAT-T to avoid IPsec slowing down caused by dropping ESP packets
by some Customer Premises Equipments (CPE). I implement ATF to test such
situation.
I think it can also work with nat66, but I have not tested to the fine details.
If we do so, there will remain one route that is of a preceding address, but
that behavior is not documented and may be changed in the future. Tests
shouldn't rely on such a unstable behavior.
of what are left are "race for the bus" type - if we lose, we just
wait for the next one ... slower but still reliable.
There are two exceptions ... when starting more than one rtadvd
(on different routers) we expect to receive an RA from each, but
all that we can check is that we received the (at least) right number
of RAs. It is possible (though unlikely) that one router sent two
before another sent any, in which case we will not have the data we
expect, and a sub-test will fail.
Second, there is no way to know for sure that we have waited long
enough when we're waiting for data to expire - in systems with
correctly working clocks that actually measure time, this should not
be an issue, if data is due to expire in < 5 seconds, and we wait
5 seconds, and the data is still there, then that indicates a
failure, which should be detected. Unfortunately with QEMU testing
time just isn't that reliable. But fortunately, it is generally the
sleep which takes longer, while other timers run correctly, which is
the way that makes us happy...
While here lots of cleanups - everything from white space and
line wrapping, to removing superfluous quotes and adding some
(but probably not enough) that are not (though given the data is
all known here, lack of quotes will rarely hurt.)
Also take note of the fact that current rtadvd *cannot* delete its
pidfile, so waiting for that file to be removed is doomed to failure.
Do things in a way that works, rather than simply resorting to assassination.
Because we do a lot less "sleep and hope it is long enough" and more
"wait until it is observed to happen" the tests generally run in less
elapsed time than before (20% less has been observed.) But because we
"wait until it is observed to happen" rather than just "sleep and hope
it is long enough" sometimes things take longer (and when that happens,
we no longer fail). Up to 7% slower (overall) has been observed.
(Observations on an amd64 DomU, no idea yet as to what QEMU might observe.)
1. get pid of bg process with $! not $?
2. expect a single message from route monitor, not two, after ndp -d
3. run atf_check just once to verify correct output, not once for each string
1. get rid of the "$*" fetish.
2. more consistency (not complete .. yet) with RUMP_SERVER setting
3. white space (esp around pipe ('|') symbols.)
4. drop unnecessary \ line joining.
1. Be assertive when claiming the pid of the background route monitor command,
not polite... (ie: $! will give you the pid, $? is just 0 there).
2. Since "wait 0" simply (always) exits with status 127, immediately (we
know without thinking that we have no child with pid 0) the waits were
ineffective - now (after fix#1) they work .. which requires the
route monitor that watches the arp -d to exit after 1 message, not 2,
as 1 is all it gets. (If there really should be 2, someone needs to
find out why the kernel is sending only 1 - I am not that someone).
3. The file contents need to be read only once, no matter how many patterns
we need to look for, save some work, and do it that way (this is not
really a bug,m but saving time for the ATF tests is always a good thing.)
Not sure if this will stop it randomly failing on bablyon5, but it might.
(The likely cause is that the "route.monitor" has not flushed its stdout
buffers at the time the "grep -A 3" [aside: why that way to read the file??]
is performed, so fails to find its expected output ... the route monitor would
get an extra message once interfaces start being destroyed, I assume, and
would exit then, flushing its buffer, but by then it is too late.
If that is/was the cause, then it should be fixed now.)
detected as invalid, become the "someone" referred to in the
previous commit log, and add tests for 0 and 4095 as well, and
while here, throw in a few more that might elicit bugs.
And if the shell running the tests is able, add tests of a few
random vlan tags between 2 and 4093 (1 and 4094 are always tested)
to check that anything in range works (well, partially check...)