Long ago, the storage representations of srtt and rttvar were changed
from the 4.4BSD scheme, and the comments are out of sync with the
code. This commit rewrites most of the comments that explain the RTO
calculations, and points out some issues in the code.
Joint work with Bev Schwartz of BBN (original analysis and comments),
but I have rewritten and extended them, so errors are mine.
This material is based upon work supported by the Defense Advanced
Research Projects Agency and Space and Naval Warfare Systems Center,
Pacific, under Contract No. N66001-09-C-2073. Approved for Public
Release, Distribution Unlimited
exactly at a page boundary, and the FORCE_SHORT_XFER was set by the
client (which causes that an empty descriptor is needed to terminate
the transfer), from Gordon McNutt per PR kern/44883
(fixed a bit differently than the proposed patch for aesthetical
reasons -- avoids the page pointer to come into unexpexted area earlier)
- driver attaches to xenbus(4) (no compile time option anymore)
- max reservation hypercall is fixed
- sysctl(7) entries are now in KiB (like Xentools) rather than in number
of pages
- some more explanations.
Our driver initializes the Broadcom hardware to peform a tcp and udp
checksum on only the payload of the tcp or udp packet, rather than the
entire packet. The FreeBSD, OpenBSD and Linux drivers instruct the hardware to compute
the checksum for the entire packet. I believe the bug is that some revisions
of the BCM hardware, under certain circumstances, revert to doing the
complete checksum calculation, as the FreeBSD, OpenBSD and Linux drivers
request, while things are running. As
a result, when we pull the computed checksum from the hardware and pass it
up to the upper layers, we assume the checksum is the more minimal
one, and the upper layers perform the appropriate checks, which, when this
happens, cause the packet to be rejected because the resultant checksum is
decidedly incorrect.
This patch changes the driver to instruct the hardware to perform the
checksum over the entire packet, just as the FreeBSD, OpenBSD and
Linux drivers do, and to notify the upper layers appropriately.
This patch appears to work on all revisions of the hardware that have been
tested. (See the list in the bug report.)
this patch is approved by tls.
- turns balloon into a driver that attaches to xenbus(4). This allows to
disable the functionality either at compile time or boot time via
userconf(4). Driver can implement detach or pmf(9) hooks if deemed
necessary.
- keeps Cherry's locking model, but simplify it a bit. There is now
only one target value serialized inside balloon, we do not feedback
alternative value to Xenstore (clients are not expected to see its value
evolve behind their back, and can't do much about that either)
- implements min threshold; this is an admin-settable value that tells
driver to "not balloon below this threshold." This can be used by domain
to keep memory reservations, useful if activity is expected in the near
future.
- in addition to min threshold, the driver implements internally a
safeguard value (uvmexp.freemin + 1MiB), so that admin cannot
inadvertently set min to a very low value forcing domain into heavy
memory pressure and swapping.
- create the sysctl(8) kern.xen.balloon tree. 4 nodes are actually present
(values are in KiB):
- min: (rw) an admin-settable value that prevents ballooning below this
mark
- max: (ro) the maximum size for reservation, as set by xm(1) mem-max.
- current: (ro) the current reservation for domain.
- target: (rw) the targetted reservation for domain.
- fix a few limitations here and there, most notably the max_reservation
hypercall, and KiB vs pages representations at interfaces.
The driver is still turned off by default. Enabling it would need more
approval, especially from bouyer@, cherry@ and cegger@.
FWIW: tested it two days long, from amd64 dom0 (with dom0 ballooning
enabled for xend), and bunch of domUs. Did not notice anything suspicious.
XXX it still has one big limitation: it cannot hotplug memory pages in
uvm(9) if they were not present beforehand. Example: ballooning above
physmem will give more pages to domain but it won't use it to serve
allocations, unless we teach uvm(9) how to handle the extra pages.
reversing the sense of the associated test and using the big block I
moved a couple versions back (and didn't reindent on purpose) as the
body of the if statement.
There are now no gotos in namei_oneroot, only normal loop logic.
now be changed to a loop break and another null test and goto outside
the loop. In neither of the other two cases for exiting the loop can
foundobj be null.
has an unconditional loop break at the end this can be done safely,
now that the other loop break has been patched out.
Add a spurious set of braces to preserve the indent for the moment.