b5b89e9bc6
Setting the SO_REUSEADDR property on a socket allows binding to a port
number that is in the TIMED_WAIT state. This is usually done on listener
sockets, to enable a server to restart itself without having to wait for
the completion of TIMED_WAIT on the port.
It is also possible, but highly unusual, to set it on client sockets. It
is rare to explicitly bind() a client socket, since it is almost always
fine to allow the kernel to auto-bind a client socket to a random free
port. Most systems will have many 10's of 1000's of free ports that
client sockets will be bound to.
eg on Linux
$ sysctl -a | grep local_port
net.ipv4.ip_local_port_range = 32768 60999
eg on OpenBSD
$ sysctl -a | grep net.inet.ip.port
net.inet.ip.portfirst=1024
net.inet.ip.portlast=49151
net.inet.ip.porthifirst=49152
net.inet.ip.porthilast=65535
A connected socket must have a unique set of value for
(protocol, localip, localport, remoteip, remoteport)
otherwise it is liable to get EADDRINUSE.
A client connection should trivially avoid EADDRINUSE if letting the
kernel auto-assign the 'localport' value, which QEMU always does.
When QEMU sets SO_REUSEADDR on a client socket on OpenBSD, however, it
upsets this situation.
The OpenBSD kernel appears to happily pick a 'localport' that is in the
TIMED_WAIT state, even if there are many other available local ports
available for use that are not in the TIMED_WAIT state.
A test program that just loops opening client sockets will start seeing
EADDRINUSE on OpenBSD when as few as 2000 ports are in TIMED_WAIT,
despite 10's of 1000's ports still being unused. This contrasts with
Linux which appears to avoid picking local ports in TIMED_WAIT state.
This problem on OpenBSD exhibits itself periodically with the migration
test failing with a message like[1]:
qemu-system-ppc64: Failed to connect to '127.0.0.1:24109': Address already in use
While I have not been able to reproduce the OpenBSD failure in my own
testing, given the scope of what QEMU tests do, it is entirely possible
that there could be a lot of ports in TIMED_WAIT state when the
migration test runs.
Removing SO_REUSEADDR from the client sockets should not affect normal
QEMU usage, and should improve reliability on OpenBSD.
This use of SO_REUSEADDR on client sockets is highly unusual, and
appears to have been present since the very start of the QEMU socket
helpers in 2008. The orignal commit has no comment about the use of
SO_REUSEADDR on the client, so is most likely just an 16 year old
copy+paste bug.
[1] https://lists.nongnu.org/archive/html/qemu-devel/2024-10/msg03427.html
https://lists.nongnu.org/archive/html/qemu-devel/2024-02/msg01572.html
Fixes:
|
||
---|---|---|
.. | ||
aio-posix.c | ||
aio-posix.h | ||
aio-wait.c | ||
aio-win32.c | ||
aiocb.c | ||
async.c | ||
atomic64.c | ||
base64.c | ||
bitmap.c | ||
bitops.c | ||
block-helpers.c | ||
block-helpers.h | ||
buffer.c | ||
bufferiszero.c | ||
cacheflush.c | ||
chardev_open.c | ||
compatfd.c | ||
coroutine-sigaltstack.c | ||
coroutine-ucontext.c | ||
coroutine-windows.c | ||
cpuinfo-aarch64.c | ||
cpuinfo-i386.c | ||
cpuinfo-loongarch.c | ||
cpuinfo-ppc.c | ||
cpuinfo-riscv.c | ||
crc32c.c | ||
crc-ccitt.c | ||
cutils.c | ||
dbus.c | ||
defer-call.c | ||
drm.c | ||
envlist.c | ||
error-report.c | ||
error.c | ||
event_notifier-posix.c | ||
event_notifier-win32.c | ||
fdmon-epoll.c | ||
fdmon-io_uring.c | ||
fdmon-poll.c | ||
fifo8.c | ||
filemonitor-inotify.c | ||
filemonitor-stub.c | ||
getauxval.c | ||
guest-random.c | ||
hbitmap.c | ||
hexdump.c | ||
host-utils.c | ||
id.c | ||
int128.c | ||
interval-tree.c | ||
iov.c | ||
iova-tree.c | ||
keyval.c | ||
lockcnt.c | ||
log.c | ||
main-loop.c | ||
memalign.c | ||
memfd.c | ||
meson.build | ||
mmap-alloc.c | ||
module.c | ||
notify.c | ||
nvdimm-utils.c | ||
osdep.c | ||
oslib-posix.c | ||
oslib-win32.c | ||
path.c | ||
qdist.c | ||
qemu-co-shared-resource.c | ||
qemu-co-timeout.c | ||
qemu-config.c | ||
qemu-coroutine-io.c | ||
qemu-coroutine-lock.c | ||
qemu-coroutine-sleep.c | ||
qemu-coroutine.c | ||
qemu-option.c | ||
qemu-print.c | ||
qemu-progress.c | ||
qemu-sockets.c | ||
qemu-thread-common.h | ||
qemu-thread-posix.c | ||
qemu-thread-win32.c | ||
qemu-timer-common.c | ||
qemu-timer.c | ||
qht.c | ||
qsp.c | ||
qtree.c | ||
range.c | ||
rcu.c | ||
readline.c | ||
reserved-region.c | ||
selfmap.c | ||
stats64.c | ||
sys_membarrier.c | ||
systemd.c | ||
thread-context.c | ||
thread-pool.c | ||
throttle.c | ||
timed-average.c | ||
trace-events | ||
trace.h | ||
transactions.c | ||
unicode.c | ||
userfaultfd.c | ||
uuid.c | ||
vfio-helpers.c | ||
vhost-user-server.c | ||
yank.c |