This simplifies error checking.
Cc: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221121085054.683122-7-armbru@redhat.com>
Since we already have bitmap_mutex to protect either the dirty bitmap or
the clear log bitmap, we don't need atomic operations to set/clear/test on
the clear log bitmap. Switching all ops from atomic to non-atomic
versions, meanwhile touch up the comments to show which lock is in charge.
Introduced non-atomic version of bitmap_test_and_clear_atomic(), mostly the
same as the atomic version but simplified a few places, e.g. dropped the
"old_bits" variable, and also the explicit memory barriers.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
If configuring with "--disable-system --disable-user --enable-guest-agent"
the linking currently fails with:
qga/qemu-ga.p/commands.c.o: In function `qmp_command_info':
build/../../home/thuth/devel/qemu/qga/commands.c:70: undefined reference to `qmp_command_name'
build/../../home/thuth/devel/qemu/qga/commands.c:71: undefined reference to `qmp_command_is_enabled'
build/../../home/thuth/devel/qemu/qga/commands.c:72: undefined reference to `qmp_has_success_response'
qga/qemu-ga.p/commands.c.o: In function `qmp_guest_info':
build/../../home/thuth/devel/qemu/qga/commands.c:82: undefined reference to `qmp_for_each_command'
qga/qemu-ga.p/commands.c.o: In function `qmp_guest_exec':
build/../../home/thuth/devel/qemu/qga/commands.c:410: undefined reference to `qbase64_decode'
qga/qemu-ga.p/channel-posix.c.o: In function `ga_channel_open':
build/../../home/thuth/devel/qemu/qga/channel-posix.c:214: undefined reference to `unix_listen'
build/../../home/thuth/devel/qemu/qga/channel-posix.c:228: undefined reference to `socket_parse'
build/../../home/thuth/devel/qemu/qga/channel-posix.c:234: undefined reference to `socket_listen'
qga/qemu-ga.p/commands-posix.c.o: In function `qmp_guest_file_write':
build/../../home/thuth/devel/qemu/qga/commands-posix.c:527: undefined reference to `qbase64_decode'
Let's make sure that we also compile and link the required files if
the system emulators have not been enabled.
Message-Id: <20221110083626.31899-1-thuth@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This reverts commit 59d1ce4439.
The "zpcii-disable" machine property is redundant with the "interpret"
zPCI device property. Remove it for clarification.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20221107161349.1032730-2-clg@kaod.org>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
If QEMU is started with `-D qemu.log.%d` without any `-d` option,
doing `log all` in the monitor fails with:
Filename template with '%d' required for 'tid'
It is confusing since '%d' was actually passed.
This happens because QEMU caches the log file name with %d converted
to getpid() since `tid` wasn't required. This name isn't suitable
for a subsequent enablement of per-thread logs. There's little cause
to change the behavior as `-d tid` is mostly used at user-only startup.
Drop the per-thread from the requested flags in this case : `log all`
will thus enable everything except `tid` instead of failing. This is
preferable over forcing the user to enable each log item individually.
With this patch, `tid` is now truely immutable : it can only be set
or unset from the command line and never changed afterwards.
Fixes: 4e51069d67 ("util/log: Support per-thread log files")
Cc: richard.henderson@linaro.org
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20221104120059.678470-3-groug@kaod.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Per-thread logging was implemented under the assumption that once
enabled, it is not possible to switch back to single file logging.
This isn't enforced though and it is possible to go through the
global file opening sequence in per-thread mode. The code isn't
ready for this and produces unexpected results as detailed below.
Start QEMU in system emulation mode with `-D ./qemu.log.%d -d tid`
and then change the log level from the monitor to something that
doesn't have tid, e.g. `log cpu_reset`. The value of log_flags
is zero and per_thread is set to false : the rest of the code
then assumes it is running in the global log case and opens a
file named `qemu.log.%d`, which is obviously not an expected
behavior.
Enforce the immutability of the flag early in qemu_set_log_internal()
so that its value is correct for all subsequent users.
Fixes: 4e51069d67 ("util/log: Support per-thread log files")
Cc: richard.henderson@linaro.org
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20221104120059.678470-2-groug@kaod.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
improve error handling during module load, by changing:
bool module_load(const char *prefix, const char *lib_name);
void module_load_qom(const char *type);
to:
int module_load(const char *prefix, const char *name, Error **errp);
int module_load_qom(const char *type, Error **errp);
where the return value is:
-1 on module load error, and errp is set with the error
0 on module or one of its dependencies are not installed
1 on module load success
2 on module load success (module already loaded or built-in)
module_load_qom_one has been introduced in:
commit 28457744c3 ("module: qom module support"), which built on top of
module_load_one, but discarded the bool return value. Restore it.
Adapt all callers to emit errors, or ignore them, or fail hard,
as appropriate in each context.
Replace the previous emission of errors via fprintf in _some_ error
conditions with Error and error_report, so as to emit to the appropriate
target.
A memory leak is also fixed as part of the module_load changes.
audio: when attempting to load an audio module, report module load errors.
Note that still for some callers, a single issue may generate multiple
error reports, and this could be improved further.
Regarding the audio code itself, audio_add() seems to ignore errors,
and this should probably be improved.
block: when attempting to load a block module, report module load errors.
For the code paths that already use the Error API, take advantage of those
to report module load errors into the Error parameter.
For the other code paths, we currently emit the error, but this could be
improved further by adding Error parameters to all possible code paths.
console: when attempting to load a display module, report module load errors.
qdev: when creating a new qdev Device object (DeviceState), report load errors.
If a module cannot be loaded to create that device, now abort execution
(if no CONFIG_MODULE) or exit (if CONFIG_MODULE).
qom/object.c: when initializing a QOM object, or looking up class_by_name,
report module load errors.
qtest: when processing the "module_load" qtest command, report errors
in the load of the module.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220929093035.4231-4-cfontana@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
mayfail is always passed as false for every invocation throughout the program.
It controls whether to printf or not to printf an error on
g_module_open failure.
Remove this unused argument.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220929093035.4231-2-cfontana@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
WaitForMultipleObjects() can only wait for MAXIMUM_WAIT_OBJECTS
object handles. Correct the event array size in aio_poll() and
add a assert() to ensure it does not cause out of bound access.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20221019102015.2441622-3-bmeng.cn@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix the logic in qemu_add_wait_object() to avoid adding the same
HANDLE twice, as the behavior is undefined when passing an array
that contains same HANDLEs to WaitForMultipleObjects() API.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <20221019102015.2441622-2-bmeng.cn@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The maximum number of wait objects for win32 should be
MAXIMUM_WAIT_OBJECTS, not MAXIMUM_WAIT_OBJECTS + 1.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <20221019102015.2441622-1-bmeng.cn@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Some more small s390x fixes and maintainer updates
* Make sure to remove all temporary files from qtests
* OpenBSD VM test update to version 7.2
* Add sndio to FreeBSD tests
* More patches to enable the qtests on Windows
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmNb1x8RHHRodXRoQHJl
ZGhhdC5jb20ACgkQLtnXdP5wLbXmcA//TCliiFkhprVxzIqy7zb9uz2Odu+sS4dT
azUSlXvC14fECm/Rb/rd2VLqCu5x2er8CYauxKQ4VhRImzcDta4kvpt/HKIppN2t
sqw5tipJL0DYcWBwYL1llvfutM26M+Oh0igwR8uV7b+W1FjojEZdcOr9IZ6E6V55
wQCE5OHm0VCr61QeI5IBfZTsiPo+DFomUCpj7w66j6i0CVDvmpoe36tCmvGgrcpZ
SP7ep7/Iq+dnGh2YnJyoUOPlXeeiBCxAygOVnIRXptDeniGoliCFn7ksLdKDQ9qY
69pSPR/W7mTZB/HkCRalAbYuYrI9Rcqxdu6c9vcyB8Pr0snQLTf8qThY+BJ2oC4w
JSGgWVniAk5MmrDazwNRkSbgngYLYf+CcT1h5AANuU5Kt50Bdy9Y3TuL5YVmofEp
N4bypV0ICImQyDECz76+i5/iJOcWiRyjMfLT6y00dspeuy983xHakrsHGD8xj0U/
3IVxnF9bDnUSVg6lFhYrgCB3dRG1TNPJoYQOM7raS5MAPRrDtIuSabwtyn84jo4+
9kZRPJBriMBHNsCjGVlJ9CATmaK1SKVAbRcabjgOKoIwhZTpAe6JalykREUJlTys
hB2V//lWWYPaSpzwY+OkvxoOmJIziixEskOmx6hPcoxID5v/bqlR69W15aUlKuLq
VWFb+/yMvaE=
=h0Ep
-----END PGP SIGNATURE-----
Merge tag 'pull-request-2022-10-28' of https://gitlab.com/thuth/qemu into staging
* Fix and test the VISTR instruction on s390x
* Some more small s390x fixes and maintainer updates
* Make sure to remove all temporary files from qtests
* OpenBSD VM test update to version 7.2
* Add sndio to FreeBSD tests
* More patches to enable the qtests on Windows
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmNb1x8RHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbXmcA//TCliiFkhprVxzIqy7zb9uz2Odu+sS4dT
# azUSlXvC14fECm/Rb/rd2VLqCu5x2er8CYauxKQ4VhRImzcDta4kvpt/HKIppN2t
# sqw5tipJL0DYcWBwYL1llvfutM26M+Oh0igwR8uV7b+W1FjojEZdcOr9IZ6E6V55
# wQCE5OHm0VCr61QeI5IBfZTsiPo+DFomUCpj7w66j6i0CVDvmpoe36tCmvGgrcpZ
# SP7ep7/Iq+dnGh2YnJyoUOPlXeeiBCxAygOVnIRXptDeniGoliCFn7ksLdKDQ9qY
# 69pSPR/W7mTZB/HkCRalAbYuYrI9Rcqxdu6c9vcyB8Pr0snQLTf8qThY+BJ2oC4w
# JSGgWVniAk5MmrDazwNRkSbgngYLYf+CcT1h5AANuU5Kt50Bdy9Y3TuL5YVmofEp
# N4bypV0ICImQyDECz76+i5/iJOcWiRyjMfLT6y00dspeuy983xHakrsHGD8xj0U/
# 3IVxnF9bDnUSVg6lFhYrgCB3dRG1TNPJoYQOM7raS5MAPRrDtIuSabwtyn84jo4+
# 9kZRPJBriMBHNsCjGVlJ9CATmaK1SKVAbRcabjgOKoIwhZTpAe6JalykREUJlTys
# hB2V//lWWYPaSpzwY+OkvxoOmJIziixEskOmx6hPcoxID5v/bqlR69W15aUlKuLq
# VWFb+/yMvaE=
# =h0Ep
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 28 Oct 2022 09:20:31 EDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2022-10-28' of https://gitlab.com/thuth/qemu: (21 commits)
tests/qtest: libqtest: Correct the timeout unit of blocking receive calls for win32
tests/qtest: libqos: Do not build virtio-9p unconditionally
tests/qtest: migration-test: Make sure QEMU process "to" exited after migration is canceled
tests/qtest: libqtest: Introduce qtest_wait_qemu()
tests/qtest: Use EXIT_FAILURE instead of magic number
tests/qtest: device-plug-test: Reverse the usage of double/single quotes
tests/qtest: Support libqtest to build and run on Windows
tests/qtest: Use send/recv for socket communication
accel/qtest: Support qtest accelerator for Windows
tests: Add sndio to the FreeBSD CI containers / VM
tests/vm: update openbsd to release 7.2
tests/qtest/libqos/e1000e: Use e1000_regs.h
tests/qtest/cxl-test: Remove temporary directories after testing
tests/qtest/tpm: Clean up remainders of swtpm
MAINTAINERS: target/s390x/: add Ilya as reviewer
tests/tcg/s390x: Add a test for the vistr instruction
target/s390x: Fix emulation of the VISTR instruction
tests/tcg/s390x: Test compiler flags only once, not every time
s390x/tod-kvm: don't save/restore the TOD in PV guests
s390x: step down as general arch maintainer
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When `-D ${logfile} -d tid` is passed, qemu_log_trylock() creates
a dedicated log file for the current thread and opens it. The
corresponding file descriptor is cached in a __thread variable.
Nothing is done to close the corresponding file descriptor when the
thread terminates though and the file descriptor is leaked.
The issue was found during code inspection and reproduced manually.
Fix that with an atexit notifier.
Fixes: 4e51069d67 ("util/log: Support per-thread log files")
Cc: richard.henderson@linaro.org
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20221021105734.555797-1-groug@kaod.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds support for asynchronously tearing down a VM on Linux.
When qemu terminates, either naturally or because of a fatal signal,
the VM is torn down. If the VM is huge, it can take a considerable
amount of time for it to be cleaned up. In case of a protected VM, it
might take even longer than a non-protected VM (this is the case on
s390x, for example).
Some users might want to shut down a VM and restart it immediately,
without having to wait. This is especially true if management
infrastructure like libvirt is used.
This patch implements a simple trick on Linux to allow qemu to return
immediately, with the teardown of the VM being performed
asynchronously.
If the new commandline option -async-teardown is used, a new process is
spawned from qemu at startup, using the clone syscall, in such way that
it will share its address space with qemu.The new process will have the
name "cleanup/<QEMU_PID>". It will wait until qemu terminates
completely, and then it will exit itself.
This allows qemu to terminate quickly, without having to wait for the
whole address space to be torn down. The cleanup process will exit
after qemu, so it will be the last user of the address space, and
therefore it will take care of the actual teardown. The cleanup
process will share the same cgroups as qemu, so both memory usage and
cpu time will be accounted properly.
If possible, close_range will be used in the cleanup process to close
all open file descriptors. If it is not available or if it fails, /proc
will be used to determine which file descriptors to close.
If the cleanup process is forcefully killed with SIGKILL before the
main qemu process has terminated completely, the mechanism is defeated
and the teardown will not be asynchronous.
This feature can already be used with libvirt by adding the following
to the XML domain definition to pass the parameter to qemu directly:
<commandline xmlns="http://libvirt.org/schemas/domain/qemu/1.0">
<arg value='-async-teardown'/>
</commandline>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Tested-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Message-Id: <20220812133453.82671-1-imbrenda@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Many LUKS header robustness checks
* Fix TLS PSK error reporting
* Enable LUKS creation on macOS
* Report useful errnos from seccomp
* I/O chanel Windows portability fix
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE2vOm/bJrYpEtDo4/vobrtBUQT98FAmNawAcACgkQvobrtBUQ
T9/pWA/9FXE6kvkv9YQhb/h1rMALO1aLKqUG/jWKP/mzqqLpDKHxxPin/nw8RYff
xyHt5mC7t1g7a8FFMlXxFHw1WE9o46j3tQg2IokWlX2ossYaZQx+BVv4s1zjTxcK
KPVKWoEqN5sfa2T7gUGbfZ+dH9LSZ29DRT+GrO9YEvjdSg0yUKHXPetjw6iw5OVT
GuI22xOVKbuCBf7PW/nvUe/6prxAfc7IavvAusrdkMFXymcys87q7ZCxGYEsDxyC
vUkLdAoB9kcjwvmU+sZl9WhjasRQkUxW8zCToKea4TSS1fp5pgVL0TT4x7yq7ts4
nqnaqiSTBfRda62lF64A9lM91K7hbDqPC33FkCNKWJGsQAYIFvdVJdqJsvZHUr1/
3KyHkXMsyzRfGnT7MHK+GpwcgvTupBP8ceiyYq28CLNAKXpXb6vmJIsIAdF3UaYi
N320ogiU3iRmkqdbbbGTpBB40UQvQvdbmqKTTDmigLdpDL2TLzAqfpu1zepg+7xE
wcXoPM9ZcRSwM7i9QyPMtjharCTeVR/QPlUN9agDGOlzNpUahIC5YrmCVKXNunnE
M259Ytyb6ymaMrsHgshW1gJP3327N/lIOp5yLLHEzgLM1xAGOaDP83FsF8JA/Zsd
f1he75N3KbDPYhgrdfFfitcO8F8zvhK3AqyqNDPCpJKVSeKKqFE=
=qrzm
-----END PGP SIGNATURE-----
Merge tag 'misc-next-pull-request' of https://gitlab.com/berrange/qemu into staging
pull: crypto and io queue
* Many LUKS header robustness checks
* Fix TLS PSK error reporting
* Enable LUKS creation on macOS
* Report useful errnos from seccomp
* I/O chanel Windows portability fix
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE2vOm/bJrYpEtDo4/vobrtBUQT98FAmNawAcACgkQvobrtBUQ
# T9/pWA/9FXE6kvkv9YQhb/h1rMALO1aLKqUG/jWKP/mzqqLpDKHxxPin/nw8RYff
# xyHt5mC7t1g7a8FFMlXxFHw1WE9o46j3tQg2IokWlX2ossYaZQx+BVv4s1zjTxcK
# KPVKWoEqN5sfa2T7gUGbfZ+dH9LSZ29DRT+GrO9YEvjdSg0yUKHXPetjw6iw5OVT
# GuI22xOVKbuCBf7PW/nvUe/6prxAfc7IavvAusrdkMFXymcys87q7ZCxGYEsDxyC
# vUkLdAoB9kcjwvmU+sZl9WhjasRQkUxW8zCToKea4TSS1fp5pgVL0TT4x7yq7ts4
# nqnaqiSTBfRda62lF64A9lM91K7hbDqPC33FkCNKWJGsQAYIFvdVJdqJsvZHUr1/
# 3KyHkXMsyzRfGnT7MHK+GpwcgvTupBP8ceiyYq28CLNAKXpXb6vmJIsIAdF3UaYi
# N320ogiU3iRmkqdbbbGTpBB40UQvQvdbmqKTTDmigLdpDL2TLzAqfpu1zepg+7xE
# wcXoPM9ZcRSwM7i9QyPMtjharCTeVR/QPlUN9agDGOlzNpUahIC5YrmCVKXNunnE
# M259Ytyb6ymaMrsHgshW1gJP3327N/lIOp5yLLHEzgLM1xAGOaDP83FsF8JA/Zsd
# f1he75N3KbDPYhgrdfFfitcO8F8zvhK3AqyqNDPCpJKVSeKKqFE=
# =qrzm
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 27 Oct 2022 13:29:43 EDT
# gpg: using RSA key DAF3A6FDB26B62912D0E8E3FBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" [full]
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>" [full]
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* tag 'misc-next-pull-request' of https://gitlab.com/berrange/qemu:
crypto: add test cases for many malformed LUKS header scenarios
crypto: ensure LUKS tests run with GNUTLS crypto provider
crypto: quote algorithm names in error messages
crypto: split off helpers for converting LUKS header endianess
crypto: split LUKS header definitions off into file
crypto: check that LUKS PBKDF2 iterations count is non-zero
crypto: strengthen the check for key slots overlapping with LUKS header
crypto: validate that LUKS payload doesn't overlap with header
crypto: enforce that key material doesn't overlap with LUKS header
crypto: enforce that LUKS stripes is always a fixed value
crypto: sanity check that LUKS header strings are NUL-terminated
tests: avoid DOS line endings in PSK file
crypto: check for and report errors setting PSK credentials
scripts: check if .git exists before checking submodule status
seccomp: Get actual errno value from failed seccomp functions
io/channel-watch: Fix socket watch on Windows
io/channel-watch: Drop the unnecessary cast
io/channel-watch: Drop a superfluous '#ifdef WIN32'
util/qemu-sockets: Use g_get_tmp_dir() to get the directory for temporary files
crypto/luks: Support creating LUKS image on Darwin
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Socket communication in the libqtest and libqmp codes uses read()
and write() which work on any file descriptor on *nix, and sockets
in *nix are an example of a file descriptor.
However sockets on Windows do not use *nix-style file descriptors,
so read() and write() cannot be used on sockets on Windows.
Switch over to use send() and recv() instead which work on both
Windows and *nix.
Signed-off-by: Xuzhou Cheng <xuzhou.cheng@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20221028045736.679903-3-bin.meng@windriver.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
To be consistent with socket_uri(), add 'tcp:' prefix for inet type in
socket_parse(), by default socket_parse() use tcp when no prefix is
provided (format is host:port).
In socket_uri(), use 'vsock:' prefix for vsock type rather than 'tcp:'
because it makes a vsock address look like an inet address with CID
misinterpreted as host.
Goes back to commit 9aca82ba31 "migration: Create socket-address parameter"
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Rename SocketAddress_to_str() to socket_uri() and move it to
util/qemu-sockets.c close to socket_parse().
socket_uri() generates a string from a SocketAddress while
socket_parse() generates a SocketAddress from a string.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
... and implement it under POSIX. When a ThreadContext is provided,
create new threads via the context such that these new threads obtain a
properly configured CPU affinity.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Message-Id: <20221014134720.168738-6-david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's make it easier to pin threads created via a ThreadContext to
all host CPUs currently belonging to a given set of host NUMA nodes --
which is the common case.
"node-affinity" is simply a shortcut for setting "cpu-affinity" manually
to the list of host CPUs belonging to the set of host nodes. This property
can only be written.
A simple QEMU example to set the CPU affinity to host node 1 on a system
with two nodes, 24 CPUs each, whereby odd-numbered host CPUs belong to
host node 1:
qemu-system-x86_64 -S \
-object thread-context,id=tc1,node-affinity=1
And we can query the cpu-affinity via HMP/QMP:
(qemu) qom-get tc1 cpu-affinity
[
1,
3,
5,
7,
9,
11,
13,
15,
17,
19,
21,
23,
25,
27,
29,
31,
33,
35,
37,
39,
41,
43,
45,
47
]
We cannot query the node-affinity:
(qemu) qom-get tc1 node-affinity
Error: Insufficient permission to perform this operation
But note that due to dynamic library loading this example will not work
before we actually make use of thread_context_create_thread() in QEMU
code, because the type will otherwise not get registered. We'll wire
this up next to make it work.
Note that if the host CPUs for a host node change due do CPU hot(un)plug
CPU onlining/offlining (i.e., lscpu output changes) after the ThreadContext
was started, the CPU affinity will not get updated.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221014134720.168738-5-david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Setting the CPU affinity of QEMU threads is a bit problematic, because
QEMU doesn't always have permissions to set the CPU affinity itself,
for example, with seccomp after initialized by QEMU:
-sandbox enable=on,resourcecontrol=deny
General information about CPU affinities can be found in the man page of
taskset:
CPU affinity is a scheduler property that "bonds" a process to a given
set of CPUs on the system. The Linux scheduler will honor the given CPU
affinity and the process will not run on any other CPUs.
While upper layers are already aware of how to handle CPU affinities for
long-lived threads like iothreads or vcpu threads, especially short-lived
threads, as used for memory-backend preallocation, are more involved to
handle. These threads are created on demand and upper layers are not even
able to identify and configure them.
Introduce the concept of a ThreadContext, that is essentially a thread
used for creating new threads. All threads created via that context
thread inherit the configured CPU affinity. Consequently, it's
sufficient to create a ThreadContext and configure it once, and have all
threads created via that ThreadContext inherit the same CPU affinity.
The CPU affinity of a ThreadContext can be configured two ways:
(1) Obtaining the thread id via the "thread-id" property and setting the
CPU affinity manually (e.g., via taskset).
(2) Setting the "cpu-affinity" property and letting QEMU try set the
CPU affinity itself. This will fail if QEMU doesn't have permissions
to do so anymore after seccomp was initialized.
A simple QEMU example to set the CPU affinity to host CPU 0,1,6,7 would be:
qemu-system-x86_64 -S \
-object thread-context,id=tc1,cpu-affinity=0-1,cpu-affinity=6-7
And we can query it via HMP/QMP:
(qemu) qom-get tc1 cpu-affinity
[
0,
1,
6,
7
]
But note that due to dynamic library loading this example will not work
before we actually make use of thread_context_create_thread() in QEMU
code, because the type will otherwise not get registered. We'll wire
this up next to make it work.
In general, the interface behaves like pthread_setaffinity_np(): host
CPU numbers that are currently not available are ignored; only host CPU
numbers that are impossible with the current kernel will fail. If the
list of host CPU numbers does not include a single CPU that is
available, setting the CPU affinity will fail.
A ThreadContext can be reused, simply by reconfiguring the CPU affinity.
Note that the CPU affinity of previously created threads will not get
adjusted.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221014134720.168738-4-david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Usually, we let upper layers handle CPU pinning, because
pthread_setaffinity_np() (-> sched_setaffinity()) is blocked via
seccomp when starting QEMU with
-sandbox enable=on,resourcecontrol=deny
However, we want to configure and observe the CPU affinity of threads
from QEMU directly in some cases when the sandbox option is either not
enabled or not active yet.
So let's add a way to configure CPU pinning via
qemu_thread_set_affinity() and obtain CPU affinity via
qemu_thread_get_affinity() and implement them under POSIX using
pthread_setaffinity_np() + pthread_getaffinity_np().
Implementation under Windows is possible using SetProcessAffinityMask()
+ GetProcessAffinityMask(), however, that is left as future work.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Message-Id: <20221014134720.168738-3-david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's
* give the function a "qemu_*" style name
* make sure the parameters in the implementation match the prototype
* rename smp_cpus to max_threads, which makes the semantics of that
parameter clearer
... and add a function documentation.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Message-Id: <20221014134720.168738-2-david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
When a RAMBlockNotifier is added, ->ram_block_added() is called with all
existing RAMBlocks. There is no equivalent ->ram_block_removed() call
when a RAMBlockNotifier is removed.
The util/vfio-helpers.c code (the sole user of RAMBlockNotifier) is fine
with this asymmetry because it does not rely on RAMBlockNotifier for
cleanup. It walks its internal list of DMA mappings and unmaps them by
itself.
Future users of RAMBlockNotifier may not have an internal data structure
that records added RAMBlocks so they will need ->ram_block_removed()
callbacks.
This patch makes ram_block_notifier_remove() symmetric with respect to
callbacks. Now util/vfio-helpers.c needs to unmap remaining DMA mappings
after ram_block_notifier_remove() has been called. This is necessary
since users like block/nvme.c may create additional DMA mappings that do
not originate from the RAMBlockNotifier.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20221013185908.1297568-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When a coroutine wakes up it may determine that it must re-queue.
Normally coroutines are pushed onto the back of the CoQueue, but for
fairness it may be necessary to push it onto the front of the CoQueue.
Add a flag to specify that the coroutine should be pushed onto the front
of the CoQueue. A later patch will use this to ensure fairness in the
bounce buffer CoQueue used by the blkio BlockDriver.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20221013185908.1297568-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Replace the existing logic to get the directory for temporary files
with g_get_tmp_dir(), which works for win32 too.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
According to msdn documentation and Linux man pages, send() should try
to send as much as possible in blocking mode, while recv() may return
earlier with a smaller available amount, we should try to continue
send/recv from there.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20221006113657.2656108-3-marcandre.lureau@redhat.com>
With a pipe or other reasons, read/write may return less than the
requested bytes. This happens with the test-io-channel-command test on
Windows. glib spawn code uses a binary pipe of 4096 bytes, and the first
read returns that much (although more are requested), for some unclear
reason...
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20221006113657.2656108-2-marcandre.lureau@redhat.com>
As described in:
https://learn.microsoft.com/en-us/visualstudio/debugger/how-to-set-a-thread-name-in-native-code?view=vs-2022
SetThreadDescription() is available since Windows 10, version 1607 and
in some versions only by "Run Time Dynamic Linking". Its declaration is
not yet in mingw, so we lookup the function the same way glib does.
Tested with Visual Studio Community 2022 debugger.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Reviewed-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-23-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qemu_coroutine_get_aio_context inspects a coroutine, but it does
not have to be called from the coroutine itself (or from any
coroutine).
Reviewed-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-6-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qemu_socketpair() will create a pair of connected sockets
with FD_CLOEXEC set
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <17fa1eff729eeabd9a001f4639abccb127ceec81.1661240709.git.tugy@chinatelecom.cn>
The zpcii-disable machine property can be used to force-disable the use
of zPCI interpretation facilities for a VM. By default, this setting
will be off for machine 7.2 and newer.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Message-Id: <20220902172737.170349-9-mjrosato@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[thuth: Fix contextual conflict in ccw_machine_7_1_instance_options()]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Version: GnuPG v1
iQEcBAABAgAGBQJjEaMLAAoJEO8Ells5jWIRoRwIAJpwefLgH/+lkd1mtWqxBhuS
KLa0bkcS6nIGnjQzNX/XWipu/5tMbBLzbaKw0myodvoK6Yx0MFog1cWf6gLHuvWH
Jy3ONUrF9umHYuOa9sJJtXv/aP7neNJSB3RW67BaiLCLkaetDj9lLciA/KKMvb/I
JNFtuLVTPibZ5iVTjvifFWmJD/Yk0P8mlrH5yfrA3B2EaaWf1es0GWobGIwwLu9s
ZSqjhMDAhfOW2E1sBh7jFRh4lJX1t1jRhyIGx2bOXevPx2hFHq6FSq+yuJ9OsZvO
wC8mC4DD+fovypDWbv3WLslIejM0+THD8KuBQnZtKX5Mbhc+0cELpIFLUdH95TM=
=eMUT
-----END PGP SIGNATURE-----
Merge tag 'net-pull-request' of https://github.com/jasowang/qemu into staging
# -----BEGIN PGP SIGNATURE-----
# Version: GnuPG v1
#
# iQEcBAABAgAGBQJjEaMLAAoJEO8Ells5jWIRoRwIAJpwefLgH/+lkd1mtWqxBhuS
# KLa0bkcS6nIGnjQzNX/XWipu/5tMbBLzbaKw0myodvoK6Yx0MFog1cWf6gLHuvWH
# Jy3ONUrF9umHYuOa9sJJtXv/aP7neNJSB3RW67BaiLCLkaetDj9lLciA/KKMvb/I
# JNFtuLVTPibZ5iVTjvifFWmJD/Yk0P8mlrH5yfrA3B2EaaWf1es0GWobGIwwLu9s
# ZSqjhMDAhfOW2E1sBh7jFRh4lJX1t1jRhyIGx2bOXevPx2hFHq6FSq+yuJ9OsZvO
# wC8mC4DD+fovypDWbv3WLslIejM0+THD8KuBQnZtKX5Mbhc+0cELpIFLUdH95TM=
# =eMUT
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 02 Sep 2022 02:30:35 EDT
# gpg: using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [full]
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* tag 'net-pull-request' of https://github.com/jasowang/qemu: (21 commits)
net: tulip: Restrict DMA engine to memories
net/colo.c: Fix the pointer issue reported by Coverity.
vdpa: Delete CVQ migration blocker
vdpa: Add virtio-net mac address via CVQ at start
vhost_net: add NetClientState->load() callback
vdpa: extract vhost_vdpa_net_cvq_add from vhost_vdpa_net_handle_ctrl_avail
vdpa: Move command buffers map to start of net device
vdpa: add net_vhost_vdpa_cvq_info NetClientInfo
vhost_net: Add NetClientInfo stop callback
vhost_net: Add NetClientInfo start callback
vhost: Do not depend on !NULL VirtQueueElement on vhost_svq_flush
vhost: Delete useless read memory barrier
vhost: use SVQ element ndescs instead of opaque data for desc validation
vhost: stop transfer elem ownership in vhost_handle_guest_kick
vdpa: Use ring hwaddr at vhost_vdpa_svq_unmap_ring
vhost: Always store new kick fd on vhost_svq_set_svq_kick_fd
vdpa: Make SVQ vring unmapping return void
vdpa: Remove SVQ vring from iova_tree at shutdown
util: accept iova_tree_remove_parameter by value
vdpa: do not save failed dma maps in SVQ iova tree
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Support for the unix socket has existed both in BSD and Linux for the
longest time, but not on Windows. Since Windows 10 build 17063 [1],
the native support for the unix socket has come to Windows. Starting
this build, two Win32 processes can use the AF_UNIX address family
over Winsock API to communicate with each other.
[1] https://devblogs.microsoft.com/commandline/af_unix-comes-to-windows/
Signed-off-by: Xuzhou Cheng <xuzhou.cheng@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220802075200.907360-3-bmeng.cn@gmail.com>
It's convenient to call iova_tree_remove from a map returned from
iova_tree_find or iova_tree_find_iova. With the current code this is not
possible, since we will free it, and then we will try to search for it
again.
Fix it making accepting the map by value, forcing a copy of the
argument. Not applying a fixes tag, since there is no use like that at
the moment.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The last user of this function has just been removed, so we can
drop this function now, too.
Message-Id: <20220810125720.3849835-4-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Commit 06680b15b4 moved qemu_*_exec_dir() to cutils but forgot
to move the macOS dyld(3) include, resulting in the following
error (when building with Homebrew GCC on macOS Monterey 12.4):
[313/1197] Compiling C object libqemuutil.a.p/util_cutils.c.o
FAILED: libqemuutil.a.p/util_cutils.c.o
../../util/cutils.c:1039:13: error: implicit declaration of function '_NSGetExecutablePath' [-Werror=implicit-function-declaration]
1039 | if (_NSGetExecutablePath(fpath, &len) == 0) {
| ^~~~~~~~~~~~~~~~~~~~
../../util/cutils.c:1039:13: error: nested extern declaration of '_NSGetExecutablePath' [-Werror=nested-externs]
Fix by moving the include line to cutils.
Fixes: 06680b15b4 ("include: move qemu_*_exec_dir() to cutils")
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20220809222046.30812-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
close() is a *nix function. It works on any file descriptor, and
sockets in *nix are an example of a file descriptor.
closesocket() is a Windows-specific function, which works only
specifically with sockets. Sockets on Windows do not use *nix-style
file descriptors, and socket() returns a handle to a kernel object
instead, so it must be closed with closesocket().
In QEMU there is already a logic to handle such platform difference
in os-posix.h and os-win32.h, that:
* closesocket maps to close on POSIX
* closesocket maps to a wrapper that calls the real closesocket()
on Windows
Replace the call to close a socket with closesocket() instead.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
A recent commit moved some Haiku-specific code parts from oslib-posix.c
to cutils.c, but failed to move the corresponding header #include
statement, too, so "make vm-build-haiku.x86_64" is currently broken.
Fix it by moving the header #include, too.
Fixes: 06680b15b4 ("include: move qemu_*_exec_dir() to cutils")
Message-Id: <20220718172026.139004-1-thuth@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Before this change, the directory of the executable was being added to
resolve modules in the build tree. However, get_relocated_path() can now
resolve them with the new bundle mechanism.
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Message-Id: <20220624145039.49929-5-akihiko.odaki@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Developers often run QEMU without installing. The bundle mechanism
allows to look up files which should be present in installation even in
such a situation.
It is a general mechanism and can find any files in the installation
tree. The build tree will have a new directory, qemu-bundle, to
represent what files the installation tree would have for reference by
the executables.
Note that it abandons compatibility with Windows older than 8. The
extended support for the prior version, 7 ended more than 2 years ago,
and it is unlikely that someone would like to run the latest QEMU on
such an old system.
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220624145039.49929-3-akihiko.odaki@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add new API, to make a time limited call of the coroutine.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
It always returns IOVA_OK so nobody uses it.
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20220427154931.3166388-1-eperezma@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
It seems that aio_wait_kick always required a memory barrier
or atomic operation in the caller, but nobody actually
took care of doing it.
Let's put the barrier in the function instead, and pair it
with another one in AIO_WAIT_WHILE. Read aio_wait_kick()
comment for further explanation.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220524173054.12651-1-eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We have too much logic to simply check that bitmaps are of the same
size. Let's just define that hbitmap_merge() and
bdrv_dirty_bitmap_merge_internal() require their argument bitmaps be of
same size, this simplifies things.
Let's look through the callers:
For backup_init_bcs_bitmap() we already assert that merge can't fail.
In bdrv_reclaim_dirty_bitmap_locked() we gracefully handle the error
that can't happen: successor always has same size as its parent, drop
this logic.
In bdrv_merge_dirty_bitmap() we already has assertion and separate
check. Make the check explicit and improve error message.
Signed-off-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru>
Reviewed-by: Nikita Lapshin <nikita.lapshin@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20220517111206.23585-4-v.sementsov-og@mail.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On linux, the AT_HWCAP bit PPC_FEATURE_ICACHE_SNOOP indicates
that we can use a simplified 3 instruction flush sequence.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20220519141131.29839-1-npiggin@gmail.com>
[rth: update after merging cacheflush.c and cacheinfo.c]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20220621014837.189139-4-richard.henderson@linaro.org>
Merge init_ctr_el0 into arch_cache_info. In flush_idcache_range,
use the pre-computed line sizes from the global variables.
Use CONFIG_DARWIN in preference to __APPLE__.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20220621014837.189139-3-richard.henderson@linaro.org>
Combine the two files into cacheflush.c. There's a couple of bits
that would be helpful to share between the two, and combining them
seems better than exporting the bits.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20220621014837.189139-2-richard.henderson@linaro.org>
This decreases qemu_clock_deadline_ns_all's share from 23.2% to 13% in a
profile of icount-enabled aarch64-softmmu.
Signed-off-by: Idan Horowitz <idan.horowitz@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220114004358.299534-2-idan.horowitz@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Based on already existing QEMU implementation created a signed
256 bit by 128 bit division needed to implement the vector divide
extended signed quadword instruction from PowerISA 3.1
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220525134954.85056-6-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Based on already existing QEMU implementation, created an unsigned 256
bit by 128 bit division needed to implement the vector divide extended
unsigned instruction from PowerISA3.1
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220525134954.85056-5-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Extract the knowledge of IEC and SI prefixes out of size_to_str and
freq_to_str, so that it can be reused when printing statistics.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
vCPU execution should be suspended when new BH is scheduled.
This is needed to avoid guest timeouts caused by the long cycles
of the execution. In replay mode execution may hang when
vCPU sleeps and block event comes to the queue.
This patch adds notification which wakes up vCPU or interrupts
execution of guest code.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
--
v2: changed first_cpu to current_cpu (suggested by Richard Henderson)
v4: moved vCPU notification to aio_bh_enqueue (suggested by Paolo Bonzini)
Message-Id: <165364837317.688121.17680519919871405281.stgit@pasha-ThinkPad-X280>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SHGetFolderPath() is a deprecated API:
https://docs.microsoft.com/en-us/windows/win32/api/shlobj_core/nf-shlobj_core-shgetfolderpatha
It is a wrapper for SHGetKnownFolderPath() and CSIDL_COMMON_PATH is
mapped to FOLDERID_ProgramData:
https://docs.microsoft.com/en-us/windows/win32/shell/csidl
g_get_system_data_dirs() is a suitable replacement, as it will have
FOLDERID_ProgramData in the returned list. However, it follows the XDG
Base Directory Specification, if `XDG_DATA_DIRS` is defined, it will be
returned instead.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20220525144140.591926-3-marcandre.lureau@redhat.com>
The function is required by get_relocated_path() (already in cutils),
and used by qemu-ga and may be generally useful.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20220525144140.591926-2-marcandre.lureau@redhat.com>
Just setting the max threads to 0 is enough to stop all workers.
Message-Id: <20220514065012.1149539-4-pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since commit f9fc8932b1 ("thread-posix: remove the posix semaphore
support", 2022-04-06) QemuSemaphore has its own mutex and condition
variable; this adds unnecessary overhead on I/O with small block sizes.
Check the QTAILQ directly instead of adding the indirection of a
semaphore's count. Using a semaphore has not been necessary since
qemu_cond_timedwait was introduced; the new code has to be careful about
spurious wakeups but it is simpler, for example thread_pool_cancel does
not have to worry about synchronizing the semaphore count with the number
of elements of pool->request_list.
Note that the return value of qemu_cond_timedwait (0 for timeout, 1 for
signal or spurious wakeup) is different from that of qemu_sem_timedwait
(-1 for timeout, 0 for success).
Reported-by: Lukáš Doktor <ldoktor@redhat.com>
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Message-Id: <20220514065012.1149539-3-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The completion bottom half was scheduled within the pool->lock
critical section. That actually results in worse performance,
because the worker thread can run its own small critical section
and go to sleep before the bottom half starts running.
Note that this simple change does not produce an improvement without
changing the thread pool QemuSemaphore to a condition variable.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Message-Id: <20220514065012.1149539-2-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu_co_queue_restart_all is basically the same as qemu_co_enter_all
but without a QemuLockable argument. That's perfectly fine, but only as
long as the function is marked coroutine_fn. If used outside coroutine
context, qemu_co_queue_wait will attempt to take the lock and that
is just broken: if you are calling qemu_co_queue_restart_all outside
coroutine context, the lock is going to be a QemuMutex which cannot be
taken twice by the same thread.
The patch adds the marker to qemu_co_queue_restart_all and to its sole
non-coroutine_fn caller; it then reimplements the function in terms of
qemu_co_enter_all_impl, to remove duplicated code and to clarify that the
latter also works in coroutine context.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20220427130830.150180-4-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Because qemu_co_queue_restart_all does not release the lock, it should
be used only in coroutine context. Introduce a new function that,
like qemu_co_enter_next, does release the lock, and use it whenever
qemu_co_queue_restart_all was used outside coroutine context.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20220427130830.150180-3-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu_co_queue_next is basically the same as qemu_co_enter_next but
without a QemuLockable argument. That's perfectly fine, but only
as long as the function is marked coroutine_fn. If used outside
coroutine context, qemu_co_queue_wait will attempt to take the lock
and that is just broken: if you are calling qemu_co_queue_next outside
coroutine context, the lock is going to be a QemuMutex which cannot be
taken twice by the same thread.
The patch adds the marker and reimplements qemu_co_queue_next in terms of
qemu_co_enter_next_impl, to remove duplicated code and to clarify that the
latter also works in coroutine context.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20220427130830.150180-2-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 4c41c69e changed the way the coroutine pool is sized because for
virtio-blk devices with a large queue size and heavy I/O, it was just
too small and caused coroutines to be deleted and reallocated soon
afterwards. The change made the size dynamic based on the number of
queues and the queue size of virtio-blk devices.
There are two important numbers here: Slightly simplified, when a
coroutine terminates, it is generally stored in the global release pool
up to a certain pool size, and if the pool is full, it is freed.
Conversely, when allocating a new coroutine, the coroutines in the
release pool are reused if the pool already has reached a certain
minimum size (the batch size), otherwise we allocate new coroutines.
The problem after commit 4c41c69e is that it not only increases the
maximum pool size (which is the intended effect), but also the batch
size for reusing coroutines (which is a bug). It means that in cases
with many devices and/or a large queue size (which defaults to the
number of vcpus for virtio-blk-pci), many thousand coroutines could be
sitting in the release pool without being reused.
This is not only a waste of memory and allocations, but it actually
makes the QEMU process likely to hit the vm.max_map_count limit on Linux
because each coroutine requires two mappings (its stack and the guard
page for the stack), causing it to abort() in qemu_alloc_stack() because
when the limit is hit, mprotect() starts to fail with ENOMEM.
In order to fix the problem, change the batch size back to 64 to avoid
uselessly accumulating coroutines in the release pool, but keep the
dynamic maximum pool size so that coroutines aren't freed too early
in heavy I/O scenarios.
Note that this fix doesn't strictly make it impossible to hit the limit,
but this would only happen if most of the coroutines are actually in use
at the same time, not just sitting in a pool. This is the same behaviour
as we already had before commit 4c41c69e. Fully preventing this would
require allowing qemu_coroutine_create() to return an error, but it
doesn't seem to be a scenario that people hit in practice.
Cc: qemu-stable@nongnu.org
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2079938
Fixes: 4c41c69e05
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20220510151020.105528-3-kwolf@redhat.com>
Tested-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It's true that these functions currently affect the batch size in which
coroutines are reused (i.e. moved from the global release pool to the
allocation pool of a specific thread), but this is a bug and will be
fixed in a separate patch.
In fact, the comment in the header file already just promises that it
influences the pool size, so reflect this in the name of the functions.
As a nice side effect, the shorter function name makes some line
wrapping unnecessary.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20220510151020.105528-2-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The thread pool regulates itself: when idle, it kills threads until
empty, when in demand, it creates new threads until full. This behaviour
doesn't play well with latency sensitive workloads where the price of
creating a new thread is too high. For example, when paired with qemu's
'-mlock', or using safety features like SafeStack, creating a new thread
has been measured take multiple milliseconds.
In order to mitigate this let's introduce a new 'EventLoopBase'
property to set the thread pool size. The threads will be created during
the pool's initialization or upon updating the property's value, remain
available during its lifetime regardless of demand, and destroyed upon
freeing it. A properly characterized workload will then be able to
configure the pool to avoid any latency spikes.
Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-id: 20220425075723.20019-4-nsaenzju@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
'event-loop-base' provides basic property handling for all 'AioContext'
based event loops. So let's define a new 'MainLoopClass' that inherits
from it. This will permit tweaking the main loop's properties through
qapi as well as through the command line using the '-object' keyword[1].
Only one instance of 'MainLoopClass' might be created at any time.
'EventLoopBaseClass' learns a new callback, 'can_be_deleted()' so as to
mark 'MainLoop' as non-deletable.
[1] For example:
-object main-loop,id=main-loop,aio-max-batch=<value>
Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-id: 20220425075723.20019-3-nsaenzju@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Thread-Local Storage variables cannot be used directly from coroutine
code because the compiler may optimize TLS variable accesses across
qemu_coroutine_yield() calls. When the coroutine is re-entered from
another thread the TLS variables from the old thread must no longer be
used.
Use QEMU_DEFINE_STATIC_CO_TLS() for the current and leader variables.
I think coroutine-win32.c could get away with __thread because the
variables are only used in situations where either the stale value is
correct (current) or outside coroutine context (loading leader when
current is NULL). Due to the difficulty of being sure that this is
really safe in all scenarios it seems worth converting it anyway.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220307153853.602859-4-stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Thread-Local Storage variables cannot be used directly from coroutine
code because the compiler may optimize TLS variable accesses across
qemu_coroutine_yield() calls. When the coroutine is re-entered from
another thread the TLS variables from the old thread must no longer be
used.
Use QEMU_DEFINE_STATIC_CO_TLS() for the current and leader variables.
The alloc_pool QSLIST needs a typedef so the return value of
get_ptr_alloc_pool() can be stored in a local variable.
One example of why this code is necessary: a coroutine that yields
before calling qemu_coroutine_create() to create another coroutine is
affected by the TLS issue.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220307153853.602859-3-stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Thread-Local Storage variables cannot be used directly from coroutine
code because the compiler may optimize TLS variable accesses across
qemu_coroutine_yield() calls. When the coroutine is re-entered from
another thread the TLS variables from the old thread must no longer be
used.
Use QEMU_DEFINE_STATIC_CO_TLS() for the current and leader variables.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220307153853.602859-2-stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The qemu_*block() functions are meant to be be used with sockets (the
win32 implementation expects SOCKET)
Over time, those functions where used with Win32 SOCKET or
file-descriptors interchangeably. But for portability, they must only be
used with socket-like file-descriptors. FDs can use
g_unix_set_fd_nonblocking() instead.
Rename the functions with "socket" in the name to prevent bad usages.
This is effectively reverting commit f9e8cacc55 ("oslib-posix:
rename socket_set_nonblock() to qemu_set_nonblock()").
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
GLib g_unix_open_pipe() is essentially like qemu_pipe(), available since
2.30.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
It is only used by block/file-posix.c, move it there.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
API available since glib 2.30. It also preserves errno.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Like -set and -readconfig, it would not really be too hard to
extend -writeconfig to parsing mechanisms other than QemuOpts.
However, the uses of -writeconfig are substantially more
limited, as it is generally easier to write the configuration
by hand in the first place. In addition, -writeconfig does
not even try to detect cases where it prints incorrect
syntax (for example if values have a quote in them, since
qemu_config_parse does not support any kind of escaping.
Just remove it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220414145721.326866-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The 'g_get_real_time' returns the number of microseconds since January
1, 1970 UTC, but 'g_date_time_new_from_unix_utc' needs the number of
seconds, so it will cause the invalid time input:
(process:279642): GLib-CRITICAL (recursed) **: g_date_time_format: assertion 'datetime != NULL' failed
Call function 'g_date_time_new_now_utc' instead, it has the same result
as 'g_date_time_new_from_unix_utc(g_get_real_time() / G_USEC_PER_SEC)';
Fixes: 73dab893b5 ("error-report: replace deprecated g_get_current_time() with glib >= 2.62")
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220424105036.291370-1-haiyue.wang@intel.com>
Users requiring FIPS support must build QEMU with either the libgcrypt
or gnutls libraries as the crytography backend.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Simplify the function to only return the directory path. Callers are
adjusted to use the GLib function to build paths, g_build_filename().
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220420132624.2439741-39-marcandre.lureau@redhat.com>
qemu_open_old(O_CREATE) should be replaced with qemu_create() which
handles Error reporting.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220420132624.2439741-38-marcandre.lureau@redhat.com>
Mostly for correctness.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220420132624.2439741-37-marcandre.lureau@redhat.com>
Use qemu_write_full() instead of open-coding a write loop.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220420132624.2439741-36-marcandre.lureau@redhat.com>
The function is specific to qemu-ga, no need to share it in QEMU.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Message-Id: <20220420132624.2439741-32-marcandre.lureau@redhat.com>
Since it depends on monitor code, and error_vprintf_unless_qmp() is
already there.
This will help to move error-report in a common subproject.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220420132624.2439741-31-marcandre.lureau@redhat.com>
Do not require the whole option machinery to handle keyval, as it is
used by QAPI alone, without the option API. And match the associated
unit name.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220420132624.2439741-24-marcandre.lureau@redhat.com>
Move QEMU-specific code to util/osdep.c, so cutils can become a common
subproject.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220420132624.2439741-22-marcandre.lureau@redhat.com>
The implementation depends on the OS. (and longer-term goal is to move
cutils to a common subproject)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220420132624.2439741-21-marcandre.lureau@redhat.com>