Inside riscv_cpu_validate_v() we're always throwing a log message if the
user didn't set a vector version via 'vext_spec'.
We're going to include one case with the 'max' CPU where env->vext_ver
will be set in the cpu_init(). But that alone will not stop the "vector
version is not specified" message from appearing. The usefulness of this
log message is debatable for the generic CPUs, but for a 'max' CPU type,
where we are supposed to deliver a CPU model with all features possible,
it's strange to force users to set 'vext_spec' to get rid of this
message.
Change riscv_cpu_validate_v() to not throw this log message if
env->vext_ver is already set.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Message-ID: <20230912132423.268494-10-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Use a helper in riscv_cpu_add_kvm_properties() to eliminate some of its
code repetition.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230912132423.268494-9-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The code inside riscv_cpu_add_user_properties() became quite repetitive
after recent changes. Add a helper to hide the repetition away.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230912132423.268494-8-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Our goal is to make riscv_cpu_extensions[] hold only ratified,
non-vendor extensions.
Create a new riscv_cpu_vendor_exts[] array for them, changing
riscv_cpu_add_user_properties() and riscv_cpu_add_kvm_properties()
accordingly.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Create a new riscv_cpu_experimental_exts[] to store the non-ratified
extensions properties. Once they are ratified we'll move them back to
riscv_cpu_extensions[].
riscv_cpu_add_user_properties() and riscv_cpu_add_kvm_properties() are
changed to keep adding non-ratified properties to users.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-6-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Add DEFINE_PROP_END_OF_LIST() and eliminate the ARRAY_SIZE() usage when
iterating in the riscv_cpu_options[] array, making it similar to what
we already do when working with riscv_cpu_extensions[].
We also have a more sophisticated motivation behind this change. In the
future we might need to export riscv_cpu_options[] to other files, and
ARRAY_LIST() doesn't work properly in that case because the array size
isn't exposed to the header file. Here's a future sight of what we would
deal with:
./target/riscv/kvm.c:1057:5: error: nested extern declaration of 'riscv_cpu_add_misa_properties' [-Werror=nested-externs]
n file included from ../target/riscv/kvm.c:19:
home/danielhb/work/qemu/include/qemu/osdep.h:473:31: error: invalid application of 'sizeof' to incomplete type 'const RISCVCPUMultiExtConfig[]'
473 | #define ARRAY_SIZE(x) ((sizeof(x) / sizeof((x)[0])) + \
| ^
./target/riscv/kvm.c:1047:29: note: in expansion of macro 'ARRAY_SIZE'
1047 | for (int i = 0; i < ARRAY_SIZE(_array); i++) { \
| ^~~~~~~~~~
./target/riscv/kvm.c:1059:5: note: in expansion of macro 'ADD_UNAVAIL_KVM_PROP_ARRAY'
1059 | ADD_UNAVAIL_KVM_PROP_ARRAY(obj, riscv_cpu_extensions);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
home/danielhb/work/qemu/include/qemu/osdep.h:473:31: error: invalid application of 'sizeof' to incomplete type 'const RISCVCPUMultiExtConfig[]'
473 | #define ARRAY_SIZE(x) ((sizeof(x) / sizeof((x)[0])) + \
| ^
./target/riscv/kvm.c:1047:29: note: in expansion of macro 'ARRAY_SIZE'
1047 | for (int i = 0; i < ARRAY_SIZE(_array); i++) { \
Homogenize the present and change the future by using
DEFINE_PROP_END_OF_LIST() in riscv_cpu_options[].
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230912132423.268494-5-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Future patches will split the existing Property arrays even further, and
the existing code in riscv_cpu_add_user_properties() will start to scale
bad with it because it's dealing with KVM constraints mixed in with TCG
constraints. We're going to pay a high price to share a couple of common
lines of code between the two.
Create a new kvm_riscv_cpu_add_kvm_properties() helper that will be
forked from riscv_cpu_add_user_properties() if we're running KVM. The
helper includes all properties that a KVM CPU will add. The rest of
riscv_cpu_add_user_properties() body will then be relieved from having
to deal with KVM constraints.
The helper was declared in kvm_stubs.h, while being implemented in
cpu.c, to allow '--enable-debug' builds to work. The compiler won't
remove the kvm_riscv_cpu_add_kvm_properties() reference when
'kvm_enabled()' is false if we end up with an unused function. Even
though being a KVM only helper we can't implement it in kvm.c due to its
many dependencies inside cpu.c, so make it public in kvm_riscv.h and
keep its implementation in cpu.c for now. We'll move it to kvm.c in the
near future.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-4-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
After the introduction of riscv_cpu_options[] all properties in
riscv_cpu_extensions[] are booleans. This check is now obsolete.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We'll add a new CPU type that will enable a considerable amount of
extensions. To make it easier for us we'll do a few cleanups in our
existing riscv_cpu_extensions[] array.
Start by splitting all CPU non-boolean options from it. Create a new
riscv_cpu_options[] array for them. Add all these properties in
riscv_cpu_add_user_properties() as it is already being done today.
'mmu' and 'pmp' aren't really extensions in the usual way we think about
RISC-V extensions. These are closer to CPU features/options, so move
both to riscv_cpu_options[] too. In the near future we'll need to match
all extensions with all entries in isa_edata_arr[], and so it happens
that both 'mmu' and 'pmp' do not have a riscv,isa string (thus, no priv
spec version restriction). This further emphasizes the point that these
are more a CPU option than an extension.
No functional changes made.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230912132423.268494-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Commit a5e0b33119 ("vl.c: Replace smp global variables
with smp machine properties") removed the last uses of
the smp_cores / smp_threads variables but forgot to
remove their declarations. Do it now.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If a script is executable and has a shebang line, Meson treats it as
a normal executable, so that this script here is run via the "python3"
binary in the $PATH. However, "python3" might not be in the $PATH at
all, or it might be a wrong version, so we should make sure to run
this script via the Python version that has been chosen for the QEMU
build process. The best way to do this is to remove the executable bit
from the access mode bits. (See also commit 4b424c7571)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1918
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
python3 may not be the expected python version.
Use PYTHON to invoke python.
Fixes: 22e11539e1 ("edk2: replace build scripts")
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
These files belong to the sbsa-ref machine and thus should
be listed here.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Leif Lindholm <quic_llindhol@quicinc.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Sensor devices depend on some bus, not a particular board.
While merged for a particular board, sensor devices don't
depend on it. They depend on a bus technology, and can be
used by any board exposing such bus.
In order to help merging sensor patches, when they fall out
of a particular board tree, add a section covering overall
sensors, to help out with patch review and merge queue
handling.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The files in there are updated via update-linux-headers.sh.
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The docs/devel/ci* were not covered yet, add them to MAINTAINERS
so that the right people are put on CC: for related patches.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The "Character devices" section only covers hw/char/ but
misses the corresponding include/hw/char/ folder. Add it now.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
There are a bunch of RISC-V files that are currently not covered
by the "get_maintainers.pl" script. Add them to the right sections
in MAINTAINERS to fix this problem.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
These devices are only used by the Jazz machine, so they
should be listed in the corresponding section in MAINTAINERS.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Python 3.12 has released, so update the test infrastructure to test
against this version. Update the configure script to look for it when an
explicit Python interpreter isn't chosen.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-id: 20231006195243.3131140-5-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-id: 20231006195243.3131140-4-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
This patch is a backport from
e03a3334b6
According to Guido in https://github.com/python/cpython/issues/104344 ,
this call was never meant to wait for the server to shut down - that is
handled synchronously - but instead, this waits for all connections to
close. Or, it would have, if it wasn't broken since it was introduced.
3.12 fixes the bug, which now causes a hang in our code. The fix is just
to remove the wait.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-id: 20231006195243.3131140-3-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
The test bails gracefully if this module isn't installed, but linters
need a little help understanding that. It's enough to just declare the
type in this case.
(Fixes pylint complaining about use of an uninitialized variable because
it isn't wise enough to understand the notrun call is noreturn.)
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-id: 20231006195243.3131140-2-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
By using a socketpair for all of the sockets managed by the VM class and
its extensions, we don't need the sock_dir argument anymore, so remove
it.
We only added this argument so that we could specify a second, shorter
temporary directory for cases where the temp/log dirs were "too long" as
a socket name on macOS. We don't need it for this class now. In one
case, avocado testing takes over responsibility for creating an
appropriate sockdir.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20230928044943.849073-7-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Like the QMP and console sockets, begin using socketpairs for the qtest
connection, too. After this patch, we'll be able to remove the vestigial
sock_dir argument, but that cleanup is best done in its own patch.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20230928044943.849073-6-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Create a socketpair for the console output. This should help eliminate
race conditions around console text early in the boot process that might
otherwise have been dropped on the floor before being able to connect to
QEMU under "server,nowait".
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20230928044943.849073-5-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Useful if we want to use ConsoleSocket() for a socket created by
socketpair().
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20230928044943.849073-4-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
If everything has gone smoothly, we'll already have closed the socket we
gave to the child during post_launch. The other half of the pair that we
gave to the QMP connection should, likewise, be definitively closed by
now.
However, in the cleanup path, it's possible we've created the socketpair
but flubbed the launch and need to clean up resources. These resources
*would* be handled by the garbage collector, but that can happen at
unpredictable times. Nicer to just clean them up synchronously on the
exit path, here.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20230928044943.849073-3-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
This property isn't meant to do much else besides return a list of
strings, so move this setup back out into _pre_launch().
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20230928044943.849073-2-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
- enable more sbsa-ref tests in avocado
- add swtpm to the package lists
- reduce avocado noise in gitlab by limiting tests
- make docker engine choice driven by configure and enable override
- remove unneeded gcc suffix on some cross compilers
- fix some NULL returns in gdbstub
- improve locking in execlog plugin
- introduce the GDBFeature structure
- consistently set gdb_core_xml_file
- use cleaner escaping for gdb xml
- drop ancient gdb_has_xml() test
- disable multi-instruction GUSA emulation when plugins enabled
- fix some coverity issues in plugins
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmUmVJkACgkQ+9DbCVqe
KkTfNggAiS02FcL3faGjAN9+60xhvEQ3DJjI473hjvFWu0bSkQTjObcQqGc+V7Cw
9yNtnxOOWB6KdAU8At7HlVqiUXeyTCJB7Att5/UgNUZj63j+cs7PXb4p7cVCcJOc
17zni22tnmCBcC8wZaz0yj68jaftL3hz1QNUZOmv6CBt42q0+/4g1WKfaJ+w+SbK
T7cJEiMDObm8qeNAAXpDLB+9v3bRDxMZ8hFJ3p3CatQC8jbDrkuH7RrVPHDWiWQx
w0uXpUHlZEOVX23v6+iIoeb8YQW2bZI9UsfeyIHJlENaVgyL200LHgLvvAE4Qd63
dCtfQUZzj4t9sfoL4XgxaB7G4qtXTg==
=7PLI
-----END PGP SIGNATURE-----
Merge tag 'pull-omnibus-111023-1' of https://gitlab.com/stsquad/qemu into staging
testing, gdbstub and plugin updates
- enable more sbsa-ref tests in avocado
- add swtpm to the package lists
- reduce avocado noise in gitlab by limiting tests
- make docker engine choice driven by configure and enable override
- remove unneeded gcc suffix on some cross compilers
- fix some NULL returns in gdbstub
- improve locking in execlog plugin
- introduce the GDBFeature structure
- consistently set gdb_core_xml_file
- use cleaner escaping for gdb xml
- drop ancient gdb_has_xml() test
- disable multi-instruction GUSA emulation when plugins enabled
- fix some coverity issues in plugins
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmUmVJkACgkQ+9DbCVqe
# KkTfNggAiS02FcL3faGjAN9+60xhvEQ3DJjI473hjvFWu0bSkQTjObcQqGc+V7Cw
# 9yNtnxOOWB6KdAU8At7HlVqiUXeyTCJB7Att5/UgNUZj63j+cs7PXb4p7cVCcJOc
# 17zni22tnmCBcC8wZaz0yj68jaftL3hz1QNUZOmv6CBt42q0+/4g1WKfaJ+w+SbK
# T7cJEiMDObm8qeNAAXpDLB+9v3bRDxMZ8hFJ3p3CatQC8jbDrkuH7RrVPHDWiWQx
# w0uXpUHlZEOVX23v6+iIoeb8YQW2bZI9UsfeyIHJlENaVgyL200LHgLvvAE4Qd63
# dCtfQUZzj4t9sfoL4XgxaB7G4qtXTg==
# =7PLI
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 11 Oct 2023 03:54:01 EDT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* tag 'pull-omnibus-111023-1' of https://gitlab.com/stsquad/qemu: (25 commits)
contrib/plugins: fix coverity warning in hotblocks
contrib/plugins: fix coverity warning in lockstep
contrib/plugins: fix coverity warning in cache
plugins: Set final instruction count in plugin_gen_tb_end
target/sh4: Disable decode_gusa when plugins enabled
accel/tcg: Add plugin_enabled to DisasContextBase
gdbstub: Replace gdb_regs with an array
gdbstub: Remove gdb_has_xml variable
target/ppc: Remove references to gdb_has_xml
target/arm: Remove references to gdb_has_xml
gdbstub: Use g_markup_printf_escaped()
hw/core/cpu: Return static value with gdb_arch_name()
target/arm: Move the reference to arm-core.xml
gdbstub: Introduce GDBFeature structure
contrib/plugins: Use GRWLock in execlog
plugins: Check if vCPU is realized
gdbstub: Fix target.xml response
gdbstub: Fix target_xml initialization
configure: remove gcc version suffixes
configure: allow user to override docker engine
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If output is being captured for a guest-exec invocation, the out-data
and err-data fields of guest-exec-status are only populated after the
process is reaped. This is somewhat counter intuitive and too late to
change. Thus, it would be good to document the behavior.
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
If capture-output is requested but one of the channels goes unused (eg.
we attempt to capture stderr but the command never writes to stderr), we
can leak memory.
guest_exec_output_watch() is (from what I understand) unconditionally
called for both streams if output capture is requested. The first call
will always pass the `p->size == p->length` check b/c both values are
0. Then GUEST_EXEC_IO_SIZE bytes will be allocated for the stream.
But when we reap the exited process there's a `gei->err.length > 0`
check to actually free the buffer. Which does not get run if the command
doesn't write to the stream.
Fix by making free() unconditional.
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
GUID_DEVINTERFACE_DISK and GUID_DEVINTERFACE_STORAGEPORT are already
defined by MinGW-w64. They are not only unnecessary, but can lead to
duplicate definition errors at link time with some unknown condition.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
The previous links for the white paper and programmer's manual
are no longer available. Replace them with the new ones.
Signed-off-by: Jianlin Li <ljianlin99@gmail.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
It's just a simple wrapper for rp_sem on either wait() or kick(), make it
even clearer on how it is used. Prepared to be used even for other things.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-ID: <20231004220240.167175-8-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Instead of only relying on the count of rp_sem, make the counter be part of
RAMState so it can be used in both threads to synchronize on the process.
rp_sem will be further reused in follow up patches, as a way to kick the
main thread, e.g., on recovery failures.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-7-peterx@redhat.com>
There're a lot of cases where we only have an errno set in last_error but
without a detailed error description. When this happens, try to generate
an error contains the errno as a descriptive error.
This will be helpful in cases where one relies on the Error*. E.g.,
migration state only caches Error* in MigrationState.error. With this,
we'll display correct error messages in e.g. query-migrate when the error
was only set by qemu_file_set_error().
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-6-peterx@redhat.com>
Introduce a helper to detect whether MigrationState.error is set for
whatever reason.
This is preparation work for any thread (e.g. source return path thread) to
setup errors in an unified way to MigrationState, rather than relying on
its own way to set errors (mark_source_rp_bad()).
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-3-peterx@redhat.com>
Display it as long as being set, irrelevant of FAILED status. E.g., it may
also be applicable to PAUSED stage of postcopy, to provide hint on what has
gone wrong.
The error_mutex seems to be overlooked when referencing the error, add it
to be very safe.
This will change QAPI behavior by showing up error message outside !FAILED
status, but it's intended and doesn't expect to break anyone.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018404
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-2-peterx@redhat.com>
qemu_rdma_dump_id() dumps RDMA device details to stdout.
rdma_start_outgoing_migration() calls it via qemu_rdma_source_init()
and qemu_rdma_resolve_host() to show source device details.
rdma_start_incoming_migration() arranges its call via
rdma_accept_incoming_migration() and qemu_rdma_accept() to show
destination device details.
Two issues:
1. rdma_start_outgoing_migration() can run in HMP context. The
information should arguably go the monitor, not stdout.
2. ibv_query_port() failure is reported as error. Its callers remain
unaware of this failure (qemu_rdma_dump_id() can't fail), so
reporting this to the user as an error is problematic.
Fixable, but the device detail dump is noise, except when
troubleshooting. Tracing is a better fit. Similar function
qemu_rdma_dump_id() was converted to tracing in commit
733252deb8 (Tracify migration/rdma.c).
Convert qemu_rdma_dump_id(), too.
While there, touch up qemu_rdma_dump_gid()'s outdated comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-54-armbru@redhat.com>
error_report() obeys -msg, reports the current error location if any,
and reports to the current monitor if any. Reporting to stderr
directly with fprintf() or perror() is wrong, because it loses all
this.
Fix the offenders. Bonus: resolves a FIXME about problematic use of
errno.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-53-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qemu_rdma_source_init(), qemu_rdma_connect(),
rdma_start_incoming_migration(), and rdma_start_outgoing_migration()
violate this principle: they call error_report() via
qemu_rdma_cleanup().
Moreover, qemu_rdma_cleanup() can't fail. It is called on error
paths, and QIOChannel close and finalization. Are the conditions it
reports really errors? I doubt it.
Downgrade qemu_rdma_cleanup()'s errors to warnings.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-52-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qemu_rdma_write_one() violates this principle: it reports errors to
stderr via qemu_rdma_register_and_get_keys(). I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.
Clean this up: silence qemu_rdma_register_and_get_keys(). I believe
the caller's error reports suffice. If they don't, we need to convert
to Error instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-51-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qemu_rdma_post_send_control(), qemu_rdma_exchange_get_response(), and
qemu_rdma_write_one() violate this principle: they call
error_report(), fprintf(stderr, ...), and perror() via
qemu_rdma_block_for_wrid(), qemu_rdma_poll(), and
qemu_rdma_wait_comp_channel(). I elected not to investigate how
callers handle the error, i.e. precise impact is not known.
Clean this up by dropping the error reporting from qemu_rdma_poll(),
qemu_rdma_wait_comp_channel(), and qemu_rdma_block_for_wrid(). I
believe the callers' error reports suffice. If they don't, we need to
convert to Error instead.
Bonus: resolves a FIXME about problematic use of errno.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-50-armbru@redhat.com>
When qemu_rdma_wait_comp_channel() receives an event from the
completion channel, it reports an error "receive cm event while wait
comp channel,cm event is T", where T is the numeric event type.
However, the function fails only when T is a disconnect or device
removal. Events other than these two are not actually an error, and
reporting them as an error is wrong. If we need to report them to the
user, we should use something else, and what to use depends on why we
need to report them to the user.
For now, report this error only when the function actually fails.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-49-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qemu_rdma_source_init() and qemu_rdma_accept() violate this principle:
they call error_report() via qemu_rdma_reg_control(). I elected not
to investigate how callers handle the error, i.e. precise impact is
not known.
Clean this up by dropping the error reporting from
qemu_rdma_reg_control(). I believe the callers' error reports
suffice. If they don't, we need to convert to Error instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-48-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qemu_rdma_connect() violates this principle: it calls error_report()
and perror(). I elected not to investigate how callers handle the
error, i.e. precise impact is not known.
Clean this up: replace perror() by changing error_setg() to
error_setg_errno(), and drop error_report(). I believe the callers'
error reports suffice then. If they don't, we need to convert to
Error instead.
Bonus: resolves a FIXME about problematic use of errno.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-47-armbru@redhat.com>