Rename SysBusDevice variable to avoid this warning :
../hw/ppc/spapr_pci.c: In function ‘spapr_phb_realize’:
../hw/ppc/spapr_pci.c:1872:24: warning: declaration of ‘s’ shadows a previous local [-Wshadow=local]
1872 | SpaprPhbState *s;
| ^
../hw/ppc/spapr_pci.c:1829:19: note: shadowed declaration is here
1829 | SysBusDevice *s = SYS_BUS_DEVICE(dev);
| ^
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230918145850.241074-8-clg@kaod.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Remove extra 'drc_index' variable to avoid this warning :
../hw/ppc/spapr_drc.c: In function ‘rtas_ibm_configure_connector’:
../hw/ppc/spapr_drc.c:1240:26: warning: declaration of ‘drc_index’ shadows a previous local [-Wshadow=compatible-local]
1240 | uint32_t drc_index = spapr_drc_index(drc);
| ^~~~~~~~~
../hw/ppc/spapr_drc.c:1155:14: note: shadowed declaration is here
1155 | uint32_t drc_index;
| ^~~~~~~~~
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230918145850.241074-7-clg@kaod.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Remove extra 'i' variable to fix this warning :
../hw/ppc/spapr.c: In function ‘spapr_init_cpus’:
../hw/ppc/spapr.c:2668:13: warning: declaration of ‘i’ shadows a previous local [-Wshadow=compatible-local]
2668 | int i;
| ^
../hw/ppc/spapr.c:2645:9: note: shadowed declaration is here
2645 | int i;
| ^
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230918145850.241074-5-clg@kaod.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Introduce a helper routine defining one CPU device node to fix this
warning :
../hw/ppc/spapr.c: In function ‘spapr_dt_cpus’:
../hw/ppc/spapr.c:812:19: warning: declaration of ‘cs’ shadows a previous local [-Wshadow=compatible-local]
812 | CPUState *cs = rev[i];
| ^~
../hw/ppc/spapr.c:786:15: note: shadowed declaration is here
786 | CPUState *cs;
| ^~
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230918145850.241074-4-clg@kaod.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
this fixes numerous warnings of this type :
In file included from ../hw/ppc/spapr_pci.c:43:
../hw/ppc/spapr_pci.c: In function ‘spapr_dt_phb’:
../include/hw/ppc/fdt.h:18:13: warning: declaration of ‘ret’ shadows a previous local [-Wshadow=compatible-local]
18 | int ret = (exp); \
| ^~~
../hw/ppc/spapr_pci.c:2355:5: note: in expansion of macro ‘_FDT’
2355 | _FDT(bus_off = fdt_add_subnode(fdt, 0, phb->dtbusname));
| ^~~~
../hw/ppc/spapr_pci.c:2311:24: note: shadowed declaration is here
2311 | int bus_off, i, j, ret;
| ^~~
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230918145850.241074-2-clg@kaod.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
hw/intc/openpic.c: In function ‘openpic_gbl_write’:
hw/intc/openpic.c:614:17: warning: declaration of ‘idx’ shadows a previous local [-Wshadow=compatible-local]
614 | int idx;
| ^~~
hw/intc/openpic.c:568:9: note: shadowed declaration is here
568 | int idx;
| ^~~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904162824.85385-3-philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
softmmu/memory.c: In function ‘mtree_print_mr’:
softmmu/memory.c:3236:27: warning: declaration of ‘ml’ shadows a previous local [-Wshadow=compatible-local]
3236 | MemoryRegionList *ml;
| ^~
softmmu/memory.c:3213:32: note: shadowed declaration is here
3213 | MemoryRegionList *new_ml, *ml, *next_ml;
| ^~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-22-philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
hw/mips/boston.c:472:5: error: declaration shadows a local variable [-Werror,-Wshadow]
qemu_fdt_setprop_cells(fdt, name, "reg", reg_base, reg_size);
^
include/sysemu/device_tree.h:129:13: note: expanded from macro 'qemu_fdt_setprop_cells'
int i;
^
hw/mips/boston.c:461:9: note: previous declaration is here
int i;
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-21-philmd@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
linux-user/strace.c: In function ‘print_sockaddr’:
linux-user/strace.c:370:17: warning: declaration of ‘i’ shadows a previous local [-Wshadow=compatible-local]
370 | int i;
| ^
linux-user/strace.c:361:9: note: shadowed declaration is here
361 | int i;
| ^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-20-philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
util/vhost-user-server.c: In function ‘set_watch’:
util/vhost-user-server.c:274:20: warning: declaration of ‘vu_fd_watch’ shadows a previous local [-Wshadow=compatible-local]
274 | VuFdWatch *vu_fd_watch = g_new0(VuFdWatch, 1);
| ^~~~~~~~~~~
util/vhost-user-server.c:271:16: note: shadowed declaration is here
271 | VuFdWatch *vu_fd_watch = find_vu_fd_watch(server, fd);
| ^~~~~~~~~~~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-18-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
In file included from crypto/cipher.c:140:
crypto/cipher-gnutls.c.inc: In function ‘qcrypto_gnutls_cipher_encrypt’:
crypto/cipher-gnutls.c.inc:116:17: warning: declaration of ‘err’ shadows a previous local [-Wshadow=compatible-local]
116 | int err = gnutls_cipher_init(&handle, ctx->galg, &gkey, NULL);
| ^~~
crypto/cipher-gnutls.c.inc:94:9: note: shadowed declaration is here
94 | int err;
| ^~~
---
crypto/cipher-gnutls.c.inc: In function ‘qcrypto_gnutls_cipher_decrypt’:
crypto/cipher-gnutls.c.inc:177:17: warning: declaration of ‘err’ shadows a previous local [-Wshadow=compatible-local]
177 | int err = gnutls_cipher_init(&handle, ctx->galg, &gkey, NULL);
| ^~~
crypto/cipher-gnutls.c.inc:154:9: note: shadowed declaration is here
154 | int err;
| ^~~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-17-philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
hw/nios2/10m50_devboard.c: In function ‘nios2_10m50_ghrd_init’:
hw/nios2/10m50_devboard.c:101:22: warning: declaration of ‘dev’ shadows a previous local [-Wshadow=compatible-local]
101 | DeviceState *dev = qdev_new(TYPE_NIOS2_VIC);
| ^~~
hw/nios2/10m50_devboard.c:60:18: note: shadowed declaration is here
60 | DeviceState *dev;
| ^~~
hw/nios2/10m50_devboard.c:110:18: warning: declaration of ‘i’ shadows a previous local [-Wshadow=compatible-local]
110 | for (int i = 0; i < 32; i++) {
| ^
hw/nios2/10m50_devboard.c:67:9: note: shadowed declaration is here
67 | int i;
| ^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-15-philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
hw/m68k/virt.c:263:13: error: declaration shadows a local variable [-Werror,-Wshadow]
BOOTINFOSTR(param_ptr, BI_COMMAND_LINE,
^
hw/m68k/bootinfo.h:47:13: note: expanded from macro 'BOOTINFOSTR'
int i; \
^
hw/m68k/virt.c:130:9: note: previous declaration is here
int i;
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-13-philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
hw/arm/allwinner-r40.c:412:14: error: declaration shadows a local variable [-Werror,-Wshadow]
for (int i = 0; i < AW_R40_NUM_MMCS; i++) {
^
hw/arm/allwinner-r40.c:299:14: note: previous declaration is here
unsigned i;
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20230904161235.84651-10-philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
hw/arm/virt.c:821:22: error: declaration shadows a local variable [-Werror,-Wshadow]
qemu_irq irq = qdev_get_gpio_in(vms->gic,
^
hw/arm/virt.c:803:13: note: previous declaration is here
int irq;
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20230904161235.84651-9-philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
target/mips/tcg/nanomips_translate.c.inc:4410:33: error: declaration shadows a local variable [-Werror,-Wshadow]
int32_t imm = extract32(ctx->opcode, 1, 13) |
^
target/mips/tcg/nanomips_translate.c.inc:3577:9: note: previous declaration is here
int imm;
^
target/mips/tcg/translate.c:15578:19: error: declaration shadows a local variable [-Werror,-Wshadow]
for (unsigned i = 1; i < 32; i++) {
^
target/mips/tcg/translate.c:15567:9: note: previous declaration is here
int i;
^
target/mips/tcg/msa_helper.c:7478:13: error: declaration shadows a local variable [-Werror,-Wshadow]
MSA_FLOAT_MAXOP(pwx->w[0], min, pws->w[0], pws->w[0], 32);
^
target/mips/tcg/msa_helper.c:7434:23: note: expanded from macro 'MSA_FLOAT_MAXOP'
float_status *status = &env->active_tc.msa_fp_status;
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-5-philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Per Peter Maydell analysis [*]:
The hvf_vcpu_exec() function is not documented, but in practice
its caller expects it to return either EXCP_DEBUG (for "this was
a guest debug exception you need to deal with") or something else
(presumably the intention being 0 for OK).
The hvf_sysreg_read() and hvf_sysreg_write() functions are also not
documented, but they return 0 on success, or 1 for a completely
unrecognized sysreg where we've raised the UNDEF exception (but
not if we raised an UNDEF exception for an unrecognized GIC sysreg --
I think this is a bug). We use this return value to decide whether
we need to advance the PC past the insn or not. It's not the same
as the return value we want to return from hvf_vcpu_exec().
Retain the variable as locally scoped but give it a name that
doesn't clash with the other function-scoped variable.
This fixes:
target/arm/hvf/hvf.c:1936:13: error: declaration shadows a local variable [-Werror,-Wshadow]
int ret = 0;
^
target/arm/hvf/hvf.c:1807:9: note: previous declaration is here
int ret;
^
[*] https://lore.kernel.org/qemu-devel/CAFEAcA_e+fU6JKtS+W63wr9cCJ6btu_hT_ydZWOwC0kBkDYYYQ@mail.gmail.com/
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-4-philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
tcg/tcg.c:2551:27: error: declaration shadows a local variable [-Werror,-Wshadow]
MemOp op = get_memop(oi);
^
tcg/tcg.c:2437:12: note: previous declaration is here
TCGOp *op;
^
accel/tcg/tb-maint.c:245:18: error: declaration shadows a local variable [-Werror,-Wshadow]
for (int i = 0; i < V_L2_SIZE; i++) {
^
accel/tcg/tb-maint.c:210:9: note: previous declaration is here
int i;
^
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-2-philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Variables declared in macros can shadow other variables. Much of the
time, this is harmless, e.g.:
#define _FDT(exp) \
do { \
int ret = (exp); \
if (ret < 0) { \
error_report("error creating device tree: %s: %s", \
#exp, fdt_strerror(ret)); \
exit(1); \
} \
} while (0)
Harmless shadowing in h_client_architecture_support():
target_ulong ret;
[...]
ret = do_client_architecture_support(cpu, spapr, vec, fdt_bufsize);
if (ret == H_SUCCESS) {
_FDT((fdt_pack(spapr->fdt_blob)));
[...]
}
return ret;
However, we can get in trouble when the shadowed variable is used in a
macro argument:
#define QOBJECT(obj) ({ \
typeof(obj) o = (obj); \
o ? container_of(&(o)->base, QObject, base) : NULL; \
})
QOBJECT(o) expands into
({
---> typeof(o) o = (o);
o ? container_of(&(o)->base, QObject, base) : NULL;
})
Unintended variable name capture at --->. We'd be saved by
-Winit-self. But I could certainly construct more elaborate death
traps that don't trigger it.
To reduce the risk of trapping ourselves, we use variable names in
macros that no sane person would use elsewhere. Here's our actual
definition of QOBJECT():
#define QOBJECT(obj) ({ \
typeof(obj) _obj = (obj); \
_obj ? container_of(&(_obj)->base, QObject, base) : NULL; \
})
Works well enough until we nest macro calls. For instance, with
#define qobject_ref(obj) ({ \
typeof(obj) _obj = (obj); \
qobject_ref_impl(QOBJECT(_obj)); \
_obj; \
})
the expression qobject_ref(obj) expands into
({
typeof(obj) _obj = (obj);
qobject_ref_impl(
({
---> typeof(_obj) _obj = (_obj);
_obj ? container_of(&(_obj)->base, QObject, base) : NULL;
}));
_obj;
})
Unintended variable name capture at --->.
The only reliable way to prevent unintended variable name capture is
-Wshadow.
One blocker for enabling it is shadowing hiding in function-like
macros like
qdict_put(dict, "name", qobject_ref(...))
qdict_put() wraps its last argument in QOBJECT(), and the last
argument here contains another QOBJECT().
Use dark preprocessor sorcery to make the macros that give us this
problem use different variable names on every call.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230921121312.1301864-8-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Local variables shadowing other local variables or parameters make the
code needlessly hard to understand. Tracked down with -Wshadow=local.
Clean up: delete inner declarations when they are actually redundant,
else rename variables.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230921121312.1301864-7-armbru@redhat.com>
Local variables shadowing other local variables or parameters make the
code needlessly hard to understand. Tracked down with -Wshadow=local.
Clean up: delete inner declarations when they are actually redundant,
else rename variables.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230921121312.1301864-6-armbru@redhat.com>
Local variables shadowing other local variables or parameters make the
code needlessly hard to understand. Tracked down with -Wshadow=local.
Clean up: rename both the pair of parameters and the pair of local
variables. While there, move the local variables to function scope.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230921121312.1301864-5-armbru@redhat.com>
Local variables shadowing other local variables or parameters make the
code needlessly hard to understand. Tracked down with -Wshadow=local.
Clean up: delete inner declarations when they are actually redundant,
else rename variables.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20230921121312.1301864-4-armbru@redhat.com>
Local variables shadowing other local variables or parameters make the
code needlessly hard to understand. Tracked down with -Wshadow=local.
Clean up: delete inner declarations when they are actually redundant,
else rename variables.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Message-ID: <20230921121312.1301864-3-armbru@redhat.com>
qemu_rdma_save_page() reports polling error with error_report(), then
succeeds anyway. This is because the variable holding the polling
status *shadows* the variable the function returns. The latter
remains zero.
Broken since day one, and duplicated more recently.
Fixes: 2da776db48 (rdma: core logic)
Fixes: b390afd8c5 (migration/rdma: Fix out of order wrid)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Message-ID: <20230921121312.1301864-2-armbru@redhat.com>
Now that the return path thread is allowed to finish during a paused
migration, we can move the cleanup of the QEMUFiles to the main
migration thread.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-9-farosas@suse.de>
Replace the return path retry logic with finishing and restarting the
thread. This fixes a race when resuming the migration that leads to a
segfault.
Currently when doing postcopy we consider that an IO error on the
return path file could be due to a network intermittency. We then keep
the thread alive but have it do cleanup of the 'from_dst_file' and
wait on the 'postcopy_pause_rp' semaphore. When the user issues a
migrate resume, a new return path is opened and the thread is allowed
to continue.
There's a race condition in the above mechanism. It is possible for
the new return path file to be setup *before* the cleanup code in the
return path thread has had a chance to run, leading to the *new* file
being closed and the pointer set to NULL. When the thread is released
after the resume, it tries to dereference 'from_dst_file' and crashes:
Thread 7 "return path" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffd1dbf700 (LWP 9611)]
0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154
154 return f->last_error;
(gdb) bt
#0 0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154
#1 0x00005555560e4983 in qemu_file_get_error (f=0x0) at ../migration/qemu-file.c:206
#2 0x0000555555b9a1df in source_return_path_thread (opaque=0x555556e06000) at ../migration/migration.c:1876
#3 0x000055555602e14f in qemu_thread_start (args=0x55555782e780) at ../util/qemu-thread-posix.c:541
#4 0x00007ffff38d76ea in start_thread (arg=0x7fffd1dbf700) at pthread_create.c:477
#5 0x00007ffff35efa6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Here's the race (important bit is open_return_path happening before
migration_release_dst_files):
migration | qmp | return path
--------------------------+-----------------------------+---------------------------------
qmp_migrate_pause()
shutdown(ms->to_dst_file)
f->last_error = -EIO
migrate_detect_error()
postcopy_pause()
set_state(PAUSED)
wait(postcopy_pause_sem)
qmp_migrate(resume)
migrate_fd_connect()
resume = state == PAUSED
open_return_path <-- TOO SOON!
set_state(RECOVER)
post(postcopy_pause_sem)
(incoming closes to_src_file)
res = qemu_file_get_error(rp)
migration_release_dst_files()
ms->rp_state.from_dst_file = NULL
post(postcopy_pause_rp_sem)
postcopy_pause_return_path_thread()
wait(postcopy_pause_rp_sem)
rp = ms->rp_state.from_dst_file
goto retry
qemu_file_get_error(rp)
SIGSEGV
-------------------------------------------------------------------------------------------
We can keep the retry logic without having the thread alive and
waiting. The only piece of data used by it is the 'from_dst_file' and
it is only allowed to proceed after a migrate resume is issued and the
semaphore released at migrate_fd_connect().
Move the retry logic to outside the thread by waiting for the thread
to finish before pausing the migration.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-8-farosas@suse.de>
We'll start calling the await_return_path_close_on_source() function
from other parts of the code, so move all of the related checks and
tracepoints into it.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-7-farosas@suse.de>
This file is owned by the return path thread which is already doing
cleanup.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-6-farosas@suse.de>
It's not safe to call qemu_file_shutdown() on the to_dst_file without
first checking for the file's presence under the lock. The cleanup of
this file happens at postcopy_pause() and migrate_fd_cleanup() which
are not necessarily running in the same thread as migrate_fd_cancel().
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-5-farosas@suse.de>
We cannot call qemu_file_shutdown() on the return path file without
taking the file lock. The return path thread could be running it's
cleanup code and have just cleared the from_dst_file pointer.
Checking ms->to_dst_file for errors could also race with
migrate_fd_cleanup() which clears the to_dst_file pointer.
Protect both accesses by taking the file lock.
This was caught by inspection, it should be rare, but the next patches
will start calling this code from other places, so let's do the
correct thing.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-4-farosas@suse.de>
We don't need to set the rp_state.error right after a shutdown because
qemu_file_shutdown() always sets the QEMUFile error, so the return
path thread would have seen it and set the rp error itself.
Setting the error outside of the thread is also racy because the
thread could clear it after we set it.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-3-farosas@suse.de>
We hit intermit CI issue on failing at migration-test over the unit test
preempt/plain:
qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc current = 4f hit_edge = 1
**
ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion failed: (bad == 0)
(test program exited with status code -6)
Fabiano debugged into it and found that the preempt thread can quit even
without receiving all the pages, which can cause guest not receiving all
the pages and corrupt the guest memory.
To make sure preempt thread finished receiving all the pages, we can rely
on the page_requested_count being zero because preempt channel will only
receive requested page faults. Note, not all the faulted pages are required
to be sent via the preempt channel/thread; imagine the case when a
requested page is just queued into the background main channel for
migration, the src qemu will just still send it via the background channel.
Here instead of spinning over reading the count, we add a condvar so the
main thread can wait on it if that unusual case happened, without burning
the cpu for no good reason, even if the duration is short; so even if we
spin in this rare case is probably fine. It's just better to not do so.
The condvar is only used when that special case is triggered. Some memory
ordering trick is needed to guarantee it from happening (against the
preempt thread status field), so the main thread will always get a kick
when that triggers correctly.
Closes: https://gitlab.com/qemu-project/qemu/-/issues/1886
Debugged-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-2-farosas@suse.de>
The marking should be extended transitively to all functions that call
these ones, so that static analysis can be done much more efficiently.
However, this is a start and makes it possible to use vrc's path-based
searches to find potential bugs where coroutine_fns call blocking functions.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>