From ddc7cb30f824a7062974c017154b9c0dc2a059aa Mon Sep 17 00:00:00 2001 From: Miroslav Rezanina Date: Wed, 8 Mar 2023 08:05:57 -0500 Subject: [PATCH 1/4] Fix build without CONFIG_XEN_EMU Upstream commit ddf0fd9ae1 "hw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callback" added kvm_xen_maybe_deassert_callback usage to target/i386/kvm/kvm.c file without conditional preprocessing check. This breaks any build not using CONFIG_XEN_EMU. Protect call by conditional preprocessing to allow build without CONFIG_XEN_EMU. Signed-off-by: Miroslav Rezanina Reviewed-by: David Woodhouse Message-Id: <20230308130557.2420-1-mrezanin@redhat.com> Signed-off-by: Paolo Bonzini --- target/i386/kvm/kvm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 1aef54f87e..de531842f6 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -4991,6 +4991,7 @@ MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run) kvm_rate_limit_on_bus_lock(); } +#ifdef CONFIG_XEN_EMU /* * If the callback is asserted as a GSI (or PCI INTx) then check if * vcpu_info->evtchn_upcall_pending has been cleared, and deassert @@ -5001,6 +5002,7 @@ MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run) if (x86_cpu->env.xen_callback_asserted) { kvm_xen_maybe_deassert_callback(cpu); } +#endif /* We need to protect the apic state against concurrent accesses from * different threads in case the userspace irqchip is used. */ From 3b31669ea986575625cad4ee3dbe85bb3870b5c6 Mon Sep 17 00:00:00 2001 From: Paolo Bonzini Date: Thu, 9 Mar 2023 12:42:38 +0100 Subject: [PATCH 2/4] docs/devel: clarify further the semantics of RMW operations Signed-off-by: Paolo Bonzini --- docs/devel/atomics.rst | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/docs/devel/atomics.rst b/docs/devel/atomics.rst index 633df65a97..81ec26be17 100644 --- a/docs/devel/atomics.rst +++ b/docs/devel/atomics.rst @@ -469,13 +469,19 @@ and memory barriers, and the equivalents in QEMU: In QEMU, the second kind is named ``atomic_OP_fetch``. - different atomic read-modify-write operations in Linux imply - a different set of memory barriers; in QEMU, all of them enforce - sequential consistency. + a different set of memory barriers. In QEMU, all of them enforce + sequential consistency: there is a single order in which the + program sees them happen. -- in QEMU, ``qatomic_read()`` and ``qatomic_set()`` do not participate in - the ordering enforced by read-modify-write operations. - This is because QEMU uses the C11 memory model. The following example - is correct in Linux but not in QEMU: +- however, according to the C11 memory model that QEMU uses, this order + does not propagate to other memory accesses on either side of the + read-modify-write operation. As far as those are concerned, the + operation consist of just a load-acquire followed by a store-release. + Stores that precede the RMW operation, and loads that follow it, can + still be reordered and will happen *in the middle* of the read-modify-write + operation! + + Therefore, the following example is correct in Linux but not in QEMU: +----------------------------------+--------------------------------+ | Linux (correct) | QEMU (incorrect) | From 54ad31fb0a0a54d3564f74a0e8dc8ef2b70d5611 Mon Sep 17 00:00:00 2001 From: David Woodhouse Date: Wed, 8 Mar 2023 11:19:50 +0000 Subject: [PATCH 3/4] hw/intc/ioapic: Update KVM routes before redelivering IRQ, on RTE update A Linux guest will perform IRQ migration after the IRQ has happened, updating the RTE to point to the new destination CPU and then unmasking the interrupt. However, when the guest updates the RTE, ioapic_mem_write() calls ioapic_service(), which redelivers the pending level interrupt via kvm_set_irq(), *before* calling ioapic_update_kvm_routes() which sets the new target CPU. Thus, the IRQ which is supposed to go to the new target CPU is instead misdelivered to the previous target. An example where the guest kernel is attempting to migrate from CPU#2 to CPU#0 shows: xenstore_read tx 0 path control/platform-feature-xs_reset_watches ioapic_set_irq vector: 11 level: 1 ioapic_set_remote_irr set remote irr for pin 11 ioapic_service: trigger KVM IRQ 11 [ 0.523627] The affinity mask was 0-3 and the handler is on 2 ioapic_mem_write ioapic mem write addr 0x0 regsel: 0x27 size 0x4 val 0x26 ioapic_update_kvm_routes: update KVM route for IRQ 11: fee02000 8021 ioapic_mem_write ioapic mem write addr 0x10 regsel: 0x26 size 0x4 val 0x18021 xenstore_reset_watches ioapic_set_irq vector: 11 level: 1 ioapic_mem_read ioapic mem read addr 0x10 regsel: 0x26 size 0x4 retval 0x1c021 [ 0.524569] ioapic_ack_level IRQ 11 moveit = 1 ioapic_eoi_broadcast EOI broadcast for vector 33 ioapic_clear_remote_irr clear remote irr for pin 11 vector 33 ioapic_mem_write ioapic mem write addr 0x0 regsel: 0x26 size 0x4 val 0x26 ioapic_mem_read ioapic mem read addr 0x10 regsel: 0x26 size 0x4 retval 0x18021 [ 0.525235] ioapic_finish_move IRQ 11 calls irq_move_masked_irq() [ 0.526147] irq_do_set_affinity for IRQ 11, 0 [ 0.526732] ioapic_set_affinity for IRQ 11, 0 [ 0.527330] ioapic_setup_msg_from_msi for IRQ11 target 0 ioapic_mem_write ioapic mem write addr 0x0 regsel: 0x26 size 0x4 val 0x27 ioapic_mem_write ioapic mem write addr 0x10 regsel: 0x27 size 0x4 val 0x0 ioapic_mem_write ioapic mem write addr 0x0 regsel: 0x27 size 0x4 val 0x26 ioapic_mem_write ioapic mem write addr 0x10 regsel: 0x26 size 0x4 val 0x18021 [ 0.527623] ioapic_set_affinity returns 0 [ 0.527623] ioapic_finish_move IRQ 11 calls unmask_ioapic_irq() ioapic_mem_write ioapic mem write addr 0x0 regsel: 0x26 size 0x4 val 0x26 ioapic_mem_write ioapic mem write addr 0x10 regsel: 0x26 size 0x4 val 0x8021 ioapic_set_remote_irr set remote irr for pin 11 ioapic_service: trigger KVM IRQ 11 ioapic_update_kvm_routes: update KVM route for IRQ 11: fee00000 8021 [ 0.529571] The affinity mask was 0 and the handler is on 2 [ xenstore_watch path memory/target token FFFFFFFF92847D40 There are no other code paths in ioapic_mem_write() which need the KVM IRQ routing table to be updated, so just shift the call from the end of the function to happen right before the call to ioapic_service() and thus deliver the re-enabled IRQ to the right place. Alternative fixes might have been just to remove the part in ioapic_service() which delivers the IRQ via kvm_set_irq() because surely delivering as MSI ought to work just fine anyway in all cases? That code lacks a comment justifying its existence. Or maybe in the specific case shown in the above log, it would have sufficed for ioapic_update_kvm_routes() to update the route *even* when the IRQ is masked. It's not like it's actually going to get triggered unless QEMU deliberately does so, anyway? But that only works because the target CPU happens to be in the high word of the RTE; if something in the *low* word (vector, perhaps) was changed at the same time as the unmask, we'd still trigger with stale data. Fixes: 15eafc2e602f "kvm: x86: add support for KVM_CAP_SPLIT_IRQCHIP" Signed-off-by: David Woodhouse Reviewed-by: Peter Xu Message-Id: <20230308111952.2728440-2-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini --- hw/intc/ioapic.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c index 6364ecab1b..716ffc8bbb 100644 --- a/hw/intc/ioapic.c +++ b/hw/intc/ioapic.c @@ -405,6 +405,7 @@ ioapic_mem_write(void *opaque, hwaddr addr, uint64_t val, s->ioredtbl[index] |= ro_bits; s->irq_eoi[index] = 0; ioapic_fix_edge_remote_irr(&s->ioredtbl[index]); + ioapic_update_kvm_routes(s); ioapic_service(s); } } @@ -417,8 +418,6 @@ ioapic_mem_write(void *opaque, hwaddr addr, uint64_t val, ioapic_eoi_broadcast(val); break; } - - ioapic_update_kvm_routes(s); } static const MemoryRegionOps ioapic_io_ops = { From dee2a4d4d2f6adc3c664e37a559d4187deee4818 Mon Sep 17 00:00:00 2001 From: Fiona Ebner Date: Mon, 31 Oct 2022 10:47:16 +0100 Subject: [PATCH 4/4] vl: defuse PID file path resolve error Commit 85c4bf8aa6 ("vl: Unlink absolute PID file path") introduced a critical error when the PID file path cannot be resolved. Before this commit, it was possible to invoke QEMU when the PID file was a file created with mkstemp that was already unlinked at the time of the invocation. There might be other similar scenarios. It should not be a critical error when the PID file unlink notifier can't be registered, because the path can't be resolved. If the file is already gone from QEMU's perspective, silently ignore the error. Otherwise, only print a warning. Fixes: 85c4bf8aa6 ("vl: Unlink absolute PID file path") Reported-by: Dominik Csapak Suggested-by: Thomas Lamprecht Signed-off-by: Fiona Ebner Reviewed-by: Hanna Reitz Message-Id: <20221031094716.39786-1-f.ebner@proxmox.com> Signed-off-by: Paolo Bonzini --- softmmu/vl.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/softmmu/vl.c b/softmmu/vl.c index 3340f63c37..ea20b23e4c 100644 --- a/softmmu/vl.c +++ b/softmmu/vl.c @@ -2465,10 +2465,11 @@ static void qemu_maybe_daemonize(const char *pid_file) pid_file_realpath = g_malloc0(PATH_MAX); if (!realpath(pid_file, pid_file_realpath)) { - error_report("cannot resolve PID file path: %s: %s", - pid_file, strerror(errno)); - unlink(pid_file); - exit(1); + if (errno != ENOENT) { + warn_report("not removing PID file on exit: cannot resolve PID " + "file path: %s: %s", pid_file, strerror(errno)); + } + return; } qemu_unlink_pidfile_notifier = (struct UnlinkPidfileNotifier) {