qemu/include
David Hildenbrand f39b7d2b96 kvm: Atomic memslot updates
If we update an existing memslot (e.g., resize, split), we temporarily
remove the memslot to re-add it immediately afterwards. These updates
are not atomic, especially not for KVM VCPU threads, such that we can
get spurious faults.

Let's inhibit most KVM ioctls while performing relevant updates, such
that we can perform the update just as if it would happen atomically
without additional kernel support.

We capture the add/del changes and apply them in the notifier commit
stage instead. There, we can check for overlaps and perform the ioctl
inhibiting only if really required (-> overlap).

To keep things simple we don't perform additional checks that wouldn't
actually result in an overlap -- such as !RAM memory regions in some
cases (see kvm_set_phys_mem()).

To minimize cache-line bouncing, use a separate indicator
(in_ioctl_lock) per CPU.  Also, make sure to hold the kvm_slots_lock
while performing both actions (removing+re-adding).

We have to wait until all IOCTLs were exited and block new ones from
getting executed.

This approach cannot result in a deadlock as long as the inhibitor does
not hold any locks that might hinder an IOCTL from getting finished and
exited - something fairly unusual. The inhibitor will always hold the BQL.

AFAIKs, one possible candidate would be userfaultfd. If a page cannot be
placed (e.g., during postcopy), because we're waiting for a lock, or if the
userfaultfd thread cannot process a fault, because it is waiting for a
lock, there could be a deadlock. However, the BQL is not applicable here,
because any other guest memory access while holding the BQL would already
result in a deadlock.

Nothing else in the kernel should block forever and wait for userspace
intervention.

Note: pause_all_vcpus()/resume_all_vcpus() or
start_exclusive()/end_exclusive() cannot be used, as they either drop
the BQL or require to be called without the BQL - something inhibitors
cannot handle. We need a low-level locking mechanism that is
deadlock-free even when not releasing the BQL.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Tested-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20221111154758.1372674-4-eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-01-11 09:59:39 +01:00
..
authz Prefer 'on' | 'off' over 'yes' | 'no' for bool options 2021-01-29 17:07:53 +00:00
block block: GRAPH_RDLOCK for functions only called by co_wrappers 2022-12-15 16:08:23 +01:00
chardev chardev: src buffer const for write functions 2022-09-29 14:38:05 +04:00
crypto crypto: Support export akcipher to pkcs8 2022-11-02 06:56:32 -04:00
disas target/loongarch: Add disassembler 2022-06-06 18:09:03 +00:00
exec * s390x header clean-ups from Philippe 2023-01-09 15:54:31 +00:00
fpu fpu: Add rebias bool, value and operation 2022-08-31 14:08:05 -03:00
hw accel: introduce accelerator blocker API 2023-01-11 09:59:39 +01:00
io io: Tidy up fat-fingered parameter name 2022-12-14 16:19:35 +01:00
libdecnumber Replace config-time define HOST_WORDS_BIGENDIAN 2022-04-06 10:50:37 +02:00
migration migration: Remove load_state_old and minimum_version_id_old 2022-03-02 18:20:45 +00:00
monitor pci: Move HMP command from hw/pci/pcie_aer.c to pci-hmp-cmds.c 2022-12-19 16:21:56 +01:00
net virtio-net: add support for configure interrupt 2023-01-08 01:54:22 -05:00
qapi qerror: QERR_PERMISSION_DENIED is no longer used, drop 2022-10-27 07:57:18 +02:00
qemu * s390x header clean-ups from Philippe 2023-01-09 15:54:31 +00:00
qom qom/object: Remove circular include dependency 2022-06-28 10:53:32 +02:00
scsi scsi-disk: add SCSI_DISK_QUIRK_MODE_PAGE_VENDOR_SPECIFIC_APPLE quirk for Macintosh 2022-07-13 16:58:58 +02:00
semihosting semihosting: Allow optional use of semihosting from userspace 2022-09-13 17:18:21 +01:00
standard-headers m68k: rework BI_VIRT_RNG_SEED as BI_RNG_SEED 2022-10-21 20:46:10 +02:00
sysemu kvm: Atomic memslot updates 2023-01-11 09:59:39 +01:00
tcg tcg: Reorg function calls 2023-01-05 11:41:29 -08:00
ui ui/console: Get tab completion working again in the SDL monitor vc 2022-09-23 13:42:09 +02:00
user include: Include headers where needed 2023-01-08 01:54:22 -05:00
elf.h include/elf.h: add s390x note types 2022-10-26 12:54:59 +04:00
glib-compat.h compiler.h: replace QEMU_NORETURN with G_NORETURN 2022-04-21 17:03:51 +04:00
qemu-io.h Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
qemu-main.h ui/cocoa: Run qemu_init in the main thread 2022-09-23 14:36:33 +02:00