mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Fabiano Rosas	380e49297c	kvm: ppc: Print meaningful message on KVM_CREATE_VM failure PowerPC has two KVM types (HV, PR) that translate into three kernel modules: kvm.ko - common kvm code kvm_hv.ko - kvm running with MSR_HV=1 or MSR_HV\|PR=0 in a nested guest. kvm_pr.ko - kvm running in usermode MSR_PR=1. Since the two KVM types can both be running at the same time, this creates a situation in which it is possible for one or both of the modules to fail to initialize, leaving the generic one behind. This leads QEMU to think it can create a guest, but KVM will fail when calling the type-specific code: ioctl(KVM_CREATE_VM) failed: 22 Invalid argument qemu-kvm: failed to initialize KVM: Invalid argument Ideally this would be solved kernel-side, but it might be a while until we can get rid of one of the modules. So in the meantime this patch tries to make this less confusing for the end user by adding a more elucidative message: ioctl(KVM_CREATE_VM) failed: 22 Invalid argument PPC KVM module is not loaded. Try 'modprobe kvm_hv'. [dwg: Fixed error in #elif which failed compile on !ppc hosts] Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20210722141340.2367905-1-farosas@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-07-29 10:59:49 +10:00
Peter Xu	dcafa24827	KVM: Fix dirty ring mmap incorrect size due to renaming accident Found this when I wanted to try the per-vcpu dirty rate series out, then I found that it's not really working and it can quickly hang death a guest. I found strange errors (e.g. guest crash after migration) happens even without the per-vcpu dirty rate series. When merging dirty ring, probably no one notice that the trivial renaming diff [1] missed two existing references of kvm_dirty_ring_sizes; they do matter since otherwise we'll mmap() a shorter range of memory after the renaming. I think it didn't SIGBUS for me easily simply because some other stuff within qemu mmap()ed right after the dirty rings (e.g. when testing 4096 slots, it aligned with one small page on x86), so when we access the rings we've been reading/writting to random memory elsewhere of qemu. Fix the two sizes when map/unmap the shared dirty gfn memory. [1] https://lore.kernel.org/qemu-devel/dac5f0c6-1bca-3daf-e5d2-6451dbbaca93@redhat.com/ Cc: Hyman Huang <huangy81@chinatelecom.cn> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210609014355.217110-1-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-06-25 10:54:12 +02:00
Stefano Garzarella	d0fb9657a3	docs: fix references to docs/devel/tracing.rst Commit `e50caf4a5c` ("tracing: convert documentation to rST") converted docs/devel/tracing.txt to docs/devel/tracing.rst. We still have several references to the old file, so let's fix them with the following command: sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt) Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210517151702.109066-2-sgarzare@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2021-06-02 06:51:09 +02:00
Peter Xu	b4420f198d	KVM: Dirty ring support KVM dirty ring is a new interface to pass over dirty bits from kernel to the userspace. Instead of using a bitmap for each memory region, the dirty ring contains an array of dirtied GPAs to fetch (in the form of offset in slots). For each vcpu there will be one dirty ring that binds to it. kvm_dirty_ring_reap() is the major function to collect dirty rings. It can be called either by a standalone reaper thread that runs in the background, collecting dirty pages for the whole VM. It can also be called directly by any thread that has BQL taken. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210506160549.130416-11-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-26 14:49:46 +02:00
Peter Xu	a81a592698	KVM: Disable manual dirty log when dirty ring enabled KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is for KVM_CLEAR_DIRTY_LOG, which is only useful for KVM_GET_DIRTY_LOG. Skip enabling it for kvm dirty ring. More importantly, KVM_DIRTY_LOG_INITIALLY_SET will not wr-protect all the pages initially, which is against how kvm dirty ring is used - there's no way for kvm dirty ring to re-protect a page before it's notified as being written first with a GFN entry in the ring! So when KVM_DIRTY_LOG_INITIALLY_SET is enabled with dirty ring, we'll see silent data loss after migration. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210506160549.130416-10-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-26 14:49:46 +02:00
Peter Xu	2ea5cb0a47	KVM: Add dirty-ring-size property Add a parameter for dirty gfn count for dirty rings. If zero, dirty ring is disabled. Otherwise dirty ring will be enabled with the per-vcpu gfn count as specified. If dirty ring cannot be enabled due to unsupported kernel or illegal parameter, it'll fallback to dirty logging. By default, dirty ring is not enabled (dirty-gfn-count default to 0). Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210506160549.130416-9-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-26 14:49:45 +02:00
Peter Xu	563d32ba9b	KVM: Cache kvm slot dirty bitmap size Cache it too because we'll reference it more frequently in the future. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210506160549.130416-8-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-26 14:49:45 +02:00
Peter Xu	29b7e8be76	KVM: Simplify dirty log sync in kvm_set_phys_mem kvm_physical_sync_dirty_bitmap() on the whole section is inaccurate, because the section can be a superset of the memslot that we're working on. The result is that if the section covers multiple kvm memslots, we could be doing the synchronization for multiple times for each kvmslot in the section. With the two helpers that we just introduced, it's very easy to do it right now by calling the helpers. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210506160549.130416-7-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-26 14:49:45 +02:00
Peter Xu	2c20b27eed	KVM: Provide helper to sync dirty bitmap from slot to ramblock kvm_physical_sync_dirty_bitmap() calculates the ramblock offset in an awkward way from the MemoryRegionSection that passed in from the caller. The truth is for each KVMSlot the ramblock offset never change for the lifecycle. Cache the ramblock offset for each KVMSlot into the structure when the KVMSlot is created. With that, we can further simplify kvm_physical_sync_dirty_bitmap() with a helper to sync KVMSlot dirty bitmap to the ramblock dirty bitmap of a specific KVMSlot. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210506160549.130416-6-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-26 14:49:45 +02:00
Peter Xu	e65e5f50db	KVM: Provide helper to get kvm dirty log Provide a helper kvm_slot_get_dirty_log() to make the function kvm_physical_sync_dirty_bitmap() clearer. We can even cache the as_id into KVMSlot when it is created, so that we don't even need to pass it down every time. Since at it, remove return value of kvm_physical_sync_dirty_bitmap() because it should never fail. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210506160549.130416-5-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-26 14:49:45 +02:00
Peter Xu	ea776d15ad	KVM: Create the KVMSlot dirty bitmap on flag changes Previously we have two places that will create the per KVMSlot dirty bitmap: 1. When a newly created KVMSlot has dirty logging enabled, 2. When the first log_sync() happens for a memory slot. The 2nd case is lazy-init, while the 1st case is not (which is a fix of what the 2nd case missed). To do explicit initialization of dirty bitmaps, what we're missing is to create the dirty bitmap when the slot changed from not-dirty-track to dirty-track. Do that in kvm_slot_update_flags(). With that, we can safely remove the 2nd lazy-init. This change will be needed for kvm dirty ring because kvm dirty ring does not use the log_sync() interface at all. Also move all the pre-checks into kvm_slot_init_dirty_bitmap(). Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210506160549.130416-4-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-26 14:49:45 +02:00
Peter Xu	a2f77862ff	KVM: Use a big lock to replace per-kml slots_lock Per-kml slots_lock will bring some trouble if we want to take all slots_lock of all the KMLs, especially when we're in a context that we could have taken some of the KML slots_lock, then we even need to figure out what we've taken and what we need to take. Make this simple by merging all KML slots_lock into a single slots lock. Per-kml slots_lock isn't anything that helpful anyway - so far only x86 has two address spaces (so, two slots_locks). All the rest archs will be having one address space always, which means there's actually one slots_lock so it will be the same as before. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210506160549.130416-3-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-26 14:49:45 +02:00
Paolo Bonzini	70cbae429e	KVM: do not allow setting properties at runtime Only allow accelerator properties to be set when the accelerator is being created. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-26 14:49:45 +02:00
Thomas Huth	ee86213aa3	Do not include exec/address-spaces.h if it's not really necessary Stop including exec/address-spaces.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-5-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:51 +02:00
Thomas Huth	4c386f8064	Do not include sysemu/sysemu.h if it's not really necessary Stop including sysemu/sysemu.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-2-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:50 +02:00
David Edmondson	56567da376	accel: kvm: clarify that extra exit data is hexadecimal When dumping the extra exit data provided by KVM, make it clear that the data is hexadecimal. At the same time, zero-pad the output. Signed-off-by: David Edmondson <david.edmondson@oracle.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210428142431.266879-1-david.edmondson@oracle.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:50 +02:00
Paolo Bonzini	26dbec410e	Revert "accel: kvm: Add aligment assert for kvm_log_clear_one_slot" This reverts commit `3920552846`. Thomas Huth reported a failure with CentOS 6 guests: ../../devel/qemu/accel/kvm/kvm-all.c:690: kvm_log_clear_one_slot: Assertion `QEMU_IS_ALIGNED(start \| size, psize)' failed. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-03-16 14:30:30 -04:00
Andrew Jones	516fc0a081	accel: kvm: Fix kvm_type invocation Prior to commit `f2ce39b4f0` a MachineClass kvm_type method only needed to be registered to ensure it would be executed. With commit `f2ce39b4f0` a kvm-type machine property must also be specified. hw/arm/virt relies on the kvm_type method to pass its selected IPA limit to KVM, but this is not exposed as a machine property. Restore the previous functionality of invoking kvm_type when it's present. Fixes: `f2ce39b4f0` ("vl: make qemu_get_machine_opts static") Signed-off-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Message-id: 20210310135218.255205-2-drjones@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-03-12 12:47:11 +00:00
Keqian Zhu	3920552846	accel: kvm: Add aligment assert for kvm_log_clear_one_slot The parameters start and size are transfered from QEMU memory emulation layer. It can promise that they are TARGET_PAGE_SIZE aligned. However, KVM needs they are qemu_real_page_size aligned. Though no caller breaks this aligned requirement currently, we'd better add an explicit assert to avoid future breaking. Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <20201217014941.22872-3-zhukeqian1@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-03-06 11:41:54 +01:00
Keqian Zhu	e0a8f99355	accel: kvm: Fix memory waste under mismatch page size When handle dirty log, we face qemu_real_host_page_size and TARGET_PAGE_SIZE. The first one is the granule of KVM dirty bitmap, and the second one is the granule of QEMU dirty bitmap. As qemu_real_host_page_size >= TARGET_PAGE_SIZE (kvm_init() enforced it), misuse TARGET_PAGE_SIZE to init kvmslot dirty_bmap may waste memory. For example, when qemu_real_host_page_size is 64K and TARGET_PAGE_SIZE is 4K, it wastes 93.75% (15/16) memory. Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20201217014941.22872-2-zhukeqian1@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-03-06 11:41:54 +01:00
Tom Lendacky	92a5199b29	sev/i386: Don't allow a system reset under an SEV-ES guest An SEV-ES guest does not allow register state to be altered once it has been measured. When an SEV-ES guest issues a reboot command, Qemu will reset the vCPU state and resume the guest. This will cause failures under SEV-ES. Prevent that from occuring by introducing an arch-specific callback that returns a boolean indicating whether vCPUs are resettable. Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: Aleksandar Rikalo <aleksandar.rikalo@syrmia.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: David Hildenbrand <david@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com> Message-Id: <1ac39c441b9a3e970e9556e1cc29d0a0814de6fd.1611682609.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-02-16 17:15:39 +01:00
Paolo Bonzini	b2f73a0784	sev/i386: Allow AP booting under SEV-ES When SEV-ES is enabled, it is not possible modify the guests register state after it has been initially created, encrypted and measured. Normally, an INIT-SIPI-SIPI request is used to boot the AP. However, the hypervisor cannot emulate this because it cannot update the AP register state. For the very first boot by an AP, the reset vector CS segment value and the EIP value must be programmed before the register has been encrypted and measured. Search the guest firmware for the guest for a specific GUID that tells Qemu the value of the reset vector to use. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Message-Id: <22db2bfb4d6551aed661a9ae95b4fdbef613ca21.1611682609.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-02-16 17:15:39 +01:00
Thomas Huth	38e0b7904e	accel/kvm/kvm-all: Fix wrong return code handling in dirty log code The kvm_vm_ioctl() wrapper already returns -errno if the ioctl itself returned -1, so the callers of kvm_vm_ioctl() should not check for -1 but for a value < 0 instead. This problem has been fixed once already in commit `b533f658a9` but that commit missed that the ENOENT error code is not fatal for this ioctl, so the commit has been reverted in commit `50212d6346` since the problem occurred close to a pending release at that point in time. The plan was to fix it properly after the release, but it seems like this has been forgotten. So let's do it now finally instead. Resolves: https://bugs.launchpad.net/qemu/+bug/1294227 Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210129084354.42928-1-thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-02-08 14:43:55 +01:00
David Gibson	ec78e2cda3	confidential guest support: Move SEV initialization into arch specific code While we've abstracted some (potential) differences between mechanisms for securing guest memory, the initialization is still specific to SEV. Given that, move it into x86's kvm_arch_init() code, rather than the generic kvm_init() code. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org>	2021-02-08 16:57:38 +11:00
David Gibson	c9f5aaa6bc	sev: Add Error ** to sev_kvm_init() This allows failures to be reported richly and idiomatically. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com>	2021-02-08 16:57:38 +11:00
David Gibson	e0292d7c62	confidential guest support: Rework the "memory-encryption" property Currently the "memory-encryption" property is only looked at once we get to kvm_init(). Although protection of guest memory from the hypervisor isn't something that could really ever work with TCG, it's not conceptually tied to the KVM accelerator. In addition, the way the string property is resolved to an object is almost identical to how a QOM link property is handled. So, create a new "confidential-guest-support" link property which sets this QOM interface link directly in the machine. For compatibility we keep the "memory-encryption" property, but now implemented in terms of the new property. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com>	2021-02-08 16:57:38 +11:00
David Gibson	aacdb84413	sev: Remove false abstraction of flash encryption When AMD's SEV memory encryption is in use, flash memory banks (which are initialed by pc_system_flash_map()) need to be encrypted with the guest's key, so that the guest can read them. That's abstracted via the kvm_memcrypt_encrypt_data() callback in the KVM state.. except, that it doesn't really abstract much at all. For starters, the only call site is in code specific to the 'pc' family of machine types, so it's obviously specific to those and to x86 to begin with. But it makes a bunch of further assumptions that need not be true about an arbitrary confidential guest system based on memory encryption, let alone one based on other mechanisms: * it assumes that the flash memory is defined to be encrypted with the guest key, rather than being shared with hypervisor * it assumes that that hypervisor has some mechanism to encrypt data into the guest, even though it can't decrypt it out, since that's the whole point * the interface assumes that this encrypt can be done in place, which implies that the hypervisor can write into a confidential guests's memory, even if what it writes isn't meaningful So really, this "abstraction" is actually pretty specific to the way SEV works. So, this patch removes it and instead has the PC flash initialization code call into a SEV specific callback. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cornelia Huck <cohuck@redhat.com>	2021-02-08 16:57:38 +11:00
Claudio Fontana	b86f59c715	accel: replace struct CpusAccel with AccelOpsClass This will allow us to centralize the registration of the cpus.c module accelerator operations (in accel/accel-softmmu.c), and trigger it automatically using object hierarchy lookup from the new accel_init_interfaces() initialization step, depending just on which accelerators are available in the code. Rename all tcg-cpus.c, kvm-cpus.c, etc to tcg-accel-ops.c, kvm-accel-ops.c, etc, matching the object type names. Signed-off-by: Claudio Fontana <cfontana@suse.de> Message-Id: <20210204163931.7358-18-cfontana@suse.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:15 -10:00
Jiaxun Yang	eb8b1a797a	accel/kvm: avoid using predefined PAGE_SIZE As per POSIX specification of limits.h [1], OS libc may define PAGE_SIZE in limits.h. PAGE_SIZE is used in included kernel uapi headers. To prevent collosion of definition, we discard PAGE_SIZE from defined by libc and take QEMU's variable. [1]: https://pubs.opengroup.org/onlinepubs/7908799/xsh/limits.h.html Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Message-Id: <20210118063808.12471-8-jiaxun.yang@flygoat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2021-01-20 10:46:54 +01:00
Zenghui Yu	4054adbdd2	kvm: Take into account the unaligned section size when preparing bitmap The kernel KVM_CLEAR_DIRTY_LOG interface has align requirement on both the start and the size of the given range of pages. We have been careful to handle the unaligned cases when performing CLEAR on one slot. But it seems that we forget to take the unaligned size case into account when preparing bitmap for the interface, and we may end up clearing dirty status for pages outside of [start, start + size). If the size is unaligned, let's go through the slow path to manipulate a temp bitmap for the interface so that we won't bother with those unaligned bits at the end of bitmap. I don't think this can happen in practice since the upper layer would provide us with the alignment guarantee. I'm not sure if kvm-all could rely on it. And this patch is mainly intended to address correctness of the specific algorithm used inside kvm_log_clear_one_slot(). Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Message-Id: <20201208114013.875-1-yuzenghui@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-12-15 12:52:05 -05:00
Paolo Bonzini	f2ce39b4f0	vl: make qemu_get_machine_opts static Machine options can be retrieved as properties of the machine object. Encourage that by removing the "easy" accessor to machine options. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-12-15 12:51:55 -05:00
Elena Afanasova	f9b4908895	accel/kvm: add PIO ioeventfds only in case kvm_eventfds_allowed is true Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Elena Afanasova <eafanasova@gmail.com> Message-Id: <20201017210102.26036-1-eafanasova@gmail.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-11-03 18:55:13 +00:00
Dr. David Alan Gilbert	d0a92b353e	kvm: kvm_init_vcpu take Error pointer Clean up the error handling in kvm_init_vcpu so we can see what went wrong more easily. Make it take an Error ** and fill it out with what failed, including the cpu id, so you can tell if it only fails at a given ID. Replace the remaining DPRINTF by a trace. This turns a: kvm_init_vcpu failed: Invalid argument into: kvm_init_vcpu: kvm_get_vcpu failed (256): Invalid argument and with the trace you then get to see: 19049@1595520414.310107:kvm_init_vcpu index: 169 id: 212 19050@1595520414.310635:kvm_init_vcpu index: 170 id: 256 qemu-system-x86_64: kvm_init_vcpu: kvm_get_vcpu failed (256): Invalid argument which makes stuff a lot more obvious. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200723160915.129069-1-dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-10-05 16:41:22 +02:00
Claudio Fontana	e0715f6abc	kvm: remove kvm specific functions from global includes Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-10-05 16:41:22 +02:00
Claudio Fontana	57038a92bb	cpus: extract out kvm-specific code to accel/kvm register a "CpusAccel" interface for KVM as well. Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> [added const] Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-10-05 16:41:22 +02:00
Stefan Hajnoczi	d73415a315	qemu/atomic.h: rename atomic_ to qatomic_ clang's C11 atomic_fetch_() functions only take a C11 atomic type pointer argument. QEMU uses direct types (int, etc) and this causes a compiler error when a QEMU code calls these functions in a source file that also included <stdatomic.h> via a system header file: $ CC=clang CXX=clang++ ./configure ... && make ../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int ' invalid) Avoid using atomic_*() names in QEMU's atomic.h since that namespace is used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h and <stdatomic.h> can co-exist. I checked /usr/include on my machine and searched GitHub for existing "qatomic_" users but there seem to be none. This patch was generated using: $ git grep -h -o '\<atomic$64$\?_[a-z0-9_]\+' include/qemu/atomic.h \| \ sort -u >/tmp/changed_identifiers $ for identifier in $(</tmp/changed_identifiers); do sed -i "s%\<$identifier\>%q$identifier%g" \ $(git grep -I -l "\<$identifier\>") done I manually fixed line-wrap issues and misaligned rST tables. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200923105646.47864-1-stefanha@redhat.com>	2020-09-23 16:07:44 +01:00
Daniel P. Berrangé	448058aa99	util: rename qemu_open() to qemu_open_old() We want to introduce a new version of qemu_open() that uses an Error object for reporting problems and make this it the preferred interface. Rename the existing method to release the namespace for the new impl. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-09-16 10:33:48 +01:00
Marc-André Lureau	1a82878a08	meson: accel Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-21 06:30:36 -04:00
Paolo Bonzini	243af0225a	trace: switch position of headers to what Meson requires Meson doesn't enjoy the same flexibility we have with Make in choosing the include path. In particular the tracing headers are using $(build_root)/$(<D). In order to keep the include directives unchanged, the simplest solution is to generate headers with patterns like "trace/trace-audio.h" and place forwarding headers in the source tree such that for example "audio/trace.h" includes "trace/trace-audio.h". This patch is too ugly to be applied to the Makefiles now. It's only a way to separate the changes to the tracing header files from the Meson rewrite of the tracing logic. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-21 06:18:24 -04:00
Markus Armbruster	668f62ec62	error: Eliminate error_propagate() with Coccinelle, part 1 When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. Convert if (!foo(..., &err)) { ... error_propagate(errp, err); ... return ... } to if (!foo(..., errp)) { ... ... return ... } where nothing else needs @err. Coccinelle script: @rule1 forall@ identifier fun, err, errp, lbl; expression list args, args2; binary operator op; constant c1, c2; symbol false; @@ if ( ( - fun(args, &err, args2) + fun(args, errp, args2) \| - !fun(args, &err, args2) + !fun(args, errp, args2) \| - fun(args, &err, args2) op c1 + fun(args, errp, args2) op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; \| return c2; \| return false; ) } @rule2 forall@ identifier fun, err, errp, lbl; expression list args, args2; expression var; binary operator op; constant c1, c2; symbol false; @@ - var = fun(args, &err, args2); + var = fun(args, errp, args2); ... when != err if ( ( var \| !var \| var op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; \| return c2; \| return false; \| return var; ) } @depends on rule1 \|\| rule2@ identifier err; @@ - Error *err = NULL; ... when != err Not exactly elegant, I'm afraid. The "when != lbl:" is necessary to avoid transforming if (fun(args, &err)) { goto out } ... out: error_propagate(errp, err); even though other paths to label out still need the error_propagate(). For an actual example, see sclp_realize(). Without the "when strict", Coccinelle transforms vfio_msix_setup(), incorrectly. I don't know what exactly "when strict" does, only that it helps here. The match of return is narrower than what I want, but I can't figure out how to express "return where the operand doesn't use @err". For an example where it's too narrow, see vfio_intx_enable(). Silently fails to convert hw/arm/armsse.c, because Coccinelle gets confused by ARMSSE being used both as typedef and function-like macro there. Converted manually. Line breaks tidied up manually. One nested declaration of @local_err deleted manually. Preexisting unwanted blank line dropped in hw/riscv/sifive_e.c. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-35-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	14217038bc	qapi: Use returned bool to check for failure, manual part The previous commit used Coccinelle to convert from checking the Error object to checking the return value. Convert a few more manually. Also tweak control flow in places to conform to the conventional "if error bail out" pattern. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-20-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	62a35aaa31	qapi: Use returned bool to check for failure, Coccinelle part The previous commit enables conversion of visit_foo(..., &err); if (err) { ... } to if (!visit_foo(..., errp)) { ... } for visitor functions that now return true / false on success / error. Coccinelle script: @@ identifier fun =~ "check_list\|input_type_enum\|lv_start_struct\|lv_type_bool\|lv_type_int64\|lv_type_str\|lv_type_uint64\|output_type_enum\|parse_type_bool\|parse_type_int64\|parse_type_null\|parse_type_number\|parse_type_size\|parse_type_str\|parse_type_uint64\|print_type_bool\|print_type_int64\|print_type_null\|print_type_number\|print_type_size\|print_type_str\|print_type_uint64\|qapi_clone_start_alternate\|qapi_clone_start_list\|qapi_clone_start_struct\|qapi_clone_type_bool\|qapi_clone_type_int64\|qapi_clone_type_null\|qapi_clone_type_number\|qapi_clone_type_str\|qapi_clone_type_uint64\|qapi_dealloc_start_list\|qapi_dealloc_start_struct\|qapi_dealloc_type_anything\|qapi_dealloc_type_bool\|qapi_dealloc_type_int64\|qapi_dealloc_type_null\|qapi_dealloc_type_number\|qapi_dealloc_type_str\|qapi_dealloc_type_uint64\|qobject_input_check_list\|qobject_input_check_struct\|qobject_input_start_alternate\|qobject_input_start_list\|qobject_input_start_struct\|qobject_input_type_any\|qobject_input_type_bool\|qobject_input_type_bool_keyval\|qobject_input_type_int64\|qobject_input_type_int64_keyval\|qobject_input_type_null\|qobject_input_type_number\|qobject_input_type_number_keyval\|qobject_input_type_size_keyval\|qobject_input_type_str\|qobject_input_type_str_keyval\|qobject_input_type_uint64\|qobject_input_type_uint64_keyval\|qobject_output_start_list\|qobject_output_start_struct\|qobject_output_type_any\|qobject_output_type_bool\|qobject_output_type_int64\|qobject_output_type_null\|qobject_output_type_number\|qobject_output_type_str\|qobject_output_type_uint64\|start_list\|visit_check_list\|visit_check_struct\|visit_start_alternate\|visit_start_list\|visit_start_struct\|visit_type_."; expression list args; typedef Error; Error err; @@ - fun(args, &err); - if (err) + if (!fun(args, &err)) { ... } A few line breaks tidied up manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-19-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
David Hildenbrand	956b109fe3	accel/kvm: Convert to ram_block_discard_disable() Discarding memory does not work as expected. At the time this is called, we cannot have anyone active that relies on discards to work properly. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20200626072248.78761-5-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-07-02 05:54:59 -04:00
Jay Zhou	494cd11d76	kvm: support to get/set dirty log initial-all-set capability Since the new capability KVM_DIRTY_LOG_INITIALLY_SET of KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 has been introduced in the kernel, tweak the userspace side to detect and enable this capability. Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20200304025554.2159-1-jianjay.zhou@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-06-26 06:45:29 -04:00
Peter Xu	c82d9d43ed	KVM: Kick resamplefd for split kernel irqchip This is majorly only for X86 because that's the only one that supports split irqchip for now. When the irqchip is split, we face a dilemma that KVM irqfd will be enabled, however the slow irqchip is still running in the userspace. It means that the resamplefd in the kernel irqfds won't take any effect and it will miss to ack INTx interrupts on EOIs. One example is split irqchip with VFIO INTx, which will break if we use the VFIO INTx fast path. This patch can potentially supports the VFIO fast path again for INTx, that the IRQ delivery will still use the fast path, while we don't need to trap MMIOs in QEMU for the device to emulate the EIOs (see the callers of vfio_eoi() hook). However the EOI of the INTx will still need to be done from the userspace by caching all the resamplefds in QEMU and kick properly for IOAPIC EOI broadcast. This is tricky because in this case the userspace ioapic irr & remote-irr will be bypassed. However such a change will greatly boost performance for assigned devices using INTx irqs (TCP_RR boosts 46% after this patch applied). When the userspace is responsible for the resamplefd kickup, don't register it on the kvm_irqfd anymore, because on newer kernels (after commit 654f1f13ea56, 5.2+) the KVM_IRQFD will fail if with both split irqchip and resamplefd. This will make sure that the fast path will work for all supported kernels. https://patchwork.kernel.org/patch/10738541/#22609933 Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20200318145204.74483-5-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-06-10 12:10:33 -04:00
Peter Xu	ff66ba87ba	KVM: Pass EventNotifier into kvm_irqchip_assign_irqfd So that kvm_irqchip_assign_irqfd() can have access to the EventNotifiers, especially the resample event. It is needed in follow up patch to cache and kick resamplefds from QEMU. Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20200318145204.74483-4-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-06-10 12:10:28 -04:00
Markus Armbruster	d2623129a7	qom: Drop parameter @errp of object_property_add() & friends The only way object_property_add() can fail is when a property with the same name already exists. Since our property names are all hardcoded, failure is a programming error, and the appropriate way to handle it is passing &error_abort. Same for its variants, except for object_property_add_child(), which additionally fails when the child already has a parent. Parentage is also under program control, so this is a programming error, too. We have a bit over 500 callers. Almost half of them pass &error_abort, slightly fewer ignore errors, one test case handles errors, and the remaining few callers pass them to their own callers. The previous few commits demonstrated once again that ignoring programming errors is a bad idea. Of the few ones that pass on errors, several violate the Error API. The Error ** argument must be NULL, &error_abort, &error_fatal, or a pointer to a variable containing NULL. Passing an argument of the latter kind twice without clearing it in between is wrong: if the first call sets an error, it no longer points to NULL for the second call. ich9_pm_add_properties(), sparc32_ledma_realize(), sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize() are wrong that way. When the one appropriate choice of argument is &error_abort, letting users pick the argument is a bad idea. Drop parameter @errp and assert the preconditions instead. There's one exception to "duplicate property name is a programming error": the way object_property_add() implements the magic (and undocumented) "automatic arrayification". Don't drop @errp there. Instead, rename object_property_add() to object_property_try_add(), and add the obvious wrapper object_property_add(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-15-armbru@redhat.com> [Two semantic rebase conflicts resolved]	2020-05-15 07:07:58 +02:00
Markus Armbruster	7eecec7d12	qom: Drop object_property_set_description() parameter @errp object_property_set_description() and object_class_property_set_description() fail only when property @name is not found. There are 85 calls of object_property_set_description() and object_class_property_set_description(). None of them can fail: * 84 immediately follow the creation of the property. * The one in spapr_rng_instance_init() refers to a property created in spapr_rng_class_init(), from spapr_rng_properties[]. Every one of them still gets to decide what to pass for @errp. 51 calls pass &error_abort, 32 calls pass NULL, one receives the error and propagates it to &error_abort, and one propagates it to &error_fatal. I'm actually surprised none of them violates the Error API. What are we gaining by letting callers handle the "property not found" error? Use when the property is not known to exist is simpler: you don't have to guard the call with a check. We haven't found such a use in 5+ years. Until we do, let's make life a bit simpler and drop the @errp parameter. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-8-armbru@redhat.com> [One semantic rebase conflict resolved]	2020-05-15 07:06:49 +02:00
Dongjiu Geng	6b552b9bc8	KVM: Move hwpoison page related functions into kvm-all.c kvm_hwpoison_page_add() and kvm_unpoison_all() will both be used by X86 and ARM platforms, so moving them into "accel/kvm/kvm-all.c" to avoid duplicate code. For architectures that don't use the poison-list functionality the reset handler will harmlessly do nothing, so let's register the kvm_unpoison_all() function in the generic kvm_init() function. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com> Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com> Acked-by: Xiang Zheng <zhengxiang9@huawei.com> Message-id: 20200512030609.19593-8-gengdongjiu@huawei.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-05-14 15:03:09 +01:00
Paolo Bonzini	9e264985ff	Merge branch 'exec_rw_const_v4' of https://github.com/philmd/qemu into HEAD	2020-02-25 13:41:48 +01:00
Philippe Mathieu-Daudé	88cd34ee9e	accel/kvm: Check ioctl(KVM_SET_USER_MEMORY_REGION) return value kvm_vm_ioctl() can fail, check its return value, and log an error when it failed. This fixes Coverity CID 1412229: Unchecked return value (CHECKED_RETURN) check_return: Calling kvm_vm_ioctl without checking return value Reported-by: Coverity (CID 1412229) Fixes: `235e8982ad` ("support using KVM_MEM_READONLY flag for regions") Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20200221163336.2362-1-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-02-25 09:18:01 +01:00
Peter Maydell	19f7034773	Avoid address_space_rw() with a constant is_write argument The address_space_rw() function allows either reads or writes depending on the is_write argument passed to it; this is useful when the direction of the access is determined programmatically (as for instance when handling the KVM_EXIT_MMIO exit reason). Under the hood it just calls either address_space_write() or address_space_read_full(). We also use it a lot with a constant is_write argument, though, which has two issues: * when reading "address_space_rw(..., 1)" this is less immediately clear to the reader as being a write than "address_space_write(...)" * calling address_space_rw() bypasses the optimization in address_space_read() that fast-paths reads of a fixed length This commit was produced with the included Coccinelle script scripts/coccinelle/exec_rw_const.cocci. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20200218112457.22712-1-peter.maydell@linaro.org> [PMD: Update macvm_set_cr0() reported by Laurent Vivier] Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2020-02-20 14:47:08 +01:00
Philippe Mathieu-Daudé	4f7f589381	accel: Replace current_machine->accelerator by current_accel() wrapper We actually want to access the accelerator, not the machine, so use the current_accel() wrapper instead. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200121110349.25842-10-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-01-24 20:59:11 +01:00
Xiaoyao Li	d1972be13f	accel/kvm: Make "kernel_irqchip" default on Commit `11bc4a13d1` ("kvm: convert "-machine kernel_irqchip" to an accelerator property") moves kernel_irqchip property from "-machine" to "-accel kvm", but it forgets to set the default value of kernel_irqchip_allowed and kernel_irqchip_split. Also cleaning up the three useless members (kernel_irqchip_allowed, kernel_irqchip_required, kernel_irqchip_split) in struct MachineState. Fixes: `11bc4a13d1` ("kvm: convert "-machine kernel_irqchip" to an accelerator property") Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-Id: <20191228104326.21732-1-xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-01-07 12:08:39 +01:00
Paolo Bonzini	11bc4a13d1	kvm: convert "-machine kernel_irqchip" to an accelerator property Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-12-17 19:32:46 +01:00
Paolo Bonzini	4376c40ded	kvm: introduce kvm_kernel_irqchip_* functions The KVMState struct is opaque, so provide accessors for the fields that will be moved from current_machine to the accelerator. For now they just forward to the machine object, but this will change. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-12-17 19:32:45 +01:00
Paolo Bonzini	23b0898e44	kvm: convert "-machine kvm_shadow_mem" to an accelerator property Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-12-17 19:32:27 +01:00
Dr. David Alan Gilbert	9b3a31c745	kvm: Reallocate dirty_bmap when we change a slot kvm_set_phys_mem can be called to reallocate a slot by something the guest does (e.g. writing to PAM and other chipset registers). This can happen in the middle of a migration, and if we're unlucky it can now happen between the split 'sync' and 'clear'; the clear asserts if there's no bmap to clear. Recreate the bmap whenever we change the slot, keeping the clear path happy. Typically this is triggered by the guest rebooting during a migrate. Corresponds to: https://bugzilla.redhat.com/show_bug.cgi?id=1772774 https://bugzilla.redhat.com/show_bug.cgi?id=1771032 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>	2019-12-17 19:32:23 +01:00
David Gibson	3607715a30	kvm: Introduce KVM irqchip change notifier Awareness of an in kernel irqchip is usually local to the machine and its top-level interrupt controller. However, in a few cases other things need to know about it. In particular vfio devices need this in order to accelerate interrupt delivery. If interrupt routing is changed, such devices may need to readjust their connection to the KVM irqchip. pci_bus_fire_intx_routing_notifier() exists to do just this. However, for the pseries machine type we have a situation where the routing remains constant but the top-level irq chip itself is changed. This occurs because of PAPR feature negotiation which allows the guest to decide between the older XICS and newer XIVE irq chip models (both of which are paravirtualized). To allow devices like vfio to adjust to this change, introduce a new notifier for the purpose kvm_irqchip_change_notify(). Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Acked-by: Alex Williamson <alex.williamson@redhat.com>	2019-11-26 10:11:30 +11:00
Wei Yang	038adc2f58	core: replace getpagesize() with qemu_real_host_page_size There are three page size in qemu: real host page size host page size target page size All of them have dedicate variable to represent. For the last two, we use the same form in the whole qemu project, while for the first one we use two forms: qemu_real_host_page_size and getpagesize(). qemu_real_host_page_size is defined to be a replacement of getpagesize(), so let it serve the role. [Note] Not fully tested for some arch or device. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191013021145.16011-3-richardw.yang@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-26 15:38:06 +02:00
Alex Bennée	87287ac02c	accel/kvm: ensure ret always set Some of the cross compilers rightly complain there are cases where ret may not be set. 0 seems to be the reasonable default unless particular slot explicitly returns -1. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-03 12:48:42 +02:00
Igor Mammedov	023ae9a88a	kvm: split too big memory section on several memslots Max memslot size supported by kvm on s390 is 8Tb, move logic of splitting RAM in chunks upto 8T to KVM code. This way it will hide KVM specific restrictions in KVM code and won't affect board level design decisions. Which would allow us to avoid misusing memory_region_allocate_system_memory() API and eventually use a single hostmem backend for guest RAM. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20190924144751.24149-4-imammedo@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2019-09-30 13:51:50 +02:00
Paolo Bonzini	84516e5b8d	kvm: clear dirty bitmaps from all overlapping memslots Currently MemoryRegionSection has 1:1 mapping to KVMSlot. However next patch will allow splitting MemoryRegionSection into several KVMSlot-s, make sure that kvm_physical_log_slot_clear() is able to handle such 1:N mapping. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20190924144751.24149-3-imammedo@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2019-09-30 13:51:50 +02:00
Paolo Bonzini	4222147dfb	kvm: extract kvm_log_clear_one_slot We may need to clear the dirty bitmap for more than one KVM memslot. First do some code movement with no semantic change. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20190924144751.24149-2-imammedo@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> [fixup line break]	2019-09-30 13:51:50 +02:00
Markus Armbruster	54d31236b9	sysemu: Split sysemu/runstate.h off sysemu/sysemu.h sysemu/sysemu.h is a rather unfocused dumping ground for stuff related to the system-emulator. Evidence: * It's included widely: in my "build everything" tree, changing sysemu/sysemu.h still triggers a recompile of some 1100 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h, down from 5400 due to the previous two commits). * It pulls in more than a dozen additional headers. Split stuff related to run state management into its own header sysemu/runstate.h. Touching sysemu/sysemu.h now recompiles some 850 objects. qemu/uuid.h also drops from 1100 to 850, and qapi/qapi-types-run-state.h from 4400 to 4200. Touching new sysemu/runstate.h recompiles some 500 objects. Since I'm touching MAINTAINERS to add sysemu/runstate.h anyway, also add qemu/main-loop.h. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-30-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> [Unbreak OS-X build]	2019-08-16 13:37:36 +02:00
Markus Armbruster	46517dd497	Include sysemu/sysemu.h a lot less In my "build everything" tree, changing sysemu/sysemu.h triggers a recompile of some 5400 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). hw/qdev-core.h includes sysemu/sysemu.h since recent commit `e965ffa70a` "qdev: add qdev_add_vm_change_state_handler()". This is a bad idea: hw/qdev-core.h is widely included. Move the declaration of qdev_add_vm_change_state_handler() to sysemu/sysemu.h, and drop the problematic include from hw/qdev-core.h. Touching sysemu/sysemu.h now recompiles some 1800 objects. qemu/uuid.h also drops from 5400 to 1800. A few more headers show smaller improvement: qemu/notify.h drops from 5600 to 5200, qemu/timer.h from 5600 to 4500, and qapi/qapi-types-run-state.h from 5500 to 5000. Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20190812052359.30071-28-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>	2019-08-16 13:31:53 +02:00
Markus Armbruster	db72581598	Include qemu/main-loop.h less In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-08-16 13:31:52 +02:00
Markus Armbruster	650d103d3e	Include hw/hw.h exactly where needed In my "build everything" tree, changing hw/hw.h triggers a recompile of some 2600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). The previous commits have left only the declaration of hw_error() in hw/hw.h. This permits dropping most of its inclusions. Touching it now recompiles less than 200 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-19-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-08-16 13:31:52 +02:00
Alexey Kardashevskiy	8072aae377	hmp: Print if memory section is registered with an accelerator This adds an accelerator name to the "into mtree -f" to tell the user if a particular memory section is registered with the accelerator; the primary user for this is KVM and such information is useful for debugging purposes. This adds a has_memory() callback to the accelerator class allowing any accelerator to have a label in that memory tree dump. Since memory sections are passed to memory listeners and get registered in accelerators (rather than memory regions), this only prints new labels for flatviews attached to the system address space. An example: Root memory region: system 0000000000000000-0000002fffffffff (prio 0, ram): /objects/mem0 kvm 0000003000000000-0000005fffffffff (prio 0, ram): /objects/mem1 kvm 0000200000000020-000020000000003f (prio 1, i/o): virtio-pci 0000200080000000-000020008000003f (prio 0, i/o): capabilities Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20190614015237.82463-1-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-19 19:04:49 +02:00
Peter Xu	ff4aa11419	kvm: Support KVM_CLEAR_DIRTY_LOG Firstly detect the interface using KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 and mark it. When failed to enable the new feature we'll fall back to the old sync. Provide the log_clear() hook for the memory listeners for both address spaces of KVM (normal system memory, and SMM) and deliever the clear message to kernel. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20190603065056.25211-11-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:03 +02:00
Peter Xu	36adac4934	kvm: Introduce slots lock for memory listener Introduce KVMMemoryListener.slots_lock to protect the slots inside the kvm memory listener. Currently it is close to useless because all the KVM code path now is always protected by the BQL. But it'll start to make sense in follow up patches where we might do remote dirty bitmap clear and also we'll update the per-slot cached dirty bitmap even without the BQL. So let's prepare for it. We can also use per-slot lock for above reason but it seems to be an overkill. Let's just use this bigger one (which covers all the slots of a single address space) but anyway this lock is still much smaller than the BQL. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20190603065056.25211-10-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:03 +02:00
Peter Xu	9f4bf4baa8	kvm: Persistent per kvmslot dirty bitmap When synchronizing dirty bitmap from kernel KVM we do it in a per-kvmslot fashion and we allocate the userspace bitmap for each of the ioctl. This patch instead make the bitmap cache be persistent then we don't need to g_malloc0() every time. More importantly, the cached per-kvmslot dirty bitmap will be further used when we want to add support for the KVM_CLEAR_DIRTY_LOG and this cached bitmap will be used to guarantee we won't clear any unknown dirty bits otherwise that can be a severe data loss issue for migration code. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190603065056.25211-9-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:03 +02:00
Peter Xu	4a12a11a39	kvm: Update comments for sync_dirty_bitmap It's obviously obsolete. Do some update. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190603065056.25211-8-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:02 +02:00
Like Xu	5cc8767d05	general: Replace global smp variables with smp machine properties Basically, the context could get the MachineState reference via call chains or unrecommended qdev_get_machine() in !CONFIG_USER_ONLY mode. A local variable of the same name would be introduced in the declaration phase out of less effort OR replace it on the spot if it's only used once in the context. No semantic changes. Signed-off-by: Like Xu <like.xu@linux.intel.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190518205428.90532-4-like.xu@linux.intel.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2019-07-05 17:07:36 -03:00
Liran Alon	ebbfef2f34	target/i386: kvm: Add support for save and restore nested state Kernel commit 8fcc4b5923af ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE") introduced new IOCTLs to extract and restore vCPU state related to Intel VMX & AMD SVM. Utilize these IOCTLs to add support for migration of VMs which are running nested hypervisors. Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: Maran Wilson <maran.wilson@oracle.com> Tested-by: Maran Wilson <maran.wilson@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Message-Id: <20190619162140.133674-9-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-06-21 13:23:47 +02:00
Liran Alon	b1115c9991	KVM: Introduce kvm_arch_destroy_vcpu() Simiar to how kvm_init_vcpu() calls kvm_arch_init_vcpu() to perform arch-dependent initialisation, introduce kvm_arch_destroy_vcpu() to be called from kvm_destroy_vcpu() to perform arch-dependent destruction. This was added because some architectures (Such as i386) currently do not free memory that it have allocated in kvm_arch_init_vcpu(). Suggested-by: Maran Wilson <maran.wilson@oracle.com> Reviewed-by: Maran Wilson <maran.wilson@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Message-Id: <20190619162140.133674-3-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-06-21 02:29:39 +02:00
Yury Kotov	e346bcbf44	kvm-all: Add/update fprintf's for kvm_*_ioeventfd_del Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190607090830.18807-1-yury-kotov@yandex-team.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-06-21 02:29:38 +02:00
Markus Armbruster	a8d2532645	Include qemu-common.h exactly where needed No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]	2019-06-12 13:20:20 +02:00
Markus Armbruster	90c84c5600	qom/cpu: Simplify how CPUClass:cpu_dump_state() prints CPUClass method dump_statistics() takes an fprintf()-like callback and a FILE * to pass to it. Most callers pass fprintf() and stderr. log_cpu_state() passes fprintf() and qemu_log_file. hmp_info_registers() passes monitor_fprintf() and the current monitor cast to FILE . monitor_fprintf() casts it right back, and is otherwise identical to monitor_printf(). The callback gets passed around a lot, which is tiresome. The type-punning around monitor_fprintf() is ugly. Drop the callback, and call qemu_fprintf() instead. Also gets rid of the type-punning, since qemu_fprintf() takes NULL instead of the current monitor cast to FILE . Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190417191805.28198-15-armbru@redhat.com>	2019-04-18 22:18:59 +02:00
Markus Armbruster	e68b3baa25	trace-events: Consistently point to docs/devel/tracing.txt Almost all trace-events point to docs/devel/tracing.txt in a comment right at the beginning. Touch up the ones that don't. [Updated with Markus' new commit description wording. --Stefan] Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190314180929.27722-2-armbru@redhat.com Message-Id: <20190314180929.27722-2-armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-03-22 16:17:37 +00:00
Eric Auger	dc0ca80eb1	hw/boards: Add a MachineState parameter to kvm_type callback On ARM, the kvm_type will be resolved by querying the KVMState. Let's add the MachineState handle to the callback so that we can retrieve the KVMState handle. in kvm_init, when the callback is called, the kvm_state variable is not yet set. Signed-off-by: Eric Auger <eric.auger@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-id: 20190304101339.25970-5-eric.auger@redhat.com [ppc parts] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-03-05 15:55:09 +00:00
Dr. David Alan Gilbert	876d16cd31	kvm: Add kvm_set_ioeventfd* traces Add a couple of traces around the kvm_set_ioeventfd* calls. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20190212134758.10514-4-dgilbert@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2019-02-14 11:46:30 +01:00
Paolo Bonzini	b58deb344d	qemu/queue.h: leave head structs anonymous unless necessary Most list head structs need not be given a name. In most cases the name is given just in case one is going to use QTAILQ_LAST, QTAILQ_PREV or reverse iteration, but this does not apply to lists of other kinds, and even for QTAILQ in practice this is only rarely needed. In addition, we will soon reimplement those macros completely so that they do not need a name for the head struct. So clean up everything, not giving a name except in the rare case where it is necessary. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-01-11 15:46:55 +01:00
Peng Hao	e6d34aeea6	target-i386 : add coalesced_pio API the primary API realization. Signed-off-by: Peng Hao <peng.hao2@zte.com.cn> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <1539795177-21038-3-git-send-email-peng.hao2@zte.com.cn> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-19 13:44:11 +02:00
Paolo Bonzini	b2e78fac6f	KVM: cleanup unnecessary #ifdef KVM_CAP_... The capability macros are always defined, since they come from kernel headers that are copied into the QEMU tree. Remove the unnecessary #ifdefs. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-08-23 18:46:25 +02:00
Alex Williamson	f59489423a	kvm: Use inhibit to prevent ballooning without synchronous mmu Remove KVM specific tests in balloon_page(), instead marking ballooning as inhibited without KVM_CAP_SYNC_MMU support. Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2018-08-17 09:27:15 -06:00
Shannon Zhao	6c090d4a75	kvm: Delete the slot if and only if the KVM_MEM_READONLY flag is changed According to KVM commit 75d61fbc, it needs to delete the slot before changing the KVM_MEM_READONLY flag. But QEMU commit `235e8982` only check whether KVM_MEM_READONLY flag is set instead of changing. It doesn't need to delete the slot if the KVM_MEM_READONLY flag is not changed. This fixes a issue that migrating a VM at the OVMF startup stage and VM is executing the codes in rom. Between the deleting and adding the slot in kvm_set_user_memory_region, there is a chance that guest access rom and trap to KVM, then KVM can't find the corresponding memslot. While KVM (on ARM) injects an abort to guest due to the broken hva, then guest will get stuck. Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com> Message-Id: <1526462314-19720-1-git-send-email-zhaoshenglong@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-06-28 19:05:31 +02:00
Brijesh Singh	54e8953967	kvm: introduce memory encryption APIs Inorder to integerate the Secure Encryption Virtualization (SEV) support add few high-level memory encryption APIs which can be used for encrypting the guest memory region. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-03-13 12:04:03 +01:00
Brijesh Singh	b20e37801f	kvm: add memory encryption context Split from a patch by Brijesh Singh (brijesh.singh@amd.com). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>	2018-03-13 12:04:03 +01:00
Alexey Kardashevskiy	fe29141bee	kvm: Add kvm_set_user_memory tracepoint This adds a tracepoint to trace the KVM_SET_USER_MEMORY_REGION ioctl parameters which is quite useful for debugging VFIO memory regions being actually registered with KVM. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20171215052326.21386-1-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-02-05 13:54:38 +01:00
Michael S. Tsirkin	1ef8185a06	Revert "virtio: postpone the execution of event_notifier_cleanup function" This reverts commit `4fe6d78b2e` as it is reported to break cleanup and migration. Cc: Gal Hammer <ghammer@redhat.com> Cc: Sitong Liu <siliu@redhat.com> Cc: Xiaoling Gao <xiagao@redhat.com> Suggested-by: Greg Kurz <groug@kaod.org> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Reported-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com> Reported-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>	2018-01-24 19:20:19 +02:00
Gal Hammer	4fe6d78b2e	virtio: postpone the execution of event_notifier_cleanup function Use the EventNotifier's cleanup callback function to execute the event_notifier_cleanup function after kvm unregistered the eventfd. This change supports running the virtio_bus_set_host_notifier function inside a memory region transaction. Otherwise, a closed fd is sent to kvm, which results in a failure. Signed-off-by: Gal Hammer <ghammer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2018-01-18 21:52:37 +02:00
David Hildenbrand	a6ffc4232a	kvm: simplify kvm_align_section() Use ROUND_UP and simplify the code a bit. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20171016144302.24284-7-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-10-18 10:15:00 +02:00
David Hildenbrand	90ed4bcc3a	kvm: region_add and region_del is not called on updates Attributes are not updated via region_add()/region_del(). Attribute changes lead to a delete first, followed by a new add. If this would ever not be the case, we would get an error when trying to register the new slot. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20171016144302.24284-6-david@redhat.com> Tested-by: Joe Clifford <joeclifford@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-10-18 10:14:52 +02:00
David Hildenbrand	1c4fdabaf7	kvm: fix error message when failing to unregister slot "overlapping" is a leftover, let's drop it. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20171016144302.24284-5-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-10-18 10:14:48 +02:00
David Hildenbrand	e377e87ca6	kvm: tolerate non-existing slot for log_start/log_stop/log_sync If we want to trap every access to a section, we might not have a slot. So let's just tolerate if we don't have one. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20171016144302.24284-4-david@redhat.com> Tested-by: Joe Clifford <joeclifford@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-10-18 10:14:42 +02:00
David Hildenbrand	bbfd3017eb	kvm: fix alignment of ram address Fix the wrong calculation of the delta, used to align the ram address. This only strikes if alignment has to be done. Reported-by: Joe Clifford <joeclifford@gmail.com> Fixes: `5ea69c2e36` ("kvm: factor out alignment of memory section") Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20171016144302.24284-3-david@redhat.com> Tested-by: Joe Clifford <joeclifford@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-10-18 10:14:35 +02:00
Greg Kurz	11748ba72e	kvm: check KVM_CAP_NR_VCPUS with kvm_vm_check_extension() On a modern server-class ppc host with the following CPU topology: Architecture: ppc64le Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0,8,16,24 Off-line CPU(s) list: 1-7,9-15,17-23,25-31 Thread(s) per core: 1 If both KVM PR and KVM HV loaded and we pass: -machine pseries,accel=kvm,kvm-type=PR -smp 8 We expect QEMU to warn that this exceeds the number of online CPUs: Warning: Number of SMP cpus requested (8) exceeds the recommended cpus supported by KVM (4) Warning: Number of hotpluggable cpus requested (8) exceeds the recommended cpus supported by KVM (4) but nothing is printed... This happens because on ppc the KVM_CAP_NR_VCPUS capability is VM specific ndreally depends on the KVM type, but we currently use it as a global capability. And KVM returns a fallback value based on KVM HV being present. Maybe KVM on POWER shouldn't presume anything as long as it doesn't have a VM, but in all cases, we should call KVM_CREATE_VM first and use KVM_CAP_NR_VCPUS as a VM capability. This patch hence changes kvm_recommended_vcpus() accordingly and moves the sanity checking of smp_cpus after the VM creation. It is okay for the other archs that also implement KVM_CAP_NR_VCPUS, ie, mips, s390, x86 and arm, because they don't depend on the VM being created or not. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <150600966286.30533.10909862523552370889.stgit@bahia.lan> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-10-02 14:38:06 +02:00
Greg Kurz	62dd4edaaf	kvm: check KVM_CAP_SYNC_MMU with kvm_vm_check_extension() On a server-class ppc host, this capability depends on the KVM type, ie, HV or PR. If both KVM are present in the kernel, we will always get the HV specific value, even if we explicitely requested PR on the command line. This can have an impact if we're using hugepages or a balloon device. Since we've already created the VM at the time any user calls kvm_has_sync_mmu(), switching to kvm_vm_check_extension() is enough to fix any potential issue. It is okay for the other archs that also implement KVM_CAP_SYNC_MMU, ie, mips, s390, x86 and arm, because they don't depend on the VM being created or not. While here, let's cache the state of this extension in a bool variable, since it has several users in the code, as suggested by Thomas Huth. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <150600965332.30533.14702405809647835716.stgit@bahia.lan> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-10-02 14:38:06 +02:00
David Hildenbrand	3110cdbd8a	kvm: drop wrong assertion creating problems with pflash pflash toggles mr->romd_mode. So this assert does not always hold. 1) a device was added with !mr->romd_mode, therefore effectively not creating a kvm slot as we want to trap every access (add = false). 2) mr->romd_mode was toggled on before remove it. There is now actually no slot to remove and the assert is wrong. So let's just drop the assert. Reported-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170920145025.19403-1-david@redhat.com> Tested-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 12:40:08 +02:00
Alistair Francis	8297be80f7	Convert multi-line fprintf() to warn_report() Convert all the multi-line uses of fprintf(stderr, "warning:"..."\n"... to use warn_report() instead. This helps standardise on a single method of printing warnings to the user. All of the warnings were changed using these commands: find ./* -type f -exec sed -i \ 'N; {s\|fprintf(.".warning[,:] $.$\\n"$.$);\|warn_report("\1"\2);\|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N; {s\|fprintf(.".warning[,:] $.$\\n"$.$);\|warn_report("\1"\2);\|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N;N; {s\|fprintf(.".warning[,:] $.$\\n"$.$);\|warn_report("\1"\2);\|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N;N;N {s\|fprintf(.".warning[,:] $.$\\n"$.$);\|warn_report("\1"\2);\|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N;N;N;N {s\|fprintf(.".warning[,:] $.$\\n"$.$);\|warn_report("\1"\2);\|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N;N;N;N;N {s\|fprintf(.".warning[,:] $.$\\n"$.$);\|warn_report("\1"\2);\|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N;N;N;N;N;N; {s\|fprintf(.".warning[,:] $.$\\n"$.$);\|warn_report("\1"\2);\|Ig}' \ {} + Indentation fixed up manually afterwards. Some of the lines were manually edited to reduce the line length to below 80 charecters. Some of the lines with newlines in the middle of the string were also manually edit to avoid checkpatch errrors. The #include lines were manually updated to allow the code to compile. Several of the warning messages can be improved after this patch, to keep this patch mechanical this has been moved into a later patch. Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: Yongbok Kim <yongbok.kim@imgtec.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Alexander Graf <agraf@suse.de> Cc: Jason Wang <jasowang@redhat.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <5def63849ca8f551630c6f2b45bcb1c482f765a6.1505158760.git.alistair.francis@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-19 14:09:34 +02:00
David Hildenbrand	67548f0965	kvm: kvm_log_sync() is only called with known memory sections Flatview will make sure that we can only end up in this function with memory sections that correspond to exactly one slot. So we don't have to iterate multiple times. There won't be overlapping slots but only matching slots. Properly align the section and look up the corresponding slot. This heavily simplifies this function. We can now get rid of kvm_lookup_overlapping_slot(). Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170911174933.20789-7-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-19 14:09:34 +02:00
David Hildenbrand	343562e8fa	kvm: kvm_log_start/stop are only called with known sections Let's properly align the sections first and bail out if we would ever get called with a memory section we don't know yet. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170911174933.20789-6-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-19 14:09:33 +02:00
David Hildenbrand	f357f564be	kvm: we never have overlapping slots in kvm_set_phys_mem() The way flatview handles memory sections, we will never have overlapping memory sections in kvm. address_space_update_topology_pass() will make sure that we will only get called for a) an existing memory section for which we only update parameters (log_start, log_stop). b) an existing memory section we want to delete (region_del) c) a brand new memory section we want to add (region_add) We cannot have overlapping memory sections in kvm as we will first remove the overlapping sections and then add the ones without conflicts. Therefore we can remove the complexity for handling prefix and suffix slots. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170911174933.20789-5-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-19 14:09:33 +02:00
David Hildenbrand	2747e71672	kvm: use start + size for memory ranges Convert kvm_lookup_matching_slot(). Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170911174933.20789-4-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-19 14:09:33 +02:00
David Hildenbrand	5ea69c2e36	kvm: factor out alignment of memory section Factor it out, so we can reuse it later. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170911174933.20789-3-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-19 14:09:33 +02:00
David Hildenbrand	89de4b9138	kvm: require JOIN_MEMORY_REGIONS_WORKS We already require DESTROY_MEMORY_REGION_WORKS, JOIN_MEMORY_REGIONS_WORKS was added just half a year later. In addition, with flatview overlapping memory regions are first removed before adding the changed one. So we can't really detect joining memory regions this way. Let's just get rid of this special handling. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170911174933.20789-2-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-19 14:09:33 +02:00
Cornelia Huck	88c725c78e	kvm: remove hard dependency on pci The msi routing code in kvm calls some pci functions: provide some stubs to enable builds without pci. Also, to make this more obvious, guard them via a pci_available boolean (which also can be reused in other places). Fixes: `e1d4fb2de` ("kvm-irqchip: x86: add msi route notify fn") Fixes: `767a554a0` ("kvm-all: Pass requester ID to MSI routing functions") Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2017-08-30 18:23:25 +02:00
Alexander Graf	5d721b785f	ARM: KVM: Enable in-kernel timers with user space gic When running with KVM enabled, you can choose between emulating the gic in kernel or user space. If the kernel supports in-kernel virtualization of the interrupt controller, it will default to that. If not, if will default to user space emulation. Unfortunately when running in user mode gic emulation, we miss out on interrupt events which are only available from kernel space, such as the timer. This patch leverages the new kernel/user space pending line synchronization for timer events. It does not handle PMU events yet. Signed-off-by: Alexander Graf <agraf@suse.de> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-id: 1498577737-130264-1-git-send-email-agraf@suse.de Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-07-11 11:21:26 +01:00
Paolo Bonzini	290dae4678	cpu: move interrupt handling out of translate-common.c translate-common.c will not be available anymore with --disable-tcg, so we cannot leave cpu_interrupt_handler there. Move the TCG-specific handler to accel/tcg/tcg-all.c, and adopt KVM's handler as the default one, since it works just as well for Xen and qtest. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-04 16:00:43 +02:00
Sergio Andres Gomez Del Real	99f318322e	vcpu_dirty: share the same field in CPUState for all accelerators This patch simply replaces the separate boolean field in CPUState that kvm, hax (and upcoming hvf) have for keeping track of vcpu dirtiness with a single shared field. Signed-off-by: Sergio Andres Gomez Del Real <Sergio.G.DelReal@gmail.com> Message-Id: <20170618191101.3457-1-Sergio.G.DelReal@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-04 14:30:03 +02:00
Yang Zhong	92229a57bb	accel: move kvm related accelerator files into accel/ move kvm related accelerator files into accel/ subdirectory, also create one stub subdirectory, which will include accelerator's stub files. Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <1496383606-18060-5-git-send-email-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-15 11:04:06 +02:00

1 2 3 4 5

212 Commits