When synchronizing dirty bitmap from kernel KVM we do it in a
per-kvmslot fashion and we allocate the userspace bitmap for each of
the ioctl. This patch instead make the bitmap cache be persistent
then we don't need to g_malloc0() every time.
More importantly, the cached per-kvmslot dirty bitmap will be further
used when we want to add support for the KVM_CLEAR_DIRTY_LOG and this
cached bitmap will be used to guarantee we won't clear any unknown
dirty bits otherwise that can be a severe data loss issue for
migration code.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190603065056.25211-9-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
It's obviously obsolete. Do some update.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20190603065056.25211-8-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Basically, the context could get the MachineState reference via call
chains or unrecommended qdev_get_machine() in !CONFIG_USER_ONLY mode.
A local variable of the same name would be introduced in the declaration
phase out of less effort OR replace it on the spot if it's only used
once in the context. No semantic changes.
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190518205428.90532-4-like.xu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Kernel commit 8fcc4b5923af ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE")
introduced new IOCTLs to extract and restore vCPU state related to
Intel VMX & AMD SVM.
Utilize these IOCTLs to add support for migration of VMs which are
running nested hypervisors.
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Tested-by: Maran Wilson <maran.wilson@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-9-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Simiar to how kvm_init_vcpu() calls kvm_arch_init_vcpu() to perform
arch-dependent initialisation, introduce kvm_arch_destroy_vcpu()
to be called from kvm_destroy_vcpu() to perform arch-dependent
destruction.
This was added because some architectures (Such as i386)
currently do not free memory that it have allocated in
kvm_arch_init_vcpu().
Suggested-by: Maran Wilson <maran.wilson@oracle.com>
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-3-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
No header includes qemu-common.h after this commit, as prescribed by
qemu-common.h's file comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-5-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c
block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c
target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h
target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h
target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h
target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and
net/tap-bsd.c fixed up]
CPUClass method dump_statistics() takes an fprintf()-like callback and
a FILE * to pass to it. Most callers pass fprintf() and stderr.
log_cpu_state() passes fprintf() and qemu_log_file.
hmp_info_registers() passes monitor_fprintf() and the current monitor
cast to FILE *. monitor_fprintf() casts it right back, and is
otherwise identical to monitor_printf().
The callback gets passed around a lot, which is tiresome. The
type-punning around monitor_fprintf() is ugly.
Drop the callback, and call qemu_fprintf() instead. Also gets rid of
the type-punning, since qemu_fprintf() takes NULL instead of the
current monitor cast to FILE *.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190417191805.28198-15-armbru@redhat.com>
On ARM, the kvm_type will be resolved by querying the KVMState.
Let's add the MachineState handle to the callback so that we
can retrieve the KVMState handle. in kvm_init, when the callback
is called, the kvm_state variable is not yet set.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20190304101339.25970-5-eric.auger@redhat.com
[ppc parts]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a couple of traces around the kvm_set_ioeventfd* calls.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20190212134758.10514-4-dgilbert@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Most list head structs need not be given a name. In most cases the
name is given just in case one is going to use QTAILQ_LAST, QTAILQ_PREV
or reverse iteration, but this does not apply to lists of other kinds,
and even for QTAILQ in practice this is only rarely needed. In addition,
we will soon reimplement those macros completely so that they do not
need a name for the head struct. So clean up everything, not giving a
name except in the rare case where it is necessary.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The capability macros are always defined, since they come from kernel
headers that are copied into the QEMU tree. Remove the unnecessary #ifdefs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove KVM specific tests in balloon_page(), instead marking
ballooning as inhibited without KVM_CAP_SYNC_MMU support.
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
According to KVM commit 75d61fbc, it needs to delete the slot before
changing the KVM_MEM_READONLY flag. But QEMU commit 235e8982 only check
whether KVM_MEM_READONLY flag is set instead of changing. It doesn't
need to delete the slot if the KVM_MEM_READONLY flag is not changed.
This fixes a issue that migrating a VM at the OVMF startup stage and
VM is executing the codes in rom. Between the deleting and adding the
slot in kvm_set_user_memory_region, there is a chance that guest access
rom and trap to KVM, then KVM can't find the corresponding memslot.
While KVM (on ARM) injects an abort to guest due to the broken hva, then
guest will get stuck.
Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Message-Id: <1526462314-19720-1-git-send-email-zhaoshenglong@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Inorder to integerate the Secure Encryption Virtualization (SEV) support
add few high-level memory encryption APIs which can be used for encrypting
the guest memory region.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Split from a patch by Brijesh Singh (brijesh.singh@amd.com).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
This adds a tracepoint to trace the KVM_SET_USER_MEMORY_REGION ioctl
parameters which is quite useful for debugging VFIO memory regions
being actually registered with KVM.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20171215052326.21386-1-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit 4fe6d78b2e as it is
reported to break cleanup and migration.
Cc: Gal Hammer <ghammer@redhat.com>
Cc: Sitong Liu <siliu@redhat.com>
Cc: Xiaoling Gao <xiagao@redhat.com>
Suggested-by: Greg Kurz <groug@kaod.org>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Reported-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Use the EventNotifier's cleanup callback function to execute the
event_notifier_cleanup function after kvm unregistered the eventfd.
This change supports running the virtio_bus_set_host_notifier
function inside a memory region transaction. Otherwise, a closed
fd is sent to kvm, which results in a failure.
Signed-off-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use ROUND_UP and simplify the code a bit.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171016144302.24284-7-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Attributes are not updated via region_add()/region_del(). Attribute changes
lead to a delete first, followed by a new add.
If this would ever not be the case, we would get an error when trying to
register the new slot.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171016144302.24284-6-david@redhat.com>
Tested-by: Joe Clifford <joeclifford@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
"overlapping" is a leftover, let's drop it.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171016144302.24284-5-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If we want to trap every access to a section, we might not have a
slot. So let's just tolerate if we don't have one.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171016144302.24284-4-david@redhat.com>
Tested-by: Joe Clifford <joeclifford@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix the wrong calculation of the delta, used to align the ram address.
This only strikes if alignment has to be done.
Reported-by: Joe Clifford <joeclifford@gmail.com>
Fixes: 5ea69c2e36 ("kvm: factor out alignment of memory section")
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171016144302.24284-3-david@redhat.com>
Tested-by: Joe Clifford <joeclifford@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On a modern server-class ppc host with the following CPU topology:
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0,8,16,24
Off-line CPU(s) list: 1-7,9-15,17-23,25-31
Thread(s) per core: 1
If both KVM PR and KVM HV loaded and we pass:
-machine pseries,accel=kvm,kvm-type=PR -smp 8
We expect QEMU to warn that this exceeds the number of online CPUs:
Warning: Number of SMP cpus requested (8) exceeds the recommended
cpus supported by KVM (4)
Warning: Number of hotpluggable cpus requested (8) exceeds the
recommended cpus supported by KVM (4)
but nothing is printed...
This happens because on ppc the KVM_CAP_NR_VCPUS capability is VM
specific ndreally depends on the KVM type, but we currently use it
as a global capability. And KVM returns a fallback value based on
KVM HV being present. Maybe KVM on POWER shouldn't presume anything
as long as it doesn't have a VM, but in all cases, we should call
KVM_CREATE_VM first and use KVM_CAP_NR_VCPUS as a VM capability.
This patch hence changes kvm_recommended_vcpus() accordingly and
moves the sanity checking of smp_cpus after the VM creation.
It is okay for the other archs that also implement KVM_CAP_NR_VCPUS,
ie, mips, s390, x86 and arm, because they don't depend on the VM
being created or not.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <150600966286.30533.10909862523552370889.stgit@bahia.lan>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On a server-class ppc host, this capability depends on the KVM type,
ie, HV or PR. If both KVM are present in the kernel, we will always
get the HV specific value, even if we explicitely requested PR on
the command line.
This can have an impact if we're using hugepages or a balloon device.
Since we've already created the VM at the time any user calls
kvm_has_sync_mmu(), switching to kvm_vm_check_extension() is
enough to fix any potential issue.
It is okay for the other archs that also implement KVM_CAP_SYNC_MMU,
ie, mips, s390, x86 and arm, because they don't depend on the VM being
created or not.
While here, let's cache the state of this extension in a bool variable,
since it has several users in the code, as suggested by Thomas Huth.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <150600965332.30533.14702405809647835716.stgit@bahia.lan>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
pflash toggles mr->romd_mode. So this assert does not always hold.
1) a device was added with !mr->romd_mode, therefore effectively not
creating a kvm slot as we want to trap every access (add = false).
2) mr->romd_mode was toggled on before remove it. There is now
actually no slot to remove and the assert is wrong.
So let's just drop the assert.
Reported-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170920145025.19403-1-david@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Convert all the multi-line uses of fprintf(stderr, "warning:"..."\n"...
to use warn_report() instead. This helps standardise on a single
method of printing warnings to the user.
All of the warnings were changed using these commands:
find ./* -type f -exec sed -i \
'N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
Indentation fixed up manually afterwards.
Some of the lines were manually edited to reduce the line length to below
80 charecters. Some of the lines with newlines in the middle of the
string were also manually edit to avoid checkpatch errrors.
The #include lines were manually updated to allow the code to compile.
Several of the warning messages can be improved after this patch, to
keep this patch mechanical this has been moved into a later patch.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Jason Wang <jasowang@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <5def63849ca8f551630c6f2b45bcb1c482f765a6.1505158760.git.alistair.francis@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Flatview will make sure that we can only end up in this function with
memory sections that correspond to exactly one slot. So we don't
have to iterate multiple times. There won't be overlapping slots but
only matching slots.
Properly align the section and look up the corresponding slot. This
heavily simplifies this function.
We can now get rid of kvm_lookup_overlapping_slot().
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170911174933.20789-7-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's properly align the sections first and bail out if we would ever
get called with a memory section we don't know yet.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170911174933.20789-6-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The way flatview handles memory sections, we will never have overlapping
memory sections in kvm.
address_space_update_topology_pass() will make sure that we will only
get called for
a) an existing memory section for which we only update parameters
(log_start, log_stop).
b) an existing memory section we want to delete (region_del)
c) a brand new memory section we want to add (region_add)
We cannot have overlapping memory sections in kvm as we will first remove
the overlapping sections and then add the ones without conflicts.
Therefore we can remove the complexity for handling prefix and suffix
slots.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170911174933.20789-5-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Factor it out, so we can reuse it later.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170911174933.20789-3-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We already require DESTROY_MEMORY_REGION_WORKS, JOIN_MEMORY_REGIONS_WORKS
was added just half a year later.
In addition, with flatview overlapping memory regions are first
removed before adding the changed one. So we can't really detect joining
memory regions this way.
Let's just get rid of this special handling.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170911174933.20789-2-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The msi routing code in kvm calls some pci functions: provide
some stubs to enable builds without pci.
Also, to make this more obvious, guard them via a pci_available boolean
(which also can be reused in other places).
Fixes: e1d4fb2de ("kvm-irqchip: x86: add msi route notify fn")
Fixes: 767a554a0 ("kvm-all: Pass requester ID to MSI routing functions")
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
When running with KVM enabled, you can choose between emulating the
gic in kernel or user space. If the kernel supports in-kernel virtualization
of the interrupt controller, it will default to that. If not, if will
default to user space emulation.
Unfortunately when running in user mode gic emulation, we miss out on
interrupt events which are only available from kernel space, such as the timer.
This patch leverages the new kernel/user space pending line synchronization for
timer events. It does not handle PMU events yet.
Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1498577737-130264-1-git-send-email-agraf@suse.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
translate-common.c will not be available anymore with --disable-tcg,
so we cannot leave cpu_interrupt_handler there.
Move the TCG-specific handler to accel/tcg/tcg-all.c, and adopt
KVM's handler as the default one, since it works just as well for
Xen and qtest.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch simply replaces the separate boolean field in CPUState that
kvm, hax (and upcoming hvf) have for keeping track of vcpu dirtiness
with a single shared field.
Signed-off-by: Sergio Andres Gomez Del Real <Sergio.G.DelReal@gmail.com>
Message-Id: <20170618191101.3457-1-Sergio.G.DelReal@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
move kvm related accelerator files into accel/ subdirectory, also
create one stub subdirectory, which will include accelerator's stub
files.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <1496383606-18060-5-git-send-email-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>