qemu/accel/kvm/trace-events

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

40 lines
3.0 KiB
Plaintext
Raw Normal View History

# See docs/devel/tracing.rst for syntax documentation.
# kvm-all.c
kvm: Use 'unsigned long' for request argument in functions wrapping ioctl() Change the data type of the ioctl _request_ argument from 'int' to 'unsigned long' for the various accel/kvm functions which are essentially wrappers around the ioctl() syscall. The correct type for ioctl()'s 'request' argument is confused: * POSIX defines the request argument as 'int' * glibc uses 'unsigned long' in the prototype in sys/ioctl.h * the glibc info documentation uses 'int' * the Linux manpage uses 'unsigned long' * the Linux implementation of the syscall uses 'unsigned int' If we wrap ioctl() with another function which uses 'int' as the type for the request argument, then requests with the 0x8000_0000 bit set will be sign-extended when the 'int' is cast to 'unsigned long' for the call to ioctl(). On x86_64 one such example is the KVM_IRQ_LINE_STATUS request. Bit requests with the _IOC_READ direction bit set, will have the high bit set. Fortunately the Linux Kernel truncates the upper 32bit of the request on 64bit machines (because it uses 'unsigned int', and see also Linus Torvalds' comments in https://sourceware.org/bugzilla/show_bug.cgi?id=14362 ) so this doesn't cause active problems for us. However it is more consistent to follow the glibc ioctl() prototype when we define functions that are essentially wrappers around ioctl(). This resolves a Coverity issue where it points out that in kvm_get_xsave() we assign a value (KVM_GET_XSAVE or KVM_GET_XSAVE2) to an 'int' variable which can't hold it without overflow. Resolves: Coverity CID 1547759 Signed-off-by: Johannes Stoelp <johannes.stoelp@gmail.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20240815122747.3053871-1-peter.maydell@linaro.org [PMM: Rebased patch, adjusted commit message, included note about Coverity fix, updated the type of the local var in kvm_get_xsave, updated the comment in the KVMState struct definition] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-09-13 17:31:46 +03:00
kvm_ioctl(unsigned long type, void *arg) "type 0x%lx, arg %p"
kvm_vm_ioctl(unsigned long type, void *arg) "type 0x%lx, arg %p"
kvm_vcpu_ioctl(int cpu_index, unsigned long type, void *arg) "cpu_index %d, type 0x%lx, arg %p"
kvm_run_exit(int cpu_index, uint32_t reason) "cpu_index %d, reason %d"
kvm: Use 'unsigned long' for request argument in functions wrapping ioctl() Change the data type of the ioctl _request_ argument from 'int' to 'unsigned long' for the various accel/kvm functions which are essentially wrappers around the ioctl() syscall. The correct type for ioctl()'s 'request' argument is confused: * POSIX defines the request argument as 'int' * glibc uses 'unsigned long' in the prototype in sys/ioctl.h * the glibc info documentation uses 'int' * the Linux manpage uses 'unsigned long' * the Linux implementation of the syscall uses 'unsigned int' If we wrap ioctl() with another function which uses 'int' as the type for the request argument, then requests with the 0x8000_0000 bit set will be sign-extended when the 'int' is cast to 'unsigned long' for the call to ioctl(). On x86_64 one such example is the KVM_IRQ_LINE_STATUS request. Bit requests with the _IOC_READ direction bit set, will have the high bit set. Fortunately the Linux Kernel truncates the upper 32bit of the request on 64bit machines (because it uses 'unsigned int', and see also Linus Torvalds' comments in https://sourceware.org/bugzilla/show_bug.cgi?id=14362 ) so this doesn't cause active problems for us. However it is more consistent to follow the glibc ioctl() prototype when we define functions that are essentially wrappers around ioctl(). This resolves a Coverity issue where it points out that in kvm_get_xsave() we assign a value (KVM_GET_XSAVE or KVM_GET_XSAVE2) to an 'int' variable which can't hold it without overflow. Resolves: Coverity CID 1547759 Signed-off-by: Johannes Stoelp <johannes.stoelp@gmail.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20240815122747.3053871-1-peter.maydell@linaro.org [PMM: Rebased patch, adjusted commit message, included note about Coverity fix, updated the type of the local var in kvm_get_xsave, updated the comment in the KVMState struct definition] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-09-13 17:31:46 +03:00
kvm_device_ioctl(int fd, unsigned long type, void *arg) "dev fd %d, type 0x%lx, arg %p"
kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to retrieve ONEREG %" PRIu64 " from KVM: %s"
kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set ONEREG %" PRIu64 " to KVM: %s"
kvm_init_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu"
accel/kvm: Extract common KVM vCPU {creation,parking} code KVM vCPU creation is done once during the vCPU realization when Qemu vCPU thread is spawned. This is common to all the architectures as of now. Hot-unplug of vCPU results in destruction of the vCPU object in QOM but the corresponding KVM vCPU object in the Host KVM is not destroyed as KVM doesn't support vCPU removal. Therefore, its representative KVM vCPU object/context in Qemu is parked. Refactor architecture common logic so that some APIs could be reused by vCPU Hotplug code of some architectures likes ARM, Loongson etc. Update new/old APIs with trace events. New APIs qemu_{create,park,unpark}_vcpu() can be externally called. No functional change is intended here. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Xianglai Li <lixianglai@loongson.cn> Tested-by: Miguel Luis <miguel.luis@oracle.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Reviewed-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240716111502.202344-2-salil.mehta@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-16 14:14:56 +03:00
kvm_create_vcpu(int cpu_index, unsigned long arch_cpu_id, int kvm_fd) "index: %d, id: %lu, kvm fd: %d"
kvm_destroy_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu"
kvm_park_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu"
kvm_unpark_vcpu(unsigned long arch_cpu_id, const char *msg) "id: %lu %s"
kvm_irqchip_commit_routes(void) ""
kvm_irqchip_add_msi_route(char *name, int vector, int virq) "dev %s vector %d virq %d"
kvm_irqchip_update_msi_route(int virq) "Updating MSI route virq=%d"
kvm_irqchip_release_virq(int virq) "virq %d"
kvm_set_ioeventfd_mmio(int fd, uint64_t addr, uint32_t val, bool assign, uint32_t size, bool datamatch) "fd: %d @0x%" PRIx64 " val=0x%x assign: %d size: %d match: %d"
kvm_set_ioeventfd_pio(int fd, uint16_t addr, uint32_t val, bool assign, uint32_t size, bool datamatch) "fd: %d @0x%x val=0x%x assign: %d size: %d match: %d"
kvm_set_user_memory(uint16_t as, uint16_t slot, uint32_t flags, uint64_t guest_phys_addr, uint64_t memory_size, uint64_t userspace_addr, uint32_t fd, uint64_t fd_offset, int ret) "AddrSpace#%d Slot#%d flags=0x%x gpa=0x%"PRIx64 " size=0x%"PRIx64 " ua=0x%"PRIx64 " guest_memfd=%d" " guest_memfd_offset=0x%" PRIx64 " ret=%d"
kvm_clear_dirty_log(uint32_t slot, uint64_t start, uint32_t size) "slot#%"PRId32" start 0x%"PRIx64" size 0x%"PRIx32
KVM: Kick resamplefd for split kernel irqchip This is majorly only for X86 because that's the only one that supports split irqchip for now. When the irqchip is split, we face a dilemma that KVM irqfd will be enabled, however the slow irqchip is still running in the userspace. It means that the resamplefd in the kernel irqfds won't take any effect and it will miss to ack INTx interrupts on EOIs. One example is split irqchip with VFIO INTx, which will break if we use the VFIO INTx fast path. This patch can potentially supports the VFIO fast path again for INTx, that the IRQ delivery will still use the fast path, while we don't need to trap MMIOs in QEMU for the device to emulate the EIOs (see the callers of vfio_eoi() hook). However the EOI of the INTx will still need to be done from the userspace by caching all the resamplefds in QEMU and kick properly for IOAPIC EOI broadcast. This is tricky because in this case the userspace ioapic irr & remote-irr will be bypassed. However such a change will greatly boost performance for assigned devices using INTx irqs (TCP_RR boosts 46% after this patch applied). When the userspace is responsible for the resamplefd kickup, don't register it on the kvm_irqfd anymore, because on newer kernels (after commit 654f1f13ea56, 5.2+) the KVM_IRQFD will fail if with both split irqchip and resamplefd. This will make sure that the fast path will work for all supported kernels. https://patchwork.kernel.org/patch/10738541/#22609933 Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20200318145204.74483-5-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-18 17:52:03 +03:00
kvm_resample_fd_notify(int gsi) "gsi %d"
kvm_dirty_ring_full(int id) "vcpu %d"
kvm_dirty_ring_reap_vcpu(int id) "vcpu %d"
kvm_dirty_ring_page(int vcpu, uint32_t slot, uint64_t offset) "vcpu %d fetch %"PRIu32" offset 0x%"PRIx64
kvm_dirty_ring_reaper(const char *s) "%s"
kvm_dirty_ring_reap(uint64_t count, int64_t t) "reaped %"PRIu64" pages (took %"PRIi64" us)"
kvm_dirty_ring_reaper_kick(const char *reason) "%s"
kvm_dirty_ring_flush(int finished) "%d"
kvm_failed_get_vcpu_mmap_size(void) ""
kvm_cpu_exec(void) ""
kvm_interrupt_exit_request(void) ""
kvm_io_window_exit(void) ""
kvm_run_exit_system_event(int cpu_index, uint32_t event_type) "cpu_index %d, system_even_type %"PRIu32
kvm_convert_memory(uint64_t start, uint64_t size, const char *msg) "start 0x%" PRIx64 " size 0x%" PRIx64 " %s"
kvm_memory_fault(uint64_t start, uint64_t size, uint64_t flags) "start 0x%" PRIx64 " size 0x%" PRIx64 " flags 0x%" PRIx64
KVM: Dynamic sized kvm memslots array Zhiyi reported an infinite loop issue in VFIO use case. The cause of that was a separate discussion, however during that I found a regression of dirty sync slowness when profiling. Each KVMMemoryListerner maintains an array of kvm memslots. Currently it's statically allocated to be the max supported by the kernel. However after Linux commit 4fc096a99e ("KVM: Raise the maximum number of user memslots"), the max supported memslots reported now grows to some number large enough so that it may not be wise to always statically allocate with the max reported. What's worse, QEMU kvm code still walks all the allocated memslots entries to do any form of lookups. It can drastically slow down all memslot operations because each of such loop can run over 32K times on the new kernels. Fix this issue by making the memslots to be allocated dynamically. Here the initial size was set to 16 because it should cover the basic VM usages, so that the hope is the majority VM use case may not even need to grow at all (e.g. if one starts a VM with ./qemu-system-x86_64 by default it'll consume 9 memslots), however not too large to waste memory. There can also be even better way to address this, but so far this is the simplest and should be already better even than before we grow the max supported memslots. For example, in the case of above issue when VFIO was attached on a 32GB system, there are only ~10 memslots used. So it could be good enough as of now. In the above VFIO context, measurement shows that the precopy dirty sync shrinked from ~86ms to ~3ms after this patch applied. It should also apply to any KVM enabled VM even without VFIO. NOTE: we don't have a FIXES tag for this patch because there's no real commit that regressed this in QEMU. Such behavior existed for a long time, but only start to be a problem when the kernel reports very large nr_slots_max value. However that's pretty common now (the kernel change was merged in 2021) so we attached cc:stable because we'll want this change to be backported to stable branches. Cc: qemu-stable <qemu-stable@nongnu.org> Reported-by: Zhiyi Guo <zhguo@redhat.com> Tested-by: Zhiyi Guo <zhguo@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240917163835.194664-2-peterx@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-09-17 19:38:32 +03:00
kvm_slots_grow(unsigned int old, unsigned int new) "%u -> %u"