mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Peter Maydell	614e41bc26	kvm: Decouple 'MSI routing via irqfds' from 'kernel irqchip' Decouple another x86-specific assumption about what irqchips imply. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-08-09 16:16:56 +03:00
Peter Maydell	cc7e0ddf5a	kvm: Decouple 'irqfds usable' from 'kernel irqchip' Instead of assuming that we can use irqfds if and only if kvm_irqchip_in_kernel(), add a bool to the KVMState which indicates this, and is set only on x86 and only if the irqchip is in the kernel. The kernel documentation implies that the only thing you need to use KVM_IRQFD is that KVM_CAP_IRQFD is advertised, but this seems to be untrue. In particular the kernel does not (alas) return a sensible error if you try to set up an irqfd when you haven't created an irqchip. If it did we could remove all this nonsense and let the kernel return the error code. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-08-09 16:16:56 +03:00
Peter Maydell	1d31f66bbc	kvm: Move kvm_allows_irq0_override() to target-i386, fix return type kvm_allows_irq0_override() is a totally x86 specific concept: move it to the target-specific source file where it belongs. This means we need a new header file for the prototype: kvm_i386.h, in line with the existing kvm_ppc.h. While we are moving it, fix the return type to be 'bool' rather than 'int'. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-08-09 16:16:55 +03:00
Peter Maydell	3889c3fab1	kvm: Rename kvm_irqchip_set_irq() to kvm_set_irq() Rename the function kvm_irqchip_set_irq() to kvm_set_irq(), since it can be used for sending (asynchronous) interrupts whether there is a full irqchip model in the kernel or not. (We don't include 'async' in the function name since asynchronous is the normal case.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-08-09 16:16:54 +03:00
Peter Maydell	7ae26bd484	kvm: Decouple 'async interrupt delivery' from 'kernel irqchip' On x86 userspace delivers interrupts to the kernel asynchronously (and therefore VCPU idle management is done in the kernel) if and only if there is an in-kernel irqchip. On other architectures this isn't necessarily true (they may always send interrupts asynchronously), so define a new kvm_async_interrupts_enabled() function instead of misusing kvm_irqchip_in_kernel(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-08-09 16:16:53 +03:00
Dunrong Huang	3ed444e949	kvm: Check if smp_cpus exceeds max cpus supported by kvm Add a helper function for fetching max cpus supported by kvm. Make QEMU exit with an error message if smp_cpus exceeds limit of VCPU count retrieved by invoking this helper function. Signed-off-by: Dunrong Huang <riegamaths@gmail.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-08-09 16:16:36 +03:00
Anthony Liguori	09f06a6c60	Merge remote-tracking branch 'qemu-kvm/uq/master' into staging * qemu-kvm/uq/master: virtio: move common irqfd handling out of virtio-pci virtio: move common ioeventfd handling out of virtio-pci event_notifier: add event_notifier_set_handler memory: pass EventNotifier, not eventfd ivshmem: wrap ivshmem_del_eventfd loops with transaction ivshmem: use EventNotifier and memory API event_notifier: add event_notifier_init_fd event_notifier: remove event_notifier_test event_notifier: add event_notifier_set apic: Defer interrupt updates to VCPU thread apic: Reevaluate pending interrupts on LVT_LINT0 changes apic: Resolve potential endless loop around apic_update_irq kvm: expose tsc deadline timer feature to guest kvm_pv_eoi: add flag support kvm: Don't abort on kvm_irqchip_add_msi_route()	2012-07-18 14:44:43 -05:00
Paolo Bonzini	15b2bd1847	virtio: move common irqfd handling out of virtio-pci All transports can use the same event handler for the irqfd, though the exact mechanics of the assignment will be specific. Note that there are three states: handled by the kernel, handled in userspace, disabled. This also lets virtio use event_notifier_set_handler. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-07-12 14:08:11 +03:00
Paolo Bonzini	753d5e14c4	memory: pass EventNotifier, not eventfd Under Win32, EventNotifiers will not have event_notifier_get_fd, so we cannot call it in common code such as hw/virtio-pci.c. Pass a pointer to the notifier, and only retrieve the file descriptor in kvm-specific code. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-07-12 14:08:10 +03:00
Christian Borntraeger	fdec991857	s390: autodetect map private By default qemu will use MAP_PRIVATE for guest pages. This will write protect pages and thus break on s390 systems that dont support this feature. Therefore qemu has a hack to always use MAP_SHARED for s390. But MAP_SHARED has other problems (no dirty pages tracking, a lot more swap overhead etc.) Newer systems allow the distinction via KVM_CAP_S390_COW. With this feature qemu can use the standard qemu alloc if available, otherwise it will use the old s390 hack. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Acked-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2012-07-10 18:27:33 +02:00
Alex Williamson	df410675e5	kvm: Don't abort on kvm_irqchip_add_msi_route() Anyone using these functions has to be prepared that irqchip support may not be present. It shouldn't be up to the core code to determine whether this is a fatal error. Currently code written as: virq = kvm_irqchip_add_msi_route(...) if (virq < 0) { <slow path> } else { <fast path> } works on x86 with and without kvm irqchip enabled, works without kvm support compiled in, but aborts() on !x86 with kvm support. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-06-29 20:02:27 -03:00
Paolo Bonzini	ebd063d150	kvm: add missing include files These are included via monitor.h right now, add them explicitly. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2012-06-15 10:41:05 -03:00
Jan Kiszka	d3d3bef0a0	kvm: Fix build for non-CAP_IRQ_ROUTING targets A type definition and a KVMState field initialization escaped the required wrapping with KVM_CAP_IRQ_ROUTING. Also, we need to provide a dummy kvm_irqchip_release_virq as virtio-pci references (but does not use) it. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Ben Collins <bcollins@ubuntu.com> Tested-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-06-06 11:41:33 +03:00
Jan Kiszka	39853bbc49	kvm: Introduce kvm_irqchip_add/remove_irqfd Add services to associate an eventfd file descriptor as input with an IRQ line as output. Such a line can be an input pin of an in-kernel irqchip or a virtual line returned by kvm_irqchip_add_route. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-05-21 19:22:50 +03:00
Jan Kiszka	e7b2030862	kvm: Make kvm_irqchip_commit_routes an internal service Automatically commit route changes after kvm_add_routing_entry and kvm_irqchip_release_virq. There is no performance relevant use case for which collecting multiple route changes is beneficial. This makes kvm_irqchip_commit_routes an internal service which assert()s that the corresponding IOCTL will always succeed. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-05-21 19:22:49 +03:00
Jan Kiszka	1e2aa8be09	kvm: Publicize kvm_irqchip_release_virq This allows to drop routes created by kvm_irqchip_add_irq/msi_route again. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-05-21 19:22:49 +03:00
Jan Kiszka	92b4e48982	kvm: Introduce kvm_irqchip_add_msi_route Add a service that establishes a static route from a virtual IRQ line to an MSI message. Will be used for IRQFD and device assignment. As we will use this service outside of CONFIG_KVM protected code, stub it properly. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-05-21 19:22:49 +03:00
Jan Kiszka	1df186df35	kvm: Rename kvm_irqchip_add_route to kvm_irqchip_add_irq_route We will add kvm_irqchip_add_msi_route, so let's make the difference clearer. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-05-21 19:22:49 +03:00
Jan Kiszka	a24b9106fa	kvm: Enable in-kernel irqchip support by default As MSI is now fully supported by KVM (/wrt available features in upstream), we can finally enable the in-kernel irqchip by default. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-05-16 18:04:45 -03:00
Jan Kiszka	4a3adebb18	kvm: Add support for direct MSI injections If the kernel supports KVM_SIGNAL_MSI, we can avoid the route-based MSI injection mechanism. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-05-16 18:04:45 -03:00
Jan Kiszka	04fa27f5ae	kvm: Introduce basic MSI support for in-kernel irqchips This patch basically adds kvm_irqchip_send_msi, a service for sending arbitrary MSI messages to KVM's in-kernel irqchip models. As the original KVM API requires us to establish a static route from a pseudo GSI to the target MSI message and inject the MSI via toggling that virtual IRQ, we need to play some tricks to make this interface transparent. We create those routes on demand and keep them in a hash table. Succeeding messages can then search for an existing route in the table first and reuse it whenever possible. If we should run out of limited GSIs, we simply flush the table and rebuild it as messages are sent. This approach is rather simple and could be optimized further. However, latest kernels contains a more efficient MSI injection interface that will obsolete the GSI-based dynamic injection. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-05-16 18:04:44 -03:00
Jan Kiszka	4e2e4e6355	kvm: Refactor KVMState::max_gsi to gsi_count Instead of the bitmap size, store the maximum of GSIs the kernel support. Move the GSI limit assertion to the API function kvm_irqchip_add_route and make it stricter. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-05-16 18:04:44 -03:00
David Gibson	3145fcb605	kvm: Fix dirty tracking with large kernel page size If the kernel page size is larger than TARGET_PAGE_SIZE, which happens for example on ppc64 with kernels compiled for 64K pages, the dirty tracking doesn't work. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-05-10 12:40:08 +03:00
Jason Baron	bc8c6788b6	kvm: set gsi_bits and max_gsi correctly The current kvm_init_irq_routing() doesn't set up the used_gsi_bitmap correctly, and as a consequence pins max_gsi to 32 when it really should be 1024. I ran into this limitation while testing pci passthrough, where I consistently got an -ENOSPC return from kvm_get_irq_route_gsi() called from assigned_dev_update_msix_mmio(). Signed-off-by: Jason Baron <jbaron@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-04-12 19:01:42 -03:00
Jan Kiszka	c73b00973b	kvm: Drop unused kvm_pit_in_kernel This is now implied by kvm_irqchip_in_kernel. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-04-12 19:01:41 -03:00
Michael S. Tsirkin	4b8f1c88e9	kvm: allow arbitrarily sized mmio ioeventfd We use a 2 byte ioeventfd for virtio memory, add support for this. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Amos Kong <akong@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-12 19:01:41 -03:00
David Gibson	92e4b519e0	kvm: Comparison with ioctl number macros needs to be unsigned In kvm-all.c we store an ioctl cmd number in the irqchip_inject_ioctl field of KVMState, which has type 'int'. This seems to make sense since the ioctl() man page says that the cmd parameter has type int. However, the kernel treats ioctl numbers as unsigned - sys_ioctl() takes an unsigned int, and the macros which generate ioctl numbers expand to unsigned expressions. Furthermore, some ioctls (IOC_READ ioctls on x86 and IOC_WRITE ioctls on powerpc) have bit 31 set, and so would be negative if interpreted as an int. This has the surprising and compile-breaking consequence that in kvm_irqchip_set_irq() where we do: return (s->irqchip_inject_ioctl == KVM_IRQ_LINE) ? 1 : event.status; We will get a "comparison is always false due to limited range of data type" warning from gcc if KVM_IRQ_LINE is one of the bit-31-set ioctls, which it is on powerpc. So, despite the fact that the man page and posix say ioctl numbers are signed, they're actually unsigned. The kernel uses unsigned, the glibc header uses unsigned long, and FreeBSD, NetBSD and OSX also use unsigned long ioctl numbers in the code. Therefore, this patch changes the variable to be unsigned, fixing the compile. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2012-03-15 13:12:12 +01:00
Andreas Färber	9349b4f9fd	Rename CPUState -> CPUArchState Scripted conversion: for file in .[hc] hw/.[hc] hw/kvm/.[hc] linux-user/.[hc] linux-user/m68k/.[hc] bsd-user/.[hc] darwin-user/.[hc] tcg//.[hc] target-/cpu.h; do sed -i "s/CPUState/CPUArchState/g" $file done All occurrences of CPUArchState are expected to be replaced by QOM CPUState, once all targets are QOM'ified and common fields have been extracted. Signed-off-by: Andreas Färber <afaerber@suse.de> Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>	2012-03-14 22:20:27 +01:00
Anthony Liguori	dac6b1b22c	Merge remote-tracking branch 'stefanha/trivial-patches' into staging * stefanha/trivial-patches: configure: Quote the configure args printed in config.log osdep: Remove local definition of macro offsetof libcacard: Spelling and grammar fixes in documentation Spelling fixes in comments (it's -> its) vnc: Add break statement libcacard: Use format specifier %u instead of %d for unsigned values Fix sign of sscanf format specifiers block/vmdk: Fix warning from splint (comparision of unsigned value) qmp: Fix spelling fourty -> forty qom: Fix spelling in documentation sh7750: Remove redundant 'struct' from MemoryRegionOps	2012-03-09 12:30:20 -06:00
Stefan Weil	93148aa5c3	Spelling fixes in comments (it's -> its) * it's -> its (fixed for all files) * dont -> don't (only fixed in a line which was touched by the previous fix) * distrub -> disturb (fixed in the same line) Reviewed-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2012-03-08 15:22:11 +00:00
Michael S. Tsirkin	7e680753cf	kvm: fill in padding to help valgrind valgrind warns about padding fields which are passed to vcpu ioctls uninitialized. This is not an error in practice because kvm ignored padding. Since the ioctls in question are off data path and the cost is zero anyway, initialize padding to 0 to suppress these errors. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-07 12:31:48 +02:00
Jan Kiszka	8a7c73932e	kvm: Add kvm_has_pit_state2 helper To be used for in-kernel PIT emulation. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-07 12:27:42 +02:00
Anthony Liguori	88e6c60671	Merge remote-tracking branch 'qemu-kvm/memory/urgent' into staging * qemu-kvm/memory/urgent: kvm: fix unaligned slots	2012-03-01 15:26:55 -06:00
Anthony Liguori	14655e482b	Merge remote-tracking branch 'qemu-kvm/memory/core' into staging * qemu-kvm/memory/core: (30 commits) memory: allow phys_map tree paths to terminate early memory: unify PhysPageEntry::node and ::leaf memory: change phys_page_set() to set multiple pages memory: switch phys_page_set() to a recursive implementation memory: replace phys_page_find_alloc() with phys_page_set() memory: simplify multipage/subpage registration memory: give phys_page_find() its own tree search loop memory: make phys_page_find() return a MemoryRegionSection memory: move tlb flush to MemoryListener commit callback memory: unify the two branches of cpu_register_physical_memory_log() memory: fix RAM subpages in newly initialized pages memory: compress phys_map node pointers to 16 bits memory: store MemoryRegionSection pointers in phys_map memory: unify phys_map last level with intermediate levels memory: remove first level of l1_phys_map memory: change memory registration to rebuild the memory map on each change memory: support stateless memory listeners memory: split memory listener for the two address spaces xen: ignore I/O memory regions memory: allow MemoryListeners to observe a specific address space ...	2012-03-01 15:26:25 -06:00
Avi Kivity	8f6f962b99	kvm: fix unaligned slots kvm_set_phys_mem() may be passed sections that are not aligned to a page boundary. The current code simply brute-forces the alignment which leads to an inconsistency and an abort(). Fix by aligning the start and the end of the section correctly, discarding and unaligned head or tail. This was triggered by a guest sizing a 64-bit BAR that is smaller than a page with PCI_COMMAND_MEMORY enabled and the upper dword clear. Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-01 14:50:07 +02:00
Avi Kivity	50c1e1491e	memory: support stateless memory listeners Current memory listeners are incremental; that is, they are expected to maintain their own state, and receive callbacks for changes to that state. This patch adds support for stateless listeners; these work by receiving a ->begin() callback (which tells them that new state is coming), a sequence of ->region_add() and ->region_nop() callbacks, and then a ->commit() callback which signifies the end of the new state. They should ignore ->region_del() callbacks. Signed-off-by: Avi Kivity <avi@redhat.com>	2012-02-29 13:44:42 +02:00
Avi Kivity	7376e5827a	memory: allow MemoryListeners to observe a specific address space Ignore any regions not belonging to a specified address space. Signed-off-by: Avi Kivity <avi@redhat.com>	2012-02-29 13:44:42 +02:00
Avi Kivity	80a1ea3748	memory: move ioeventfd ops to MemoryListener This way the accelerator (kvm) can handle them directly. Signed-off-by: Avi Kivity <avi@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2012-02-29 13:44:41 +02:00
Avi Kivity	72e22d2fe1	memory: switch memory listeners to a QTAILQ This allows reverse iteration, which in turns allows consistent ordering among multiple listeners: l1->add l2->add l2->del l1->del Signed-off-by: Avi Kivity <avi@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2012-02-29 13:44:41 +02:00
Jan Kiszka	e479c207d7	kvm: Set cpu_single_env only once As we have thread-local cpu_single_env now and KVM uses exactly one thread per VCPU, we can drop the cpu_single_env updates from the loop and initialize this variable only once during setup. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-02-18 12:15:52 +02:00
Jan Kiszka	3d4b26494f	kvm: Implement kvm_irqchip_in_kernel like kvm_enabled To both avoid that kvm_irqchip_in_kernel always has to be paired with kvm_enabled and that the former ends up in a function call, implement it like the latter. This means keeping the state in a global variable and defining kvm_irqchip_in_kernel as a preprocessor macro. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-02-08 15:57:50 -02:00
Alexander Graf	a9c5eb0db1	KVM: Fix compilation on non-x86 Commit `84b058d` broke compilation for KVM on non-x86 targets, which don't have KVM_CAP_IRQ_ROUTING defined. Fix by not using the unavailable constant when it's not around. Signed-off-by: Alexander Graf <agraf@suse.de>	2012-02-02 02:47:46 +01:00
Blue Swirl	fd4aa97903	memory: change dirty setting APIs to take a size Instead of each target knowing or guessing the guest page size, just pass the desired size of dirtied memory area. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-01-25 18:32:47 +00:00
Anthony Liguori	5b4448d27d	Merge remote-tracking branch 'qemu-kvm/uq/master' into staging * qemu-kvm/uq/master: kvm: Activate in-kernel irqchip support kvm: x86: Add user space part for in-kernel IOAPIC kvm: x86: Add user space part for in-kernel i8259 kvm: x86: Add user space part for in-kernel APIC kvm: x86: Establish IRQ0 override control kvm: Introduce core services for in-kernel irqchip support memory: Introduce memory_region_init_reservation ioapic: Factor out base class for KVM reuse ioapic: Drop post-load irr initialization i8259: Factor out base class for KVM reuse i8259: Completely privatize PicState apic: Open-code timer save/restore apic: Factor out base class for KVM reuse apic: Introduce apic_report_irq_delivered apic: Inject external NMI events via LINT1 apic: Stop timer on reset kvm: Move kvmclock into hw/kvm folder msi: Generalize msix_supported to msi_supported hyper-v: initialize Hyper-V CPUID leaves. hyper-v: introduce Hyper-V support infrastructure. Conflicts: Makefile.target Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2012-01-23 11:00:26 -06:00
Benjamin Herrenschmidt	aa90fec7ad	Fix dirty logging with 32-bit qemu & 64-bit guests The kvm_get_dirty_pages_log_range() function uses two address variables to step through the monitored memory region to update the dirty log. However, these variables have type unsigned long, which can overflow if running a 64-bit guest with a 32-bit qemu binary. This patch changes these to target_phys_addr_t which will have the correct size. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2012-01-21 05:17:01 +01:00
Jan Kiszka	9b5b76d449	kvm: x86: Establish IRQ0 override control KVM is forced to disable the IRQ0 override when we run with in-kernel irqchip but without IRQ routing support of the kernel. Set the fwcfg value correspondingly. This aligns us with qemu-kvm. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>	2012-01-19 12:14:42 +01:00
Jan Kiszka	84b058d7df	kvm: Introduce core services for in-kernel irqchip support Add the basic infrastructure to active in-kernel irqchip support, inject interrupts into these models, and maintain IRQ routes. Routing is optional and depends on the host arch supporting KVM_CAP_IRQ_ROUTING. When it's not available on x86, we looe the HPET as we can't route GSI0 to IOAPIC pin 2. In-kernel irqchip support will once be controlled by the machine property 'kernel_irqchip', but this is not yet wired up. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>	2012-01-19 12:14:42 +01:00
Avi Kivity	3fbffb628c	kvm: flush the dirty log when unregistering a slot Otherwise, the dirty log information is lost in the kernel forever. Fixes opensuse-12.1 boot screen, which changes the vga windows rapidly. Signed-off-by: Avi Kivity <avi@redhat.com>	2012-01-15 16:13:59 +02:00
Avi Kivity	ffcde12f6c	kvm: avoid cpu_get_physical_page_desc() This reaches into the innards of the memory core, which are being changed. Switch to a memory API version. Signed-off-by: Avi Kivity <avi@redhat.com>	2012-01-03 19:19:28 +02:00
Avi Kivity	a01672d396	kvm: convert to MemoryListener API Drop the use of cpu_register_phys_memory_client() in favour of the new MemoryListener API. The new API simplifies the caller, since there is no need to deal with splitting and merging slots; however this is not exploited in this patch. Signed-off-by: Avi Kivity <avi@redhat.com>	2011-12-20 14:14:07 +02:00
Avi Kivity	9f213ed92c	kvm: switch kvm slots to use host virtual address instead of ram_addr_t This simplifies a later switch to the memory API in slot management. Signed-off-by: Avi Kivity <avi@redhat.com>	2011-12-20 14:14:07 +02:00
Michael Ellerman	7b011fbc05	kvm: Print something before calling abort() if KVM_RUN fails It's a little unfriendly to call abort() without printing any sort of error message. So turn the DPRINTK() into an fprintf(stderr, ...). Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2011-12-16 10:45:17 +00:00
Xu He Jie	db9eae1c49	kvm_init didn't set return value after create vm failed And kvm_ioctl(s, KVM_CREATE_VM, 0)'s return value can be < -1, so change the check of vmfd at label 'err'. Signed-off-by: Xu He Jie <xuhj@linux.vnet.ibm.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2011-11-01 16:52:05 -05:00
Avi Kivity	1cae88b9f4	kvm: avoid reentring kvm_flush_coalesced_mmio_buffer() mmio callbacks invoked by kvm_flush_coalesced_mmio_buffer() may themselves indirectly call kvm_flush_coalesced_mmio_buffer(). Prevent reentering the function by checking a flag that indicates we're processing coalesced mmio requests. Signed-off-by: Avi Kivity <avi@redhat.com>	2011-10-24 21:26:53 -02:00
Luiz Capitulino	0461d5a699	RunState: Rename enum values as generated by the QAPI Next commit will convert the query-status command to use the RunState type as generated by the QAPI. In order to "transparently" replace the current enum by the QAPI one, we have to make some changes to some enum values. As the changes are simple renames, I'll do them in one shot. The changes are: - Rename the prefix from RSTATE_ to RUN_STATE_ - RUN_STATE_SAVEVM to RUN_STATE_SAVE_VM - RUN_STATE_IN_MIGRATE to RUN_STATE_INMIGRATE - RUN_STATE_PANICKED to RUN_STATE_INTERNAL_ERROR - RUN_STATE_POST_MIGRATE to RUN_STATE_POSTMIGRATE - RUN_STATE_PRE_LAUNCH to RUN_STATE_PRELAUNCH - RUN_STATE_PRE_MIGRATE to RUN_STATE_PREMIGRATE - RUN_STATE_RESTORE to RUN_STATE_RESTORE_VM - RUN_STATE_PRE_MIGRATE to RUN_STATE_FINISH_MIGRATE Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2011-10-04 11:02:57 -03:00
Luiz Capitulino	1dfb4dd993	Replace the VMSTOP macros with a proper state type Today, when notifying a VM state change with vm_state_notify(), we pass a VMSTOP macro as the 'reason' argument. This is not ideal because the VMSTOP macros tell why qemu stopped and not exactly what the current VM state is. One example to demonstrate this problem is that vm_start() calls vm_state_notify() with reason=0, which turns out to be VMSTOP_USER. This commit fixes that by replacing the VMSTOP macros with a proper state type called RunState. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2011-09-15 16:39:32 -03:00
Anthony Liguori	12d4536f7d	main: force enabling of I/O thread Enabling the I/O thread by default seems like an important part of declaring 1.0. Besides allowing true SMP support with KVM, the I/O thread means that the TCG VCPU doesn't have to multiplex itself with the I/O dispatch routines which currently requires a (racey) signal based alarm system. I know there have been concerns about performance. I think so far the ones that have come up (virtio-net) are most likely due to secondary reasons like decreased batching. I think we ought to force enabling I/O thread early in 1.0 development and commit to resolving any lingering issues. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2011-09-02 10:34:55 -05:00
Anthony Liguori	7267c0947d	Use glib memory allocation and free functions qemu_malloc/qemu_free no longer exist after this commit. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2011-08-20 23:01:08 -05:00
Jan Kiszka	2a1ac12b0b	kvm: Drop obsolete KVM_IOEVENTFD #ifdefs Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-08-05 12:04:27 -03:00
Jan Kiszka	ba4047cf84	kvm: Drop KVM_CAP build dependencies No longer needed with accompanied kernel headers. We are only left with build dependencies that are controlled by kvm arch headers. CC: Alexander Graf <agraf@suse.de> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-06-20 15:18:43 -03:00
Jan Kiszka	1480b74f9a	kvm: Drop useless zero-initializations Backing KVMState is alreay zero-initialized. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-06-20 15:17:55 -03:00
Alexander Graf	d4d6868f7b	kvm: ppc: warn user on PAGE_SIZE mismatch On PPC, the default PAGE_SIZE is 64kb. Unfortunately, the hardware alignments don't match here: There are RAM and MMIO regions within a single page when it's 64kb in size. So the only way out for now is to tell the user that he should use 4k PAGE_SIZE. This patch gives the user a hint on that, telling him that failing to register a prefix slot is most likely to be caused by mismatching PAGE_SIZE. This way it's also more future-proof, as bigger PAGE_SIZE can easily be supported by other machines then, as long as they stick to 64kb granularities. Signed-off-by: Alexander Graf <agraf@suse.de>	2011-05-09 22:31:38 +02:00
Michael S. Tsirkin	5300f1a548	Merge remote branch 'origin/master' into pci Conflicts: exec.c	2011-05-05 16:39:47 +03:00
Paolo Bonzini	4a043713b3	kvm: use qemu_free consistently Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-05-02 09:51:59 -03:00
Michael Tokarev	51b0c6065a	fix crash in migration, 32-bit userspace on 64-bit host This change fixes a long-standing immediate crash (memory corruption and abort in glibc malloc code) in migration on 32bits. The bug is present since this commit: commit 692d9aca97b865b0f7903565274a52606910f129 Author: Bruce Rogers <brogers@novell.com> Date: Wed Sep 23 16:13:18 2009 -0600 qemu-kvm: allocate correct size for dirty bitmap The dirty bitmap copied out to userspace is stored in a long array, and gets copied out to userspace accordingly. This patch accounts for that correctly. Currently I'm seeing kvm crashing due to writing beyond the end of the alloc'd dirty bitmap memory, because the buffer has the wrong size. Signed-off-by: Bruce Rogers Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ int kvm_get_dirty_pages_range(kvm_context_t kvm, unsigned long phys_addr, - buf = qemu_malloc((slots[i].len / 4096 + 7) / 8 + 2); + buf = qemu_malloc(BITMAP_SIZE(slots[i].len)); r = kvm_get_map(kvm, KVM_GET_DIRTY_LOG, i, buf); BITMAP_SIZE is now open-coded in that function, like this: size = ALIGN(((mem->memory_size) >> TARGET_PAGE_BITS), HOST_LONG_BITS) / 8; The problem is that HOST_LONG_BITS in 32bit userspace is 32 but it's 64 in 64bit kernel. So userspace aligns this to 32, and kernel to 64, but since no length is passed from userspace to kernel on ioctl, kernel uses its size calculation and copies 4 extra bytes to userspace, corrupting memory. Here's how it looks like during migrate execution: our=20, kern=24 our=4, kern=8 ... our=4, kern=8 our=4064, kern=4064 our=512, kern=512 our=4, kern=8 our=20, kern=24 our=4, kern=8 ... our=4, kern=8 our=4064, kern=4064 * glibc detected * ./x86_64-softmmu/qemu-system-x86_64: realloc(): invalid next size: 0x08f20528 *** (our is userspace size above, kern is the size as calculated by the kernel). Fix this by always aligning to 64 in a hope that no platform will have sizeof(long)>8 any time soon, and add a comment describing it all. It's a small price to pay for bad kernel design. Alternatively it's possible to fix that in the kernel by using different size calculation depending on the current process. But this becomes quite ugly. Special thanks goes to Stefan Hajnoczi for spotting the fundamental cause of the issue, and to Alexander Graf for his support in #qemu. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> CC: Bruce Rogers <brogers@novell.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-05-02 09:38:35 -03:00
Jan Kiszka	aa7f74d119	kvm: Install specialized interrupt handler KVM only requires to set the raised IRQ in CPUState and to kick the receiving vcpu if it is remote. Installing a specialized handler allows potential future changes to the TCG code path without risking KVM side effects. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-05-02 09:38:35 -03:00
Michael S. Tsirkin	25254bbc4d	kvm: halve number of set memory calls for vga use the new api to reduce the number of these (expensive) system calls. Note: using this API, we should be able to get rid of vga_dirty_log_xxx APIs. Using them doesn't affect the performance though because we detects the log_dirty flag set and ignores the call. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2011-04-06 23:08:03 +03:00
Michael S. Tsirkin	0fd542fb7d	cpu: add set_memory flag to request dirty logging Pass the flag to all cpu notifiers, doing nothing at this point. Will be used by follow-up patches. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2011-04-06 22:28:40 +03:00
Jan Kiszka	f2574737f6	kvm: x86: Push kvm_arch_debug to kvm_arch_handle_exit There are no generic bits remaining in the handling of KVM_EXIT_DEBUG. So push its logic completely into arch hands, i.e. only x86 so far. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-16 17:11:06 -03:00
Jan Kiszka	bb4ea39329	kvm: Align kvm_arch_handle_exit to kvm_cpu_exec changes Make the return code of kvm_arch_handle_exit directly usable for kvm_cpu_exec. This is straightforward for x86 and ppc, just s390 would require more work. Avoid this for now by pushing the return code translation logic into s390's kvm_arch_handle_exit. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> CC: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-16 17:11:06 -03:00
Jan Kiszka	d73cd8f4ea	kvm: Rework inner loop of kvm_cpu_exec Let kvm_cpu_exec return EXCP_* values consistently and generate those codes already inside its inner loop. This means we will now re-enter the kernel while ret == 0. Update kvm_handle_internal_error accordingly, but keep kvm_arch_handle_exit untouched, it will be converted in a separate step. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-16 17:11:06 -03:00
Jan Kiszka	dc77d341d2	kvm: Reorder error handling of KVM_RUN Test for general errors first as this is the slower path. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-16 17:11:06 -03:00
Jan Kiszka	7cbb533f92	kvm: Keep KVM_RUN return value in separate variable Avoid using 'ret' both for the return value of KVM_RUN as well as the code kvm_cpu_exec is supposed to return. Both have no direct relation. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-16 17:11:06 -03:00
Jan Kiszka	51e8fa606e	kvm: Consider EXIT_DEBUG unknown without CAP_SET_GUEST_DEBUG Without KVM_CAP_SET_GUEST_DEBUG, we neither motivate the kernel to report KVM_EXIT_DEBUG nor do we expect such exits. So fall through to the arch code which will simply report an unknown exit reason. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-16 17:11:06 -03:00
Jan Kiszka	d841b6c4f1	kvm: Mark VCPU state dirty on creation This avoids that early cpu_synchronize_state calls try to retrieve an uninitialized state from the kernel. That even causes a deadlock if io-thread is enabled. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-16 17:11:05 -03:00
Jan Kiszka	990368650f	kvm: Rename kvm_arch_process_irqchip_events to async_events We will broaden the scope of this function on x86 beyond irqchip events. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-15 01:19:05 -03:00
Jan Kiszka	fbc1c7e688	kvm: Fix build warning when KVM_CAP_SET_GUEST_DEBUG is lacking Original fix by David Gibson. CC: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-15 01:19:04 -03:00
Jan Kiszka	6a7af8cb04	kvm: Make kvm_state globally available KVM-assisted devices need access to it but we have no clean channel to distribute a reference. As a workaround until there is a better solution, export kvm_state for global use, though use should remain restricted to the mentioned scenario. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-02-14 12:39:47 -02:00
Anthony PERARD	e5896b12e2	Introduce log_start/log_stop in CPUPhysMemoryClient In order to use log_start/log_stop with Xen as well in the vga code, this two operations have been put in CPUPhysMemoryClient. The two new functions cpu_physical_log_start,cpu_physical_log_stop are used in hw/vga.c and replace the kvm_log_start/stop. With this, vga does no longer depends on kvm header. [ Jan: rebasing and style fixlets ] Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-02-14 12:39:47 -02:00
Jan Kiszka	8e045ac407	kvm: Remove unneeded memory slot reservation The number of slots and the location of private ones changed several times in KVM's early days. However, it's stable since 2.6.29 (our required baseline), and slots 8..11 are no longer reserved since then. So remove this unneeded restriction. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> CC: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-02-14 12:39:46 -02:00
Jan Kiszka	6792a57bf1	kvm: Separate TCG from KVM cpu execution Mixing up TCG bits with KVM already led to problems around eflags emulation on x86. Moreover, quite some code that TCG requires on cpu enty/exit is useless for KVM. So dispatch between tcg_cpu_exec and kvm_cpu_exec as early as possible. The core logic of cpu_halted from cpu_exec is added to kvm_arch_process_irqchip_events. Moving away from cpu_exec makes exception_index meaningless for KVM, we can simply pass the exit reason directly (only "EXCP_DEBUG vs. rest" is relevant). Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-02-14 12:39:46 -02:00
Jan Kiszka	e07bbac542	Improve vm_stop reason declarations Define and use dedicated constants for vm_stop reasons, they actually have nothing to do with the EXCP_* defines used so far. At this chance, specify more detailed reasons so that VM state change handlers can evaluate them. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-02-14 12:39:46 -02:00
Jan Kiszka	cdea50ede1	kvm: Leave kvm_cpu_exec directly after KVM_EXIT_SHUTDOWN The reset we issue on KVM_EXIT_SHUTDOWN implies that we should also leave the VCPU loop. As we now check for exit_request which is set by qemu_system_reset_request, this bug is no longer critical. Still it's an unneeded extra turn. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-02-14 12:39:45 -02:00
Jan Kiszka	b30e93e9ec	kvm: Remove static return code of kvm_handle_io Improve the readability of the exit dispatcher by moving the static return value of kvm_handle_io to its caller. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-02-14 12:39:45 -02:00
Jan Kiszka	9ccfac9ea4	kvm: Unconditionally reenter kernel after IO exits KVM requires to reenter the kernel after IO exits in order to complete instruction emulation. Failing to do so will leave the kernel state inconsistently behind. To ensure that we will get back ASAP, we issue a self-signal that will cause KVM_RUN to return once the pending operations are completed. We can move kvm_arch_process_irqchip_events out of the inner VCPU loop. The only state that mattered at its old place was a pending INIT request. Catch it in kvm_arch_pre_run and also trigger a self-signal to process the request on next kvm_cpu_exec. This patch also fixes the missing exit_request check in kvm_cpu_exec in the CONFIG_IOTHREAD case. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> CC: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-02-14 12:39:45 -02:00
Jan Kiszka	a1b87fe046	kvm: Provide sigbus services arch-independently Provide arch-independent kvm_on_sigbus* stubs to remove the #ifdef'ery from cpus.c. This patch also fixes --disable-kvm build by providing the missing kvm_on_sigbus_vcpu kvm-stub. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-02-14 12:39:45 -02:00
Jan Kiszka	748a680b96	kvm: Report proper error on GET_VCPU_MMAP_SIZE failures Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-02-14 12:39:45 -02:00
Anthony Liguori	b3a98367ee	Merge remote branch 'qemu-kvm/uq/master' into staging aliguori: fix build with !defined(KVM_CAP_ASYNC_PF) Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2011-02-01 15:32:18 -06:00
Stefan Hajnoczi	d0dcac833a	virtio-pci: Disable virtio-ioeventfd when !CONFIG_IOTHREAD It is not possible to use virtio-ioeventfd when building without an I/O thread. We rely on a signal to kick us out of vcpu execution. Timers and AIO use SIGALRM and SIGUSR2 respectively. Unfortunately eventfd does not support O_ASYNC (SIGIO) so eventfd cannot be used in a signal driven manner. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-01-31 10:03:00 +01:00
Jan Kiszka	b0c883b5e0	kvm: Flush coalesced mmio buffer on IO window exits We must flush pending mmio writes if we leave kvm_cpu_exec for an IO window. Otherwise we risk to loose those requests when migrating to a different host during that window. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-23 02:27:22 -02:00
Jan Kiszka	94a8d39afd	kvm: Consolidate must-have capability checks Instead of splattering the code with #ifdefs and runtime checks for capabilities we cannot work without anyway, provide central test infrastructure for verifying their availability both at build and runtime. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-23 02:27:22 -02:00
Jan Kiszka	cad1e2827b	kvm: Drop smp_cpus argument from init functions No longer used. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-23 02:27:21 -02:00
Jan Kiszka	f5c848eed7	x86: Optionally dump code bytes on cpu_dump_state Introduce the cpu_dump_state flag CPU_DUMP_CODE and implement it for x86. This writes out the code bytes around the current instruction pointer. Make use of this feature in KVM to help debugging fatal vm exits. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-23 02:27:20 -02:00
Jan Kiszka	bb44e0d12d	kvm: Improve reporting of fatal errors Report KVM_EXIT_UNKNOWN, KVM_EXIT_FAIL_ENTRY, and KVM_EXIT_EXCEPTION with more details to stderr. The latter two are so far x86-only, so move them into the arch-specific handler. Integrate the Intel real mode warning on KVM_EXIT_FAIL_ENTRY that qemu-kvm carries, but actually restrict it to Intel CPUs. Moreover, always dump the CPU state in case we fail. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-23 02:27:20 -02:00
Jan Kiszka	73aaec4a39	kvm: Stop on all fatal exit reasons Ensure that we stop the guest whenever we face a fatal or unknown exit reason. If we stop, we also have to enforce a cpu loop exit. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-23 02:27:20 -02:00
Jan Kiszka	a426e12217	kvm: Fix coding style violations No functional changes. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-21 14:12:49 -02:00
Lai Jiangshan	14a095184c	kvm: convert kvm_ioctl(KVM_CHECK_EXTENSION) to kvm_check_extension() simple cleanup and use existing helper: kvm_check_extension(). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-21 14:05:21 -02:00
Stefan Hajnoczi	d2f2b8a740	kvm: test for ioeventfd support on old kernels There used to be a limit of 6 KVM io bus devices in the kernel. On such a kernel, we can't use many ioeventfds for host notification since the limit is reached too easily. Add an API to test for this condition. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2011-01-10 14:44:16 +02:00
Michael S. Tsirkin	b2e0a138e7	migration: stable ram block ordering This makes ram block ordering under migration stable, ordered by offset. This is especially useful for migration to exec, for debugging. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Jason Wang <jasowang@redhat.com>	2010-12-02 21:13:39 +02:00
Huang Ying	983dfc3b13	Add RAM -> physical addr mapping in MCE simulation In QEMU-KVM, physical address != RAM address. While MCE simulation needs physical address instead of RAM address. So kvm_physical_memory_addr_from_ram() is implemented to do the conversion, and it is invoked before being filled in the IA32_MCi_ADDR MSR. Reported-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-20 16:15:04 -05:00
Andreas Färber	e78815a554	Introduce qemu_madvise() vl.c has a Sun-specific hack to supply a prototype for madvise(), but the call site has apparently moved to arch_init.c. Haiku doesn't implement madvise() in favor of posix_madvise(). OpenBSD and Solaris 10 don't implement posix_madvise() but madvise(). MinGW implements neither. Check for madvise() and posix_madvise() in configure and supply qemu_madvise() as wrapper. Prefer madvise() over posix_madvise() due to flag availability. Convert all callers to use qemu_madvise() and QEMU_MADV_. Note that on Solaris the warning is fixed by moving the madvise() prototype, not by qemu_madvise() itself. It helps with porting though, and it simplifies most call sites. v7 -> v8: Some versions of MinGW have no sys/mman.h header. Reported by Blue Swirl. v6 -> v7: * Adopt madvise() rather than posix_madvise() semantics for returning errors. * Use EINVAL in place of ENOTSUP. v5 -> v6: * Replace two leftover instances of POSIX_MADV_NORMAL with QEMU_MADV_INVALID. Spotted by Blue Swirl. v4 -> v5: * Introduce QEMU_MADV_INVALID, suggested by Alexander Graf. Note that this relies on -1 not being a valid advice value. v3 -> v4: * Eliminate #ifdefs at qemu_advise() call sites. Requested by Blue Swirl. This will currently break the check in kvm-all.c by calling madvise() with a supported flag, which will not fail. Ideas/patches welcome. v2 -> v3: * Reuse the _MADV_ defines for QEMU_MADV_. Suggested by Alexander Graf. Add configure check for madvise(), too. Add defines to Makefile, not QEMU_CFLAGS. Convert all callers, untested. Suggested by Blue Swirl. * Keep Solaris' madvise() prototype around. Pointed out by Alexander Graf. * Display configure check results. v1 -> v2: * Don't rely on posix_madvise() availability, add qemu_madvise(). Suggested by Blue Swirl. Signed-off-by: Andreas Färber <afaerber@opensolaris.org> Cc: Blue Swirl <blauwirbel@gmail.com> Cc: Alexander Graf <agraf@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2010-09-25 11:26:05 +00:00
Anthony Liguori	e945fb7627	Merge remote branch 'qemu-kvm/uq/master' into staging	2010-08-23 15:10:12 -05:00
Cam Macdonell	44f1a3d876	Add function to assign ioeventfd to MMIO. Signed-off-by: Cam Macdonell <cam@cs.ualberta.ca> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2010-08-10 16:25:15 -05:00
Gleb Natapov	14542fea73	kvm: remove guest triggerable abort() This abort() condition is easily triggerable by a guest if it configures pci bar with unaligned address that overlaps main memory. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-08-05 09:57:36 -03:00
Alex Williamson	70fedd76d9	kvm: Don't walk memory_size == 0 slots in kvm_client_migration_log If we've unregistered a memory area, we should avoid calling qemu_get_ram_ptr() on the left over phys_offset cruft in the slot array. Now that we support removing ramblocks, the phys_offset ram_addr_t can go away and cause a lookup fault and abort. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-08-05 09:57:35 -03:00
Jan Kiszka	3c638d0690	Rework debug exception processing for gdb use Guest debugging is currently broken under CONFIG_IOTHREAD. The reason is inconsistent or even lacking signaling the debug events from the source VCPU to the main loop and the gdbstub. This patch addresses the issue by pushing this signaling into a CPUDebugExcpHandler: cpu_debug_handler is registered as first handler, thus will be executed last after potential breakpoint emulation handlers. It sets informs the gdbstub about the debug event source, requests a debug exit of the main loop and stops the current VCPU. This mechanism works both for TCG and KVM, with and without IO-thread. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2010-07-22 05:52:09 +02:00
Sheng Yang	f1665b21f1	kvm: Enable XSAVE live migration support Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-06-28 13:06:03 -03:00
Jan Kiszka	be41cbe036	kvm: Switch kvm_update_guest_debug to run_on_cpu Guest debugging under KVM is currently broken once io-threads are enabled. Easily fixable by switching the fake on_vcpu to the real run_on_cpu implementation. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-06-28 13:06:03 -03:00
Gleb Natapov	4513d9232b	Do not stop VM if emulation failed in userspace. Continue vcpu execution in case emulation failure happened while vcpu was in userspace. In this case #UD will be injected into the guest allowing guest OS to kill offending process and continue. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-11 14:03:44 -03:00
Marcelo Tosatti	0af691d779	kvm: enable smp > 1 Process INIT/SIPI requests and enable -smp > 1. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-11 14:02:22 -03:00
Jan Kiszka	2705d56a6a	kvm: synchronize state from cpu context It is not safe to retrieve the KVM internal state of a given cpu while its potentially modifying it. Queue the request to run on cpu context, similarly to qemu-kvm. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-11 14:02:22 -03:00
Marcelo Tosatti	273faf1b6d	kvm: set cpu_single_env around KVM_RUN ioctl Zero cpu_single_env before leaving global lock protection, and restore on return. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-11 14:02:21 -03:00
Marcelo Tosatti	8369e01ce4	kvm: port qemu-kvm's bitmap scanning Which is significantly faster. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-04-28 13:41:15 -03:00
Marcelo Tosatti	7c80eef899	kvm: handle internal error Port qemu-kvm's KVM_EXIT_INTERNAL_ERROR handling to upstream. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-04-26 11:28:35 -03:00
Jan Kiszka	ff44f1a373	KVM: x86: Add debug register saving and restoring Make use of the new KVM_GET/SET_DEBUGREGS to save/restore the x86 debug registers. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-04-26 11:28:35 -03:00
Paolo Bonzini	98c8573eb3	provide a stub version of kvm-all.c if !CONFIG_KVM This allows limited use of kvm functions (which will return ENOSYS) even in once-compiled modules. The patch also improves a bit the error messages for KVM initialization. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [blauwirbel@gmail.com: fixed Win32 build] Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2010-04-19 18:59:30 +00:00
Blue Swirl	8c0d577eb5	kvm: avoid collision with dprintf macro in stdio.h, spotted by clang Fixes clang errors: CC i386-softmmu/kvm.o /src/qemu/target-i386/kvm.c:40:9: error: 'dprintf' macro redefined In file included from /src/qemu/target-i386/kvm.c:21: In file included from /src/qemu/qemu-common.h:27: In file included from /usr/include/stdio.h:910: /usr/include/bits/stdio2.h:189:12: note: previous definition is here CC i386-softmmu/kvm-all.o /src/qemu/kvm-all.c:39:9: error: 'dprintf' macro redefined In file included from /src/qemu/kvm-all.c:23: In file included from /src/qemu/qemu-common.h:27: In file included from /usr/include/stdio.h:910: /usr/include/bits/stdio2.h:189:12: note: previous definition is here Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2010-04-18 14:22:14 +00:00
Alexander Graf	0104dcacf3	S390: Tell user why VM creation failed The KVM kernel module on S390 refuses to create a VM when the switch_amode kernel parameter is not used. Since that is not exactly obvious, let's give the user a nice warning. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2010-04-01 21:51:59 +02:00
Michael S. Tsirkin	ca82180603	kvm: add API to set ioeventfd Comment on kvm usage: rather than require users to do if (kvm_enabled()) and/or ifdefs, this patch adds an API that, internally, is defined to stub function on non-kvm build, and checks kvm_enabled for non-kvm run. While rest of qemu code still uses if (kvm_enabled()), I think this approach is cleaner, and we should convert rest of code to it long term. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2010-04-01 13:56:43 -05:00
Blue Swirl	d745bef890	Move KVM and Xen global flags to vl.c Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2010-03-29 19:23:49 +00:00
Jan Kiszka	ea375f9ab8	KVM: Rework VCPU state writeback API This grand cleanup drops all reset and vmsave/load related synchronization points in favor of four(!) generic hooks: - cpu_synchronize_all_states in qemu_savevm_state_complete (initial sync from kernel before vmsave) - cpu_synchronize_all_post_init in qemu_loadvm_state (writeback after vmload) - cpu_synchronize_all_post_init in main after machine init - cpu_synchronize_all_post_reset in qemu_system_reset (writeback after system reset) These writeback points + the existing one of VCPU exec after cpu_synchronize_state map on three levels of writeback: - KVM_PUT_RUNTIME_STATE (during runtime, other VCPUs continue to run) - KVM_PUT_RESET_STATE (on synchronous system reset, all VCPUs stopped) - KVM_PUT_FULL_STATE (on init or vmload, all VCPUs stopped as well) This level is passed to the arch-specific VCPU state writing function that will decide which concrete substates need to be written. That way, no writer of load, save or reset functions that interact with in-kernel KVM states will ever have to worry about synchronization again. That also means that a lot of reasons for races, segfaults and deadlocks are eliminated. cpu_synchronize_state remains untouched, just as Anthony suggested. We continue to need it before reading or writing of VCPU states that are also tracked by in-kernel KVM subsystems. Consequently, this patch removes many cpu_synchronize_state calls that are now redundant, just like remaining explicit register syncs. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-03-04 00:29:28 -03:00
Jan Kiszka	b0b1d69079	KVM: Rework of guest debug state writing So far we synchronized any dirty VCPU state back into the kernel before updating the guest debug state. This was a tribute to a deficite in x86 kernels before 2.6.33. But as this is an arch-dependent issue, it is better handle in the x86 part of KVM and remove the writeback point for generic code. This also avoids overwriting the flushed state later on if user space decides to change some more registers before resuming the guest. We furthermore need to reinject guest exceptions via the appropriate mechanism. That is KVM_SET_GUEST_DEBUG for older kernels and KVM_SET_VCPU_EVENTS for recent ones. Using both mechanisms at the same time will cause state corruptions. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-03-04 00:29:26 -03:00
Marcelo Tosatti	85199474d0	kvm-all.c: define smp_wmb and use it for coalesced mmio Acked-by: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-02-22 19:04:13 +02:00
Marcelo Tosatti	6312b92853	kvm: remove pre-entry exit_request check with iothread enabled With SIG_IPI blocked vcpu loop exit notification happens via -EAGAIN from KVM_RUN. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-02-22 10:58:33 +02:00
Marcelo Tosatti	cc84de9570	kvm: consume internal signal with sigtimedwait Change the way the internal qemu signal, used for communication between iothread and vcpus, is handled. Block and consume it with sigtimedwait on the outer vcpu loop, which allows more precise timing control. Change from standard signal (SIGUSR1) to real-time one, so multiple signals are not collapsed. Set the signal number on KVM's in-kernel allowed sigmask. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-02-22 10:58:33 +02:00
Amit Shah	a2eebe88fd	kvm: reduce code duplication in config_iothread We have some duplicated code in the CONFIG_IOTHREAD #ifdef and #else cases. Fix that. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2010-02-10 12:45:04 -06:00
Michael S. Tsirkin	7b8f3b7834	kvm: move kvm to use memory notifiers remove direct kvm calls from exec.c, make kvm use memory notifiers framework instead. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Avi Kivity <avi@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2010-02-09 16:56:13 -06:00
Michael S. Tsirkin	46dbef6ade	kvm: move kvm_set_phys_mem around move kvm_set_phys_mem so that it will be later available earlier in the file. needed for next patch using memory notifiers. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Avi Kivity <avi@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2010-02-09 16:56:13 -06:00
Jan Kiszka	9ded274466	KVM: Move and rename regs_modified Touching the user space representation of KVM's VCPU state is - naturally - a per-VCPU thing. So move the dirty flag into KVM_CPU_COMMON and rename it at this chance to reflect its true meaning. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>	2010-02-03 19:47:34 -02:00
Sheng Yang	62a2744ca0	kvm: Flush coalesced MMIO buffer periodly The default action of coalesced MMIO is, cache the writing in buffer, until: 1. The buffer is full. 2. Or the exit to QEmu due to other reasons. But this would result in a very late writing in some condition. 1. The each time write to MMIO content is small. 2. The writing interval is big. 3. No need for input or accessing other devices frequently. This issue was observed in a experimental embbed system. The test image simply print "test" every 1 seconds. The output in QEmu meets expectation, but the output in KVM is delayed for seconds. Per Avi's suggestion, I hooked flushing coalesced MMIO buffer in VGA update handler. By this way, We don't need vcpu explicit exit to QEmu to handle this issue. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-02-03 19:47:33 -02:00
Jan Kiszka	a0fb002c64	kvm: x86: Add support for VCPU event states This patch extends the qemu-kvm state sync logic with support for KVM_GET/SET_VCPU_EVENTS, giving access to yet missing exception, interrupt and NMI states. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-12-03 15:25:57 -06:00
Kevin Wolf	40ff6d7e8d	Don't leak file descriptors We're leaking file descriptors to child processes. Set FD_CLOEXEC on file descriptors that don't need to be passed to children to stop this misbehaviour. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-12-03 11:45:50 -06:00
Jan Kiszka	caa5af0ff3	kvm: Add arch reset handler Will be required by succeeding changes. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-11-17 08:49:37 -06:00
Hollis Blanchard	9bdbe550f0	kvm: Move KVM mp_state accessors to i386-specific code Unbreaks PowerPC and S390 KVM builds. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-11-12 11:23:55 -06:00
Glauber Costa	d549db5a73	unlock iothread mutex before running kvm ioctl Without this, kvm will hold the mutex while it issues its run ioctl, and never be able to step out of it, causing a deadlock. Patchworks-ID: 35359 Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-10-12 09:42:31 -05:00
Glauber Costa	828566bc33	temporary fix for on_vcpu Recent changes made on_vcpu hit the abort() path, even with the IO thread disabled. This is because cpu_single_env is no longer set when we call this function. Although the correct fix is a little bit more complicated that that, the recent thread in which I proposed qemu_queue_work (which fixes that, btw), is likely to go on a quite different direction. So for the benefit of those using guest debugging, I'm proposing this simple fix in the interim. Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-10-05 09:32:45 -05:00
Jan Kiszka	b3807725f6	kvm: Fix guest single-stepping Hopefully the last regression of `4c0960c0`: KVM_SET_GUEST_DEBUG requires properly synchronized guest registers (on x86: eflags) on entry. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-10-05 09:32:45 -05:00
Anthony Liguori	c227f0995e	Revert "Get rid of _t suffix" In the very least, a change like this requires discussion on the list. The naming convention is goofy and it causes a massive merge problem. Something like this _must_ be presented on the list first so people can provide input and cope with it. This reverts commit `99a0949b72`. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-10-01 16:12:16 -05:00
malc	99a0949b72	Get rid of _t suffix Some not so obvious bits, slirp and Xen were left alone for the time being. Signed-off-by: malc <av1474@comtv.ru>	2009-10-01 22:45:02 +04:00
Blue Swirl	afcea8cbde	ioports: remove unused env parameter and compile only once The CPU state parameter is not used, remove it and adjust callers. Now we can compile ioport.c once for all targets. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2009-09-20 16:05:47 +00:00
Blue Swirl	72cf2d4f0e	Fix sys-queue.h conflict for good Problem: Our file sys-queue.h is a copy of the BSD file, but there are some additions and it's not entirely compatible. Because of that, there have been conflicts with system headers on BSD systems. Some hacks have been introduced in the commits `15cc923584`, `f40d753718`, `96555a96d7` and `3990d09adf` but the fixes were fragile. Solution: Avoid the conflict entirely by renaming the functions and the file. Revert the previous hacks. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2009-09-12 07:36:22 +00:00
Avi Kivity	4c0960c0c4	kvm: Simplify cpu_synchronize_state() cpu_synchronize_state() is a little unreadable since the 'modified' argument isn't self-explanatory. Simplify it by making it always synchronize the kernel state into qemu, and automatically flush the registers back to the kernel if they've been synchronized on this exit. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-08-27 20:35:30 -05:00
Anthony Liguori	6e489f3f88	Revert "Fake dirty loggin when it's not there" This reverts commit `bd83677612`. PPC should just implement dirty logging so we can avoid all the fall-out from this changeset. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-07-27 15:26:43 -05:00
Luiz Capitulino	fc5d642fca	Fix broken build The only caller of on_vcpu() is protected by ifdef KVM_CAP_SET_GUEST_DEBUG, so protect on_vcpu() too otherwise QEMU may not to build. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-07-27 14:09:15 -05:00
Alexander Graf	96c1606b33	Use Little Endian for Dirty Log We currently use host endian long types to store information in the dirty bitmap. This works reasonably well on Little Endian targets, because the u32 after the first contains the next 32 bits. On Big Endian this breaks completely though, forcing us to be inventive here. So Ben suggested to always use Little Endian, which looks reasonable. We only have dirty bitmap implemented in Little Endian targets so far and since PowerPC would be the first Big Endian platform, we can just as well switch to Little Endian always with little effort without breaking existing targets. This is the userspace part of the patch. It shouldn't change anything for existing targets, but help PowerPC. It replaces my older patch called "Use 64bit pointer for dirty log". Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-07-27 14:09:14 -05:00
Alexander Graf	1c7936e377	Use 64bit pointer for dirty log Dirty logs currently get written with native "long" size. On little endian it doesn't matter if we use uint64_t instead though, because we'd still end up using the right bytes. On big endian, this does become a bigger problem, so we need to ensure that kernel and userspace talk the same language, which means getting rid of "long" and using a defined size instead. So I decided to use 64 bit types at all times. This doesn't break existing targets but will in conjunction with a patch I'll send to the KVM ML make dirty logs work with 32 bit userspace on 64 kernel with big endian. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-07-27 14:09:14 -05:00
Glauber Costa	6f725c139a	provide tests for pit in kernel and irqchip in kernel KVM can have an in-kernel pit or irqchip. While we don't implement it yet, having a way for test for it (that always returns zero) will allow us to reuse code in qemu-kvm that tests for it. Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-07-22 10:58:49 -05:00
Glauber Costa	452e475196	introduce on_vcpu on_vcpu is a qemu-kvm function that will make sure that a specific piece of code will run on a requested cpu. We don't need that because we're restricted to -smp 1 right now, but those days are likely to end soon. So for the benefit of having qemu-kvm share more code with us, I'm introducing our own version of on_vcpu(). Right now, we either run a function on the current cpu, or abort the execution, because it would mean something is seriously wrong. As an example code, I "ported" kvm_update_guest_debug to use it, with some slight differences from qemu-kvm. This is probably 0.12 material Signed-off-by: Glauber Costa <glommer@redhat.com> CC: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-07-22 10:58:49 -05:00
Alexander Graf	bd83677612	Fake dirty loggin when it's not there Some KVM platforms don't support dirty logging yet, like IA64 and PPC, so in order to still have screen updates on those, we need to fake it. This patch just tells the getter function for dirty bitmaps, that all pages within a slot are dirty when the slot has dirty logging enabled. That way we can implement dirty logging on those platforms sometime when it drags down performance, but share the rest of the code with dirty logging capable platforms. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-07-22 10:58:46 -05:00
Alexander Graf	b80a55e67b	Fix warning in kvm-all.c This fixes a warning I stumbled across while compiling qemu on PPC64. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-07-22 10:58:46 -05:00
Jan Kiszka	a08d43677f	Revert "Introduce reset notifier order" This reverts commit `8217606e6e` (and updates later added users of qemu_register_reset), we solved the problem it originally addressed less invasively. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-06-29 14:18:08 -05:00
Jan Kiszka	8d2ba1fb9c	kvm: Rework VCPU synchronization During startup and after reset we have to synchronize user space to the in-kernel KVM state. Namely, we need to transfer the VCPU registers when they change due to VCPU as well as APIC reset. This patch refactors the required hooks so that kvm_init_vcpu registers its own per-VCPU reset handler and adds a cpu_synchronize_state to the APIC reset. That way we no longer depend on the new reset order (and can drop this disliked interface again) and we can even drop a KVM hook in main(). Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-06-29 14:18:07 -05:00
Jan Kiszka	8c14c17395	kvm: Fix IRQ injection into full queue User space may only inject interrupts during kvm_arch_pre_run if ready_for_interrupt_injection is set in kvm_run. But that field is updated on exit from KVM_RUN, so we must ensure that we enter the kernel after potentially queuing an interrupt, otherwise we risk to loose one - like it happens with the current code against latest kernel modules (since kvm-86) that started to queue only a single interrupt. Fix the problem by reordering kvm_cpu_exec. Credits go to Gleb Natapov for analyzing the issue in details. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-06-16 15:36:47 -05:00
Jan Kiszka	168ccc11c3	kvm: Improve upgrade notes when facing unsupported kernels Users complained that it is not obvious what to do when kvm refuses to build or run due to an unsupported host kernel, so let's improve the hints. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>	2009-06-07 16:40:22 +03:00
Jan Kiszka	f8d926e9cd	kvm: x86: Save/restore KVM-specific CPU states Save and restore all so far neglected KVM-specific CPU states. Handling the TSC stabilizes migration in KVM mode. The interrupt_bitmap and mp_state are currently unused, but will become relevant for in-kernel irqchip support. By including proper saving/restoring already, we avoid having to increment CPU_SAVE_VERSION later on once again. v2: - initialize mp_state runnable (for the boot CPU) Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-05-22 10:50:34 -05:00
Jan Kiszka	d33a1810d7	kvm: Rework VCPU reset Use standard callback with highest order to synchronize VCPU on reset after all device callbacks were execute. This allows to remove the special kvm hook in qemu_system_reset. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-05-22 10:50:34 -05:00
Jan Kiszka	151f7749f2	kvm: Rework dirty bitmap synchronization Extend kvm_physical_sync_dirty_bitmap() so that is can sync across multiple slots. Useful for updating the whole dirty log during migration. Moreover, properly pass down errors the whole call chain. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-05-22 10:50:33 -05:00
Jan Kiszka	62518b8b1d	kvm: Fix dirty log temporary buffer size The buffer passed to KVM_GET_DIRTY_LOG requires one bit per page. Fix the size calculation in kvm_physical_sync_dirty_bitmap accordingly, avoiding allocation of extremly oversized buffers. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-05-22 10:50:33 -05:00
Jan Kiszka	4495d6a745	kvm: Introduce kvm_set_migration_log Introduce a global dirty logging flag that enforces logging for all slots. This can be used by the live migration code to enable/disable global logging withouth destroying the per-slot setting. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-05-22 10:50:33 -05:00
Jan Kiszka	e69917e29a	kvm: Conditionally apply workaround for KVM slot handling bug Only apply the workaround for broken slot joining in KVM when the capability was not found that signals the corresponding fix existence. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-05-22 10:50:32 -05:00
Mark McLoughlin	9f8fd69460	kvm: add error message for when SMP is requested Right now, if you try e.g. '-smp 2' you just get 'failed to initialize KVM'. Signed-off-by: Mark McLoughlin <markmc@redhat.com>	2009-05-20 09:24:23 -05:00
Anthony Liguori	ad7b8b3310	Introduce kvm_check_extension to check if KVM extensions are supported Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-05-08 15:42:42 -05:00
Jan Kiszka	6f0437e8de	kvm: Avoid COW if KVM MMU is asynchronous Avi Kivity wrote: > Suggest wrapping in a function and hiding it deep inside kvm-all.c. > Done in v2: ----------> If the KVM MMU is asynchronous (kernel does not support MMU_NOTIFIER), we have to avoid COW for the guest memory. Otherwise we risk serious breakage when guest pages change there physical locations due to COW after fork. Seen when forking smbd during runtime via -smb. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-05-01 09:44:11 -05:00
Jan Kiszka	e6f4afe029	kvm: Relax aligment check of kvm_set_phys_mem There is no need to reject an unaligned memory region registration if the region will be I/O memory and it will not split an existing KVM slot. This fixes KVM support on PPC. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-05-01 09:38:33 -05:00
aliguori	6152e2ae43	kvm: improve handling of overlapping slots (Jan Kiszka) This reworks the slot management to handle more patterns of cpu_register_physical_memory*, finally allowing to reset KVM guests (so far address remapping on reset broke the slot management). We could actually handle all possible ones without failing, but a KVM kernel bug in older versions would force us to track all previous fragmentations and maintain them (as that bug prevents registering larger slots that overlap also deleted ones). To remain backward compatible but avoid overly complicated workarounds, we apply a simpler workaround that covers all currently used patterns. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7139 c046a42c-6fe2-441c-8c8c-71466251a162	2009-04-17 14:26:33 +00:00
aliguori	d3f8d37fe2	kvm: Add sanity checks to slot management (Jan Kiszka) Fail loudly if we run out of memory slot. Make sure that dirty log start/stop works with consistent memory regions by reporting invalid parameters. This reveals several inconsistencies in the vga code, patch to fix them follows later in this series. And, for simplicity reasons, also catch and report unaligned memory regions passed to kvm_set_phys_mem (KVM works on page basis). Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7138 c046a42c-6fe2-441c-8c8c-71466251a162	2009-04-17 14:26:29 +00:00
aliguori	494ada4234	kvm: Cleanup unmap condition in kvm_set_phys_mem (Jan Kiszka) Testing for TLB_MMIO on unmap makes no sense as A) that flag belongs to CPUTLBEntry and not to io_memory slots or physical addresses and B) we already use a different condition before mapping. So make this test consistent. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7137 c046a42c-6fe2-441c-8c8c-71466251a162	2009-04-17 14:26:25 +00:00
pbrook	5579c7f37e	Remove code phys_ram_base uses. Signed-off-by: Paul Brook <paul@codesourcery.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7085 c046a42c-6fe2-441c-8c8c-71466251a162	2009-04-11 14:47:08 +00:00
aliguori	e22a25c936	Guest debugging support for KVM (Jan Kiszka) This is a backport of the guest debugging support for the KVM accelerator that is now part of the KVM tree. It implements the reworked KVM kernel API for guest debugging (KVM_CAP_SET_GUEST_DEBUG) which is not yet part of any mainline kernel but will probably be 2.6.30 stuff. So far supported is x86, but PPC is expected to catch up soon. Core features are: - unlimited soft-breakpoints via code patching - hardware-assisted x86 breakpoints and watchpoints Changes in this version: - use generic hook cpu_synchronize_state to transfer registers between user space and kvm - push kvm_sw_breakpoints into KVMState Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6825 c046a42c-6fe2-441c-8c8c-71466251a162	2009-03-12 20:12:48 +00:00
aurel32	be214e6c05	Fix race condition on access to env->interrupt_request env->interrupt_request is accessed as the bit level from both main code and signal handler, making a race condition possible even on CISC CPU. This causes freeze of QEMU under high load when running the dyntick clock. The patch below move the bit corresponding to CPU_INTERRUPT_EXIT in a separate variable, declared as volatile sig_atomic_t, so it should be work even on RISC CPU. We may want to move the cpu_interrupt(env, CPU_INTERRUPT_EXIT) case in its own function and get rid of CPU_INTERRUPT_EXIT. That can be done later, I wanted to keep the patch short for easier review. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6728 c046a42c-6fe2-441c-8c8c-71466251a162	2009-03-06 21:48:00 +00:00
aliguori	1eec614b36	toplevel: remove error handling from qemu_malloc() callers (Avi Kivity) Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6531 c046a42c-6fe2-441c-8c8c-71466251a162	2009-02-05 22:06:18 +00:00
aurel32	a9c11522bb	KVM: Silence unused s warning Move s under #ifdef to avoid compiler warning. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6086 c046a42c-6fe2-441c-8c8c-71466251a162	2008-12-18 22:42:51 +00:00
aliguori	f5d6f51ba6	kvm: sync vcpu state during initialization (Hollis Blanchard) Currently on x86, qemu initializes CPUState but KVM ignores it and does its own vcpu initialization. However, PowerPC KVM needs to be able to set the initial register state to support the -kernel and -append options. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6060 c046a42c-6fe2-441c-8c8c-71466251a162	2008-12-15 22:20:42 +00:00
aliguori	f65ed4c152	KVM: Coalesced MMIO support MMIO exits are more expensive in KVM or Xen than in QEMU because they involve, at least, privilege transitions. However, MMIO write operations can be effectively batched if those writes do not have side effects. Good examples of this include VGA pixel operations when in a planar mode. As it turns out, we can get a nice boost in other areas too. Laurent mentioned a 9.7% performance boost in iperf with the coalesced MMIO changes for the e1000 when he originally posted this work for KVM. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5961 c046a42c-6fe2-441c-8c8c-71466251a162	2008-12-09 20:09:57 +00:00
aliguori	d85dc283fa	Disable KVM support if the kernel modules have broken memory slot handling Prior to kvm-80, memory slot deletion was broken in the KVM kernel modules. In kvm-81, a new capability is introduced to signify that this problem has been fixed. Since we rely on being able to delete memory slots, refuse to work with any kernel module that does not have this capability present. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5960 c046a42c-6fe2-441c-8c8c-71466251a162	2008-12-09 19:59:09 +00:00
aliguori	bd3220870f	Add virtio-balloon support This adds a VirtIO based balloon driver. It uses madvise() to actually balloon the memory when possible. Until 2.6.27, KVM forced memory pinning so we must disable ballooning unless the kernel actually supports it when using KVM. It's always safe when using TCG. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5874 c046a42c-6fe2-441c-8c8c-71466251a162	2008-12-04 20:33:06 +00:00
aliguori	5832d1f2f5	kvm: Introduce kvm logging interface (Glauber Costa) Introduce functions to control logging of memory regions. We select regions based on its start address, a guest_physical_addr (target_phys_addr_t, in qemu nomenclature). The main user of this interface right now is VGA optimization (a way of reducing the number of mmio exits). Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5792 c046a42c-6fe2-441c-8c8c-71466251a162	2008-11-24 19:36:26 +00:00
aliguori	34fc643fd3	Make KVMSlot a real structure struct kvm_userspace_memory_region does not use QEMU friendly types to define memory slots. This results in lots of ugly casting with warnings on 32-bit platforms. This patch introduces a proper KVMSlot structure that uses QEMU types to describe memory slots. This eliminates many of the casts and isolates the type conversions to one spot. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5755 c046a42c-6fe2-441c-8c8c-71466251a162	2008-11-19 17:41:58 +00:00
aliguori	a3d6841ff8	kvm: de-register mem region for MMIO (Glauber Costa) Besides unassigned memory, we also don't care about MMIO. So if we're giving an MMIO area that is already registered, wipe it out. Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5753 c046a42c-6fe2-441c-8c8c-71466251a162	2008-11-19 15:09:54 +00:00
aliguori	62d60e8cc4	Make KVM slot management more robust KVM keeps track of physical memory based on slots in the kernel. The current code that translates QEMU memory mappings to slots work but is not robust in the fact of reregistering partial regions of memory. This patch does the right thing for reregistering partial regions of memory. It also prevents QEMU from using KVM private slots. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5734 c046a42c-6fe2-441c-8c8c-71466251a162	2008-11-18 15:41:18 +00:00
aliguori	984b518147	Define kvm_ioctl in the same way as ioctl The third argument to ioctl is a ... which allows any value to be passed. In practice, glibc always treats the argument as a void *. Do the same thing for the kvm ioctls to keep things consistent with a traditional ioctl. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5715 c046a42c-6fe2-441c-8c8c-71466251a162	2008-11-13 19:21:00 +00:00
aliguori	becfc39041	KVM: simplify kvm_cpu_exec hook We don't need to use cpu_loop_exit() because we never use the condition codes so everything can be folded into a single case. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5669 c046a42c-6fe2-441c-8c8c-71466251a162	2008-11-10 15:55:14 +00:00
aliguori	053304480a	Add missing files to KVM commit. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5629 c046a42c-6fe2-441c-8c8c-71466251a162	2008-11-05 16:29:27 +00:00

... 3 4 5 6 7 ...

383 Commits