Commit Graph

326 Commits

Author SHA1 Message Date
Eduardo Habkost
bc74b7db86 i386: kvm: filter CPUID feature words earlier, on cpu.c
cpu.c contains the code that will check if all requested CPU features
are available, so the filtering of KVM features must be there, so we can
implement "check" and "enforce" properly.

The only point where kvm_arch_init_vcpu() is called on i386 is:

- cpu_x86_init()
  - x86_cpu_realize() (after cpu_x86_register() is called)
    - qemu_init_vcpu()
      - qemu_kvm_start_vcpu()
        - qemu_kvm_thread_fn() (on a new thread)
          - kvm_init_vcpu()
            - kvm_arch_init_vcpu()

With this patch, the filtering will be done earlier, at:
- cpu_x86_init()
  - cpu_x86_register() (before x86_cpu_realize() is called)

Also, the KVM CPUID filtering will now be done at the same place where
the TCG CPUID feature filtering is done. Later, the code can be changed
to use the same filtering code for the "check" and "enforce" modes, as
now the cpu.c code knows exactly which CPU features are going to be
exposed to the guest (and much earlier).

One thing I was worrying about when doing this is that
kvm_arch_get_supported_cpuid() depends on kvm_irqchip_in_kernel(), and
maybe the 'kvm_kernel_irqchip' global variable wasn't initialized yet at
CPU creation time. But kvm_kernel_irqchip is initialized during
kvm_init(), that is called very early (much earlier than the machine
init function), and kvm_init() is already a requirement to run the
GET_SUPPORTED_CPUID ioctl() (as kvm_init() initializes the kvm_state
global variable).

Side note: it would be nice to keep KVM-specific code inside kvm.c. The
problem is that properly implementing -cpu check/enforce code (that's
inside cpu.c) depends directly on the feature bit filtering done using
kvm_arch_get_supported_cpuid(). Currently -cpu check/enforce is broken
because it simply uses the host CPU feature bits instead of
GET_SUPPORTED_CPUID, and we need to fix that.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:49 -02:00
Eduardo Habkost
c9da8382c1 i386: kvm: mask cpuid_ext4_features bits earlier
This way all the filtering by GET_SUPPORTED_CPUID is being done at the
same place in the code.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:49 -02:00
Eduardo Habkost
ea85c9e456 i386: kvm: mask cpuid_kvm_features earlier
Instead of masking the KVM feature bits very late (while building the
KVM_SET_CPUID2 data), mask it out on env->cpuid_kvm_features, at the
same point where the other feature words are masked out.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:48 -02:00
Eduardo Habkost
41e5e76db0 i386: kvm: x2apic is not supported without in-kernel irqchip
This is necessary so that x2apic is not improperly enabled when the
in-kernel irqchip is disabled.

This won't generate a warning with "-cpu ...,check" because the current
check/enforce code is broken (it checks the host CPU data directly,
instead of using kvm_arch_get_supported_cpuid()), but it will be
eventually fixed to properly report the missing x2apic flag.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:48 -02:00
Eduardo Habkost
ac67ee260a i386: kvm: set CPUID_EXT_TSC_DEADLINE_TIMER on kvm_arch_get_supported_cpuid()
This moves the CPUID_EXT_TSC_DEADLINE_TIMER CPUID flag hacking from
kvm_arch_init_vcpu() to kvm_arch_get_supported_cpuid().

Full git grep for kvm_arch_get_supported_cpuid:

   kvm.h:uint32_t kvm_arch_get_supported_cpuid(KVMState *env, uint32_t function,
   target-i386/cpu.c:        x86_cpu_def->cpuid_7_0_ebx_features = kvm_arch_get_supported_cpuid(kvm_state, 0x7, 0, R_EBX);
   target-i386/cpu.c:            *eax = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EAX);
   target-i386/cpu.c:            *ebx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EBX);
   target-i386/cpu.c:            *ecx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_ECX);
   target-i386/cpu.c:            *edx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EDX);
   target-i386/cpu.c:            *eax = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EAX);
   target-i386/cpu.c:            *ebx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EBX);
   target-i386/cpu.c:            *ecx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_ECX);
   target-i386/cpu.c:            *edx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EDX);
   target-i386/kvm.c:uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function,
   target-i386/kvm.c:        cpuid_1_edx = kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX);
   target-i386/kvm.c:    env->cpuid_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX);
 * target-i386/kvm.c:    env->cpuid_ext_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX);
   target-i386/kvm.c:    env->cpuid_ext2_features &= kvm_arch_get_supported_cpuid(s, 0x80000001,
   target-i386/kvm.c:    env->cpuid_ext3_features &= kvm_arch_get_supported_cpuid(s, 0x80000001,
   target-i386/kvm.c:    env->cpuid_svm_features  &= kvm_arch_get_supported_cpuid(s, 0x8000000A,
   target-i386/kvm.c:        kvm_arch_get_supported_cpuid(s, KVM_CPUID_FEATURES, 0, R_EAX);
   target-i386/kvm.c:            kvm_arch_get_supported_cpuid(s, 0xC0000001, 0, R_EDX);

Note that there is only one call for CPUID[1].ECX above (*), and it is
the one that gets hacked to include CPUID_EXT_TSC_DEADLINE_TIMER, so we
can simply make kvm_arch_get_supported_cpuid() set it, to let the rest
of the code know the flag can be safely set by QEMU.

One thing I was worrying about when doing this is that now
kvm_arch_get_supported_cpuid() depends on kvm_irqchip_in_kernel(). But
the 'kvm_kernel_irqchip' global variable is initialized during
kvm_init(), that is called very early, and kvm_init() is already a
requirement to run the GET_SUPPORTED_CPUID ioctl() (as kvm_init() is the
function that initializes the 'kvm_state' global variable).

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:48 -02:00
Eduardo Habkost
84bd945cf2 i386: kvm: set CPUID_EXT_HYPERVISOR on kvm_arch_get_supported_cpuid()
Full grep for kvm_arch_get_supported_cpuid:

   kvm.h:uint32_t kvm_arch_get_supported_cpuid(KVMState *env, uint32_t function,
   target-i386/cpu.c:        x86_cpu_def->cpuid_7_0_ebx_features = kvm_arch_get_supported_cpuid(kvm_state, 0x7, 0, R_EBX);
   target-i386/cpu.c:            *eax = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EAX);
   target-i386/cpu.c:            *ebx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EBX);
   target-i386/cpu.c:            *ecx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_ECX);
   target-i386/cpu.c:            *edx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EDX);
   target-i386/cpu.c:            *eax = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EAX);
   target-i386/cpu.c:            *ebx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EBX);
   target-i386/cpu.c:            *ecx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_ECX);
   target-i386/cpu.c:            *edx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EDX);
   target-i386/kvm.c:uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function,
   target-i386/kvm.c:        cpuid_1_edx = kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX);
   target-i386/kvm.c:    env->cpuid_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX);
 * target-i386/kvm.c:    env->cpuid_ext_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX);
   target-i386/kvm.c:    env->cpuid_ext2_features &= kvm_arch_get_supported_cpuid(s, 0x80000001,
   target-i386/kvm.c:    env->cpuid_ext3_features &= kvm_arch_get_supported_cpuid(s, 0x80000001,
   target-i386/kvm.c:    env->cpuid_svm_features  &= kvm_arch_get_supported_cpuid(s, 0x8000000A,
   target-i386/kvm.c:        kvm_arch_get_supported_cpuid(s, KVM_CPUID_FEATURES, 0, R_EAX);
   target-i386/kvm.c:            kvm_arch_get_supported_cpuid(s, 0xC0000001, 0, R_EDX);

Note that there is only one call for CPUID[1].ECX above (*), and it is
the one that gets hacked to include CPUID_EXT_HYPERVISOR, so we can
simply make kvm_arch_get_supported_cpuid() set it, to let the rest of
the code automatically know that the flag can be safely set by QEMU.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:47 -02:00
Eduardo Habkost
c2acb022c8 i386: kvm: kvm_arch_get_supported_cpuid: replace if+switch with single 'if'
Additional fixups will be added, and making them a single 'if/else if'
chain makes it clearer than two nested switch statements.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:47 -02:00
Eduardo Habkost
dd87f8a690 i386: kvm: extract try_get_cpuid() loop to get_supported_cpuid() function
No behavior change, just code movement.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:46 -02:00
Eduardo Habkost
4fb73f1d3b i386: kvm: extract CPUID entry lookup to cpuid_find_entry() function
No behavior change, just code movement.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:46 -02:00
Eduardo Habkost
829ae2f9fa i386: kvm: extract register switch to cpuid_entry_get_reg() function
No behavior change: just code movement.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:46 -02:00
Eduardo Habkost
47111e2cfa i386: kvm: kvm_arch_get_supported_cpuid: use 'entry' variable
The reg switch will be moved to a separate function, so store the entry
pointer in a variable.

No behavior change, just code movement.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:46 -02:00
Eduardo Habkost
8c723b7958 i386: kvm: kvm_arch_get_supported_cpuid: clean up has_kvm_features check
Instead of a function-specific has_kvm_features variable, simply use a
"found" variable that will be checked in case we have to use the legacy
get_para_features() interface.

No behavior change, just code cleanup.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:46 -02:00
Eduardo Habkost
7b46e5ce81 i386: kvm: kvm_arch_get_supported_cpuid: move R_EDX hack outside of for loop
The for loop will become a separate function, so clean it up so it can
become independent from the bit hacking for R_EDX.

No behavior change[1], just code movement.

[1] Well, only if the kernel returned CPUID leafs 1 or 0x80000001 as
    unsupported, but there's no kernel version that does that.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:45 -02:00
Andreas Färber
2fa45344a9 cpus: Pass CPUState to cpu_is_stopped()
CPUArchState is no longer needed there.

Also change the return type to bool.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-10-31 01:02:44 +01:00
Andreas Färber
60e82579c7 cpus: Pass CPUState to qemu_cpu_is_self()
Change return type to bool, move to include/qemu/cpu.h and
add documentation.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
[AF: Updated new caller qemu_in_vcpu_thread()]
2012-10-31 01:02:39 +01:00
Avi Kivity
a8170e5e97 Rename target_phys_addr_t to hwaddr
target_phys_addr_t is unwieldly, violates the C standard (_t suffixes are
reserved) and its purpose doesn't match the name (most target_phys_addr_t
addresses are not target specific).  Replace it with a finger-friendly,
standards conformant hwaddr.

Outstanding patchsets can be fixed up with the command

  git rebase -i --exec 'find -name "*.[ch]"
                        | xargs s/target_phys_addr_t/hwaddr/g' origin

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-23 08:58:25 -05:00
Eduardo Habkost
8fad4b44a0 i386: kvm: use a #define for the set of alias feature bits
Instea of using a hardcoded hex constant, define CPUID_EXT2_AMD_ALIASES
as the set of CPUID[8000_0001].EDX bits that on AMD are the same as the
bits of CPUID[1].EDX.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Don Slutz <Don@CloudSwitch.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-09-30 11:11:00 +00:00
Eduardo Habkost
b1f4679392 i386: kvm: bit 10 of CPUID[8000_0001].EDX is reserved
Bit 10 of CPUID[8000_0001].EDX is not defined as an alias of
CPUID[1].EDX[10], so do not duplicate it on
kvm_arch_get_supported_cpuid().

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Don Slutz <Don@CloudSwitch.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-09-30 11:10:38 +00:00
Jan Kiszka
b139bd300f kvm: i386: Add services required for PCI device assignment
These helpers abstract the interaction of upcoming pci-assign with the
KVM kernel services. Put them under i386 only as other archs will
implement device pass-through via VFIO and not this classic interface.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-10 15:29:59 +03:00
Michael S. Tsirkin
bc9a839d56 kvm: get/set PV EOI MSR
Support get/set of new PV EOI MSR, for migration.
Add an optional section for MSR value - send it
out in case MSR was changed from the default value (0).

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-08-29 10:51:28 -05:00
Peter Maydell
f3e1bed8da kvm: Decouple 'GSI routing' from 'kernel irqchip'
Don't assume having an in-kernel irqchip means that GSI
routing is enabled.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-09 16:16:57 +03:00
Peter Maydell
614e41bc26 kvm: Decouple 'MSI routing via irqfds' from 'kernel irqchip'
Decouple another x86-specific assumption about what irqchips imply.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-09 16:16:56 +03:00
Peter Maydell
cc7e0ddf5a kvm: Decouple 'irqfds usable' from 'kernel irqchip'
Instead of assuming that we can use irqfds if and only if
kvm_irqchip_in_kernel(), add a bool to the KVMState which
indicates this, and is set only on x86 and only if the
irqchip is in the kernel.

The kernel documentation implies that the only thing
you need to use KVM_IRQFD is that KVM_CAP_IRQFD is
advertised, but this seems to be untrue. In particular
the kernel does not (alas) return a sensible error if you
try to set up an irqfd when you haven't created an irqchip.
If it did we could remove all this nonsense and let the
kernel return the error code.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-09 16:16:56 +03:00
Peter Maydell
1d31f66bbc kvm: Move kvm_allows_irq0_override() to target-i386, fix return type
kvm_allows_irq0_override() is a totally x86 specific concept:
move it to the target-specific source file where it belongs.
This means we need a new header file for the prototype:
kvm_i386.h, in line with the existing kvm_ppc.h.

While we are moving it, fix the return type to be 'bool' rather
than 'int'.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-09 16:16:55 +03:00
Igor Mammedov
dd673288a8 target-i386: move cpu halted decision into x86_cpu_reset
MP initialization protocol differs between cpu families, and for P6 and
onward models it is up to CPU to decide if it will be BSP using this
protocol, so try to model this. However there is no point in implementing
MP initialization protocol in qemu. Thus first CPU is always marked as BSP.

This patch:
 - moves decision to designate BSP from board into cpu, making cpu
self-sufficient in this regard. Later it will allow to cleanup hw/pc.c
and remove cpu_reset and wrappers from there.
 - stores flag that CPU is BSP in IA32_APIC_BASE to model behavior
described in Inted SDM vol 3a part 1 chapter 8.4.1
 - uses MSR_IA32_APICBASE_BSP flag in apic_base for checking if cpu is BSP

patch is based on Jan Kiszka's proposal:
    http://thread.gmane.org/gmane.comp.emulators.qemu/100806

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-08-01 08:45:06 -05:00
Jan Kiszka
5d62c43a17 apic: Defer interrupt updates to VCPU thread
KVM performs TPR raising asynchronously to QEMU, specifically outside
QEMU's global lock. When an interrupt is injected into the APIC and TPR
is checked to decide if this can be delivered, a stale TPR value may be
used, causing spurious interrupts in the end.

Fix this by deferring apic_update_irq to the context of the target VCPU.
We introduce a new interrupt flag for this, CPU_INTERRUPT_POLL. When it
is set, the VCPU calls apic_poll_irq before checking for further pending
interrupts. To avoid special-casing KVM, we also implement this logic
for TCG mode.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-07-10 11:31:09 +03:00
Liu, Jinsong
a75b3e0f64 kvm: expose tsc deadline timer feature to guest
This patch exposes tsc deadline timer feature to guest if
1). in-kernel irqchip is used, and
2). kvm has emulated tsc deadline timer, and
3). user authorize the feature exposing via -cpu or +/- tsc-deadline

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-07-03 17:36:32 -03:00
Andreas Färber
232fc23bed target-i386: Pass X86CPU to do_cpu_{init,sipi}()
Allows to use cpu_reset() in place of cpu_state_reset().

Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2012-06-04 23:00:42 +02:00
Andreas Färber
317ac6201a target-i386: Don't overuse CPUState
Scripted conversion:
  sed -i "s/CPUState/CPUX86State/g" target-i386/*.[hc]
  sed -i "s/#define CPUX86State/#define CPUState/" target-i386/cpu.h

Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: Anthony Liguori <aliguori@us.ibm.com>
2012-03-14 22:20:25 +01:00
Michael S. Tsirkin
7e680753cf kvm: fill in padding to help valgrind
valgrind warns about padding fields which are passed
to vcpu ioctls uninitialized.
This is not an error in practice because kvm ignored padding.
Since the ioctls in question are off data path and
the cost is zero anyway, initialize padding to 0
to suppress these errors.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-07 12:31:48 +02:00
Jan Kiszka
d362e757d3 target-i386: Add infrastructure for reporting TPR MMIO accesses
This will allow the APIC core to file a TPR access report. Depending on
the accelerator and kernel irqchip mode, it will either be delivered
right away or queued for later reporting.

In TCG mode, we can restart the triggering instruction and can therefore
forward the event directly. KVM does not allows us to restart, so we
postpone the delivery of events recording in the user space APIC until
the current instruction is completed.

Note that KVM without in-kernel irqchip will report the address after
the instruction that triggered the access.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-18 12:15:55 +02:00
Gleb Natapov
d1f8663660 kvm: Synchronize cpu state in kvm_arch_stop_on_emulation_error()
Call to kvm_cpu_synchronize_state() is missing.
kvm_arch_stop_on_emulation_error may look at outdated registers here.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
2012-02-16 18:21:46 +02:00
Jan Kiszka
3d4b26494f kvm: Implement kvm_irqchip_in_kernel like kvm_enabled
To both avoid that kvm_irqchip_in_kernel always has to be paired with
kvm_enabled and that the former ends up in a function call, implement it
like the latter. This means keeping the state in a global variable and
defining kvm_irqchip_in_kernel as a preprocessor macro.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-02-08 15:57:50 -02:00
Jan Kiszka
39d6960aab kvm: Allow to set shadow MMU size
Introduce the KVM-specific machine option kvm_shadow_mem. It allows to
set a custom shadow MMU size for the virtual machine. This is useful for
stress testing e.g.

Only x86 supports this for now, but it is in principle a generic
concept for all targets with shadow MMUs.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-02-08 15:57:50 -02:00
Jan Kiszka
680c1c6fd7 kvm: x86: Add user space part for in-kernel APIC
This introduces the alternative APIC device which makes use of KVM's
in-kernel device model. External NMI injection via LINT1 is emulated by
checking the current state of the in-kernel APIC, only injecting a NMI
into the VCPU if LINT1 is unmasked and configured to DM_NMI.

MSI is not yet supported, so we disable this when the in-kernel model is
in use.

CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2012-01-19 12:14:42 +01:00
Jan Kiszka
84b058d7df kvm: Introduce core services for in-kernel irqchip support
Add the basic infrastructure to active in-kernel irqchip support, inject
interrupts into these models, and maintain IRQ routes.

Routing is optional and depends on the host arch supporting
KVM_CAP_IRQ_ROUTING. When it's not available on x86, we looe the HPET as
we can't route GSI0 to IOAPIC pin 2.

In-kernel irqchip support will once be controlled by the machine
property 'kernel_irqchip', but this is not yet wired up.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2012-01-19 12:14:42 +01:00
Vadim Rozenfeld
eab70139a6 hyper-v: initialize Hyper-V CPUID leaves.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-01-19 08:33:12 -02:00
Anthony Liguori
f3c6a169a3 Merge remote-tracking branch 'qemu-kvm/memory/page_desc' into staging
* qemu-kvm/memory/page_desc: (22 commits)
  Remove cpu_get_physical_page_desc()
  sparc: avoid cpu_get_physical_page_desc()
  virtio-balloon: avoid cpu_get_physical_page_desc()
  vhost: avoid cpu_get_physical_page_desc()
  kvm: avoid cpu_get_physical_page_desc()
  memory: remove CPUPhysMemoryClient
  xen: convert to MemoryListener API
  memory: temporarily add memory_region_get_ram_addr()
  xen, vga: add API for registering the framebuffer
  vhost: convert to MemoryListener API
  kvm: convert to MemoryListener API
  kvm: switch kvm slots to use host virtual address instead of ram_addr_t
  memory: add API for observing updates to the physical memory map
  memory: replace cpu_physical_sync_dirty_bitmap() with a memory API
  framebuffer: drop use of cpu_physical_sync_dirty_bitmap()
  loader: remove calls to cpu_get_physical_page_desc()
  framebuffer: drop use of cpu_get_physical_page_desc()
  memory: introduce memory_region_find()
  memory: add memory_region_is_logging()
  memory: add memory_region_is_rom()
  ...
2012-01-03 14:39:05 -06:00
Jan Kiszka
cce47516cd kvm: x86: Drop redundant apic base and tpr update from kvm_get_sregs
The latter was already commented out, the former is redundant as well.
We always get the latest changes after return from the guest via
kvm_arch_post_run.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-22 14:53:01 -02:00
Jan Kiszka
fabacc0f79 kvm: x86: Avoid runtime allocation of xsave buffer
Keep a per-VCPU xsave buffer for kvm_put/get_xsave instead of
continuously allocating and freeing it on state sync.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-22 14:53:01 -02:00
Jan Kiszka
6b42494b21 kvm: x86: Use symbols for all xsave field
Field 0 (FCW+FSW) and 1 (FTW+FOP) were hard-coded so far.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-22 14:53:00 -02:00
Avi Kivity
9f213ed92c kvm: switch kvm slots to use host virtual address instead of ram_addr_t
This simplifies a later switch to the memory API in slot management.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-12-20 14:14:07 +02:00
Vagrant Cascadian
126197214e Fix typo: runnning -> running
One n too many for running, need we say more.

Signed-Off-By: Vagrant Cascadian <vagrant@freegeek.org>

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-11-17 12:57:41 +00:00
Avi Kivity
21e87c4625 i386: wire up MSR_IA32_MISC_ENABLE
It's needed for its default value - bit 0 specifies that "rep movs" is
good enough for memcpy, and Linux may use a slower memcpu if it is not set,
depending on cpu family/model.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-10-24 21:33:32 -02:00
Liu, Jinsong
aa82ba549a kvm: support TSC deadline MSR with subsection
KVM add emulation of lapic tsc deadline timer for guest.
This patch is co-operation work at qemu side.

Use subsections to save/restore the field (mtosatti).

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-10-24 21:29:07 -02:00
Marcelo Tosatti
38d2c27ea6 Revert "kvm: support TSC deadline MSR"
This reverts commit bfc2455ddb.
New patch with subsections will follow.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-10-24 21:27:16 -02:00
Liu, Jinsong
bfc2455ddb kvm: support TSC deadline MSR
KVM add emulation of lapic tsc deadline timer for guest.
This patch is co-operation work at qemu side.

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-10-03 13:53:14 -03:00
Luiz Capitulino
1354869c38 Drop the vm_running global variable
Use runstate_is_running() instead, which is introduced by this commit.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-09-15 16:39:32 -03:00
Luiz Capitulino
1dfb4dd993 Replace the VMSTOP macros with a proper state type
Today, when notifying a VM state change with vm_state_notify(),
we pass a VMSTOP macro as the 'reason' argument. This is not ideal
because the VMSTOP macros tell why qemu stopped and not exactly
what the current VM state is.

One example to demonstrate this problem is that vm_start() calls
vm_state_notify() with reason=0, which turns out to be VMSTOP_USER.

This commit fixes that by replacing the VMSTOP macros with a proper
state type called RunState.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-09-15 16:39:32 -03:00
Blue Swirl
6115c0a82d kvm: remove unnecessary assignments
Avoid these warnings from clang analyzer:
/src/qemu/target-i386/kvm.c:772:5: warning: Value stored to 'cwd' is never read
    cwd = swd = twd = 0;
/src/qemu/target-i386/kvm.c:772:11: warning: Although the value stored to 'swd' is used in the enclosing expression, the value is never actually read from 'swd'
    cwd = swd = twd = 0;

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-09-10 14:50:00 +00:00
Stefan Weil
541dc0d47f Use new macro QEMU_PACKED for packed structures
Most changes were made using these commands:

git grep -la '__attribute__((packed))'|xargs perl -pi -e 's/__attribute__\(\(packed\)\)/QEMU_PACKED/'
git grep -la '__attribute__ ((packed))'|xargs perl -pi -e 's/__attribute__ \(\(packed\)\)/QEMU_PACKED/'
git grep -la '__attribute__((__packed__))'|xargs perl -pi -e 's/__attribute__\(\(__packed__\)\)/QEMU_PACKED/'
git grep -la '__attribute__ ((__packed__))'|xargs perl -pi -e 's/__attribute__ \(\(__packed__\)\)/QEMU_PACKED/'
git grep -la '__attribute((packed))'|xargs perl -pi -e 's/__attribute\(\(packed\)\)/QEMU_PACKED/'

Whitespace in linux-user/syscall_defs.h was fixed manually
to avoid warnings from scripts/checkpatch.pl.

Manual changes were also applied to hw/pc.c.

I did not fix indentation with tabs in block/vvfat.c.
The patch will show 4 errors with scripts/checkpatch.pl.

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-09-03 10:45:59 +00:00
Jan Kiszka
fdc9c41a5a Fix up some style nits of last uq/master merge
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-08-25 09:21:35 +01:00
Anthony Liguori
7267c0947d Use glib memory allocation and free functions
qemu_malloc/qemu_free no longer exist after this commit.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-20 23:01:08 -05:00
Marcelo Tosatti
ac363c9042 Remove #ifdef KVM_CAP_TSC_CONTROL
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-08-05 12:10:55 -03:00
Joerg Roedel
e7429073ed qemu-x86: Set tsc_khz in kvm when supported
Make use of the KVM_TSC_CONTROL feature if available.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-08-05 12:04:27 -03:00
Andre Przywara
31e8c69697 KVM: Fix XSAVE feature bit enumeration
When iterating through the XSAVE feature enumeration CPUID leaf (0xD)
we should not stop at the first zero EAX, but instead keep scanning
since there are gaps in the enumeration (ECX=1 for instance).
This fixes the proper usage of AVX in KVM guests.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-06-20 15:26:21 -03:00
Jan Kiszka
ba9bc59e1f kvm: x86: Pass KVMState to kvm_arch_get_supported_cpuid
kvm_arch_get_supported_cpuid checks for global cpuid restrictions, it
does not require any CPUState reference. Changing its interface allows
to call it before any VCPU is initialized.

CC: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-06-20 15:24:00 -03:00
Jan Kiszka
e56ff191b6 kvm: x86: Drop KVM_CAP build dependencies
No longer needed with accompanied kernel headers.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-06-20 15:19:54 -03:00
Jan Kiszka
5802e066eb kvm: Drop CONFIG_KVM_PARA
The kvm_para.h header is now always available.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-06-20 15:16:45 -03:00
Jan Kiszka
42cc8fa620 kvm: x86: Save/restore FPU OP, IP and DP
These FPU states are properly maintained by KVM but not yet by TCG. So
far we unconditionally set them to 0 in the guest which may cause
state corruptions, though not with modern guests.

To avoid breaking backward migration, use a conditional subsection that
is only written if any of the three fields is non-zero. The guest's
FNINIT clears them frequently, and cleared IA32_MISC_ENABLE MSR[2]
reduces the probability of non-zero values further so that this
subsection is not expected to restrict migration in any common scenario.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-06-19 15:57:56 +03:00
brillywu@viatech.com.cn
b3baa152aa kvm: Add CPUID support for VIA CPU
When KVM is running on VIA CPU with host cpu's model, the
feautures of VIA CPU will be passed into kvm guest by calling
the CPUID instruction for Centaur.

Signed-off-by: BrillyWu<brillywu@viatech.com.cn>
Signed-off-by: KaryJin<karyjin@viatech.com.cn>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-06-14 09:34:35 -05:00
Paolo Bonzini
4a043713b3 kvm: use qemu_free consistently
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-05-02 09:51:59 -03:00
Glauber Costa
0c31b744f6 kvm: use kernel-provided para_features instead of statically coming up with new capabilities
Use the features provided by KVM_GET_SUPPORTED_CPUID directly to
mask out features from guest-visible cpuid.

The old get_para_features() mechanism is kept for older kernels that do not implement it.

Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-02 08:24:42 -03:00
Jan Kiszka
f2574737f6 kvm: x86: Push kvm_arch_debug to kvm_arch_handle_exit
There are no generic bits remaining in the handling of KVM_EXIT_DEBUG.
So push its logic completely into arch hands, i.e. only x86 so far.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:06 -03:00
Jan Kiszka
2a4dac8350 kvm: x86: Reorder functions in kvm.c
Required for next patch which will access guest debug services from
kvm_arch_handle_exit. No functional changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:06 -03:00
Jan Kiszka
bb4ea39329 kvm: Align kvm_arch_handle_exit to kvm_cpu_exec changes
Make the return code of kvm_arch_handle_exit directly usable for
kvm_cpu_exec. This is straightforward for x86 and ppc, just s390
would require more work. Avoid this for now by pushing the return code
translation logic into s390's kvm_arch_handle_exit.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:06 -03:00
Jan Kiszka
0c03266a96 kvm: x86: Synchronize PAT MSR with the kernel
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:05 -03:00
Jan Kiszka
4601f7b04c kvm: x86: Do not leave halt if interrupts are disabled
When an external interrupt is pending but IF is cleared, we must not
leave the halt state prematurely.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:05 -03:00
Huang Ying
3c85e74fbf KVM, MCE, unpoison memory address across reboot
In Linux kernel HWPoison processing implementation, the virtual
address in processes mapping the error physical memory page is marked
as HWPoison.  So that, the further accessing to the virtual
address will kill corresponding processes with SIGBUS.

If the error physical memory page is used by a KVM guest, the SIGBUS
will be sent to QEMU, and QEMU will simulate a MCE to report that
memory error to the guest OS.  If the guest OS can not recover from
the error (for example, the page is accessed by kernel code), guest OS
will reboot the system.  But because the underlying host virtual
address backing the guest physical memory is still poisoned, if the
guest system accesses the corresponding guest physical memory even
after rebooting, the SIGBUS will still be sent to QEMU and MCE will be
simulated.  That is, guest system can not recover via rebooting.

In fact, across rebooting, the contents of guest physical memory page
need not to be kept.  We can allocate a new host physical page to
back the corresponding guest physical address.

This patch fixes this issue in QEMU-KVM via calling qemu_ram_remap()
to clear the corresponding page table entry, so that make it possible
to allocate a new page to recover the issue.

[ Jan: rebasing and tiny cleanups]

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:06 -03:00
Jan Kiszka
75d4949733 kvm: x86: Fail kvm_arch_init_vcpu if MCE initialization fails
There is no reason to continue if the kernel claims to support MCE but
then fails to process our request.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:06 -03:00
Jan Kiszka
32a420243c kvm: x86: Clean up kvm_setup_mce
There is nothing to abstract here. Fold kvm_setup_mce into its caller
and fix up the error reporting (return code of kvm_vcpu_ioctl holds the
error value).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:06 -03:00
Jan Kiszka
c34d440a72 kvm: x86: Consolidate TCG and KVM MCE injection code
This switches KVM's MCE injection path to cpu_x86_inject_mce, both for
SIGBUS and monitor initiated events. This means we prepare the MCA MSRs
in the VCPUState also for KVM.

We have to drop the MSRs writeback restrictions for this purpose which
is now safe as every uncoordinated MSR injection is removed with this
patch.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:06 -03:00
Jan Kiszka
ab443475c9 kvm: x86: Inject pending MCE events on state writeback
The current way of injecting MCE events without updating of and
synchronizing with the CPUState is broken and causes spurious
corruptions of the MCE-related parts of the CPUState.

As a first step towards a fix, enhance the state writeback code with
support for injecting events that are pending in the CPUState. A pending
exception will then be signaled via cpu_interrupt(CPU_INTERRUPT_MCE).
And, just like for TCG, we need to leave the halt state when
CPU_INTERRUPT_MCE is pending (left broken for the to-be-removed old KVM
code).

This will also allow to unify TCG and KVM injection code.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:05 -03:00
Jan Kiszka
990368650f kvm: Rename kvm_arch_process_irqchip_events to async_events
We will broaden the scope of this function on x86 beyond irqchip events.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:05 -03:00
Jan Kiszka
419fb20a8e kvm: x86: Move MCE functions together
Pure function suffling to avoid multiple #ifdef KVM_CAP_MCE sections,
no functional changes. While at it, annotate some #ifdef sections.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:05 -03:00
Jan Kiszka
b7680cb607 Refactor thread retrieval and check
We have qemu_cpu_self and qemu_thread_self. The latter is retrieving the
current thread, the former is checking for equality (using CPUState). We
also have qemu_thread_equal which is only used like qemu_cpu_self.

This refactors the interfaces, creating qemu_cpu_is_self and
qemu_thread_is_self as well ass qemu_thread_get_self.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-03-13 14:44:21 +00:00
Jan Kiszka
ce377af399 kvm: x86: Catch and report failing IRQ and NMI injections
We do not need to abort, but the user should be notified that weird
things go on.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Jan Kiszka
7a39fe5882 kvm: Drop return values from kvm_arch_pre/post_run
We do not check them, and the only arch with non-empty implementations
always returns 0 (this is also true for qemu-kvm).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Jan Kiszka
db1669bcca kvm: x86: Prepare VCPU loop for in-kernel irqchip
Effectively no functional change yet as kvm_irqchip_in_kernel still only
returns 0, but this patch will allow qemu-kvm to adopt the VCPU loop of
upsteam KVM.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Jan Kiszka
6792a57bf1 kvm: Separate TCG from KVM cpu execution
Mixing up TCG bits with KVM already led to problems around eflags
emulation on x86. Moreover, quite some code that TCG requires on cpu
enty/exit is useless for KVM. So dispatch between tcg_cpu_exec and
kvm_cpu_exec as early as possible.

The core logic of cpu_halted from cpu_exec is added to
kvm_arch_process_irqchip_events. Moving away from cpu_exec makes
exception_index meaningless for KVM, we can simply pass the exit reason
directly (only "EXCP_DEBUG vs. rest" is relevant).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Glauber Costa
b8cc45d6a6 kvm: make tsc stable over migration and machine start
If the machine is stopped, we should not record two different tsc values
upon a save operation. The same problem happens with kvmclock.

But kvmclock is taking a different diretion, being now seen as a separate
device. Since this is unlikely to happen with the tsc, I am taking the
approach here of simply registering a handler for state change, and
using a per-CPUState variable that prevents double updates for the TSC.

Signed-off-by: Glauber Costa <glommer@redhat.com>
CC: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Jan Kiszka
9ccfac9ea4 kvm: Unconditionally reenter kernel after IO exits
KVM requires to reenter the kernel after IO exits in order to complete
instruction emulation. Failing to do so will leave the kernel state
inconsistently behind. To ensure that we will get back ASAP, we issue a
self-signal that will cause KVM_RUN to return once the pending
operations are completed.

We can move kvm_arch_process_irqchip_events out of the inner VCPU loop.
The only state that mattered at its old place was a pending INIT
request. Catch it in kvm_arch_pre_run and also trigger a self-signal to
process the request on next kvm_cpu_exec.

This patch also fixes the missing exit_request check in kvm_cpu_exec in
the CONFIG_IOTHREAD case.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:45 -02:00
Jan Kiszka
a1b87fe046 kvm: Provide sigbus services arch-independently
Provide arch-independent kvm_on_sigbus* stubs to remove the #ifdef'ery
from cpus.c. This patch also fixes --disable-kvm build by providing the
missing kvm_on_sigbus_vcpu kvm-stub.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:45 -02:00
Anthony Liguori
b3a98367ee Merge remote branch 'qemu-kvm/uq/master' into staging
aliguori: fix build with !defined(KVM_CAP_ASYNC_PF)

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-02-01 15:32:18 -06:00
Jan Kiszka
c5999bfcfd kvm: x86: Only read/write MSR_KVM_ASYNC_PF_EN if supported
If the kernel does not support KVM_CAP_ASYNC_PF, it also does not know
about the related MSR. So skip it during state synchronization in that
case. Fixes annoying kernel warnings.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:22 -02:00
Jan Kiszka
d8f771d912 kvm: x86: Implicitly clear nmi_injected/pending on reset
All CPUX86State variables before CPU_COMMON are automatically cleared on
reset. Reorder nmi_injected and nmi_pending to avoid having to touch
them explicitly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:22 -02:00
Jan Kiszka
110761987d kvm: x86: Rework identity map and TSS setup for larger BIOS sizes
In order to support loading BIOSes > 256K, reorder the code, adjusting
the base if the kernel supports moving the identity map.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:22 -02:00
Jan Kiszka
94a8d39afd kvm: Consolidate must-have capability checks
Instead of splattering the code with #ifdefs and runtime checks for
capabilities we cannot work without anyway, provide central test
infrastructure for verifying their availability both at build and
runtime.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:22 -02:00
Jan Kiszka
cad1e2827b kvm: Drop smp_cpus argument from init functions
No longer used.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
521f0798d7 kvm: x86: Fix !CONFIG_KVM_PARA build
If we lack kvm_para.h, MSR_KVM_ASYNC_PF_EN is not defined. The change in
kvm_arch_init_vcpu is just for consistency reasons.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
ff5c186b8b kvm: x86: Reset paravirtual MSRs
Make sure to write the cleared MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK,
and MSR_KVM_ASYNC_PF_EN to the kernel state so that a freshly booted
guest cannot be disturbed by old values.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Glauber Costa <glommer@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
c3a3a7d356 kvm: x86: Refactor msr_star/hsave_pa setup and checks
Simplify kvm_has_msr_star/hsave_pa to booleans and push their one-time
initialization into kvm_arch_init. Also handle potential errors of that
setup procedure.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
1a5e9d2faf kvm: x86: Fix xcr0 reset mismerge
For unknown reasons, xcr0 reset ended up in kvm_arch_update_guest_debug
on upstream merge. Fix this and also remove the misleading comment (1 is
THE reset value).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
3390e7f797 kvm: x86: Remove redundant mp_state initialization
kvm_arch_reset_vcpu initializes mp_state, and that function is invoked
right after kvm_arch_init_vcpu.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
c14750e8ad kvm: x86: Prepare kvm_get_mp_state for in-kernel irqchip
This code path will not yet be taken as we still lack in-kernel irqchip
support. But qemu-kvm can already make use of it and drop its own
mp_state access services.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
0d75a9ecd7 kvm: x86: Align kvm_arch_put_registers code with comment
The ordering doesn't matter in this case, but better keep it consistent.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
bb44e0d12d kvm: Improve reporting of fatal errors
Report KVM_EXIT_UNKNOWN, KVM_EXIT_FAIL_ENTRY, and KVM_EXIT_EXCEPTION
with more details to stderr. The latter two are so far x86-only, so move
them into the arch-specific handler. Integrate the Intel real mode
warning on KVM_EXIT_FAIL_ENTRY that qemu-kvm carries, but actually
restrict it to Intel CPUs. Moreover, always dump the CPU state in case
we fail.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:20 -02:00
Jan Kiszka
73aaec4a39 kvm: Stop on all fatal exit reasons
Ensure that we stop the guest whenever we face a fatal or unknown exit
reason. If we stop, we also have to enforce a cpu loop exit.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:20 -02:00
Jan Kiszka
646042e1ab kvm: x86: Swallow KVM_EXIT_SET_TPR
This exit only triggers activity in the common exit path, but we should
accept it in order to be able to detect unknown exit types.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:20 -02:00
Jan Kiszka
b9bec74bcb kvm: x86: Fix a few coding style violations
No functional changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-21 14:05:22 -02:00
Jan Kiszka
95c077c919 kvm: x86: Prevent sign extension of DR7 in guest debugging mode
This unbreaks guest debugging when the 4th hardware breakpoint used for
guest debugging is a watchpoint of 4 or 8 byte lenght. The 31st bit of
DR7 is set in that case and used to cause a sign extension to the high
word which was breaking the guest state (vm entry failure).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-21 14:05:22 -02:00
Jan Kiszka
3980e3024b kvm: x86: Remove obsolete SS.RPL/DPL aligment
This seems to date back to the days KVM didn't support real mode. The
check is no longer needed and, even worse, is corrupting the guest state
in case SS.RPL != DPL.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-21 14:05:22 -02:00
Jan Kiszka
acaa75507b kvm: x86: Fix DPL write back of segment registers
The DPL is stored in the flags and not in the selector. In fact, the RPL
may differ from the DPL at some point in time, and so we were corrupting
the guest state so far.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
7cc2cc3e26 kvm: introduce kvm_inject_x86_mce_on
Pass a table instead of multiple args.

Note:

    kvm_inject_x86_mce(env, bank, status, mcg_status, addr, misc,
                       abort_on_error);

is equal to:

    struct kvm_x86_mce mce = {
        .bank = bank,
        .status = status,
        .mcg_status = mcg_status,
        .addr = addr,
        .misc = misc,
    };
    kvm_inject_x86_mce_on(env, &mce, abort_on_error);

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
e387c33892 kvm: kvm_mce_inj_* subroutines for templated error injections
Refactor codes for maintainability.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
6643e2f001 kvm: introduce kvm_mce_in_progress
Share same error handing, and rename this function after
MCIP (Machine Check In Progress) flag.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
2bd3e04c3b Add function for checking mca broadcast of CPU
Add function for checking whether current CPU support mca broadcast.

Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
31ce5e0c49 Add "broadcast" option for mce command
When the following test case is injected with mce command, maybe user could not
get the expected result.
    DATA
               command cpu bank status             mcg_status  addr   misc
        (qemu) mce     1   1    0xbd00000000000000 0x05        0x1234 0x8c

    Expected Result
           panic type: "Fatal Machine check"

That is because each mce command can only inject the given cpu and could not
inject mce interrupt to other cpus. So user will get the following result:
    panic type: "Fatal machine check on current CPU"

"broadcast" option is used for injecting dummy data into other cpus. Injecting
mce with this option the expected result could be gotten.

Usage:
    Broadcast[on]
           command broadcast cpu bank status             mcg_status  addr   misc
    (qemu) mce     -b        1   1    0xbd00000000000000 0x05        0x1234 0x8c

    Broadcast[off]
           command cpu bank status             mcg_status  addr   misc
    (qemu) mce     1   1    0xbd00000000000000 0x05        0x1234 0x8c

Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Lai Jiangshan
14a095184c kvm: convert kvm_ioctl(KVM_CHECK_EXTENSION) to kvm_check_extension()
simple cleanup and use existing helper: kvm_check_extension().

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:21 -02:00
Lai Jiangshan
276ce81563 kvm: Enable user space NMI injection for kvm guest
Make use of the new KVM_NMI IOCTL to send NMIs into the KVM guest if the
user space raised them. (example: qemu monitor's "nmi" command)

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:21 -02:00
Gleb Natapov
f6584ee203 Add support for async page fault to qemu
Add save/restore of MSR for migration and cpuid bit.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-11-16 08:40:07 -06:00
Marcelo Tosatti
25d2e3613d kvm: save/restore x86-64 MSRs on x86-64 kernels
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:37:41 -02:00
Marcelo Tosatti
384331a61a kvm: writeback SMP TSCs on migration only
commit 6389c45441269baa2873e6feafebd17105ddeaf6
Author: Jan Kiszka <jan.kiszka@siemens.com>
Date:   Mon Mar 1 18:17:26 2010 +0100

    qemu-kvm: Cleanup/fix TSC and PV clock writeback

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:37:40 -02:00
Marcelo Tosatti
75b10c4336 kvm: factor out kvm_has_msr_star
And add kvm_has_msr_hsave_pa(), to avoid warnings on older
kernels without support.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:37:29 -02:00
Marcelo Tosatti
aa851e365b kvm: add save/restore of MSR_VM_HSAVE_PA
commit 2bba4446746add456ceeb0e8359a43032a2ea333
Author: Alexander Graf <agraf@suse.de>
Date:   Thu Dec 18 15:38:32 2008 +0100

    Enable nested SVM support in userspace

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:37:17 -02:00
Hidetoshi Seto
d8da8574b1 Fix build on !KVM_CAP_MCE
This patch removes following warnings:

target-i386/kvm.c: In function 'kvm_put_msrs':
target-i386/kvm.c:782: error: unused variable 'i'
target-i386/kvm.c: In function 'kvm_get_msrs':
target-i386/kvm.c:1083: error: label at end of compound statement

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:12:14 -02:00
Hidetoshi Seto
f71ac88fe9 x86, mce: broadcast mce depending on the cpu version
There is no reason why SRAO event received by the main thread
is the only one that being broadcasted.

According to the x86 ASDM vol.3A 15.10.4.1,
MCE signal is broadcast on processor version 06H_EH or later.

This change is required to handle SRAR in smp guests.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:11:38 -02:00
Hidetoshi Seto
f8502cfbbf x86, mce: ignore SRAO only when MCG_SER_P is available
And restruct this block to call kvm_mce_in_exception() only when it is
required.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:09:06 -02:00
Marcelo Tosatti
0f53994f7a Fix memory leak in register save load due to xsave support
From: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-20 16:15:04 -05:00
Marcelo Tosatti
577804958a Add savevm/loadvm support for MCE
Port qemu-kvm's

commit 1bab5d11545d8de5facf46c28630085a2f9651ae
Author: Huang Ying <ying.huang@intel.com>
Date:   Wed Mar 3 16:52:46 2010 +0800

    Add savevm/loadvm support for MCE

    MCE registers are saved/load into/from CPUState in
    kvm_arch_save/load_regs. To simulate the MCG_STATUS clearing upon
    reset, MSR_MCG_STATUS is set to 0 for KVM_PUT_RESET_STATE.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-20 16:15:04 -05:00
Marcelo Tosatti
c0532a76b4 MCE: Relay UCR MCE to guest
Port qemu-kvm's

commit 4b62fff1101a7ad77553147717a8bd3bf79df7ef
Author: Huang Ying <ying.huang@intel.com>
Date:   Mon Sep 21 10:43:25 2009 +0800

    MCE: Relay UCR MCE to guest

    UCR (uncorrected recovery) MCE is supported in recent Intel CPUs,
    where some hardware error such as some memory error can be reported
    without PCC (processor context corrupted). To recover from such MCE,
    the corresponding memory will be unmapped, and all processes accessing
    the memory will be killed via SIGBUS.

    For KVM, if QEMU/KVM is killed, all guest processes will be killed
    too. So we relay SIGBUS from host OS to guest system via a UCR MCE
    injection. Then guest OS can isolate corresponding memory and kill
    necessary guest processes only. SIGBUS sent to main thread (not VCPU
    threads) will be broadcast to all VCPU threads as UCR MCE.

aliguori: fix build

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-10-20 16:15:04 -05:00
Marcelo Tosatti
e7701825e1 kvm: x86: add mce support
Port qemu-kvm's MCE support

commit c68b2374c9048812f488e00ffb95db66c0bc07a7
Author: Huang Ying <ying.huang@intel.com>
Date:   Mon Jul 20 10:00:53 2009 +0800

    Add MCE simulation support to qemu/kvm

    KVM ioctls are used to initialize MCE simulation and inject MCE. The
    real MCE simulation is implemented in Linux kernel. The Kernel part
    has been merged.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-20 16:15:04 -05:00
Joerg Roedel
296acb643b Add svm cpuid features
This patch adds the svm cpuid feature flags to the qemu
intialization path. It also adds the svm features available
on phenom to its cpu-definition and extends the host cpu
type to support all svm features KVM can provide.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-20 16:15:04 -05:00
Sheng Yang
408392b30d kvm: Fix cpu_is_bsp() compilation warning
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-06-28 13:06:03 -03:00
Marcelo Tosatti
ddced198cd kvm: init mp_state
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-06-28 13:06:03 -03:00
Sheng Yang
f1665b21f1 kvm: Enable XSAVE live migration support
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-06-28 13:06:03 -03:00
Sheng Yang
c958a8bd9b kvm: Extend kvm_arch_get_supported_cpuid() to support index
Would use it later for XSAVE related CPUID.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-06-28 13:06:03 -03:00
Blue Swirl
4a942ceac7 apic: avoid passing CPUState from CPU code
Pass only APICState when accessing APIC from CPU code.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-06-19 10:42:31 +03:00
Gleb Natapov
c1667e4097 kvm: fix 80000001.EDX supported bit filtering
On AMD some bits from 1.EDX are reported in 80000001.EDX. The mask used
to copy bits from 1.EDX to 80000001.EDX is incorrect resulting in
unsupported features passed into a guest.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-11 17:07:36 -03:00
Gleb Natapov
4513d9232b Do not stop VM if emulation failed in userspace.
Continue vcpu execution in case emulation failure happened while vcpu
was in userspace. In this case #UD will be injected into the guest
allowing guest OS to kill offending process and continue.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-11 14:03:44 -03:00
Marcelo Tosatti
0af691d779 kvm: enable smp > 1
Process INIT/SIPI requests and enable -smp > 1.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-11 14:02:22 -03:00
Jan Kiszka
dbaa07c404 kvm: validate context for kvm cpu get/put operations
Validate that KVM vcpu state is only read/written from cpu thread itself
or that cpu is stopped.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-11 14:02:22 -03:00
Sheng Yang
204204308b kvm: allow qemu to set EPT identity mapping address
If we use larger BIOS image than current 256KB, we would need move reserved
TSS and EPT identity mapping pages. Currently TSS support this, but not
EPT.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-26 11:28:35 -03:00
Jan Kiszka
ff44f1a373 KVM: x86: Add debug register saving and restoring
Make use of the new KVM_GET/SET_DEBUGREGS to save/restore the x86 debug
registers.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-04-26 11:28:35 -03:00
Blue Swirl
8c0d577eb5 kvm: avoid collision with dprintf macro in stdio.h, spotted by clang
Fixes clang errors:
  CC    i386-softmmu/kvm.o
/src/qemu/target-i386/kvm.c:40:9: error: 'dprintf' macro redefined
In file included from /src/qemu/target-i386/kvm.c:21:
In file included from /src/qemu/qemu-common.h:27:
In file included from /usr/include/stdio.h:910:
/usr/include/bits/stdio2.h:189:12: note: previous definition is here
  CC    i386-softmmu/kvm-all.o
/src/qemu/kvm-all.c:39:9: error: 'dprintf' macro redefined
In file included from /src/qemu/kvm-all.c:23:
In file included from /src/qemu/qemu-common.h:27:
In file included from /usr/include/stdio.h:910:
/usr/include/bits/stdio2.h:189:12: note: previous definition is here

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-04-18 14:22:14 +00:00
Andre Przywara
457dfed614 x86/cpuid: remove unnecessary kvm_trim function
Correct me if I am wrong, but kvm_trim looks like a really bloated
implementation of a bitwise AND. So remove this function and replace
it with the real stuff(TM).

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-03-13 16:50:54 +01:00
Jan Kiszka
ea64305139 KVM: x86: Restrict writeback of VCPU state
Do not write nmi_pending, sipi_vector, and mpstate unless we at least go
through a reset. And TSC as well as KVM wallclocks should only be
written on full sync, otherwise we risk to drop some time on state
read-modify-write.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-04 00:29:30 -03:00
Jan Kiszka
ea375f9ab8 KVM: Rework VCPU state writeback API
This grand cleanup drops all reset and vmsave/load related
synchronization points in favor of four(!) generic hooks:

- cpu_synchronize_all_states in qemu_savevm_state_complete
  (initial sync from kernel before vmsave)
- cpu_synchronize_all_post_init in qemu_loadvm_state
  (writeback after vmload)
- cpu_synchronize_all_post_init in main after machine init
- cpu_synchronize_all_post_reset in qemu_system_reset
  (writeback after system reset)

These writeback points + the existing one of VCPU exec after
cpu_synchronize_state map on three levels of writeback:

- KVM_PUT_RUNTIME_STATE (during runtime, other VCPUs continue to run)
- KVM_PUT_RESET_STATE   (on synchronous system reset, all VCPUs stopped)
- KVM_PUT_FULL_STATE    (on init or vmload, all VCPUs stopped as well)

This level is passed to the arch-specific VCPU state writing function
that will decide which concrete substates need to be written. That way,
no writer of load, save or reset functions that interact with in-kernel
KVM states will ever have to worry about synchronization again. That
also means that a lot of reasons for races, segfaults and deadlocks are
eliminated.

cpu_synchronize_state remains untouched, just as Anthony suggested. We
continue to need it before reading or writing of VCPU states that are
also tracked by in-kernel KVM subsystems.

Consequently, this patch removes many cpu_synchronize_state calls that
are now redundant, just like remaining explicit register syncs.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-04 00:29:28 -03:00
Jan Kiszka
b0b1d69079 KVM: Rework of guest debug state writing
So far we synchronized any dirty VCPU state back into the kernel before
updating the guest debug state. This was a tribute to a deficite in x86
kernels before 2.6.33. But as this is an arch-dependent issue, it is
better handle in the x86 part of KVM and remove the writeback point for
generic code. This also avoids overwriting the flushed state later on if
user space decides to change some more registers before resuming the
guest.

We furthermore need to reinject guest exceptions via the appropriate
mechanism. That is KVM_SET_GUEST_DEBUG for older kernels and
KVM_SET_VCPU_EVENTS for recent ones. Using both mechanisms at the same
time will cause state corruptions.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-04 00:29:26 -03:00
Paul Brook
35bed8ee91 Move ioport.h out of cpu-all.h
Only include ioport.h where it is actually needed.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-03-01 03:29:21 +00:00
Anthony Liguori
724c689357 Merge remote branch 'qemu-kvm/uq/master' into staging 2010-02-22 16:16:22 -06:00
Jes Sorensen
62f734a0d5 kvm: Kill CR3_CACHE feature references
Remove all references to KVM_CR3_CACHE as it was never implemented.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-02-21 11:41:07 +02:00
Jes Sorensen
4c5b10b7b6 QEMU e820 reservation patch
Hi,

Kevin and I have agreed on the approach for this one now. So here is
the latest version of the patch for QEMU, submitting e820 reservation
entries via fw_cfg.

Cheers,
Jes

Use qemu-cfg to provide the BIOS with an optional table of e820 entries.

Notify the BIOS of the location of the TSS+EPT range to by reserving
it via the e820 table.

This matches a corresponding patch for Seabios, however older versions
of Seabios will default to the hardcoded address range and stay
compatible with current QEMU.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-19 15:53:54 -06:00
Jan Kiszka
19ccb8ea17 KVM: x86: Fix up misreported CPU features
From qemu-kvm: Kernels before 2.6.30 misreported some essential CPU
features via KVM_GET_SUPPORTED_CPUID. Fix them up.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2010-02-03 19:47:34 -02:00
Jan Kiszka
aee028b95d KVM: Request setting of nmi_pending and sipi_vector
The final version of VCPU events in 2.6.33 will allow to skip
nmi_pending and sipi_vector on KVM_SET_VCPU_EVENTS. For now let's write
them unconditionally, which is unproblematic for upstream due to missing
SMP support. Future version which enable SMP will write them only on
reset.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-02-03 19:47:33 -02:00
Gleb Natapov
bb0300dc57 Add KVM paravirt cpuid leaf
Initialize KVM paravirt cpuid leaf and allow user to control guest
visible PV features through -cpu flag.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-13 17:14:16 -06:00
Gleb Natapov
e73223a584 remove pending exception on vcpu reset.
Without this qemu can even start on kvm modules with events support
since default value of exception_injected in zero and this is #DE
exception.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-11 09:56:30 -06:00
Jan Kiszka
31827373f0 kvm: x86: Use separate exception_injected CPUState field
Marcelo correctly remarked that there are usage conflicts between QEMU
core code and KVM /wrt exception_index. So spend a separate field and
also save/restore it properly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-18 11:26:25 -06:00
Glauber Costa
1a03675db1 v2: properly save kvm system time msr registers
Currently, the msrs involved in setting up pvclock are not saved over
migration and/or save/restore. This patch puts their value in special
fields in our CPUState, and deal with them using vmstate.

kvm also has to account for it, by including them in the msr list
for the ioctls.

This is a backport from qemu-kvm.git

[v2: sucessfully build without kerneldir ]

Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-12 07:59:49 -06:00
Jan Kiszka
6fb6d24554 kvm: x86: Fix initial kvm_has_msr_star
KVM_GET_MSR_INDEX_LIST returns -E2BIG when the provided space is too
small for all MSRs. But this is precisely the error we trigger with the
initial request in order to obtain that size. Do not fail in that case.

This caused a subtle corruption of the guest state as MSR_STAR was not
properly saved/restored. The corruption became visible with latest kvm
optimizing the MSR updates.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-07 08:45:47 -06:00
Jan Kiszka
a0fb002c64 kvm: x86: Add support for VCPU event states
This patch extends the qemu-kvm state sync logic with support for
KVM_GET/SET_VCPU_EVENTS, giving access to yet missing exception,
interrupt and NMI states.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03 15:25:57 -06:00
Jan Kiszka
5a2e3c2e11 kvm: x86: Fix merge artifact of f8d926e9 about mp_state
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03 15:25:56 -06:00
Jan Kiszka
0e607a80d3 kvm: x86: Refactor use of interrupt_bitmap
Drop interrupt_bitmap from the cpustate and solely rely on the integer
interupt_injected. This prepares us for the new injected-interrupt
interface, which will deprecate the bitmap, while preserving
compatibility.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-17 08:49:37 -06:00
Jan Kiszka
caa5af0ff3 kvm: Add arch reset handler
Will be required by succeeding changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-17 08:49:37 -06:00
Hollis Blanchard
9bdbe550f0 kvm: Move KVM mp_state accessors to i386-specific code
Unbreaks PowerPC and S390 KVM builds.

Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-12 11:23:55 -06:00
Juan Quintela
3897293825 gcc wants 1st static and then const
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-09-25 19:52:06 +00:00
Jan Kiszka
d9db889faf kvm: Work around borken MSR_GET_INDEX_LIST
Allocate enough memory for KVM_GET_MSR_INDEX_LIST as older kernels shot
far beyond their limits, corrupting user space memory.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-10 13:44:30 -05:00
Andre Przywara
6c0d7ee895 preserve the hypervisor bit while KVM trims the CPUID bits
The KVM kernel will disable all bits in CPUID which are not present in
the host. As this is mostly true for the hypervisor bit (1.ecx),
preserve its value before the trim and restore it afterwards.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-29 13:52:12 -05:00
Andre Przywara
6c1f42fe83 fix KVMs GET_SUPPORTED_CPUID feature usage
If we want to trim the user provided CPUID bits for KVM to be not greater
than that of the host, we should not remove the bits _after_ we sent
them to the kernel.
This fixes the masking of features that are not present on the host by
moving the trim function and it's call from helper.c to kvm.c.
It helps to use -cpu host.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-29 13:52:11 -05:00
Jan Kiszka
f8d926e9cd kvm: x86: Save/restore KVM-specific CPU states
Save and restore all so far neglected KVM-specific CPU states. Handling
the TSC stabilizes migration in KVM mode. The interrupt_bitmap and
mp_state are currently unused, but will become relevant for in-kernel
irqchip support. By including proper saving/restoring already, we avoid
having to increment CPU_SAVE_VERSION later on once again.

v2:
 - initialize mp_state runnable (for the boot CPU)

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-22 10:50:34 -05:00
Mark McLoughlin
76ae317f7c kvm: work around supported cpuid ioctl() brokenness
KVM_GET_SUPPORTED_CPUID has been known to fail to return -E2BIG
when it runs out of entries. Detect this by always trying again
with a bigger table if the ioctl() fills the table.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
2009-05-20 09:12:57 -05:00
Avi Kivity
b827df585b kvm: Add support for querying supported cpu features
kvm does not support all cpu features; add support for dunamically querying
the supported feature set.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-08 15:42:52 -05:00
aliguori
a33609cae0 kvm: Fix cpuid initialization (Jan Kiszka)
Fix (more or less) spurious guest boot failures due to corrupted cpuid
states. The reason was insufficient initialization of cpuid entries
before passing them to the kernel.

At this chance also fix improper entry pointer progression and simplify
the code a bit.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7167 c046a42c-6fe2-441c-8c8c-71466251a162
2009-04-17 20:50:54 +00:00
aliguori
64bf3f4eec kvm: Drop kvm_patch_opcode_byte (Jan Kiszka)
As cpu_memory_rw_debug is now capable of modifying ROM, we can drop our
own patch function.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6906 c046a42c-6fe2-441c-8c8c-71466251a162
2009-03-28 17:51:40 +00:00
aliguori
e22a25c936 Guest debugging support for KVM (Jan Kiszka)
This is a backport of the guest debugging support for the KVM
accelerator that is now part of the KVM tree. It implements the reworked
KVM kernel API for guest debugging (KVM_CAP_SET_GUEST_DEBUG) which is
not yet part of any mainline kernel but will probably be 2.6.30 stuff.
So far supported is x86, but PPC is expected to catch up soon.

Core features are:
 - unlimited soft-breakpoints via code patching
 - hardware-assisted x86 breakpoints and watchpoints

Changes in this version:
 - use generic hook cpu_synchronize_state to transfer registers between
   user space and kvm
 - push kvm_sw_breakpoints into KVMState

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6825 c046a42c-6fe2-441c-8c8c-71466251a162
2009-03-12 20:12:48 +00:00
aliguori
b136a4c6ef KVM: cpuid function 2: store all values (Amit Shah)
Incrementing the array index was missed in the previous series which causes
us to not store all the values.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6617 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-11 21:00:43 +00:00
aliguori
a36b1029e2 KVM: Get all cpuid values from function 2 (Amit Shah)
cpuid function 2 can have multiple values to describe cache behaviour.
Loop till we have fetched all the values.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6567 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-09 15:50:36 +00:00
aliguori
486bd5a2f2 KVM: Fetch sub-leaf cpuid values for functions 4, 0xb, 0xd. (Amit Shah)
CPUID functions 4, 0xb and 0xd have sub-leaf values which depend on the
input value of ECX. Store these values as well.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6566 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-09 15:50:31 +00:00
aliguori
e00b6f8099 KVM: CPUID takes ecx as input value for some functions (Amit Shah)
The CPUID instruction takes the value of ECX as an input parameter
in addition to the value of EAX as the count for functions 4, 0xb
and 0xd. Make sure we pass the value to the instruction.

Also convert to the qemu-style whitespace for the surrounding code.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6565 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-09 15:50:08 +00:00
aliguori
0d0266a53b targets: remove error handling from qemu_malloc() callers (Avi Kivity)
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6530 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-05 22:06:11 +00:00
aliguori
8c6c919e28 kvm-x86: Remove eflags conversion into emulator format (Jan Kiszka)
It seems that the conversion of the kernel-delivered eflags state into
qemu's internal split representation was once needed in an older kvm
design (register read-back may have taken place from inside cpu_exec).
Today it is plain wrong and causes incorrect cpu state reporting (gdb,
monitor) and should also corrupt its saving (savevm, migration). Drop
the related lines.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6461 c046a42c-6fe2-441c-8c8c-71466251a162
2009-01-26 20:32:22 +00:00
aliguori
55308450d4 Initialize msr list size properly in KVM
Hollis Blanchard noticed that the last commit was not sufficient.  We also need
to initialize the msr size in our newly allocated list.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6018 c046a42c-6fe2-441c-8c8c-71466251a162
2008-12-13 20:49:31 +00:00
aliguori
4c9f7372fc Correctly initialize msr list in KVM
I believe this was spotted by Gerd Hoffman but I can't find his patch 
now.  This will cause very subtle corruption on the heap because we 
don't allocate the appropriately sized buffer.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6017 c046a42c-6fe2-441c-8c8c-71466251a162
2008-12-13 20:41:58 +00:00
aliguori
40a2d7050a Fix crash in kvm.c (Stefan Weil)
Fix crash with kvm enabled.

Signed-off-by: Stefan Weil <weil@mail.berlios.de> 
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5984 c046a42c-6fe2-441c-8c8c-71466251a162
2008-12-11 21:03:10 +00:00
aliguori
984b518147 Define kvm_ioctl in the same way as ioctl
The third argument to ioctl is a ... which allows any value to be passed.  In
practice, glibc always treats the argument as a void *.

Do the same thing for the kvm ioctls to keep things consistent with a
traditional ioctl.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5715 c046a42c-6fe2-441c-8c8c-71466251a162
2008-11-13 19:21:00 +00:00
aliguori
053304480a Add missing files to KVM commit.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5629 c046a42c-6fe2-441c-8c8c-71466251a162
2008-11-05 16:29:27 +00:00