Commit Graph

297 Commits

Author SHA1 Message Date
Igor Mammedov
f8bb056564 target-i386: kvm: prevent buffer overflow if -cpu foo, [x]level is too big
Stack corruption may occur if too big 'level' or 'xlevel' values passed
on command line with KVM enabled, due to limited size of cpuid_data
in kvm_arch_init_vcpu().

reproduces with:
 qemu -enable-kvm -cpu qemu64,level=4294967295
or
 qemu -enable-kvm -cpu qemu64,xlevel=4294967295

Check if there is space in cpuid_data before passing it to cpu_x86_cpuid()
or abort() if there is not space.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-01-29 08:57:56 +02:00
Eduardo Habkost
83b17af5e6 target-i386: kvm: Set vcpu_id to APIC ID instead of CPU index
The CPU ID in KVM is supposed to be the APIC ID, so change the
KVM_CREATE_VCPU call to match it. The current behavior didn't break
anything yet because today the APIC ID is assumed to be equal to the CPU
index, but this won't be true in the future.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-01-27 14:34:26 +01:00
Eduardo Habkost
b164e48ed1 kvm: Create kvm_arch_vcpu_id() function
This will allow each architecture to define how the VCPU ID is set on
the KVM_CREATE_VCPU ioctl call.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-01-27 14:34:26 +01:00
Anthony Liguori
34daffa048 Merge remote-tracking branch 'qemu-kvm/uq/master' into staging
* qemu-kvm/uq/master:
  qemu-kvm/pci-assign: 64 bits bar emulation
  target-i386: Enabling IA32_TSC_ADJUST for QEMU KVM guest VMs

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-01-02 08:01:54 -06:00
Andreas Färber
501a7ce727 Merge branch 'master' of git://git.qemu.org/qemu into qom-cpu
Adapt header include paths.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-12-23 00:40:49 +01:00
Andreas Färber
f7575c96c6 cpu: Move kvm_run into CPUState
Pass CPUState / {X86,S390}CPU to helper functions.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-12-19 14:09:32 +01:00
Andreas Färber
a60f24b56b cpu: Move kvm_state field into CPUState
Adapt some functions to take CPUState / {PowerPC,S390}CPU argument.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-12-19 14:09:32 +01:00
Andreas Färber
1bc22652d6 kvm: Pass CPUState to kvm_vcpu_ioctl()
Adapt helper functions to pass X86CPU / PowerPCCPU / S390CPU.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-12-19 14:09:31 +01:00
Andreas Färber
20d695a925 kvm: Pass CPUState to kvm_arch_*
Move kvm_vcpu_dirty field into CPUState to simplify things and change
its type to bool while at it.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-12-19 14:09:31 +01:00
Paolo Bonzini
9c17d615a6 softmmu: move include files to include/sysemu/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:32:45 +01:00
Paolo Bonzini
1de7afc984 misc: move include files to include/qemu/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:32:39 +01:00
Paolo Bonzini
022c62cbbc exec: move include files to include/exec/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:31 +01:00
Paolo Bonzini
077805fa92 janitor: do not rely on indirect inclusions of or from qemu-char.h
Various header files rely on qemu-char.h including qemu-config.h or
main-loop.h, but they really do not need qemu-char.h at all (particularly
interesting is the case of the block layer!).  Clean this up, and also
add missing inclusions of qemu-char.h itself.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:29:52 +01:00
Michael S. Tsirkin
a2cb15b0dd pci: update all users to look in pci/
update all users so we can remove the makefile hack.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-12-17 13:02:26 +02:00
Will Auld
f28558d3d3 target-i386: Enabling IA32_TSC_ADJUST for QEMU KVM guest VMs
CPUID.7.0.EBX[1]=1 indicates IA32_TSC_ADJUST MSR 0x3b is supported

Basic design is to emulate the MSR by allowing reads and writes to the
hypervisor vcpu specific locations to store the value of the emulated MSRs.
In this way the IA32_TSC_ADJUST value will be included in all reads to
the TSC MSR whether through rdmsr or rdtsc.

As this is a new MSR that the guest may access and modify its value needs
to be migrated along with the other MRSs. The changes here are specifically
for recognizing when IA32_TSC_ADJUST is enabled in CPUID and code added
for migrating its value.

Signed-off-by: Will Auld <will.auld@intel.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-12-14 18:17:36 -02:00
Blue Swirl
4840552601 kvm: avoid using cpu_single_env
Pass around CPUArchState instead of using global cpu_single_env.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
2012-11-10 13:49:20 +00:00
Anthony Liguori
43552994c4 Merge remote-tracking branch 'qemu-kvm/uq/master' into staging
* qemu-kvm/uq/master: (28 commits)
  update-linux-headers.sh: Handle new kernel uapi/ directories
  target-i386: kvm_cpu_fill_host: use GET_SUPPORTED_CPUID
  target-i386: cpu: make -cpu host/check/enforce code KVM-specific
  target-i386: make cpu_x86_fill_host() void
  Emulate qemu-kvms -no-kvm option
  Issue warning when deprecated -tdf option is used
  Issue warning when deprecated drive parameter boot=on|off is used
  Use global properties to emulate -no-kvm-pit-reinjection
  Issue warning when deprecated -no-kvm-pit is used
  Use machine options to emulate -no-kvm-irqchip
  cirrus_vga: allow configurable vram size
  target-i386: Add missing kvm cpuid feature name
  i386: cpu: add missing CPUID[EAX=7,ECX=0] flag names
  i386: kvm: filter CPUID leaf 7 based on GET_SUPPORTED_CPUID, too
  i386: kvm: reformat filter_features_for_kvm() code
  i386: kvm: filter CPUID feature words earlier, on cpu.c
  i386: kvm: mask cpuid_ext4_features bits earlier
  i386: kvm: mask cpuid_kvm_features earlier
  i386: kvm: x2apic is not supported without in-kernel irqchip
  i386: kvm: set CPUID_EXT_TSC_DEADLINE_TIMER on kvm_arch_get_supported_cpuid()
  ...

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-11-01 11:12:50 -05:00
Andreas Färber
839b5630cd target-i386: Pass X86CPU to kvm_handle_halt()
Needed for moving interrupt_request and halted fields to CPUState.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-10-31 04:12:23 +01:00
Andreas Färber
23d02d9b4b target-i386: Pass X86CPU to kvm_get_mp_state()
Needed for moving halted field to CPUState.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-10-31 04:12:23 +01:00
Andreas Färber
8c5cf3b621 target-i386: Pass X86CPU to cpu_x86_inject_mce()
Needed for changing run_on_cpu() argument to CPUState.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-10-31 04:12:23 +01:00
Andreas Färber
bee615d4b9 target-i386: Pass X86CPU to kvm_mce_inject()
Needed for changing cpu_x86_inject_mce() argument to X86CPU.

Signed-off-by: Andreas Färber <afaerber@suse.de>
[AF: Rebased onto hwaddr]
2012-10-31 04:12:23 +01:00
Eduardo Habkost
bc74b7db86 i386: kvm: filter CPUID feature words earlier, on cpu.c
cpu.c contains the code that will check if all requested CPU features
are available, so the filtering of KVM features must be there, so we can
implement "check" and "enforce" properly.

The only point where kvm_arch_init_vcpu() is called on i386 is:

- cpu_x86_init()
  - x86_cpu_realize() (after cpu_x86_register() is called)
    - qemu_init_vcpu()
      - qemu_kvm_start_vcpu()
        - qemu_kvm_thread_fn() (on a new thread)
          - kvm_init_vcpu()
            - kvm_arch_init_vcpu()

With this patch, the filtering will be done earlier, at:
- cpu_x86_init()
  - cpu_x86_register() (before x86_cpu_realize() is called)

Also, the KVM CPUID filtering will now be done at the same place where
the TCG CPUID feature filtering is done. Later, the code can be changed
to use the same filtering code for the "check" and "enforce" modes, as
now the cpu.c code knows exactly which CPU features are going to be
exposed to the guest (and much earlier).

One thing I was worrying about when doing this is that
kvm_arch_get_supported_cpuid() depends on kvm_irqchip_in_kernel(), and
maybe the 'kvm_kernel_irqchip' global variable wasn't initialized yet at
CPU creation time. But kvm_kernel_irqchip is initialized during
kvm_init(), that is called very early (much earlier than the machine
init function), and kvm_init() is already a requirement to run the
GET_SUPPORTED_CPUID ioctl() (as kvm_init() initializes the kvm_state
global variable).

Side note: it would be nice to keep KVM-specific code inside kvm.c. The
problem is that properly implementing -cpu check/enforce code (that's
inside cpu.c) depends directly on the feature bit filtering done using
kvm_arch_get_supported_cpuid(). Currently -cpu check/enforce is broken
because it simply uses the host CPU feature bits instead of
GET_SUPPORTED_CPUID, and we need to fix that.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:49 -02:00
Eduardo Habkost
c9da8382c1 i386: kvm: mask cpuid_ext4_features bits earlier
This way all the filtering by GET_SUPPORTED_CPUID is being done at the
same place in the code.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:49 -02:00
Eduardo Habkost
ea85c9e456 i386: kvm: mask cpuid_kvm_features earlier
Instead of masking the KVM feature bits very late (while building the
KVM_SET_CPUID2 data), mask it out on env->cpuid_kvm_features, at the
same point where the other feature words are masked out.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:48 -02:00
Eduardo Habkost
41e5e76db0 i386: kvm: x2apic is not supported without in-kernel irqchip
This is necessary so that x2apic is not improperly enabled when the
in-kernel irqchip is disabled.

This won't generate a warning with "-cpu ...,check" because the current
check/enforce code is broken (it checks the host CPU data directly,
instead of using kvm_arch_get_supported_cpuid()), but it will be
eventually fixed to properly report the missing x2apic flag.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:48 -02:00
Eduardo Habkost
ac67ee260a i386: kvm: set CPUID_EXT_TSC_DEADLINE_TIMER on kvm_arch_get_supported_cpuid()
This moves the CPUID_EXT_TSC_DEADLINE_TIMER CPUID flag hacking from
kvm_arch_init_vcpu() to kvm_arch_get_supported_cpuid().

Full git grep for kvm_arch_get_supported_cpuid:

   kvm.h:uint32_t kvm_arch_get_supported_cpuid(KVMState *env, uint32_t function,
   target-i386/cpu.c:        x86_cpu_def->cpuid_7_0_ebx_features = kvm_arch_get_supported_cpuid(kvm_state, 0x7, 0, R_EBX);
   target-i386/cpu.c:            *eax = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EAX);
   target-i386/cpu.c:            *ebx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EBX);
   target-i386/cpu.c:            *ecx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_ECX);
   target-i386/cpu.c:            *edx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EDX);
   target-i386/cpu.c:            *eax = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EAX);
   target-i386/cpu.c:            *ebx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EBX);
   target-i386/cpu.c:            *ecx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_ECX);
   target-i386/cpu.c:            *edx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EDX);
   target-i386/kvm.c:uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function,
   target-i386/kvm.c:        cpuid_1_edx = kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX);
   target-i386/kvm.c:    env->cpuid_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX);
 * target-i386/kvm.c:    env->cpuid_ext_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX);
   target-i386/kvm.c:    env->cpuid_ext2_features &= kvm_arch_get_supported_cpuid(s, 0x80000001,
   target-i386/kvm.c:    env->cpuid_ext3_features &= kvm_arch_get_supported_cpuid(s, 0x80000001,
   target-i386/kvm.c:    env->cpuid_svm_features  &= kvm_arch_get_supported_cpuid(s, 0x8000000A,
   target-i386/kvm.c:        kvm_arch_get_supported_cpuid(s, KVM_CPUID_FEATURES, 0, R_EAX);
   target-i386/kvm.c:            kvm_arch_get_supported_cpuid(s, 0xC0000001, 0, R_EDX);

Note that there is only one call for CPUID[1].ECX above (*), and it is
the one that gets hacked to include CPUID_EXT_TSC_DEADLINE_TIMER, so we
can simply make kvm_arch_get_supported_cpuid() set it, to let the rest
of the code know the flag can be safely set by QEMU.

One thing I was worrying about when doing this is that now
kvm_arch_get_supported_cpuid() depends on kvm_irqchip_in_kernel(). But
the 'kvm_kernel_irqchip' global variable is initialized during
kvm_init(), that is called very early, and kvm_init() is already a
requirement to run the GET_SUPPORTED_CPUID ioctl() (as kvm_init() is the
function that initializes the 'kvm_state' global variable).

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:48 -02:00
Eduardo Habkost
84bd945cf2 i386: kvm: set CPUID_EXT_HYPERVISOR on kvm_arch_get_supported_cpuid()
Full grep for kvm_arch_get_supported_cpuid:

   kvm.h:uint32_t kvm_arch_get_supported_cpuid(KVMState *env, uint32_t function,
   target-i386/cpu.c:        x86_cpu_def->cpuid_7_0_ebx_features = kvm_arch_get_supported_cpuid(kvm_state, 0x7, 0, R_EBX);
   target-i386/cpu.c:            *eax = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EAX);
   target-i386/cpu.c:            *ebx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EBX);
   target-i386/cpu.c:            *ecx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_ECX);
   target-i386/cpu.c:            *edx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EDX);
   target-i386/cpu.c:            *eax = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EAX);
   target-i386/cpu.c:            *ebx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EBX);
   target-i386/cpu.c:            *ecx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_ECX);
   target-i386/cpu.c:            *edx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EDX);
   target-i386/kvm.c:uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function,
   target-i386/kvm.c:        cpuid_1_edx = kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX);
   target-i386/kvm.c:    env->cpuid_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX);
 * target-i386/kvm.c:    env->cpuid_ext_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX);
   target-i386/kvm.c:    env->cpuid_ext2_features &= kvm_arch_get_supported_cpuid(s, 0x80000001,
   target-i386/kvm.c:    env->cpuid_ext3_features &= kvm_arch_get_supported_cpuid(s, 0x80000001,
   target-i386/kvm.c:    env->cpuid_svm_features  &= kvm_arch_get_supported_cpuid(s, 0x8000000A,
   target-i386/kvm.c:        kvm_arch_get_supported_cpuid(s, KVM_CPUID_FEATURES, 0, R_EAX);
   target-i386/kvm.c:            kvm_arch_get_supported_cpuid(s, 0xC0000001, 0, R_EDX);

Note that there is only one call for CPUID[1].ECX above (*), and it is
the one that gets hacked to include CPUID_EXT_HYPERVISOR, so we can
simply make kvm_arch_get_supported_cpuid() set it, to let the rest of
the code automatically know that the flag can be safely set by QEMU.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:47 -02:00
Eduardo Habkost
c2acb022c8 i386: kvm: kvm_arch_get_supported_cpuid: replace if+switch with single 'if'
Additional fixups will be added, and making them a single 'if/else if'
chain makes it clearer than two nested switch statements.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:47 -02:00
Eduardo Habkost
dd87f8a690 i386: kvm: extract try_get_cpuid() loop to get_supported_cpuid() function
No behavior change, just code movement.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:46 -02:00
Eduardo Habkost
4fb73f1d3b i386: kvm: extract CPUID entry lookup to cpuid_find_entry() function
No behavior change, just code movement.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:46 -02:00
Eduardo Habkost
829ae2f9fa i386: kvm: extract register switch to cpuid_entry_get_reg() function
No behavior change: just code movement.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:46 -02:00
Eduardo Habkost
47111e2cfa i386: kvm: kvm_arch_get_supported_cpuid: use 'entry' variable
The reg switch will be moved to a separate function, so store the entry
pointer in a variable.

No behavior change, just code movement.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:46 -02:00
Eduardo Habkost
8c723b7958 i386: kvm: kvm_arch_get_supported_cpuid: clean up has_kvm_features check
Instead of a function-specific has_kvm_features variable, simply use a
"found" variable that will be checked in case we have to use the legacy
get_para_features() interface.

No behavior change, just code cleanup.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:46 -02:00
Eduardo Habkost
7b46e5ce81 i386: kvm: kvm_arch_get_supported_cpuid: move R_EDX hack outside of for loop
The for loop will become a separate function, so clean it up so it can
become independent from the bit hacking for R_EDX.

No behavior change[1], just code movement.

[1] Well, only if the kernel returned CPUID leafs 1 or 0x80000001 as
    unsupported, but there's no kernel version that does that.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30 23:39:45 -02:00
Andreas Färber
2fa45344a9 cpus: Pass CPUState to cpu_is_stopped()
CPUArchState is no longer needed there.

Also change the return type to bool.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-10-31 01:02:44 +01:00
Andreas Färber
60e82579c7 cpus: Pass CPUState to qemu_cpu_is_self()
Change return type to bool, move to include/qemu/cpu.h and
add documentation.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
[AF: Updated new caller qemu_in_vcpu_thread()]
2012-10-31 01:02:39 +01:00
Avi Kivity
a8170e5e97 Rename target_phys_addr_t to hwaddr
target_phys_addr_t is unwieldly, violates the C standard (_t suffixes are
reserved) and its purpose doesn't match the name (most target_phys_addr_t
addresses are not target specific).  Replace it with a finger-friendly,
standards conformant hwaddr.

Outstanding patchsets can be fixed up with the command

  git rebase -i --exec 'find -name "*.[ch]"
                        | xargs s/target_phys_addr_t/hwaddr/g' origin

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-23 08:58:25 -05:00
Eduardo Habkost
8fad4b44a0 i386: kvm: use a #define for the set of alias feature bits
Instea of using a hardcoded hex constant, define CPUID_EXT2_AMD_ALIASES
as the set of CPUID[8000_0001].EDX bits that on AMD are the same as the
bits of CPUID[1].EDX.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Don Slutz <Don@CloudSwitch.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-09-30 11:11:00 +00:00
Eduardo Habkost
b1f4679392 i386: kvm: bit 10 of CPUID[8000_0001].EDX is reserved
Bit 10 of CPUID[8000_0001].EDX is not defined as an alias of
CPUID[1].EDX[10], so do not duplicate it on
kvm_arch_get_supported_cpuid().

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Don Slutz <Don@CloudSwitch.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-09-30 11:10:38 +00:00
Jan Kiszka
b139bd300f kvm: i386: Add services required for PCI device assignment
These helpers abstract the interaction of upcoming pci-assign with the
KVM kernel services. Put them under i386 only as other archs will
implement device pass-through via VFIO and not this classic interface.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-10 15:29:59 +03:00
Michael S. Tsirkin
bc9a839d56 kvm: get/set PV EOI MSR
Support get/set of new PV EOI MSR, for migration.
Add an optional section for MSR value - send it
out in case MSR was changed from the default value (0).

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-08-29 10:51:28 -05:00
Peter Maydell
f3e1bed8da kvm: Decouple 'GSI routing' from 'kernel irqchip'
Don't assume having an in-kernel irqchip means that GSI
routing is enabled.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-09 16:16:57 +03:00
Peter Maydell
614e41bc26 kvm: Decouple 'MSI routing via irqfds' from 'kernel irqchip'
Decouple another x86-specific assumption about what irqchips imply.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-09 16:16:56 +03:00
Peter Maydell
cc7e0ddf5a kvm: Decouple 'irqfds usable' from 'kernel irqchip'
Instead of assuming that we can use irqfds if and only if
kvm_irqchip_in_kernel(), add a bool to the KVMState which
indicates this, and is set only on x86 and only if the
irqchip is in the kernel.

The kernel documentation implies that the only thing
you need to use KVM_IRQFD is that KVM_CAP_IRQFD is
advertised, but this seems to be untrue. In particular
the kernel does not (alas) return a sensible error if you
try to set up an irqfd when you haven't created an irqchip.
If it did we could remove all this nonsense and let the
kernel return the error code.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-09 16:16:56 +03:00
Peter Maydell
1d31f66bbc kvm: Move kvm_allows_irq0_override() to target-i386, fix return type
kvm_allows_irq0_override() is a totally x86 specific concept:
move it to the target-specific source file where it belongs.
This means we need a new header file for the prototype:
kvm_i386.h, in line with the existing kvm_ppc.h.

While we are moving it, fix the return type to be 'bool' rather
than 'int'.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-09 16:16:55 +03:00
Igor Mammedov
dd673288a8 target-i386: move cpu halted decision into x86_cpu_reset
MP initialization protocol differs between cpu families, and for P6 and
onward models it is up to CPU to decide if it will be BSP using this
protocol, so try to model this. However there is no point in implementing
MP initialization protocol in qemu. Thus first CPU is always marked as BSP.

This patch:
 - moves decision to designate BSP from board into cpu, making cpu
self-sufficient in this regard. Later it will allow to cleanup hw/pc.c
and remove cpu_reset and wrappers from there.
 - stores flag that CPU is BSP in IA32_APIC_BASE to model behavior
described in Inted SDM vol 3a part 1 chapter 8.4.1
 - uses MSR_IA32_APICBASE_BSP flag in apic_base for checking if cpu is BSP

patch is based on Jan Kiszka's proposal:
    http://thread.gmane.org/gmane.comp.emulators.qemu/100806

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-08-01 08:45:06 -05:00
Jan Kiszka
5d62c43a17 apic: Defer interrupt updates to VCPU thread
KVM performs TPR raising asynchronously to QEMU, specifically outside
QEMU's global lock. When an interrupt is injected into the APIC and TPR
is checked to decide if this can be delivered, a stale TPR value may be
used, causing spurious interrupts in the end.

Fix this by deferring apic_update_irq to the context of the target VCPU.
We introduce a new interrupt flag for this, CPU_INTERRUPT_POLL. When it
is set, the VCPU calls apic_poll_irq before checking for further pending
interrupts. To avoid special-casing KVM, we also implement this logic
for TCG mode.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-07-10 11:31:09 +03:00
Liu, Jinsong
a75b3e0f64 kvm: expose tsc deadline timer feature to guest
This patch exposes tsc deadline timer feature to guest if
1). in-kernel irqchip is used, and
2). kvm has emulated tsc deadline timer, and
3). user authorize the feature exposing via -cpu or +/- tsc-deadline

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-07-03 17:36:32 -03:00
Andreas Färber
232fc23bed target-i386: Pass X86CPU to do_cpu_{init,sipi}()
Allows to use cpu_reset() in place of cpu_state_reset().

Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2012-06-04 23:00:42 +02:00
Andreas Färber
317ac6201a target-i386: Don't overuse CPUState
Scripted conversion:
  sed -i "s/CPUState/CPUX86State/g" target-i386/*.[hc]
  sed -i "s/#define CPUX86State/#define CPUState/" target-i386/cpu.h

Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: Anthony Liguori <aliguori@us.ibm.com>
2012-03-14 22:20:25 +01:00
Michael S. Tsirkin
7e680753cf kvm: fill in padding to help valgrind
valgrind warns about padding fields which are passed
to vcpu ioctls uninitialized.
This is not an error in practice because kvm ignored padding.
Since the ioctls in question are off data path and
the cost is zero anyway, initialize padding to 0
to suppress these errors.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-07 12:31:48 +02:00
Jan Kiszka
d362e757d3 target-i386: Add infrastructure for reporting TPR MMIO accesses
This will allow the APIC core to file a TPR access report. Depending on
the accelerator and kernel irqchip mode, it will either be delivered
right away or queued for later reporting.

In TCG mode, we can restart the triggering instruction and can therefore
forward the event directly. KVM does not allows us to restart, so we
postpone the delivery of events recording in the user space APIC until
the current instruction is completed.

Note that KVM without in-kernel irqchip will report the address after
the instruction that triggered the access.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-18 12:15:55 +02:00
Gleb Natapov
d1f8663660 kvm: Synchronize cpu state in kvm_arch_stop_on_emulation_error()
Call to kvm_cpu_synchronize_state() is missing.
kvm_arch_stop_on_emulation_error may look at outdated registers here.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
2012-02-16 18:21:46 +02:00
Jan Kiszka
3d4b26494f kvm: Implement kvm_irqchip_in_kernel like kvm_enabled
To both avoid that kvm_irqchip_in_kernel always has to be paired with
kvm_enabled and that the former ends up in a function call, implement it
like the latter. This means keeping the state in a global variable and
defining kvm_irqchip_in_kernel as a preprocessor macro.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-02-08 15:57:50 -02:00
Jan Kiszka
39d6960aab kvm: Allow to set shadow MMU size
Introduce the KVM-specific machine option kvm_shadow_mem. It allows to
set a custom shadow MMU size for the virtual machine. This is useful for
stress testing e.g.

Only x86 supports this for now, but it is in principle a generic
concept for all targets with shadow MMUs.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-02-08 15:57:50 -02:00
Jan Kiszka
680c1c6fd7 kvm: x86: Add user space part for in-kernel APIC
This introduces the alternative APIC device which makes use of KVM's
in-kernel device model. External NMI injection via LINT1 is emulated by
checking the current state of the in-kernel APIC, only injecting a NMI
into the VCPU if LINT1 is unmasked and configured to DM_NMI.

MSI is not yet supported, so we disable this when the in-kernel model is
in use.

CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2012-01-19 12:14:42 +01:00
Jan Kiszka
84b058d7df kvm: Introduce core services for in-kernel irqchip support
Add the basic infrastructure to active in-kernel irqchip support, inject
interrupts into these models, and maintain IRQ routes.

Routing is optional and depends on the host arch supporting
KVM_CAP_IRQ_ROUTING. When it's not available on x86, we looe the HPET as
we can't route GSI0 to IOAPIC pin 2.

In-kernel irqchip support will once be controlled by the machine
property 'kernel_irqchip', but this is not yet wired up.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2012-01-19 12:14:42 +01:00
Vadim Rozenfeld
eab70139a6 hyper-v: initialize Hyper-V CPUID leaves.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-01-19 08:33:12 -02:00
Anthony Liguori
f3c6a169a3 Merge remote-tracking branch 'qemu-kvm/memory/page_desc' into staging
* qemu-kvm/memory/page_desc: (22 commits)
  Remove cpu_get_physical_page_desc()
  sparc: avoid cpu_get_physical_page_desc()
  virtio-balloon: avoid cpu_get_physical_page_desc()
  vhost: avoid cpu_get_physical_page_desc()
  kvm: avoid cpu_get_physical_page_desc()
  memory: remove CPUPhysMemoryClient
  xen: convert to MemoryListener API
  memory: temporarily add memory_region_get_ram_addr()
  xen, vga: add API for registering the framebuffer
  vhost: convert to MemoryListener API
  kvm: convert to MemoryListener API
  kvm: switch kvm slots to use host virtual address instead of ram_addr_t
  memory: add API for observing updates to the physical memory map
  memory: replace cpu_physical_sync_dirty_bitmap() with a memory API
  framebuffer: drop use of cpu_physical_sync_dirty_bitmap()
  loader: remove calls to cpu_get_physical_page_desc()
  framebuffer: drop use of cpu_get_physical_page_desc()
  memory: introduce memory_region_find()
  memory: add memory_region_is_logging()
  memory: add memory_region_is_rom()
  ...
2012-01-03 14:39:05 -06:00
Jan Kiszka
cce47516cd kvm: x86: Drop redundant apic base and tpr update from kvm_get_sregs
The latter was already commented out, the former is redundant as well.
We always get the latest changes after return from the guest via
kvm_arch_post_run.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-22 14:53:01 -02:00
Jan Kiszka
fabacc0f79 kvm: x86: Avoid runtime allocation of xsave buffer
Keep a per-VCPU xsave buffer for kvm_put/get_xsave instead of
continuously allocating and freeing it on state sync.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-22 14:53:01 -02:00
Jan Kiszka
6b42494b21 kvm: x86: Use symbols for all xsave field
Field 0 (FCW+FSW) and 1 (FTW+FOP) were hard-coded so far.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-22 14:53:00 -02:00
Avi Kivity
9f213ed92c kvm: switch kvm slots to use host virtual address instead of ram_addr_t
This simplifies a later switch to the memory API in slot management.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-12-20 14:14:07 +02:00
Vagrant Cascadian
126197214e Fix typo: runnning -> running
One n too many for running, need we say more.

Signed-Off-By: Vagrant Cascadian <vagrant@freegeek.org>

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-11-17 12:57:41 +00:00
Avi Kivity
21e87c4625 i386: wire up MSR_IA32_MISC_ENABLE
It's needed for its default value - bit 0 specifies that "rep movs" is
good enough for memcpy, and Linux may use a slower memcpu if it is not set,
depending on cpu family/model.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-10-24 21:33:32 -02:00
Liu, Jinsong
aa82ba549a kvm: support TSC deadline MSR with subsection
KVM add emulation of lapic tsc deadline timer for guest.
This patch is co-operation work at qemu side.

Use subsections to save/restore the field (mtosatti).

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-10-24 21:29:07 -02:00
Marcelo Tosatti
38d2c27ea6 Revert "kvm: support TSC deadline MSR"
This reverts commit bfc2455ddb.
New patch with subsections will follow.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-10-24 21:27:16 -02:00
Liu, Jinsong
bfc2455ddb kvm: support TSC deadline MSR
KVM add emulation of lapic tsc deadline timer for guest.
This patch is co-operation work at qemu side.

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-10-03 13:53:14 -03:00
Luiz Capitulino
1354869c38 Drop the vm_running global variable
Use runstate_is_running() instead, which is introduced by this commit.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-09-15 16:39:32 -03:00
Luiz Capitulino
1dfb4dd993 Replace the VMSTOP macros with a proper state type
Today, when notifying a VM state change with vm_state_notify(),
we pass a VMSTOP macro as the 'reason' argument. This is not ideal
because the VMSTOP macros tell why qemu stopped and not exactly
what the current VM state is.

One example to demonstrate this problem is that vm_start() calls
vm_state_notify() with reason=0, which turns out to be VMSTOP_USER.

This commit fixes that by replacing the VMSTOP macros with a proper
state type called RunState.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-09-15 16:39:32 -03:00
Blue Swirl
6115c0a82d kvm: remove unnecessary assignments
Avoid these warnings from clang analyzer:
/src/qemu/target-i386/kvm.c:772:5: warning: Value stored to 'cwd' is never read
    cwd = swd = twd = 0;
/src/qemu/target-i386/kvm.c:772:11: warning: Although the value stored to 'swd' is used in the enclosing expression, the value is never actually read from 'swd'
    cwd = swd = twd = 0;

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-09-10 14:50:00 +00:00
Stefan Weil
541dc0d47f Use new macro QEMU_PACKED for packed structures
Most changes were made using these commands:

git grep -la '__attribute__((packed))'|xargs perl -pi -e 's/__attribute__\(\(packed\)\)/QEMU_PACKED/'
git grep -la '__attribute__ ((packed))'|xargs perl -pi -e 's/__attribute__ \(\(packed\)\)/QEMU_PACKED/'
git grep -la '__attribute__((__packed__))'|xargs perl -pi -e 's/__attribute__\(\(__packed__\)\)/QEMU_PACKED/'
git grep -la '__attribute__ ((__packed__))'|xargs perl -pi -e 's/__attribute__ \(\(__packed__\)\)/QEMU_PACKED/'
git grep -la '__attribute((packed))'|xargs perl -pi -e 's/__attribute\(\(packed\)\)/QEMU_PACKED/'

Whitespace in linux-user/syscall_defs.h was fixed manually
to avoid warnings from scripts/checkpatch.pl.

Manual changes were also applied to hw/pc.c.

I did not fix indentation with tabs in block/vvfat.c.
The patch will show 4 errors with scripts/checkpatch.pl.

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-09-03 10:45:59 +00:00
Jan Kiszka
fdc9c41a5a Fix up some style nits of last uq/master merge
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-08-25 09:21:35 +01:00
Anthony Liguori
7267c0947d Use glib memory allocation and free functions
qemu_malloc/qemu_free no longer exist after this commit.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-20 23:01:08 -05:00
Marcelo Tosatti
ac363c9042 Remove #ifdef KVM_CAP_TSC_CONTROL
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-08-05 12:10:55 -03:00
Joerg Roedel
e7429073ed qemu-x86: Set tsc_khz in kvm when supported
Make use of the KVM_TSC_CONTROL feature if available.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-08-05 12:04:27 -03:00
Andre Przywara
31e8c69697 KVM: Fix XSAVE feature bit enumeration
When iterating through the XSAVE feature enumeration CPUID leaf (0xD)
we should not stop at the first zero EAX, but instead keep scanning
since there are gaps in the enumeration (ECX=1 for instance).
This fixes the proper usage of AVX in KVM guests.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-06-20 15:26:21 -03:00
Jan Kiszka
ba9bc59e1f kvm: x86: Pass KVMState to kvm_arch_get_supported_cpuid
kvm_arch_get_supported_cpuid checks for global cpuid restrictions, it
does not require any CPUState reference. Changing its interface allows
to call it before any VCPU is initialized.

CC: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-06-20 15:24:00 -03:00
Jan Kiszka
e56ff191b6 kvm: x86: Drop KVM_CAP build dependencies
No longer needed with accompanied kernel headers.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-06-20 15:19:54 -03:00
Jan Kiszka
5802e066eb kvm: Drop CONFIG_KVM_PARA
The kvm_para.h header is now always available.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-06-20 15:16:45 -03:00
Jan Kiszka
42cc8fa620 kvm: x86: Save/restore FPU OP, IP and DP
These FPU states are properly maintained by KVM but not yet by TCG. So
far we unconditionally set them to 0 in the guest which may cause
state corruptions, though not with modern guests.

To avoid breaking backward migration, use a conditional subsection that
is only written if any of the three fields is non-zero. The guest's
FNINIT clears them frequently, and cleared IA32_MISC_ENABLE MSR[2]
reduces the probability of non-zero values further so that this
subsection is not expected to restrict migration in any common scenario.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-06-19 15:57:56 +03:00
brillywu@viatech.com.cn
b3baa152aa kvm: Add CPUID support for VIA CPU
When KVM is running on VIA CPU with host cpu's model, the
feautures of VIA CPU will be passed into kvm guest by calling
the CPUID instruction for Centaur.

Signed-off-by: BrillyWu<brillywu@viatech.com.cn>
Signed-off-by: KaryJin<karyjin@viatech.com.cn>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-06-14 09:34:35 -05:00
Paolo Bonzini
4a043713b3 kvm: use qemu_free consistently
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-05-02 09:51:59 -03:00
Glauber Costa
0c31b744f6 kvm: use kernel-provided para_features instead of statically coming up with new capabilities
Use the features provided by KVM_GET_SUPPORTED_CPUID directly to
mask out features from guest-visible cpuid.

The old get_para_features() mechanism is kept for older kernels that do not implement it.

Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-02 08:24:42 -03:00
Jan Kiszka
f2574737f6 kvm: x86: Push kvm_arch_debug to kvm_arch_handle_exit
There are no generic bits remaining in the handling of KVM_EXIT_DEBUG.
So push its logic completely into arch hands, i.e. only x86 so far.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:06 -03:00
Jan Kiszka
2a4dac8350 kvm: x86: Reorder functions in kvm.c
Required for next patch which will access guest debug services from
kvm_arch_handle_exit. No functional changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:06 -03:00
Jan Kiszka
bb4ea39329 kvm: Align kvm_arch_handle_exit to kvm_cpu_exec changes
Make the return code of kvm_arch_handle_exit directly usable for
kvm_cpu_exec. This is straightforward for x86 and ppc, just s390
would require more work. Avoid this for now by pushing the return code
translation logic into s390's kvm_arch_handle_exit.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:06 -03:00
Jan Kiszka
0c03266a96 kvm: x86: Synchronize PAT MSR with the kernel
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:05 -03:00
Jan Kiszka
4601f7b04c kvm: x86: Do not leave halt if interrupts are disabled
When an external interrupt is pending but IF is cleared, we must not
leave the halt state prematurely.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:05 -03:00
Huang Ying
3c85e74fbf KVM, MCE, unpoison memory address across reboot
In Linux kernel HWPoison processing implementation, the virtual
address in processes mapping the error physical memory page is marked
as HWPoison.  So that, the further accessing to the virtual
address will kill corresponding processes with SIGBUS.

If the error physical memory page is used by a KVM guest, the SIGBUS
will be sent to QEMU, and QEMU will simulate a MCE to report that
memory error to the guest OS.  If the guest OS can not recover from
the error (for example, the page is accessed by kernel code), guest OS
will reboot the system.  But because the underlying host virtual
address backing the guest physical memory is still poisoned, if the
guest system accesses the corresponding guest physical memory even
after rebooting, the SIGBUS will still be sent to QEMU and MCE will be
simulated.  That is, guest system can not recover via rebooting.

In fact, across rebooting, the contents of guest physical memory page
need not to be kept.  We can allocate a new host physical page to
back the corresponding guest physical address.

This patch fixes this issue in QEMU-KVM via calling qemu_ram_remap()
to clear the corresponding page table entry, so that make it possible
to allocate a new page to recover the issue.

[ Jan: rebasing and tiny cleanups]

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:06 -03:00
Jan Kiszka
75d4949733 kvm: x86: Fail kvm_arch_init_vcpu if MCE initialization fails
There is no reason to continue if the kernel claims to support MCE but
then fails to process our request.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:06 -03:00
Jan Kiszka
32a420243c kvm: x86: Clean up kvm_setup_mce
There is nothing to abstract here. Fold kvm_setup_mce into its caller
and fix up the error reporting (return code of kvm_vcpu_ioctl holds the
error value).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:06 -03:00
Jan Kiszka
c34d440a72 kvm: x86: Consolidate TCG and KVM MCE injection code
This switches KVM's MCE injection path to cpu_x86_inject_mce, both for
SIGBUS and monitor initiated events. This means we prepare the MCA MSRs
in the VCPUState also for KVM.

We have to drop the MSRs writeback restrictions for this purpose which
is now safe as every uncoordinated MSR injection is removed with this
patch.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:06 -03:00
Jan Kiszka
ab443475c9 kvm: x86: Inject pending MCE events on state writeback
The current way of injecting MCE events without updating of and
synchronizing with the CPUState is broken and causes spurious
corruptions of the MCE-related parts of the CPUState.

As a first step towards a fix, enhance the state writeback code with
support for injecting events that are pending in the CPUState. A pending
exception will then be signaled via cpu_interrupt(CPU_INTERRUPT_MCE).
And, just like for TCG, we need to leave the halt state when
CPU_INTERRUPT_MCE is pending (left broken for the to-be-removed old KVM
code).

This will also allow to unify TCG and KVM injection code.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:05 -03:00
Jan Kiszka
990368650f kvm: Rename kvm_arch_process_irqchip_events to async_events
We will broaden the scope of this function on x86 beyond irqchip events.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:05 -03:00
Jan Kiszka
419fb20a8e kvm: x86: Move MCE functions together
Pure function suffling to avoid multiple #ifdef KVM_CAP_MCE sections,
no functional changes. While at it, annotate some #ifdef sections.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:05 -03:00
Jan Kiszka
b7680cb607 Refactor thread retrieval and check
We have qemu_cpu_self and qemu_thread_self. The latter is retrieving the
current thread, the former is checking for equality (using CPUState). We
also have qemu_thread_equal which is only used like qemu_cpu_self.

This refactors the interfaces, creating qemu_cpu_is_self and
qemu_thread_is_self as well ass qemu_thread_get_self.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-03-13 14:44:21 +00:00
Jan Kiszka
ce377af399 kvm: x86: Catch and report failing IRQ and NMI injections
We do not need to abort, but the user should be notified that weird
things go on.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Jan Kiszka
7a39fe5882 kvm: Drop return values from kvm_arch_pre/post_run
We do not check them, and the only arch with non-empty implementations
always returns 0 (this is also true for qemu-kvm).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Jan Kiszka
db1669bcca kvm: x86: Prepare VCPU loop for in-kernel irqchip
Effectively no functional change yet as kvm_irqchip_in_kernel still only
returns 0, but this patch will allow qemu-kvm to adopt the VCPU loop of
upsteam KVM.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Jan Kiszka
6792a57bf1 kvm: Separate TCG from KVM cpu execution
Mixing up TCG bits with KVM already led to problems around eflags
emulation on x86. Moreover, quite some code that TCG requires on cpu
enty/exit is useless for KVM. So dispatch between tcg_cpu_exec and
kvm_cpu_exec as early as possible.

The core logic of cpu_halted from cpu_exec is added to
kvm_arch_process_irqchip_events. Moving away from cpu_exec makes
exception_index meaningless for KVM, we can simply pass the exit reason
directly (only "EXCP_DEBUG vs. rest" is relevant).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Glauber Costa
b8cc45d6a6 kvm: make tsc stable over migration and machine start
If the machine is stopped, we should not record two different tsc values
upon a save operation. The same problem happens with kvmclock.

But kvmclock is taking a different diretion, being now seen as a separate
device. Since this is unlikely to happen with the tsc, I am taking the
approach here of simply registering a handler for state change, and
using a per-CPUState variable that prevents double updates for the TSC.

Signed-off-by: Glauber Costa <glommer@redhat.com>
CC: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Jan Kiszka
9ccfac9ea4 kvm: Unconditionally reenter kernel after IO exits
KVM requires to reenter the kernel after IO exits in order to complete
instruction emulation. Failing to do so will leave the kernel state
inconsistently behind. To ensure that we will get back ASAP, we issue a
self-signal that will cause KVM_RUN to return once the pending
operations are completed.

We can move kvm_arch_process_irqchip_events out of the inner VCPU loop.
The only state that mattered at its old place was a pending INIT
request. Catch it in kvm_arch_pre_run and also trigger a self-signal to
process the request on next kvm_cpu_exec.

This patch also fixes the missing exit_request check in kvm_cpu_exec in
the CONFIG_IOTHREAD case.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:45 -02:00
Jan Kiszka
a1b87fe046 kvm: Provide sigbus services arch-independently
Provide arch-independent kvm_on_sigbus* stubs to remove the #ifdef'ery
from cpus.c. This patch also fixes --disable-kvm build by providing the
missing kvm_on_sigbus_vcpu kvm-stub.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:45 -02:00
Anthony Liguori
b3a98367ee Merge remote branch 'qemu-kvm/uq/master' into staging
aliguori: fix build with !defined(KVM_CAP_ASYNC_PF)

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-02-01 15:32:18 -06:00
Jan Kiszka
c5999bfcfd kvm: x86: Only read/write MSR_KVM_ASYNC_PF_EN if supported
If the kernel does not support KVM_CAP_ASYNC_PF, it also does not know
about the related MSR. So skip it during state synchronization in that
case. Fixes annoying kernel warnings.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:22 -02:00
Jan Kiszka
d8f771d912 kvm: x86: Implicitly clear nmi_injected/pending on reset
All CPUX86State variables before CPU_COMMON are automatically cleared on
reset. Reorder nmi_injected and nmi_pending to avoid having to touch
them explicitly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:22 -02:00
Jan Kiszka
110761987d kvm: x86: Rework identity map and TSS setup for larger BIOS sizes
In order to support loading BIOSes > 256K, reorder the code, adjusting
the base if the kernel supports moving the identity map.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:22 -02:00
Jan Kiszka
94a8d39afd kvm: Consolidate must-have capability checks
Instead of splattering the code with #ifdefs and runtime checks for
capabilities we cannot work without anyway, provide central test
infrastructure for verifying their availability both at build and
runtime.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:22 -02:00
Jan Kiszka
cad1e2827b kvm: Drop smp_cpus argument from init functions
No longer used.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
521f0798d7 kvm: x86: Fix !CONFIG_KVM_PARA build
If we lack kvm_para.h, MSR_KVM_ASYNC_PF_EN is not defined. The change in
kvm_arch_init_vcpu is just for consistency reasons.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
ff5c186b8b kvm: x86: Reset paravirtual MSRs
Make sure to write the cleared MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK,
and MSR_KVM_ASYNC_PF_EN to the kernel state so that a freshly booted
guest cannot be disturbed by old values.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Glauber Costa <glommer@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
c3a3a7d356 kvm: x86: Refactor msr_star/hsave_pa setup and checks
Simplify kvm_has_msr_star/hsave_pa to booleans and push their one-time
initialization into kvm_arch_init. Also handle potential errors of that
setup procedure.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
1a5e9d2faf kvm: x86: Fix xcr0 reset mismerge
For unknown reasons, xcr0 reset ended up in kvm_arch_update_guest_debug
on upstream merge. Fix this and also remove the misleading comment (1 is
THE reset value).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
3390e7f797 kvm: x86: Remove redundant mp_state initialization
kvm_arch_reset_vcpu initializes mp_state, and that function is invoked
right after kvm_arch_init_vcpu.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
c14750e8ad kvm: x86: Prepare kvm_get_mp_state for in-kernel irqchip
This code path will not yet be taken as we still lack in-kernel irqchip
support. But qemu-kvm can already make use of it and drop its own
mp_state access services.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
0d75a9ecd7 kvm: x86: Align kvm_arch_put_registers code with comment
The ordering doesn't matter in this case, but better keep it consistent.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
bb44e0d12d kvm: Improve reporting of fatal errors
Report KVM_EXIT_UNKNOWN, KVM_EXIT_FAIL_ENTRY, and KVM_EXIT_EXCEPTION
with more details to stderr. The latter two are so far x86-only, so move
them into the arch-specific handler. Integrate the Intel real mode
warning on KVM_EXIT_FAIL_ENTRY that qemu-kvm carries, but actually
restrict it to Intel CPUs. Moreover, always dump the CPU state in case
we fail.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:20 -02:00
Jan Kiszka
73aaec4a39 kvm: Stop on all fatal exit reasons
Ensure that we stop the guest whenever we face a fatal or unknown exit
reason. If we stop, we also have to enforce a cpu loop exit.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:20 -02:00
Jan Kiszka
646042e1ab kvm: x86: Swallow KVM_EXIT_SET_TPR
This exit only triggers activity in the common exit path, but we should
accept it in order to be able to detect unknown exit types.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:20 -02:00
Jan Kiszka
b9bec74bcb kvm: x86: Fix a few coding style violations
No functional changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-21 14:05:22 -02:00
Jan Kiszka
95c077c919 kvm: x86: Prevent sign extension of DR7 in guest debugging mode
This unbreaks guest debugging when the 4th hardware breakpoint used for
guest debugging is a watchpoint of 4 or 8 byte lenght. The 31st bit of
DR7 is set in that case and used to cause a sign extension to the high
word which was breaking the guest state (vm entry failure).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-21 14:05:22 -02:00
Jan Kiszka
3980e3024b kvm: x86: Remove obsolete SS.RPL/DPL aligment
This seems to date back to the days KVM didn't support real mode. The
check is no longer needed and, even worse, is corrupting the guest state
in case SS.RPL != DPL.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-21 14:05:22 -02:00
Jan Kiszka
acaa75507b kvm: x86: Fix DPL write back of segment registers
The DPL is stored in the flags and not in the selector. In fact, the RPL
may differ from the DPL at some point in time, and so we were corrupting
the guest state so far.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
7cc2cc3e26 kvm: introduce kvm_inject_x86_mce_on
Pass a table instead of multiple args.

Note:

    kvm_inject_x86_mce(env, bank, status, mcg_status, addr, misc,
                       abort_on_error);

is equal to:

    struct kvm_x86_mce mce = {
        .bank = bank,
        .status = status,
        .mcg_status = mcg_status,
        .addr = addr,
        .misc = misc,
    };
    kvm_inject_x86_mce_on(env, &mce, abort_on_error);

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
e387c33892 kvm: kvm_mce_inj_* subroutines for templated error injections
Refactor codes for maintainability.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
6643e2f001 kvm: introduce kvm_mce_in_progress
Share same error handing, and rename this function after
MCIP (Machine Check In Progress) flag.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
2bd3e04c3b Add function for checking mca broadcast of CPU
Add function for checking whether current CPU support mca broadcast.

Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
31ce5e0c49 Add "broadcast" option for mce command
When the following test case is injected with mce command, maybe user could not
get the expected result.
    DATA
               command cpu bank status             mcg_status  addr   misc
        (qemu) mce     1   1    0xbd00000000000000 0x05        0x1234 0x8c

    Expected Result
           panic type: "Fatal Machine check"

That is because each mce command can only inject the given cpu and could not
inject mce interrupt to other cpus. So user will get the following result:
    panic type: "Fatal machine check on current CPU"

"broadcast" option is used for injecting dummy data into other cpus. Injecting
mce with this option the expected result could be gotten.

Usage:
    Broadcast[on]
           command broadcast cpu bank status             mcg_status  addr   misc
    (qemu) mce     -b        1   1    0xbd00000000000000 0x05        0x1234 0x8c

    Broadcast[off]
           command cpu bank status             mcg_status  addr   misc
    (qemu) mce     1   1    0xbd00000000000000 0x05        0x1234 0x8c

Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Lai Jiangshan
14a095184c kvm: convert kvm_ioctl(KVM_CHECK_EXTENSION) to kvm_check_extension()
simple cleanup and use existing helper: kvm_check_extension().

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:21 -02:00
Lai Jiangshan
276ce81563 kvm: Enable user space NMI injection for kvm guest
Make use of the new KVM_NMI IOCTL to send NMIs into the KVM guest if the
user space raised them. (example: qemu monitor's "nmi" command)

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:21 -02:00
Gleb Natapov
f6584ee203 Add support for async page fault to qemu
Add save/restore of MSR for migration and cpuid bit.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-11-16 08:40:07 -06:00
Marcelo Tosatti
25d2e3613d kvm: save/restore x86-64 MSRs on x86-64 kernels
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:37:41 -02:00
Marcelo Tosatti
384331a61a kvm: writeback SMP TSCs on migration only
commit 6389c45441269baa2873e6feafebd17105ddeaf6
Author: Jan Kiszka <jan.kiszka@siemens.com>
Date:   Mon Mar 1 18:17:26 2010 +0100

    qemu-kvm: Cleanup/fix TSC and PV clock writeback

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:37:40 -02:00
Marcelo Tosatti
75b10c4336 kvm: factor out kvm_has_msr_star
And add kvm_has_msr_hsave_pa(), to avoid warnings on older
kernels without support.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:37:29 -02:00
Marcelo Tosatti
aa851e365b kvm: add save/restore of MSR_VM_HSAVE_PA
commit 2bba4446746add456ceeb0e8359a43032a2ea333
Author: Alexander Graf <agraf@suse.de>
Date:   Thu Dec 18 15:38:32 2008 +0100

    Enable nested SVM support in userspace

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:37:17 -02:00
Hidetoshi Seto
d8da8574b1 Fix build on !KVM_CAP_MCE
This patch removes following warnings:

target-i386/kvm.c: In function 'kvm_put_msrs':
target-i386/kvm.c:782: error: unused variable 'i'
target-i386/kvm.c: In function 'kvm_get_msrs':
target-i386/kvm.c:1083: error: label at end of compound statement

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:12:14 -02:00
Hidetoshi Seto
f71ac88fe9 x86, mce: broadcast mce depending on the cpu version
There is no reason why SRAO event received by the main thread
is the only one that being broadcasted.

According to the x86 ASDM vol.3A 15.10.4.1,
MCE signal is broadcast on processor version 06H_EH or later.

This change is required to handle SRAR in smp guests.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:11:38 -02:00
Hidetoshi Seto
f8502cfbbf x86, mce: ignore SRAO only when MCG_SER_P is available
And restruct this block to call kvm_mce_in_exception() only when it is
required.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:09:06 -02:00
Marcelo Tosatti
0f53994f7a Fix memory leak in register save load due to xsave support
From: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-20 16:15:04 -05:00
Marcelo Tosatti
577804958a Add savevm/loadvm support for MCE
Port qemu-kvm's

commit 1bab5d11545d8de5facf46c28630085a2f9651ae
Author: Huang Ying <ying.huang@intel.com>
Date:   Wed Mar 3 16:52:46 2010 +0800

    Add savevm/loadvm support for MCE

    MCE registers are saved/load into/from CPUState in
    kvm_arch_save/load_regs. To simulate the MCG_STATUS clearing upon
    reset, MSR_MCG_STATUS is set to 0 for KVM_PUT_RESET_STATE.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-20 16:15:04 -05:00
Marcelo Tosatti
c0532a76b4 MCE: Relay UCR MCE to guest
Port qemu-kvm's

commit 4b62fff1101a7ad77553147717a8bd3bf79df7ef
Author: Huang Ying <ying.huang@intel.com>
Date:   Mon Sep 21 10:43:25 2009 +0800

    MCE: Relay UCR MCE to guest

    UCR (uncorrected recovery) MCE is supported in recent Intel CPUs,
    where some hardware error such as some memory error can be reported
    without PCC (processor context corrupted). To recover from such MCE,
    the corresponding memory will be unmapped, and all processes accessing
    the memory will be killed via SIGBUS.

    For KVM, if QEMU/KVM is killed, all guest processes will be killed
    too. So we relay SIGBUS from host OS to guest system via a UCR MCE
    injection. Then guest OS can isolate corresponding memory and kill
    necessary guest processes only. SIGBUS sent to main thread (not VCPU
    threads) will be broadcast to all VCPU threads as UCR MCE.

aliguori: fix build

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-10-20 16:15:04 -05:00
Marcelo Tosatti
e7701825e1 kvm: x86: add mce support
Port qemu-kvm's MCE support

commit c68b2374c9048812f488e00ffb95db66c0bc07a7
Author: Huang Ying <ying.huang@intel.com>
Date:   Mon Jul 20 10:00:53 2009 +0800

    Add MCE simulation support to qemu/kvm

    KVM ioctls are used to initialize MCE simulation and inject MCE. The
    real MCE simulation is implemented in Linux kernel. The Kernel part
    has been merged.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-20 16:15:04 -05:00
Joerg Roedel
296acb643b Add svm cpuid features
This patch adds the svm cpuid feature flags to the qemu
intialization path. It also adds the svm features available
on phenom to its cpu-definition and extends the host cpu
type to support all svm features KVM can provide.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-20 16:15:04 -05:00
Sheng Yang
408392b30d kvm: Fix cpu_is_bsp() compilation warning
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-06-28 13:06:03 -03:00
Marcelo Tosatti
ddced198cd kvm: init mp_state
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-06-28 13:06:03 -03:00
Sheng Yang
f1665b21f1 kvm: Enable XSAVE live migration support
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-06-28 13:06:03 -03:00
Sheng Yang
c958a8bd9b kvm: Extend kvm_arch_get_supported_cpuid() to support index
Would use it later for XSAVE related CPUID.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-06-28 13:06:03 -03:00
Blue Swirl
4a942ceac7 apic: avoid passing CPUState from CPU code
Pass only APICState when accessing APIC from CPU code.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-06-19 10:42:31 +03:00
Gleb Natapov
c1667e4097 kvm: fix 80000001.EDX supported bit filtering
On AMD some bits from 1.EDX are reported in 80000001.EDX. The mask used
to copy bits from 1.EDX to 80000001.EDX is incorrect resulting in
unsupported features passed into a guest.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-11 17:07:36 -03:00