Commit Graph

148 Commits

Author SHA1 Message Date
Jan Kiszka
db1669bcca kvm: x86: Prepare VCPU loop for in-kernel irqchip
Effectively no functional change yet as kvm_irqchip_in_kernel still only
returns 0, but this patch will allow qemu-kvm to adopt the VCPU loop of
upsteam KVM.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Jan Kiszka
6792a57bf1 kvm: Separate TCG from KVM cpu execution
Mixing up TCG bits with KVM already led to problems around eflags
emulation on x86. Moreover, quite some code that TCG requires on cpu
enty/exit is useless for KVM. So dispatch between tcg_cpu_exec and
kvm_cpu_exec as early as possible.

The core logic of cpu_halted from cpu_exec is added to
kvm_arch_process_irqchip_events. Moving away from cpu_exec makes
exception_index meaningless for KVM, we can simply pass the exit reason
directly (only "EXCP_DEBUG vs. rest" is relevant).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Glauber Costa
b8cc45d6a6 kvm: make tsc stable over migration and machine start
If the machine is stopped, we should not record two different tsc values
upon a save operation. The same problem happens with kvmclock.

But kvmclock is taking a different diretion, being now seen as a separate
device. Since this is unlikely to happen with the tsc, I am taking the
approach here of simply registering a handler for state change, and
using a per-CPUState variable that prevents double updates for the TSC.

Signed-off-by: Glauber Costa <glommer@redhat.com>
CC: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Jan Kiszka
9ccfac9ea4 kvm: Unconditionally reenter kernel after IO exits
KVM requires to reenter the kernel after IO exits in order to complete
instruction emulation. Failing to do so will leave the kernel state
inconsistently behind. To ensure that we will get back ASAP, we issue a
self-signal that will cause KVM_RUN to return once the pending
operations are completed.

We can move kvm_arch_process_irqchip_events out of the inner VCPU loop.
The only state that mattered at its old place was a pending INIT
request. Catch it in kvm_arch_pre_run and also trigger a self-signal to
process the request on next kvm_cpu_exec.

This patch also fixes the missing exit_request check in kvm_cpu_exec in
the CONFIG_IOTHREAD case.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:45 -02:00
Jan Kiszka
a1b87fe046 kvm: Provide sigbus services arch-independently
Provide arch-independent kvm_on_sigbus* stubs to remove the #ifdef'ery
from cpus.c. This patch also fixes --disable-kvm build by providing the
missing kvm_on_sigbus_vcpu kvm-stub.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:45 -02:00
Anthony Liguori
b3a98367ee Merge remote branch 'qemu-kvm/uq/master' into staging
aliguori: fix build with !defined(KVM_CAP_ASYNC_PF)

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-02-01 15:32:18 -06:00
Jan Kiszka
c5999bfcfd kvm: x86: Only read/write MSR_KVM_ASYNC_PF_EN if supported
If the kernel does not support KVM_CAP_ASYNC_PF, it also does not know
about the related MSR. So skip it during state synchronization in that
case. Fixes annoying kernel warnings.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:22 -02:00
Jan Kiszka
d8f771d912 kvm: x86: Implicitly clear nmi_injected/pending on reset
All CPUX86State variables before CPU_COMMON are automatically cleared on
reset. Reorder nmi_injected and nmi_pending to avoid having to touch
them explicitly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:22 -02:00
Jan Kiszka
110761987d kvm: x86: Rework identity map and TSS setup for larger BIOS sizes
In order to support loading BIOSes > 256K, reorder the code, adjusting
the base if the kernel supports moving the identity map.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:22 -02:00
Jan Kiszka
94a8d39afd kvm: Consolidate must-have capability checks
Instead of splattering the code with #ifdefs and runtime checks for
capabilities we cannot work without anyway, provide central test
infrastructure for verifying their availability both at build and
runtime.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:22 -02:00
Jan Kiszka
cad1e2827b kvm: Drop smp_cpus argument from init functions
No longer used.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
521f0798d7 kvm: x86: Fix !CONFIG_KVM_PARA build
If we lack kvm_para.h, MSR_KVM_ASYNC_PF_EN is not defined. The change in
kvm_arch_init_vcpu is just for consistency reasons.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
ff5c186b8b kvm: x86: Reset paravirtual MSRs
Make sure to write the cleared MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK,
and MSR_KVM_ASYNC_PF_EN to the kernel state so that a freshly booted
guest cannot be disturbed by old values.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Glauber Costa <glommer@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
c3a3a7d356 kvm: x86: Refactor msr_star/hsave_pa setup and checks
Simplify kvm_has_msr_star/hsave_pa to booleans and push their one-time
initialization into kvm_arch_init. Also handle potential errors of that
setup procedure.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
1a5e9d2faf kvm: x86: Fix xcr0 reset mismerge
For unknown reasons, xcr0 reset ended up in kvm_arch_update_guest_debug
on upstream merge. Fix this and also remove the misleading comment (1 is
THE reset value).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
3390e7f797 kvm: x86: Remove redundant mp_state initialization
kvm_arch_reset_vcpu initializes mp_state, and that function is invoked
right after kvm_arch_init_vcpu.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
c14750e8ad kvm: x86: Prepare kvm_get_mp_state for in-kernel irqchip
This code path will not yet be taken as we still lack in-kernel irqchip
support. But qemu-kvm can already make use of it and drop its own
mp_state access services.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
0d75a9ecd7 kvm: x86: Align kvm_arch_put_registers code with comment
The ordering doesn't matter in this case, but better keep it consistent.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
bb44e0d12d kvm: Improve reporting of fatal errors
Report KVM_EXIT_UNKNOWN, KVM_EXIT_FAIL_ENTRY, and KVM_EXIT_EXCEPTION
with more details to stderr. The latter two are so far x86-only, so move
them into the arch-specific handler. Integrate the Intel real mode
warning on KVM_EXIT_FAIL_ENTRY that qemu-kvm carries, but actually
restrict it to Intel CPUs. Moreover, always dump the CPU state in case
we fail.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:20 -02:00
Jan Kiszka
73aaec4a39 kvm: Stop on all fatal exit reasons
Ensure that we stop the guest whenever we face a fatal or unknown exit
reason. If we stop, we also have to enforce a cpu loop exit.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:20 -02:00
Jan Kiszka
646042e1ab kvm: x86: Swallow KVM_EXIT_SET_TPR
This exit only triggers activity in the common exit path, but we should
accept it in order to be able to detect unknown exit types.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:20 -02:00
Jan Kiszka
b9bec74bcb kvm: x86: Fix a few coding style violations
No functional changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-21 14:05:22 -02:00
Jan Kiszka
95c077c919 kvm: x86: Prevent sign extension of DR7 in guest debugging mode
This unbreaks guest debugging when the 4th hardware breakpoint used for
guest debugging is a watchpoint of 4 or 8 byte lenght. The 31st bit of
DR7 is set in that case and used to cause a sign extension to the high
word which was breaking the guest state (vm entry failure).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-21 14:05:22 -02:00
Jan Kiszka
3980e3024b kvm: x86: Remove obsolete SS.RPL/DPL aligment
This seems to date back to the days KVM didn't support real mode. The
check is no longer needed and, even worse, is corrupting the guest state
in case SS.RPL != DPL.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-21 14:05:22 -02:00
Jan Kiszka
acaa75507b kvm: x86: Fix DPL write back of segment registers
The DPL is stored in the flags and not in the selector. In fact, the RPL
may differ from the DPL at some point in time, and so we were corrupting
the guest state so far.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
7cc2cc3e26 kvm: introduce kvm_inject_x86_mce_on
Pass a table instead of multiple args.

Note:

    kvm_inject_x86_mce(env, bank, status, mcg_status, addr, misc,
                       abort_on_error);

is equal to:

    struct kvm_x86_mce mce = {
        .bank = bank,
        .status = status,
        .mcg_status = mcg_status,
        .addr = addr,
        .misc = misc,
    };
    kvm_inject_x86_mce_on(env, &mce, abort_on_error);

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
e387c33892 kvm: kvm_mce_inj_* subroutines for templated error injections
Refactor codes for maintainability.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
6643e2f001 kvm: introduce kvm_mce_in_progress
Share same error handing, and rename this function after
MCIP (Machine Check In Progress) flag.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
2bd3e04c3b Add function for checking mca broadcast of CPU
Add function for checking whether current CPU support mca broadcast.

Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
31ce5e0c49 Add "broadcast" option for mce command
When the following test case is injected with mce command, maybe user could not
get the expected result.
    DATA
               command cpu bank status             mcg_status  addr   misc
        (qemu) mce     1   1    0xbd00000000000000 0x05        0x1234 0x8c

    Expected Result
           panic type: "Fatal Machine check"

That is because each mce command can only inject the given cpu and could not
inject mce interrupt to other cpus. So user will get the following result:
    panic type: "Fatal machine check on current CPU"

"broadcast" option is used for injecting dummy data into other cpus. Injecting
mce with this option the expected result could be gotten.

Usage:
    Broadcast[on]
           command broadcast cpu bank status             mcg_status  addr   misc
    (qemu) mce     -b        1   1    0xbd00000000000000 0x05        0x1234 0x8c

    Broadcast[off]
           command cpu bank status             mcg_status  addr   misc
    (qemu) mce     1   1    0xbd00000000000000 0x05        0x1234 0x8c

Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Lai Jiangshan
14a095184c kvm: convert kvm_ioctl(KVM_CHECK_EXTENSION) to kvm_check_extension()
simple cleanup and use existing helper: kvm_check_extension().

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:21 -02:00
Lai Jiangshan
276ce81563 kvm: Enable user space NMI injection for kvm guest
Make use of the new KVM_NMI IOCTL to send NMIs into the KVM guest if the
user space raised them. (example: qemu monitor's "nmi" command)

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:21 -02:00
Gleb Natapov
f6584ee203 Add support for async page fault to qemu
Add save/restore of MSR for migration and cpuid bit.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-11-16 08:40:07 -06:00
Marcelo Tosatti
25d2e3613d kvm: save/restore x86-64 MSRs on x86-64 kernels
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:37:41 -02:00
Marcelo Tosatti
384331a61a kvm: writeback SMP TSCs on migration only
commit 6389c45441269baa2873e6feafebd17105ddeaf6
Author: Jan Kiszka <jan.kiszka@siemens.com>
Date:   Mon Mar 1 18:17:26 2010 +0100

    qemu-kvm: Cleanup/fix TSC and PV clock writeback

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:37:40 -02:00
Marcelo Tosatti
75b10c4336 kvm: factor out kvm_has_msr_star
And add kvm_has_msr_hsave_pa(), to avoid warnings on older
kernels without support.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:37:29 -02:00
Marcelo Tosatti
aa851e365b kvm: add save/restore of MSR_VM_HSAVE_PA
commit 2bba4446746add456ceeb0e8359a43032a2ea333
Author: Alexander Graf <agraf@suse.de>
Date:   Thu Dec 18 15:38:32 2008 +0100

    Enable nested SVM support in userspace

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:37:17 -02:00
Hidetoshi Seto
d8da8574b1 Fix build on !KVM_CAP_MCE
This patch removes following warnings:

target-i386/kvm.c: In function 'kvm_put_msrs':
target-i386/kvm.c:782: error: unused variable 'i'
target-i386/kvm.c: In function 'kvm_get_msrs':
target-i386/kvm.c:1083: error: label at end of compound statement

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:12:14 -02:00
Hidetoshi Seto
f71ac88fe9 x86, mce: broadcast mce depending on the cpu version
There is no reason why SRAO event received by the main thread
is the only one that being broadcasted.

According to the x86 ASDM vol.3A 15.10.4.1,
MCE signal is broadcast on processor version 06H_EH or later.

This change is required to handle SRAR in smp guests.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:11:38 -02:00
Hidetoshi Seto
f8502cfbbf x86, mce: ignore SRAO only when MCG_SER_P is available
And restruct this block to call kvm_mce_in_exception() only when it is
required.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:09:06 -02:00
Marcelo Tosatti
0f53994f7a Fix memory leak in register save load due to xsave support
From: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-20 16:15:04 -05:00
Marcelo Tosatti
577804958a Add savevm/loadvm support for MCE
Port qemu-kvm's

commit 1bab5d11545d8de5facf46c28630085a2f9651ae
Author: Huang Ying <ying.huang@intel.com>
Date:   Wed Mar 3 16:52:46 2010 +0800

    Add savevm/loadvm support for MCE

    MCE registers are saved/load into/from CPUState in
    kvm_arch_save/load_regs. To simulate the MCG_STATUS clearing upon
    reset, MSR_MCG_STATUS is set to 0 for KVM_PUT_RESET_STATE.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-20 16:15:04 -05:00
Marcelo Tosatti
c0532a76b4 MCE: Relay UCR MCE to guest
Port qemu-kvm's

commit 4b62fff1101a7ad77553147717a8bd3bf79df7ef
Author: Huang Ying <ying.huang@intel.com>
Date:   Mon Sep 21 10:43:25 2009 +0800

    MCE: Relay UCR MCE to guest

    UCR (uncorrected recovery) MCE is supported in recent Intel CPUs,
    where some hardware error such as some memory error can be reported
    without PCC (processor context corrupted). To recover from such MCE,
    the corresponding memory will be unmapped, and all processes accessing
    the memory will be killed via SIGBUS.

    For KVM, if QEMU/KVM is killed, all guest processes will be killed
    too. So we relay SIGBUS from host OS to guest system via a UCR MCE
    injection. Then guest OS can isolate corresponding memory and kill
    necessary guest processes only. SIGBUS sent to main thread (not VCPU
    threads) will be broadcast to all VCPU threads as UCR MCE.

aliguori: fix build

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-10-20 16:15:04 -05:00
Marcelo Tosatti
e7701825e1 kvm: x86: add mce support
Port qemu-kvm's MCE support

commit c68b2374c9048812f488e00ffb95db66c0bc07a7
Author: Huang Ying <ying.huang@intel.com>
Date:   Mon Jul 20 10:00:53 2009 +0800

    Add MCE simulation support to qemu/kvm

    KVM ioctls are used to initialize MCE simulation and inject MCE. The
    real MCE simulation is implemented in Linux kernel. The Kernel part
    has been merged.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-20 16:15:04 -05:00
Joerg Roedel
296acb643b Add svm cpuid features
This patch adds the svm cpuid feature flags to the qemu
intialization path. It also adds the svm features available
on phenom to its cpu-definition and extends the host cpu
type to support all svm features KVM can provide.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-20 16:15:04 -05:00
Sheng Yang
408392b30d kvm: Fix cpu_is_bsp() compilation warning
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-06-28 13:06:03 -03:00
Marcelo Tosatti
ddced198cd kvm: init mp_state
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-06-28 13:06:03 -03:00
Sheng Yang
f1665b21f1 kvm: Enable XSAVE live migration support
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-06-28 13:06:03 -03:00
Sheng Yang
c958a8bd9b kvm: Extend kvm_arch_get_supported_cpuid() to support index
Would use it later for XSAVE related CPUID.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-06-28 13:06:03 -03:00
Blue Swirl
4a942ceac7 apic: avoid passing CPUState from CPU code
Pass only APICState when accessing APIC from CPU code.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-06-19 10:42:31 +03:00
Gleb Natapov
c1667e4097 kvm: fix 80000001.EDX supported bit filtering
On AMD some bits from 1.EDX are reported in 80000001.EDX. The mask used
to copy bits from 1.EDX to 80000001.EDX is incorrect resulting in
unsupported features passed into a guest.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-11 17:07:36 -03:00
Gleb Natapov
4513d9232b Do not stop VM if emulation failed in userspace.
Continue vcpu execution in case emulation failure happened while vcpu
was in userspace. In this case #UD will be injected into the guest
allowing guest OS to kill offending process and continue.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-11 14:03:44 -03:00
Marcelo Tosatti
0af691d779 kvm: enable smp > 1
Process INIT/SIPI requests and enable -smp > 1.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-11 14:02:22 -03:00
Jan Kiszka
dbaa07c404 kvm: validate context for kvm cpu get/put operations
Validate that KVM vcpu state is only read/written from cpu thread itself
or that cpu is stopped.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-11 14:02:22 -03:00
Sheng Yang
204204308b kvm: allow qemu to set EPT identity mapping address
If we use larger BIOS image than current 256KB, we would need move reserved
TSS and EPT identity mapping pages. Currently TSS support this, but not
EPT.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-26 11:28:35 -03:00
Jan Kiszka
ff44f1a373 KVM: x86: Add debug register saving and restoring
Make use of the new KVM_GET/SET_DEBUGREGS to save/restore the x86 debug
registers.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-04-26 11:28:35 -03:00
Blue Swirl
8c0d577eb5 kvm: avoid collision with dprintf macro in stdio.h, spotted by clang
Fixes clang errors:
  CC    i386-softmmu/kvm.o
/src/qemu/target-i386/kvm.c:40:9: error: 'dprintf' macro redefined
In file included from /src/qemu/target-i386/kvm.c:21:
In file included from /src/qemu/qemu-common.h:27:
In file included from /usr/include/stdio.h:910:
/usr/include/bits/stdio2.h:189:12: note: previous definition is here
  CC    i386-softmmu/kvm-all.o
/src/qemu/kvm-all.c:39:9: error: 'dprintf' macro redefined
In file included from /src/qemu/kvm-all.c:23:
In file included from /src/qemu/qemu-common.h:27:
In file included from /usr/include/stdio.h:910:
/usr/include/bits/stdio2.h:189:12: note: previous definition is here

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-04-18 14:22:14 +00:00
Andre Przywara
457dfed614 x86/cpuid: remove unnecessary kvm_trim function
Correct me if I am wrong, but kvm_trim looks like a really bloated
implementation of a bitwise AND. So remove this function and replace
it with the real stuff(TM).

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-03-13 16:50:54 +01:00
Jan Kiszka
ea64305139 KVM: x86: Restrict writeback of VCPU state
Do not write nmi_pending, sipi_vector, and mpstate unless we at least go
through a reset. And TSC as well as KVM wallclocks should only be
written on full sync, otherwise we risk to drop some time on state
read-modify-write.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-04 00:29:30 -03:00
Jan Kiszka
ea375f9ab8 KVM: Rework VCPU state writeback API
This grand cleanup drops all reset and vmsave/load related
synchronization points in favor of four(!) generic hooks:

- cpu_synchronize_all_states in qemu_savevm_state_complete
  (initial sync from kernel before vmsave)
- cpu_synchronize_all_post_init in qemu_loadvm_state
  (writeback after vmload)
- cpu_synchronize_all_post_init in main after machine init
- cpu_synchronize_all_post_reset in qemu_system_reset
  (writeback after system reset)

These writeback points + the existing one of VCPU exec after
cpu_synchronize_state map on three levels of writeback:

- KVM_PUT_RUNTIME_STATE (during runtime, other VCPUs continue to run)
- KVM_PUT_RESET_STATE   (on synchronous system reset, all VCPUs stopped)
- KVM_PUT_FULL_STATE    (on init or vmload, all VCPUs stopped as well)

This level is passed to the arch-specific VCPU state writing function
that will decide which concrete substates need to be written. That way,
no writer of load, save or reset functions that interact with in-kernel
KVM states will ever have to worry about synchronization again. That
also means that a lot of reasons for races, segfaults and deadlocks are
eliminated.

cpu_synchronize_state remains untouched, just as Anthony suggested. We
continue to need it before reading or writing of VCPU states that are
also tracked by in-kernel KVM subsystems.

Consequently, this patch removes many cpu_synchronize_state calls that
are now redundant, just like remaining explicit register syncs.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-04 00:29:28 -03:00
Jan Kiszka
b0b1d69079 KVM: Rework of guest debug state writing
So far we synchronized any dirty VCPU state back into the kernel before
updating the guest debug state. This was a tribute to a deficite in x86
kernels before 2.6.33. But as this is an arch-dependent issue, it is
better handle in the x86 part of KVM and remove the writeback point for
generic code. This also avoids overwriting the flushed state later on if
user space decides to change some more registers before resuming the
guest.

We furthermore need to reinject guest exceptions via the appropriate
mechanism. That is KVM_SET_GUEST_DEBUG for older kernels and
KVM_SET_VCPU_EVENTS for recent ones. Using both mechanisms at the same
time will cause state corruptions.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-04 00:29:26 -03:00
Paul Brook
35bed8ee91 Move ioport.h out of cpu-all.h
Only include ioport.h where it is actually needed.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-03-01 03:29:21 +00:00
Anthony Liguori
724c689357 Merge remote branch 'qemu-kvm/uq/master' into staging 2010-02-22 16:16:22 -06:00
Jes Sorensen
62f734a0d5 kvm: Kill CR3_CACHE feature references
Remove all references to KVM_CR3_CACHE as it was never implemented.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-02-21 11:41:07 +02:00
Jes Sorensen
4c5b10b7b6 QEMU e820 reservation patch
Hi,

Kevin and I have agreed on the approach for this one now. So here is
the latest version of the patch for QEMU, submitting e820 reservation
entries via fw_cfg.

Cheers,
Jes

Use qemu-cfg to provide the BIOS with an optional table of e820 entries.

Notify the BIOS of the location of the TSS+EPT range to by reserving
it via the e820 table.

This matches a corresponding patch for Seabios, however older versions
of Seabios will default to the hardcoded address range and stay
compatible with current QEMU.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-19 15:53:54 -06:00
Jan Kiszka
19ccb8ea17 KVM: x86: Fix up misreported CPU features
From qemu-kvm: Kernels before 2.6.30 misreported some essential CPU
features via KVM_GET_SUPPORTED_CPUID. Fix them up.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2010-02-03 19:47:34 -02:00
Jan Kiszka
aee028b95d KVM: Request setting of nmi_pending and sipi_vector
The final version of VCPU events in 2.6.33 will allow to skip
nmi_pending and sipi_vector on KVM_SET_VCPU_EVENTS. For now let's write
them unconditionally, which is unproblematic for upstream due to missing
SMP support. Future version which enable SMP will write them only on
reset.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-02-03 19:47:33 -02:00
Gleb Natapov
bb0300dc57 Add KVM paravirt cpuid leaf
Initialize KVM paravirt cpuid leaf and allow user to control guest
visible PV features through -cpu flag.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-13 17:14:16 -06:00
Gleb Natapov
e73223a584 remove pending exception on vcpu reset.
Without this qemu can even start on kvm modules with events support
since default value of exception_injected in zero and this is #DE
exception.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-11 09:56:30 -06:00
Jan Kiszka
31827373f0 kvm: x86: Use separate exception_injected CPUState field
Marcelo correctly remarked that there are usage conflicts between QEMU
core code and KVM /wrt exception_index. So spend a separate field and
also save/restore it properly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-18 11:26:25 -06:00
Glauber Costa
1a03675db1 v2: properly save kvm system time msr registers
Currently, the msrs involved in setting up pvclock are not saved over
migration and/or save/restore. This patch puts their value in special
fields in our CPUState, and deal with them using vmstate.

kvm also has to account for it, by including them in the msr list
for the ioctls.

This is a backport from qemu-kvm.git

[v2: sucessfully build without kerneldir ]

Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-12 07:59:49 -06:00
Jan Kiszka
6fb6d24554 kvm: x86: Fix initial kvm_has_msr_star
KVM_GET_MSR_INDEX_LIST returns -E2BIG when the provided space is too
small for all MSRs. But this is precisely the error we trigger with the
initial request in order to obtain that size. Do not fail in that case.

This caused a subtle corruption of the guest state as MSR_STAR was not
properly saved/restored. The corruption became visible with latest kvm
optimizing the MSR updates.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-07 08:45:47 -06:00
Jan Kiszka
a0fb002c64 kvm: x86: Add support for VCPU event states
This patch extends the qemu-kvm state sync logic with support for
KVM_GET/SET_VCPU_EVENTS, giving access to yet missing exception,
interrupt and NMI states.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03 15:25:57 -06:00
Jan Kiszka
5a2e3c2e11 kvm: x86: Fix merge artifact of f8d926e9 about mp_state
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03 15:25:56 -06:00
Jan Kiszka
0e607a80d3 kvm: x86: Refactor use of interrupt_bitmap
Drop interrupt_bitmap from the cpustate and solely rely on the integer
interupt_injected. This prepares us for the new injected-interrupt
interface, which will deprecate the bitmap, while preserving
compatibility.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-17 08:49:37 -06:00
Jan Kiszka
caa5af0ff3 kvm: Add arch reset handler
Will be required by succeeding changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-17 08:49:37 -06:00
Hollis Blanchard
9bdbe550f0 kvm: Move KVM mp_state accessors to i386-specific code
Unbreaks PowerPC and S390 KVM builds.

Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-12 11:23:55 -06:00
Juan Quintela
3897293825 gcc wants 1st static and then const
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-09-25 19:52:06 +00:00
Jan Kiszka
d9db889faf kvm: Work around borken MSR_GET_INDEX_LIST
Allocate enough memory for KVM_GET_MSR_INDEX_LIST as older kernels shot
far beyond their limits, corrupting user space memory.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-10 13:44:30 -05:00
Andre Przywara
6c0d7ee895 preserve the hypervisor bit while KVM trims the CPUID bits
The KVM kernel will disable all bits in CPUID which are not present in
the host. As this is mostly true for the hypervisor bit (1.ecx),
preserve its value before the trim and restore it afterwards.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-29 13:52:12 -05:00
Andre Przywara
6c1f42fe83 fix KVMs GET_SUPPORTED_CPUID feature usage
If we want to trim the user provided CPUID bits for KVM to be not greater
than that of the host, we should not remove the bits _after_ we sent
them to the kernel.
This fixes the masking of features that are not present on the host by
moving the trim function and it's call from helper.c to kvm.c.
It helps to use -cpu host.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-29 13:52:11 -05:00
Jan Kiszka
f8d926e9cd kvm: x86: Save/restore KVM-specific CPU states
Save and restore all so far neglected KVM-specific CPU states. Handling
the TSC stabilizes migration in KVM mode. The interrupt_bitmap and
mp_state are currently unused, but will become relevant for in-kernel
irqchip support. By including proper saving/restoring already, we avoid
having to increment CPU_SAVE_VERSION later on once again.

v2:
 - initialize mp_state runnable (for the boot CPU)

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-22 10:50:34 -05:00
Mark McLoughlin
76ae317f7c kvm: work around supported cpuid ioctl() brokenness
KVM_GET_SUPPORTED_CPUID has been known to fail to return -E2BIG
when it runs out of entries. Detect this by always trying again
with a bigger table if the ioctl() fills the table.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
2009-05-20 09:12:57 -05:00
Avi Kivity
b827df585b kvm: Add support for querying supported cpu features
kvm does not support all cpu features; add support for dunamically querying
the supported feature set.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-08 15:42:52 -05:00
aliguori
a33609cae0 kvm: Fix cpuid initialization (Jan Kiszka)
Fix (more or less) spurious guest boot failures due to corrupted cpuid
states. The reason was insufficient initialization of cpuid entries
before passing them to the kernel.

At this chance also fix improper entry pointer progression and simplify
the code a bit.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7167 c046a42c-6fe2-441c-8c8c-71466251a162
2009-04-17 20:50:54 +00:00
aliguori
64bf3f4eec kvm: Drop kvm_patch_opcode_byte (Jan Kiszka)
As cpu_memory_rw_debug is now capable of modifying ROM, we can drop our
own patch function.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6906 c046a42c-6fe2-441c-8c8c-71466251a162
2009-03-28 17:51:40 +00:00
aliguori
e22a25c936 Guest debugging support for KVM (Jan Kiszka)
This is a backport of the guest debugging support for the KVM
accelerator that is now part of the KVM tree. It implements the reworked
KVM kernel API for guest debugging (KVM_CAP_SET_GUEST_DEBUG) which is
not yet part of any mainline kernel but will probably be 2.6.30 stuff.
So far supported is x86, but PPC is expected to catch up soon.

Core features are:
 - unlimited soft-breakpoints via code patching
 - hardware-assisted x86 breakpoints and watchpoints

Changes in this version:
 - use generic hook cpu_synchronize_state to transfer registers between
   user space and kvm
 - push kvm_sw_breakpoints into KVMState

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6825 c046a42c-6fe2-441c-8c8c-71466251a162
2009-03-12 20:12:48 +00:00
aliguori
b136a4c6ef KVM: cpuid function 2: store all values (Amit Shah)
Incrementing the array index was missed in the previous series which causes
us to not store all the values.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6617 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-11 21:00:43 +00:00
aliguori
a36b1029e2 KVM: Get all cpuid values from function 2 (Amit Shah)
cpuid function 2 can have multiple values to describe cache behaviour.
Loop till we have fetched all the values.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6567 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-09 15:50:36 +00:00
aliguori
486bd5a2f2 KVM: Fetch sub-leaf cpuid values for functions 4, 0xb, 0xd. (Amit Shah)
CPUID functions 4, 0xb and 0xd have sub-leaf values which depend on the
input value of ECX. Store these values as well.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6566 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-09 15:50:31 +00:00
aliguori
e00b6f8099 KVM: CPUID takes ecx as input value for some functions (Amit Shah)
The CPUID instruction takes the value of ECX as an input parameter
in addition to the value of EAX as the count for functions 4, 0xb
and 0xd. Make sure we pass the value to the instruction.

Also convert to the qemu-style whitespace for the surrounding code.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6565 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-09 15:50:08 +00:00
aliguori
0d0266a53b targets: remove error handling from qemu_malloc() callers (Avi Kivity)
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6530 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-05 22:06:11 +00:00
aliguori
8c6c919e28 kvm-x86: Remove eflags conversion into emulator format (Jan Kiszka)
It seems that the conversion of the kernel-delivered eflags state into
qemu's internal split representation was once needed in an older kvm
design (register read-back may have taken place from inside cpu_exec).
Today it is plain wrong and causes incorrect cpu state reporting (gdb,
monitor) and should also corrupt its saving (savevm, migration). Drop
the related lines.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6461 c046a42c-6fe2-441c-8c8c-71466251a162
2009-01-26 20:32:22 +00:00
aliguori
55308450d4 Initialize msr list size properly in KVM
Hollis Blanchard noticed that the last commit was not sufficient.  We also need
to initialize the msr size in our newly allocated list.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6018 c046a42c-6fe2-441c-8c8c-71466251a162
2008-12-13 20:49:31 +00:00
aliguori
4c9f7372fc Correctly initialize msr list in KVM
I believe this was spotted by Gerd Hoffman but I can't find his patch 
now.  This will cause very subtle corruption on the heap because we 
don't allocate the appropriately sized buffer.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6017 c046a42c-6fe2-441c-8c8c-71466251a162
2008-12-13 20:41:58 +00:00
aliguori
40a2d7050a Fix crash in kvm.c (Stefan Weil)
Fix crash with kvm enabled.

Signed-off-by: Stefan Weil <weil@mail.berlios.de> 
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5984 c046a42c-6fe2-441c-8c8c-71466251a162
2008-12-11 21:03:10 +00:00
aliguori
984b518147 Define kvm_ioctl in the same way as ioctl
The third argument to ioctl is a ... which allows any value to be passed.  In
practice, glibc always treats the argument as a void *.

Do the same thing for the kvm ioctls to keep things consistent with a
traditional ioctl.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5715 c046a42c-6fe2-441c-8c8c-71466251a162
2008-11-13 19:21:00 +00:00
aliguori
053304480a Add missing files to KVM commit.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5629 c046a42c-6fe2-441c-8c8c-71466251a162
2008-11-05 16:29:27 +00:00