Commit Graph

748 Commits

Author SHA1 Message Date
Jan Kiszka
f2574737f6 kvm: x86: Push kvm_arch_debug to kvm_arch_handle_exit
There are no generic bits remaining in the handling of KVM_EXIT_DEBUG.
So push its logic completely into arch hands, i.e. only x86 so far.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:06 -03:00
Jan Kiszka
2a4dac8350 kvm: x86: Reorder functions in kvm.c
Required for next patch which will access guest debug services from
kvm_arch_handle_exit. No functional changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:06 -03:00
Jan Kiszka
bb4ea39329 kvm: Align kvm_arch_handle_exit to kvm_cpu_exec changes
Make the return code of kvm_arch_handle_exit directly usable for
kvm_cpu_exec. This is straightforward for x86 and ppc, just s390
would require more work. Avoid this for now by pushing the return code
translation logic into s390's kvm_arch_handle_exit.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:06 -03:00
Jan Kiszka
0c03266a96 kvm: x86: Synchronize PAT MSR with the kernel
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:05 -03:00
Jan Kiszka
c995b495b9 x86: Save/restore PAT MSR
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:05 -03:00
Jan Kiszka
ebda377f8c x86: Properly reset PAT MSR
Conforming to the Intel spec, set the power-on value of PAT also on
reset, but save it across INIT.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:05 -03:00
Jan Kiszka
4601f7b04c kvm: x86: Do not leave halt if interrupts are disabled
When an external interrupt is pending but IF is cleared, we must not
leave the halt state prematurely.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:05 -03:00
Huang Ying
3c85e74fbf KVM, MCE, unpoison memory address across reboot
In Linux kernel HWPoison processing implementation, the virtual
address in processes mapping the error physical memory page is marked
as HWPoison.  So that, the further accessing to the virtual
address will kill corresponding processes with SIGBUS.

If the error physical memory page is used by a KVM guest, the SIGBUS
will be sent to QEMU, and QEMU will simulate a MCE to report that
memory error to the guest OS.  If the guest OS can not recover from
the error (for example, the page is accessed by kernel code), guest OS
will reboot the system.  But because the underlying host virtual
address backing the guest physical memory is still poisoned, if the
guest system accesses the corresponding guest physical memory even
after rebooting, the SIGBUS will still be sent to QEMU and MCE will be
simulated.  That is, guest system can not recover via rebooting.

In fact, across rebooting, the contents of guest physical memory page
need not to be kept.  We can allocate a new host physical page to
back the corresponding guest physical address.

This patch fixes this issue in QEMU-KVM via calling qemu_ram_remap()
to clear the corresponding page table entry, so that make it possible
to allocate a new page to recover the issue.

[ Jan: rebasing and tiny cleanups]

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:06 -03:00
Jan Kiszka
75d4949733 kvm: x86: Fail kvm_arch_init_vcpu if MCE initialization fails
There is no reason to continue if the kernel claims to support MCE but
then fails to process our request.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:06 -03:00
Jan Kiszka
32a420243c kvm: x86: Clean up kvm_setup_mce
There is nothing to abstract here. Fold kvm_setup_mce into its caller
and fix up the error reporting (return code of kvm_vcpu_ioctl holds the
error value).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:06 -03:00
Jan Kiszka
c34d440a72 kvm: x86: Consolidate TCG and KVM MCE injection code
This switches KVM's MCE injection path to cpu_x86_inject_mce, both for
SIGBUS and monitor initiated events. This means we prepare the MCA MSRs
in the VCPUState also for KVM.

We have to drop the MSRs writeback restrictions for this purpose which
is now safe as every uncoordinated MSR injection is removed with this
patch.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:06 -03:00
Jan Kiszka
d5bfda334a x86: Run qemu_inject_x86_mce on target VCPU
We will use the current TCG-only MCE injection path for KVM as well, and
then this read-modify-write of the target VCPU state has to be performed
synchronously in the corresponding thread.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:05 -03:00
Jan Kiszka
ab443475c9 kvm: x86: Inject pending MCE events on state writeback
The current way of injecting MCE events without updating of and
synchronizing with the CPUState is broken and causes spurious
corruptions of the MCE-related parts of the CPUState.

As a first step towards a fix, enhance the state writeback code with
support for injecting events that are pending in the CPUState. A pending
exception will then be signaled via cpu_interrupt(CPU_INTERRUPT_MCE).
And, just like for TCG, we need to leave the halt state when
CPU_INTERRUPT_MCE is pending (left broken for the to-be-removed old KVM
code).

This will also allow to unify TCG and KVM injection code.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:05 -03:00
Jan Kiszka
990368650f kvm: Rename kvm_arch_process_irqchip_events to async_events
We will broaden the scope of this function on x86 beyond irqchip events.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:05 -03:00
Jan Kiszka
419fb20a8e kvm: x86: Move MCE functions together
Pure function suffling to avoid multiple #ifdef KVM_CAP_MCE sections,
no functional changes. While at it, annotate some #ifdef sections.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:05 -03:00
Jan Kiszka
747461c76b x86: Optionally avoid injecting AO MCEs while others are pending
Allow to tell cpu_x86_inject_mce that it should ignore Action Optional
MCE events when the target VCPU is still processing another one. This
will be used by KVM soon.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:05 -03:00
Jan Kiszka
316378e4d0 x86: Refine error reporting of MCE injection services
As this service is used by the human monitor, make sure that errors get
reported to the right channel, and also raise the verbosity.

This requires to move Monitor typedef in qemu-common.h to resolve the
include dependency.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:05 -03:00
Jan Kiszka
2fa11da0c3 x86: Small cleanups of MCE helpers
Fix some code style issues, use proper headers, and align to cpu_x86
naming scheme. No functional changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:05 -03:00
Jan Kiszka
185592324f x86: Perform implicit mcg_status reset
Reorder mcg_status in CPUState to achieve automatic clearing on reset.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:05 -03:00
Jan Kiszka
ac09878158 x86: Account for MCE in cpu_has_work
MCEs can be injected asynchronously, so they can also terminate the halt
state.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:05 -03:00
Paolo Bonzini
eda48c344f inline cpu_halted into sole caller
All implementations are now the same, and there is only one caller,
so inline the function there.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-03-13 14:44:21 +00:00
Jan Kiszka
b7680cb607 Refactor thread retrieval and check
We have qemu_cpu_self and qemu_thread_self. The latter is retrieving the
current thread, the former is checking for equality (using CPUState). We
also have qemu_thread_equal which is only used like qemu_cpu_self.

This refactors the interfaces, creating qemu_cpu_is_self and
qemu_thread_is_self as well ass qemu_thread_get_self.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-03-13 14:44:21 +00:00
Jan Kiszka
ce377af399 kvm: x86: Catch and report failing IRQ and NMI injections
We do not need to abort, but the user should be notified that weird
things go on.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Jan Kiszka
7a39fe5882 kvm: Drop return values from kvm_arch_pre/post_run
We do not check them, and the only arch with non-empty implementations
always returns 0 (this is also true for qemu-kvm).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Jan Kiszka
db1669bcca kvm: x86: Prepare VCPU loop for in-kernel irqchip
Effectively no functional change yet as kvm_irqchip_in_kernel still only
returns 0, but this patch will allow qemu-kvm to adopt the VCPU loop of
upsteam KVM.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Jan Kiszka
6792a57bf1 kvm: Separate TCG from KVM cpu execution
Mixing up TCG bits with KVM already led to problems around eflags
emulation on x86. Moreover, quite some code that TCG requires on cpu
enty/exit is useless for KVM. So dispatch between tcg_cpu_exec and
kvm_cpu_exec as early as possible.

The core logic of cpu_halted from cpu_exec is added to
kvm_arch_process_irqchip_events. Moving away from cpu_exec makes
exception_index meaningless for KVM, we can simply pass the exit reason
directly (only "EXCP_DEBUG vs. rest" is relevant).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Glauber Costa
b8cc45d6a6 kvm: make tsc stable over migration and machine start
If the machine is stopped, we should not record two different tsc values
upon a save operation. The same problem happens with kvmclock.

But kvmclock is taking a different diretion, being now seen as a separate
device. Since this is unlikely to happen with the tsc, I am taking the
approach here of simply registering a handler for state change, and
using a per-CPUState variable that prevents double updates for the TSC.

Signed-off-by: Glauber Costa <glommer@redhat.com>
CC: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Jan Kiszka
9ccfac9ea4 kvm: Unconditionally reenter kernel after IO exits
KVM requires to reenter the kernel after IO exits in order to complete
instruction emulation. Failing to do so will leave the kernel state
inconsistently behind. To ensure that we will get back ASAP, we issue a
self-signal that will cause KVM_RUN to return once the pending
operations are completed.

We can move kvm_arch_process_irqchip_events out of the inner VCPU loop.
The only state that mattered at its old place was a pending INIT
request. Catch it in kvm_arch_pre_run and also trigger a self-signal to
process the request on next kvm_cpu_exec.

This patch also fixes the missing exit_request check in kvm_cpu_exec in
the CONFIG_IOTHREAD case.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:45 -02:00
Jan Kiszka
a1b87fe046 kvm: Provide sigbus services arch-independently
Provide arch-independent kvm_on_sigbus* stubs to remove the #ifdef'ery
from cpus.c. This patch also fixes --disable-kvm build by providing the
missing kvm_on_sigbus_vcpu kvm-stub.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:45 -02:00
Jan Kiszka
2905749287 x86: Fix MCA broadcast parameters for TCG case
When broadcasting MCEs, we need to set MCIP and RIPV in mcg_status like
it is done for KVM. Use the symbolic constants at this chance.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2011-02-08 12:36:48 +01:00
Anthony Liguori
b3a98367ee Merge remote branch 'qemu-kvm/uq/master' into staging
aliguori: fix build with !defined(KVM_CAP_ASYNC_PF)

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-02-01 15:32:18 -06:00
Jan Kiszka
c5999bfcfd kvm: x86: Only read/write MSR_KVM_ASYNC_PF_EN if supported
If the kernel does not support KVM_CAP_ASYNC_PF, it also does not know
about the related MSR. So skip it during state synchronization in that
case. Fixes annoying kernel warnings.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:22 -02:00
Jan Kiszka
d8f771d912 kvm: x86: Implicitly clear nmi_injected/pending on reset
All CPUX86State variables before CPU_COMMON are automatically cleared on
reset. Reorder nmi_injected and nmi_pending to avoid having to touch
them explicitly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:22 -02:00
Jan Kiszka
110761987d kvm: x86: Rework identity map and TSS setup for larger BIOS sizes
In order to support loading BIOSes > 256K, reorder the code, adjusting
the base if the kernel supports moving the identity map.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:22 -02:00
Jan Kiszka
94a8d39afd kvm: Consolidate must-have capability checks
Instead of splattering the code with #ifdefs and runtime checks for
capabilities we cannot work without anyway, provide central test
infrastructure for verifying their availability both at build and
runtime.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:22 -02:00
Jan Kiszka
cad1e2827b kvm: Drop smp_cpus argument from init functions
No longer used.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
521f0798d7 kvm: x86: Fix !CONFIG_KVM_PARA build
If we lack kvm_para.h, MSR_KVM_ASYNC_PF_EN is not defined. The change in
kvm_arch_init_vcpu is just for consistency reasons.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
ff5c186b8b kvm: x86: Reset paravirtual MSRs
Make sure to write the cleared MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK,
and MSR_KVM_ASYNC_PF_EN to the kernel state so that a freshly booted
guest cannot be disturbed by old values.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Glauber Costa <glommer@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
c3a3a7d356 kvm: x86: Refactor msr_star/hsave_pa setup and checks
Simplify kvm_has_msr_star/hsave_pa to booleans and push their one-time
initialization into kvm_arch_init. Also handle potential errors of that
setup procedure.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
1a5e9d2faf kvm: x86: Fix xcr0 reset mismerge
For unknown reasons, xcr0 reset ended up in kvm_arch_update_guest_debug
on upstream merge. Fix this and also remove the misleading comment (1 is
THE reset value).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
3390e7f797 kvm: x86: Remove redundant mp_state initialization
kvm_arch_reset_vcpu initializes mp_state, and that function is invoked
right after kvm_arch_init_vcpu.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
c14750e8ad kvm: x86: Prepare kvm_get_mp_state for in-kernel irqchip
This code path will not yet be taken as we still lack in-kernel irqchip
support. But qemu-kvm can already make use of it and drop its own
mp_state access services.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
0d75a9ecd7 kvm: x86: Align kvm_arch_put_registers code with comment
The ordering doesn't matter in this case, but better keep it consistent.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Jan Kiszka
f5c848eed7 x86: Optionally dump code bytes on cpu_dump_state
Introduce the cpu_dump_state flag CPU_DUMP_CODE and implement it for
x86. This writes out the code bytes around the current instruction
pointer. Make use of this feature in KVM to help debugging fatal vm
exits.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:20 -02:00
Jan Kiszka
bb44e0d12d kvm: Improve reporting of fatal errors
Report KVM_EXIT_UNKNOWN, KVM_EXIT_FAIL_ENTRY, and KVM_EXIT_EXCEPTION
with more details to stderr. The latter two are so far x86-only, so move
them into the arch-specific handler. Integrate the Intel real mode
warning on KVM_EXIT_FAIL_ENTRY that qemu-kvm carries, but actually
restrict it to Intel CPUs. Moreover, always dump the CPU state in case
we fail.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:20 -02:00
Jan Kiszka
73aaec4a39 kvm: Stop on all fatal exit reasons
Ensure that we stop the guest whenever we face a fatal or unknown exit
reason. If we stop, we also have to enforce a cpu loop exit.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:20 -02:00
Jan Kiszka
646042e1ab kvm: x86: Swallow KVM_EXIT_SET_TPR
This exit only triggers activity in the common exit path, but we should
accept it in order to be able to detect unknown exit types.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:20 -02:00
Jan Kiszka
b9bec74bcb kvm: x86: Fix a few coding style violations
No functional changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-21 14:05:22 -02:00
Jan Kiszka
95c077c919 kvm: x86: Prevent sign extension of DR7 in guest debugging mode
This unbreaks guest debugging when the 4th hardware breakpoint used for
guest debugging is a watchpoint of 4 or 8 byte lenght. The 31st bit of
DR7 is set in that case and used to cause a sign extension to the high
word which was breaking the guest state (vm entry failure).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-21 14:05:22 -02:00
Jan Kiszka
3980e3024b kvm: x86: Remove obsolete SS.RPL/DPL aligment
This seems to date back to the days KVM didn't support real mode. The
check is no longer needed and, even worse, is corrupting the guest state
in case SS.RPL != DPL.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-21 14:05:22 -02:00
Jan Kiszka
acaa75507b kvm: x86: Fix DPL write back of segment registers
The DPL is stored in the flags and not in the selector. In fact, the RPL
may differ from the DPL at some point in time, and so we were corrupting
the guest state so far.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
7cc2cc3e26 kvm: introduce kvm_inject_x86_mce_on
Pass a table instead of multiple args.

Note:

    kvm_inject_x86_mce(env, bank, status, mcg_status, addr, misc,
                       abort_on_error);

is equal to:

    struct kvm_x86_mce mce = {
        .bank = bank,
        .status = status,
        .mcg_status = mcg_status,
        .addr = addr,
        .misc = misc,
    };
    kvm_inject_x86_mce_on(env, &mce, abort_on_error);

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
e387c33892 kvm: kvm_mce_inj_* subroutines for templated error injections
Refactor codes for maintainability.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
6643e2f001 kvm: introduce kvm_mce_in_progress
Share same error handing, and rename this function after
MCIP (Machine Check In Progress) flag.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
2bd3e04c3b Add function for checking mca broadcast of CPU
Add function for checking whether current CPU support mca broadcast.

Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
31ce5e0c49 Add "broadcast" option for mce command
When the following test case is injected with mce command, maybe user could not
get the expected result.
    DATA
               command cpu bank status             mcg_status  addr   misc
        (qemu) mce     1   1    0xbd00000000000000 0x05        0x1234 0x8c

    Expected Result
           panic type: "Fatal Machine check"

That is because each mce command can only inject the given cpu and could not
inject mce interrupt to other cpus. So user will get the following result:
    panic type: "Fatal machine check on current CPU"

"broadcast" option is used for injecting dummy data into other cpus. Injecting
mce with this option the expected result could be gotten.

Usage:
    Broadcast[on]
           command broadcast cpu bank status             mcg_status  addr   misc
    (qemu) mce     -b        1   1    0xbd00000000000000 0x05        0x1234 0x8c

    Broadcast[off]
           command cpu bank status             mcg_status  addr   misc
    (qemu) mce     1   1    0xbd00000000000000 0x05        0x1234 0x8c

Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Jin Dongming
b3cd24e04a Clean up cpu_inject_x86_mce()
Clean up cpu_inject_x86_mce() for later patch.

Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:22 -02:00
Lai Jiangshan
14a095184c kvm: convert kvm_ioctl(KVM_CHECK_EXTENSION) to kvm_check_extension()
simple cleanup and use existing helper: kvm_check_extension().

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:21 -02:00
Lai Jiangshan
276ce81563 kvm: Enable user space NMI injection for kvm guest
Make use of the new KVM_NMI IOCTL to send NMIs into the KVM guest if the
user space raised them. (example: qemu monitor's "nmi" command)

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-21 14:05:21 -02:00
Richard Henderson
c832e3de64 target-i386: Use deposit operation.
Use this for assignment to the low byte or low word of a register.

Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2011-01-20 12:16:12 +01:00
Jan Kiszka
4058fd98fd x86: Filter out garbage from segment flags dump
Only bits 8..23 of the segment flags contain valid data, so only dump
those when printing the CPU state.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-12-27 22:02:52 +01:00
Andreas Färber
ac6c41204f target-i386: Fix accidental use of SoftFloat uint64 type
softfloat.h's uint64 type has least-width semantics.
Use uint64_t instead since that is used in helpers.

v4:
* Summary change.

v3:
* Split off.

Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Acked-by: Huang Ying <ying.huang@intel.com>
Acked-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-12-20 21:03:19 +00:00
Gleb Natapov
f6584ee203 Add support for async page fault to qemu
Add save/restore of MSR for migration and cpuid bit.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-11-16 08:40:07 -06:00
Stefan Weil
9a78eead0c target-xxx: Use fprintf_function (format checking)
fprintf_function uses format checking with GCC_FMT_ATTR.

Format errors were fixed in
* target-i386/helper.c
* target-mips/translate.c
* target-ppc/translate.c

Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-10-30 08:01:59 +00:00
Marcelo Tosatti
25d2e3613d kvm: save/restore x86-64 MSRs on x86-64 kernels
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:37:41 -02:00
Marcelo Tosatti
384331a61a kvm: writeback SMP TSCs on migration only
commit 6389c45441269baa2873e6feafebd17105ddeaf6
Author: Jan Kiszka <jan.kiszka@siemens.com>
Date:   Mon Mar 1 18:17:26 2010 +0100

    qemu-kvm: Cleanup/fix TSC and PV clock writeback

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:37:40 -02:00
Marcelo Tosatti
75b10c4336 kvm: factor out kvm_has_msr_star
And add kvm_has_msr_hsave_pa(), to avoid warnings on older
kernels without support.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:37:29 -02:00
Marcelo Tosatti
aa851e365b kvm: add save/restore of MSR_VM_HSAVE_PA
commit 2bba4446746add456ceeb0e8359a43032a2ea333
Author: Alexander Graf <agraf@suse.de>
Date:   Thu Dec 18 15:38:32 2008 +0100

    Enable nested SVM support in userspace

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:37:17 -02:00
Hidetoshi Seto
d8da8574b1 Fix build on !KVM_CAP_MCE
This patch removes following warnings:

target-i386/kvm.c: In function 'kvm_put_msrs':
target-i386/kvm.c:782: error: unused variable 'i'
target-i386/kvm.c: In function 'kvm_get_msrs':
target-i386/kvm.c:1083: error: label at end of compound statement

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:12:14 -02:00
Hidetoshi Seto
f71ac88fe9 x86, mce: broadcast mce depending on the cpu version
There is no reason why SRAO event received by the main thread
is the only one that being broadcasted.

According to the x86 ASDM vol.3A 15.10.4.1,
MCE signal is broadcast on processor version 06H_EH or later.

This change is required to handle SRAR in smp guests.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:11:38 -02:00
Hidetoshi Seto
f8502cfbbf x86, mce: ignore SRAO only when MCG_SER_P is available
And restruct this block to call kvm_mce_in_exception() only when it is
required.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-21 13:09:06 -02:00
Marcelo Tosatti
0f53994f7a Fix memory leak in register save load due to xsave support
From: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-20 16:15:04 -05:00
Marcelo Tosatti
577804958a Add savevm/loadvm support for MCE
Port qemu-kvm's

commit 1bab5d11545d8de5facf46c28630085a2f9651ae
Author: Huang Ying <ying.huang@intel.com>
Date:   Wed Mar 3 16:52:46 2010 +0800

    Add savevm/loadvm support for MCE

    MCE registers are saved/load into/from CPUState in
    kvm_arch_save/load_regs. To simulate the MCG_STATUS clearing upon
    reset, MSR_MCG_STATUS is set to 0 for KVM_PUT_RESET_STATE.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-20 16:15:04 -05:00
Marcelo Tosatti
c0532a76b4 MCE: Relay UCR MCE to guest
Port qemu-kvm's

commit 4b62fff1101a7ad77553147717a8bd3bf79df7ef
Author: Huang Ying <ying.huang@intel.com>
Date:   Mon Sep 21 10:43:25 2009 +0800

    MCE: Relay UCR MCE to guest

    UCR (uncorrected recovery) MCE is supported in recent Intel CPUs,
    where some hardware error such as some memory error can be reported
    without PCC (processor context corrupted). To recover from such MCE,
    the corresponding memory will be unmapped, and all processes accessing
    the memory will be killed via SIGBUS.

    For KVM, if QEMU/KVM is killed, all guest processes will be killed
    too. So we relay SIGBUS from host OS to guest system via a UCR MCE
    injection. Then guest OS can isolate corresponding memory and kill
    necessary guest processes only. SIGBUS sent to main thread (not VCPU
    threads) will be broadcast to all VCPU threads as UCR MCE.

aliguori: fix build

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-10-20 16:15:04 -05:00
Marcelo Tosatti
e7701825e1 kvm: x86: add mce support
Port qemu-kvm's MCE support

commit c68b2374c9048812f488e00ffb95db66c0bc07a7
Author: Huang Ying <ying.huang@intel.com>
Date:   Mon Jul 20 10:00:53 2009 +0800

    Add MCE simulation support to qemu/kvm

    KVM ioctls are used to initialize MCE simulation and inject MCE. The
    real MCE simulation is implemented in Linux kernel. The Kernel part
    has been merged.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-20 16:15:04 -05:00
Joerg Roedel
296acb643b Add svm cpuid features
This patch adds the svm cpuid feature flags to the qemu
intialization path. It also adds the svm features available
on phenom to its cpu-definition and extends the host cpu
type to support all svm features KVM can provide.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-20 16:15:04 -05:00
Joerg Roedel
db0ad1ba04 Set cpuid definition to 0 before initializing it
This patch cleans the (stack-allocated) cpuid definition to
0 before actually initializing it.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-20 16:15:04 -05:00
Blue Swirl
4581cbcdc3 i386: avoid a write only variable
Compiling with GCC 4.6.0 20100925 produced warnings:
/src/qemu/target-i386/op_helper.c: In function 'switch_tss':
/src/qemu/target-i386/op_helper.c:283:53: error: variable 'new_trap' set but not used [-Werror=unused-but-set-variable]

Fix by adding a dummy cast so that the variable is not unused. Add also
pointer to docs.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-10-13 18:43:19 +00:00
Gerd Hoffmann
3329f07b7a QemuOpts: make most qemu_*_opts static
Switch tree to lookup-by-name using qemu_find_opts().
Also hook up virtfs options so qemu_find_opts works for them too.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-08-22 17:11:06 -05:00
Adam Lackorzynski
583cd3cb93 target-i386: svm: Fix MSRPM check
Correct the calculation of the offset in the msrpm
for the MSR range 0 - 0x1fff.

Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-08-22 16:19:00 -05:00
Blue Swirl
66fe09eebb Use ARRAY_SIZE macro
Replace array size calculations with ARRAY_SIZE macro.

Implemented with this Coccinelle semantic patch, adapted from
Linux kernel:
@@
type T;
T[] E;
@@

- (sizeof(E)/sizeof(*E))
+ ARRAY_SIZE(E)

@@
type T;
T[] E;
@@

- (sizeof(E)/sizeof(E[...]))
+ ARRAY_SIZE(E)

@@
type T;
T[] E;
@@

- (sizeof(E)/sizeof(T))
+ ARRAY_SIZE(E)

Some files (*-dis.c, tests/*) had to be filtered out.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-08-20 21:03:24 +00:00
Jun Koi
728d803baa target-i386: use gen_update_cc_op()
This patch simplifies target-i386/translate.c a bit by replacing some
code with gen_update_cc_op()

Signed-off-by: Jun Koi <junkoi2004@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-07-25 17:04:07 +02:00
Jun Koi
5779406a24 use symbol for DisasContext->is_jmp
This patch replaces constant value assigned for (DisasContext
*)->is_jmp with DISAS_TB_JUMP.

Signed-off-by: Jun Koi <junkoi2004@gmail.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2010-07-24 14:34:15 +02:00
Paolo Bonzini
a88790a14f remove exec-all.h inclusion from cpu.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-07-03 09:48:24 +03:00
Paolo Bonzini
10eb0cc03c move cpu_pc_from_tb to target-*/exec.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-07-03 09:48:12 +03:00
Paolo Bonzini
83dae09588 remove unused stuff from */exec.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-07-03 06:47:13 +00:00
Andi Kleen
8713f8ffb8 Don't declare XSAVE as supported
i386 cpuid.c currently claims XSAVE is supported in the CPUID filter,
but that's not true: Only FXSAVE is supported. Remove that bit
from the filter.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-07-02 01:55:38 +02:00
Andi Kleen
c045af25a5 Add more boundary checking to sse3/4 parsing
ssse3 uses tables with only two entries per op, but it is indexed
with b1 which can contain variables upto 3. This happens when ssse3
or sse4 are used with REP* prefixes.

Add boundary checking for this case.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-07-02 01:53:37 +02:00
Richard Henderson
7418027ea4 target-i386: fix xchg rax,r8
We were ignoring REX_B while special-casing NOP, i.e. xchg eax,eax.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-07-01 23:45:28 +02:00
Jan Kiszka
ab5ea55805 x86: svm: Always clear event_inj on vmexit
We currently only clear SVM_EVTINJ_VALID after successful interrupt
delivery. This apparently does not match real hardware which clears the
whole event_inj field on every vmexit, including unsuccessful interrupt
delivery.

Reported-by: Erik van der Kouwe <vdkouwe@cs.vu.nl>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-06-30 20:59:53 +02:00
Sheng Yang
408392b30d kvm: Fix cpu_is_bsp() compilation warning
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-06-28 13:06:03 -03:00
Marcelo Tosatti
ddced198cd kvm: init mp_state
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-06-28 13:06:03 -03:00
Sheng Yang
f1665b21f1 kvm: Enable XSAVE live migration support
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-06-28 13:06:03 -03:00
Sheng Yang
51e49430c0 Enable XSAVE related CPUID
We can support it in KVM now. The 0xd leaf is queried from KVM.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-06-28 13:06:03 -03:00
Sheng Yang
c958a8bd9b kvm: Extend kvm_arch_get_supported_cpuid() to support index
Would use it later for XSAVE related CPUID.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-06-28 13:06:03 -03:00
Andre Przywara
8935499831 fix CPUID vendor override
the meaning of vendor_override is actually the opposite of how it
is currently used :-(
Fix it to allow KVM to export the non-native CPUID vendor if
explicitly requested by the user.

The intended behavior is:
With TCG:
  - always inject the configured vendor (either hard-coded, in config
    files or via ",vendor=" commandline)
With KVM:
  - by default inject the host's vendor
  - if the user specifies ",vendor=" on the commandline, use this
    instead of the host's vendor
  - all pre-configured vendors (hard-coded, config file) are ignored

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-06-28 13:06:03 -03:00
Jan Kiszka
052be86bc3 x86: Fix INIT processing
This fixes a regression of 0e26b7b892: Reset halted also on INIT.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-06-27 19:07:45 +03:00
Blue Swirl
92a16d7a9a apic: qdev conversion cleanup
Make APICState completely private to apic.c by using DeviceState
in external APIs.

Move apic_init() to pc.c.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-06-19 07:47:42 +00:00
Blue Swirl
0e26b7b892 apic: avoid using CPUState internals
Move the actual CPUState contents handling to cpu.h and cpuid.c.

Handle CPU reset and set env->halted in pc.c.

Add a function to get the local APIC state of the current
CPU for the MMIO.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-06-19 10:42:34 +03:00
Blue Swirl
4a942ceac7 apic: avoid passing CPUState from CPU code
Pass only APICState when accessing APIC from CPU code.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-06-19 10:42:31 +03:00