Commit Graph

35 Commits

Author SHA1 Message Date
Maxiwell S. Garcia
4173324946 migration: Do not re-read the clock on pre_save in case of paused guest
The clock move makes the guest knows about the paused time between the
'stop' and 'migrate' commands. This is an issue in an already-paused
VM because some side effects, like process stalls, could happen
after migration.

So, this patch checks the runstate of guest in the pre_save handler and
do not re-reads the clock in case of paused state (cold migration).

Signed-off-by: Maxiwell S. Garcia <maxiwell@linux.ibm.com>
Message-Id: <20190829210711.6570-1-maxiwell@linux.ibm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-09-03 14:39:46 -03:00
Markus Armbruster
54d31236b9 sysemu: Split sysemu/runstate.h off sysemu/sysemu.h
sysemu/sysemu.h is a rather unfocused dumping ground for stuff related
to the system-emulator.  Evidence:

* It's included widely: in my "build everything" tree, changing
  sysemu/sysemu.h still triggers a recompile of some 1100 out of 6600
  objects (not counting tests and objects that don't depend on
  qemu/osdep.h, down from 5400 due to the previous two commits).

* It pulls in more than a dozen additional headers.

Split stuff related to run state management into its own header
sysemu/runstate.h.

Touching sysemu/sysemu.h now recompiles some 850 objects.  qemu/uuid.h
also drops from 1100 to 850, and qapi/qapi-types-run-state.h from 4400
to 4200.  Touching new sysemu/runstate.h recompiles some 500 objects.

Since I'm touching MAINTAINERS to add sysemu/runstate.h anyway, also
add qemu/main-loop.h.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-30-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
[Unbreak OS-X build]
2019-08-16 13:37:36 +02:00
Markus Armbruster
a27bd6c779 Include hw/qdev-properties.h less
In my "build everything" tree, changing hw/qdev-properties.h triggers
a recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

Many places including hw/qdev-properties.h (directly or via hw/qdev.h)
actually need only hw/qdev-core.h.  Include hw/qdev-core.h there
instead.

hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h
and hw/qdev-properties.h, which in turn includes hw/qdev-core.h.
Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h.

While there, delete a few superfluous inclusions of hw/qdev-core.h.

Touching hw/qdev-properties.h now recompiles some 1200 objects.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190812052359.30071-22-armbru@redhat.com>
2019-08-16 13:31:53 +02:00
Markus Armbruster
d645427057 Include migration/vmstate.h less
In my "build everything" tree, changing migration/vmstate.h triggers a
recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

hw/hw.h supposedly includes it for convenience.  Several other headers
include it just to get VMStateDescription.  The previous commit made
that unnecessary.

Include migration/vmstate.h only where it's still needed.  Touching it
now recompiles only some 1600 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-16-armbru@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster
0b8fa32f55 Include qemu/module.h where needed, drop it from qemu-common.h
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-4-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c
hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c;
ui/cocoa.m fixed up]
2019-06-12 13:18:33 +02:00
Yongji Xie
4c3e250627 kvmclock: run KVM_KVMCLOCK_CTRL ioctl in vcpu thread
According to KVM API Documentation, we should only
run vcpu ioctls from the same thread that was used
to create the vcpu. This patch makes KVM_KVMCLOCK_CTRL
ioctl consistent with the Documentation.

No functional change.

Signed-off-by: Yongji Xie <xieyongji@baidu.com>
Signed-off-by: Chai Wen <chaiwen@baidu.com>
Message-Id: <1531315364-2551-1-git-send-email-xieyongji@baidu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yongji Xie <elohimes@gmail.com>
2018-10-02 19:09:13 +02:00
Michael S. Tsirkin
1814eab673 x86/cpu: use standard-headers/asm-x86.kvm_para.h
Switch to the header we imported from Linux,
this allows us to drop a hack in kvm_i386.h.
More code will be dropped in the next patch.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 03:14:41 +03:00
Michael Chapman
c2b01cfec1 kvmclock: fix clock_is_reliable on migration from QEMU < 2.9
When migrating from a pre-2.9 QEMU, no clock_is_reliable flag is
transferred. We should assume that the source host has an unreliable
KVM_GET_CLOCK, rather than using whatever was determined locally, to
ensure that any drift from the TSC-based value calculated by the guest
is corrected.

Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Message-Id: <20180406053406.774-1-mike@very.puzzling.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-04-09 16:36:39 +02:00
Jim Somerville
346b1215b1 kvmclock: use the updated system_timer_msr
Fixes e2b6c17 (kvmclock: update system_time_msr address forcibly)
which makes a call to get the latest value of the address
stored in system_timer_msr, but then uses the old address anyway.

Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com>
Message-Id: <59b67db0bd15a46ab47c3aa657c81a4c11f168ea.1506702472.git.Jim.Somerville@windriver.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-02 14:39:51 +02:00
Dr. David Alan Gilbert
44b1ff319c migration: pre_save return int
Modify the pre_save method on VMStateDescription to return an int
rather than void so that it potentially can fail.

Changed zillions of devices to make them return 0; the only
case I've made it return non-0 is hw/intc/s390_flic_kvm.c that already
had an error_report/return case.

Note: If you add an error exit in your pre_save you must emit
an error_report to say why.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170925112917.21340-2-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-09-27 11:35:59 +01:00
Denis Plotnikov
e2b6c1712e kvmclock: update system_time_msr address forcibly
Do an update of system_time_msr address every time before reading
the value of tsc_timestamp from guest's kvmclock page.

There is no other code paths which ensure that qemu has an up-to-date
value of system_time_msr. So, force this update on guest's tsc_timestamp
reading.

This bug causes effect on those nested setups which turn off TPR access
interception for L2 guests and that access being intercepted by L0 doesn't
show up in L1.
Linux bootstrap initiate kvmclock before APIC initializing causing TPR access.
That's why on L1 guests, having TPR interception turned on for L2, the effect
of the bug is not revealed.

This patch fixes this problem by making sure it knows the correct
system_time_msr address every time it is needed.

Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-Id: <1496054944-25623-1-git-send-email-dplotnikov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:02 +02:00
Eduardo Habkost
ca2edcd35c kvmclock: Don't crash QEMU if KVM is disabled
Most machines don't allow sysbus devices like "kvmclock" to be
created from the command-line, but some of them do (the ones with
has_dynamic_sysbus=true). In those cases, it's possible to
manually create a kvmclock device without KVM being enabled,
making QEMU crash:

  $ qemu-system-x86_64 -machine q35,accel=tcg -device kvmclock
  Segmentation fault (core dumped)

This changes kvmclock's realize method to return an error if KVM
is disabled, to ensure it won't crash QEMU.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170309185046.17555-1-ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:36 +01:00
Marcelo Tosatti
6053a86fe7 kvmclock: reduce kvmclock difference on migration
Check for KVM_CAP_ADJUST_CLOCK capability KVM_CLOCK_TSC_STABLE, which
indicates that KVM_GET_CLOCK returns a value as seen by the guest at
that moment.

For new machine types, use this value rather than reading
from guest memory.

This reduces kvmclock difference on migration from 5s to 0.1s
(when max_downtime == 5s).

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Message-Id: <20161121105052.598267440@redhat.com>
[Add comment explaining what is going on. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:56 +01:00
Paolo Bonzini
33c11879fd qemu-common: push cpu.h inclusion out of qemu-common.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:29 +02:00
Peter Maydell
b6a0aa0537 x86: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453832250-766-11-git-send-email-peter.maydell@linaro.org
2016-01-29 15:07:22 +00:00
Liang Li
0fd7e098db kvmclock: add a new function to update env->tsc.
The commit 317b0a6d8 fixed an issue which caused by the outdated
env->tsc value, but the fix lead to 'cpu_synchronize_all_states()'
called twice during live migration. The 'cpu_synchronize_all_states()'
takes about 130us for a VM which has 4 vcpus, it's a bit expensive.

Synchronize the whole CPU context just for updating env->tsc is too
wasting, this patch use a new function to update the env->tsc.
Comparing to 'cpu_synchronize_all_states()', it only takes about 20us.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Message-Id: <1446695464-27116-2-git-send-email-liang.z.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-11-05 11:28:10 +01:00
Christian Borntraeger
5e0b7d8869 valgrind/i386: avoid false positives on KVM_SET_CLOCK ioctl
kvm_clock_data contains pad fields. Let's use a designated
initializer to avoid false positives from valgrind/memcheck.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-15 12:21:01 +01:00
Eduardo Habkost
1154d84dcc kvmclock: Add comment explaining why we need cpu_clean_all_dirty()
Try to explain why commit 317b0a6d8b
needed a cpu_clean_all_dirty() call just after calling
cpu_synchronize_all_states().

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Cc: Andrey Korolyov <andrey@xdel.ru>
Cc: Marcin Gibuła <m.gibula@beyond.pl>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-13 16:13:28 +01:00
Alexander Graf
9a48bcd1b8 kvmclock: Ensure time in migration never goes backward
When we migrate we ask the kernel about its current belief on what the guest
time would be. However, I've seen cases where the kvmclock guest structure
indicates a time more recent than the kvm returned time.

To make sure we never go backwards, calculate what the guest would have seen as time at the point of migration and use that value instead of the kernel returned one when it's more recent.
This bases the view of the kvmclock after migration on the
same foundation in host as well as guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
Cc: qemu-stable@nongnu.org
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-16 11:11:38 +02:00
Marcelo Tosatti
317b0a6d8b kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation
Ensure proper env->tsc value for kvmclock_current_nsec calculation.

Reported-by: Marcin Gibuła <m.gibula@beyond.pl>
Analyzed-by: Marcin Gibuła <m.gibula@beyond.pl>
Cc: qemu-stable@nongnu.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-16 11:11:24 +02:00
Paolo Bonzini
fa666c10f2 Revert "kvmclock: Ensure time in migration never goes backward"
This reverts commit a096b3a673.

This patch caused a hang that was fixed by commit 9b17868 (kvmclock:
Ensure proper env->tsc value for kvmclock_current_nsec calculation,
2014-06-03), and we just had to revert that commit.  Drop this one
too.

Cc: agraf@suse.de
Cc: mtosatti@redhat.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-18 15:28:03 +02:00
Paolo Bonzini
108e4c3871 Revert "kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation"
This reverts commit 9b1786829a.

This patch fixed a hang introduced by commit a096b3a (kvmclock: Ensure
time in migration never goes backward, 2014-05-16), but it causes
a regression in migration whose cause is not quite clear.

Because of this, I'm choosing to revert both patches.  This trades a
2.1 regression for a bug that's been there forever.

Cc: agraf@suse.de
Cc: mtosatti@redhat.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-18 15:15:14 +02:00
Juan Quintela
d49805aeea savevm: Remove all the unneeded version_minimum_id_old (x86)
After previous Peter patch, they are redundant.  This way we don't
assign them except when needed.  Once there, there were lots of case
where the ".fields" indentation was wrong:

     .fields = (VMStateField []) {
and
     .fields =      (VMStateField []) {

Change all the combinations to:

     .fields = (VMStateField[]){

The biggest problem (appart from aesthetics) was that checkpatch complained
when we copy&pasted the code from one place to another.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-16 04:55:26 +02:00
Marcelo Tosatti
9b1786829a kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation
Ensure proper env->tsc value for kvmclock_current_nsec calculation.

Reported-by: Marcin Gibuła <m.gibula@beyond.pl>
Cc: qemu-stable@nongnu.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-03 18:40:48 +02:00
Alexander Graf
a096b3a673 kvmclock: Ensure time in migration never goes backward
When we migrate we ask the kernel about its current belief on what the guest
time would be. However, I've seen cases where the kvmclock guest structure
indicates a time more recent than the kvm returned time.

To make sure we never go backwards, calculate what the guest would have seen
as time at the point of migration and use that value instead of the kernel
returned one when it's more recent.  This bases the view of the kvmclock
after migration on the same foundation in host as well as guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
Cc: qemu-stable@nongnu.org
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-21 12:01:45 +02:00
Markus Armbruster
837d37167d sysbus: Set cannot_instantiate_with_device_add_yet
device_add plugs devices into suitable bus.  For "real" buses, that
actually connects the device.  For sysbus, the connections need to be
made separately, and device_add can't do that.  The device would be
left unconnected, and could not possibly work.

Quite a few, but not all sysbus devices already set
cannot_instantiate_with_device_add_yet in their class init function.

Set it in their abstract base's class init function
sysbus_device_class_init(), and remove the now redundant assignments
from device class init functions.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-12-23 00:27:22 +01:00
Markus Armbruster
efec3dd631 qdev: Replace no_user by cannot_instantiate_with_device_add_yet
In an ideal world, machines can be built by wiring devices together
with configuration, not code.  Unfortunately, that's not the world we
live in right now.  We still have quite a few devices that need to be
wired up by code.  If you try to device_add such a device, it'll fail
in sometimes mysterious ways.  If you're lucky, you get an
unmysterious immediate crash.

To protect users from such badness, DeviceClass member no_user used to
make device models unavailable with -device / device_add, but that
regressed in commit 18b6dad.  The device model is still omitted from
help, but is available anyway.

Attempts to fix the regression have been rejected with the argument
that the purpose of no_user isn't clear, and it's prone to misuse.

This commit clarifies no_user's purpose.  Anthony suggested to rename
it cannot_instantiate_with_device_add_yet_due_to_internal_bugs, which
I shorten somewhat to keep checkpatch happy.  While there, make it
bool.

Every use of cannot_instantiate_with_device_add_yet gets a FIXME
comment asking for rationale.  The next few commits will clean them
all up, either by providing a rationale, or by getting rid of the use.

With that done, the regression fix is hopefully acceptable.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-12-23 00:27:22 +01:00
Stefan Weil
e76d05c2b5 kvm: Fix compiler warning (clang)
Report from clang analyzer:

clock.c:42:15: warning:
Value stored to 'cpu' during its initialization is never read

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:11:32 +04:00
Andreas Färber
bdc44640cb cpu: Use QTAILQ for CPU list
Introduce CPU_FOREACH(), CPU_FOREACH_SAFE() and CPU_NEXT() shorthand
macros.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-09-03 12:25:55 +02:00
Hu Tao
913bc63844 kvm/clock: Use QOM realize for kvmclock
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-23 00:37:35 +02:00
Hu Tao
98bdc0d7ff kvm/clock: QOM'ify some more
Introduce type constant and avoid FROM_SYSBUS().

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
[AF: Renamed parent field]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-23 00:37:35 +02:00
Andreas Färber
182735efaf cpu: Make first_cpu and next_cpu CPUState
Move next_cpu from CPU_COMMON to CPUState.
Move first_cpu variable to qom/cpu.h.

gdbstub needs to use CPUState::env_ptr for now.
cpu_copy() no longer needs to save and restore cpu_next.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
[AF: Rebased, simplified cpu_copy()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-09 21:32:54 +02:00
Marcelo Tosatti
00f4d64ee7 kvmclock: clock should count only if vm is running
kvmclock should not count while vm is paused, because:

1) if the vm is paused for long periods, timekeeping
math can overflow while converting the (large) clocksource
delta to nanoseconds.

2) Users rely on CLOCK_MONOTONIC to count run time, that is,
time which OS has been in a runnable state (see CLOCK_BOOTTIME).

Change kvmclock driver so as to save clock value when vm transitions
from runnable to stopped state, and to restore clock value from stopped
to runnable transition.

Cc: qemu-stable@nongnu.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-03 10:41:32 +02:00
Eduardo Habkost
0514ef2fbb target-i386: Replace cpuid_*features fields with a feature word array
This replaces the feature-bit fields on both X86CPU and x86_def_t
structs with an array.

With this, we will be able to simplify code that simply does the same
operation on all feature words (e.g. kvm_check_features_against_host(),
filter_features_for_kvm(), add_flagname_to_bitmaps(), CPU feature-bit
property lookup/registration, and the proposed "feature-words" property)

The following field replacements were made on X86CPU and x86_def_t:

  (cpuid_)features         -> features[FEAT_1_EDX]
  (cpuid_)ext_features     -> features[FEAT_1_ECX]
  (cpuid_)ext2_features    -> features[FEAT_8000_0001_EDX]
  (cpuid_)ext3_features    -> features[FEAT_8000_0001_ECX]
  (cpuid_)ext4_features    -> features[FEAT_C000_0001_EDX]
  (cpuid_)kvm_features     -> features[FEAT_KVM]
  (cpuid_)svm_features     -> features[FEAT_SVM]
  (cpuid_)7_0_ebx_features -> features[FEAT_7_0_EBX]

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-05-02 00:27:55 +02:00
Paolo Bonzini
54976b75fb hw: move hw/kvm/ to hw/i386/kvm
Peter requested the KVM GIC to be in hw/intc.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-04-08 18:13:16 +02:00