We split it into 2 functions, foo_live_iterate, and foo_live_complete.
At this point, we only remove the bits that are for the other stage,
functionally this is equivalent to previous code.
Signed-off-by: Juan Quintela <quintela@redhat.com>
This patch splits stage 1 to its own function for both save_live
users, ram and block. It is just a copy of the function, removing the
parts of the other stages. Optimizations would came later.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Enable the creation of a method to tell migration if that section is
active and should be migrate. We use it for blk-migration, that is
normally not active. We don't create the method for RAM, as setups
without RAM are very strange O:-)
Signed-off-by: Juan Quintela <quintela@redhat.com>
Notice that the live migration users never unregister, so no problem
about freeing the ops structure.
Signed-off-by: Juan Quintela <quintela@redhat.com>
This allows to know how long each section takes to save.
An awk script like this tells us sections that takes more that 10ms
$1 ~ /savevm_state_iterate_end/ {
/* Print savevm_section_end line when > 10ms duration */
if ($2 > 10000) {
printf("%s times_missing=%u\n", $0, times_missing++);
}
}
Signed-off-by: Juan Quintela <quintela@redhat.com>
fix ws tracepoints
Signed-off-by: Juan Quintela <quintela@redhat.com>
* afaerber-or/qom-next-2: (22 commits)
qom: Push error reporting to object_property_find()
qdev: Remove qdev_prop_exists()
qbus: Initialize in standard way
qbus: Make child devices links
qdev: Connect busses with their parent devices
qdev: Convert busses to QEMU Object Model
qdev: Move SysBus initialization to sysbus.c
qdev: Use wrapper for qdev_get_path
qdev: Remove qdev_prop_set_defaults
qdev: Clean up global properties
qdev: Move bus properties to abstract superclasses
qdev: Move bus properties to a separate global
qdev: Push "type" property up to Object
arm_l2x0: Rename "type" property to "cache-type"
m48t59: Rename "type" property to "model"
qom: Assert that public types have a non-NULL parent field
qom: Drop type_register_static_alias() macro
qom: Make Object a type
qom: Add class_base_init
qom: Add object_child_foreach()
...
This makes it easier to remove it from BusInfo.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[AF: Drop now unnecessary NULL initialization in scsibus_get_dev_path()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Writing vm state uses bdrv_pwrite, so it will automatically get flushes
in writethrough mode. But doing a flush at the end in writeback mode
is probably a good idea anyway.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
tb.time is a time value, but not necessarily of the same size as time_t:
while time_t is 64 bit for w64, tb.time still is 32 bit only.
Therefore we need en explicit conversion.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* sstabellini/saverestore-8:
xen: do not allocate RAM during INMIGRATE runstate
xen mapcache: check if memory region has moved.
xen: record physmap changes to xenstore
Set runstate to INMIGRATE earlier
Introduce "xen-save-devices-state"
cirrus_vga: do not reset videoram
Conflicts:
qapi-schema.json
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
- add an "is_ram" flag to SaveStateEntry;
- register_savevm_live sets is_ram for live_savevm devices;
- introduce a "xen-save-devices-state" QAPI command that can be used to save
the state of all devices, but not the RAM or the block devices of the
VM.
Changes in v8:
- rename save-devices-state to xen-save-devices-state.
Changes in v7:
- rename save_devices to save-devices-state.
Changes in v6:
- remove the is_ram parameter from register_savevm_live and sets is_ram
if the device is a live_savevm device;
- introduce save_devices as a QAPI command, write a better description
for it;
- fix CODING_STYLE;
- introduce a new doc to explain the save format used by save_devices.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Acked-by: Luiz Capitulino <lcapitulino@redhat.com>
* qemu-kvm/memory/urgent: (42 commits)
memory: check for watchpoints when getting code ram_addr
exec: fix write tlb entry misused as iotlb
Sparc: avoid AREG0 wrappers for memory access helpers
Sparc: avoid AREG0 for memory access helpers
TCG: add 5 arg helpers to def-helper.h
softmmu templates: optionally pass CPUState to memory access functions
i386: Remove REGPARM
sparc64: implement PCI and ISA irqs
sparc: reset CPU state on reset
apb: use normal PCI device header for PBM device
w64: Fix data type of next_tb and tcg_qemu_tb_exec
softfloat: fix for C99
vmstate: fix varrays with uint32_t indexes
Fix large memory chunks allocation with tcg_malloc.
hw/pxa2xx.c: Fix handling of pxa2xx_i2c variable offset within region
hw/pxa2xx_lcd.c: drop target_phys_addr_t usage in device state
hw/pxa2xx_dma.c: drop target_phys_addr_t usage in device state
ARM: Remove unnecessary subpage workarounds
malta: Fix display for LED array
malta: Use symbolic hardware addresses
...
VMSTATE_VARRAY_UINT32() is used in hw/ds1225y.c, and we checked
VMS_VARRAY_UINT32 bit of field->flags in vmstate_load_state(),
but we don't check this bit in vmstate_save_state().
Signed-off-by: Amos Kong <akong@redhat.com>
Acked-by: Juan Quintela <quintela@redhat.com>
Acked-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The migrate command is one of those commands where HMP and QMP completely
mix up together. This made the conversion to the QAPI (which separates the
command into QMP and HMP parts) a bit difficult.
The first important change to be noticed is that this commit completes the
removal of the Monitor object from migration code, started by the previous
commit.
Another important and tricky change is about supporting the non-detached
mode. That is, if the user doesn't pass '-d' the migrate command will lock
the monitor and will only release it when migration is finished.
To support this in the new HMP command (hmp_migrate()), it is necessary
to create a timer which runs every second and checks if the migration is
still active. If it is, the timer callback will re-schedule itself to run
one second in the future. If the migration has already finished, the
monitor lock is released and the user can use it normally.
All these changes should be transparent to the user.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
The Monitor object is passed back and forth within the migration/savevm
code so that it can print errors and progress to the user.
However, that approach assumes a HMP monitor, being completely invalid
in QMP.
This commit drops almost every single usage of the Monitor object, all
monitor_printf() calls have been converted into DPRINTF() ones.
There are a few remaining Monitor objects, those are going to be dropped
by the next commit.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
rom/device regions have a ram_addr that is composed of both an I/O handler
(low bits) and RAM region (high bits); but qemu_ram_set_idstr() expects just
a RAM region. Mask the I/O handler to make it happy.
Tested-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
Currently creating a memory region automatically registers it for
live migration. This differs from other state (which is enumerated
in a VMStateDescription structure) and ties the live migration code
into the memory core.
Decouple the two by introducing a separate API, vmstate_register_ram(),
for registering a RAM block for migration. Currently the same
implementation is reused, but later it can be moved into a separate list,
and registrations can be moved to VMStateDescription blocks.
Signed-off-by: Avi Kivity <avi@redhat.com>
This is what qemu_fclose() expects.
Changes v1 -> v2:
- Add braces to if statement to match coding style
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This is what qemu_fclose() expects.
Changes v1 -> v2:
- On success, keep returning pclose() return value, instead of always 0.
Changes v2 -> v3:
- Add braces on if statements to match coding style
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This will make sure no error will be missed as long as callers always
check for qemu_fclose() return value. For reference, this is the
complete list of qemu_fclose() callers:
- exec_close(): already fixed to check for negative values, not -1
- migrate_fd_cleanup(): already fixed to consider only negative values
as error, not any non-zero value
- exec_accept_incoming_migration(): no return value check (yet)
- fd_accept_incoming_migration(): no return value check (yet)
- tcp_accept_incoming_migration(): no return value check (yet)
- unix_accept_incoming_migration(): no return value check (yet)
- do_savevm(): no return value check (yet)
- load_vmstate(): no return value check (yet)
Changes v1 -> v2:
- Add small comment about the need to return previously-spotted errors
Changes v2 -> v3:
- Add braces to "if" statements to match coding style
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Some code uses qemu_file_set_error() already, so use it everywhere
when setting last_error, for consistency.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Many places in QEMU call qemu_aio_flush() to complete all pending
asynchronous I/O. Most of these places actually want to drain all block
requests but there is no block layer API to do so.
This patch introduces the bdrv_drain_all() API to wait for requests
across all BlockDriverStates to complete. As a bonus we perform checks
after qemu_aio_wait() to ensure that requests really have finished.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now when you try to migrate with ivshmem, you get a proper QMP error:
(qemu) migrate tcp:localhost:1025
Migration is disabled when using feature 'peer mode' in device 'ivshmem'
(qemu)
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Make *save_live() return negative values when there is one error, and
updates all callers to check for the error.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Now the field contains the last error name, so rename acordingly.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Now the function returned errno, so it is better the new name.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
make functions propagate errno, instead of just using -EIO. Add a
comment about what are the return value of qemu_savevm_state_iterate().
Signed-off-by: Juan Quintela <quintela@redhat.com>
This reverts commit eb60260de0.
Conflicts:
savevm.c
We changed qemu_peek_byte() prototype, just fixed the rejects.
Signed-off-by: Juan Quintela<quintela@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
We add qemu_peek_buffer, that is identical to qemu_get_buffer, just
that it don't update f->buf_index.
We add a paramenter to qemu_peek_byte() to be able to peek more than
one byte.
Once this is done, to see if we have a subsection we look:
- 1st byte is QEMU_VM_SUBSECTION
- 2nd byte is a length, and is bigger than section name
- 3rd element is a string that starts with section_name
So, we shouldn't have false positives (yes, content could still get us
wrong but probabilities are really low).
v2:
- Alex Williamsom found that we could get negative values on index.
- Rework code to fix that part.
- Rewrite qemu_get_buffer() using qemu_peek_buffer()
v3:
- return "done" on error case
v4:
- fix qemu_file_skip() off by one.
Signed-off-by: Juan Quintela <quintela@redhat.com>
This patch will make moving code on next patches and having checkpatch
happy easier.
Signed-off-by: Juan Quintela<quintela@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
We will need on next patch to be able to lookahead on next patch
v2: rename "used" to "pending" (Alex Williams)
Signed-off-by: Juan Quintela<quintela@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
qemu_savevm_state() has some logic to stop the VM and to (or not to)
resume it. But this seems to be a big noop, as qemu_savevm_state()
is only called by do_savevm() when the VM is already stopped.
So, let's drop qemu_savevm_state()'s stop VM logic.
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Next commit will convert the query-status command to use the
RunState type as generated by the QAPI.
In order to "transparently" replace the current enum by the QAPI
one, we have to make some changes to some enum values.
As the changes are simple renames, I'll do them in one shot. The
changes are:
- Rename the prefix from RSTATE_ to RUN_STATE_
- RUN_STATE_SAVEVM to RUN_STATE_SAVE_VM
- RUN_STATE_IN_MIGRATE to RUN_STATE_INMIGRATE
- RUN_STATE_PANICKED to RUN_STATE_INTERNAL_ERROR
- RUN_STATE_POST_MIGRATE to RUN_STATE_POSTMIGRATE
- RUN_STATE_PRE_LAUNCH to RUN_STATE_PRELAUNCH
- RUN_STATE_PRE_MIGRATE to RUN_STATE_PREMIGRATE
- RUN_STATE_RESTORE to RUN_STATE_RESTORE_VM
- RUN_STATE_PRE_MIGRATE to RUN_STATE_FINISH_MIGRATE
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Today, when notifying a VM state change with vm_state_notify(),
we pass a VMSTOP macro as the 'reason' argument. This is not ideal
because the VMSTOP macros tell why qemu stopped and not exactly
what the current VM state is.
One example to demonstrate this problem is that vm_start() calls
vm_state_notify() with reason=0, which turns out to be VMSTOP_USER.
This commit fixes that by replacing the VMSTOP macros with a proper
state type called RunState.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
savevm and loadvm silently ignore block devices with removable media,
such as floppies and SD cards. Rolling back a VM to a previous
checkpoint will *not* roll back writes to block devices with removable
media.
Moreover, bdrv_is_removable() is a confused mess, and wrong in at
least one case: it considers "-drive if=xen,media=cdrom -M xenpv"
removable. It'll be cleaned up later in this series.
Read-only block devices are also ignored, but that's okay.
Fix by ignoring only read-only block devices and empty block devices.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Avoid warnings like these by wrapping recv():
CC slirp/ip_icmp.o
/src/qemu/slirp/ip_icmp.c: In function 'icmp_receive':
/src/qemu/slirp/ip_icmp.c:418:5: error: passing argument 2 of 'recv' from incompatible pointer type [-Werror]
/usr/local/lib/gcc/i686-mingw32msvc/4.6.0/../../../../i686-mingw32msvc/include/winsock2.h:547:32: note: expected 'char *' but argument is of type 'struct icmp *'
Remove also casts used to avoid warnings.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
This allows to easily tag devices as non-migratable,
so any attempt to migrate a virtual machine with the
device in question active will make migration fail.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
In case we load the vmstate during incoming migration, we start from a
clean, default machine state as we went through system reset before. But
if we load from a snapshot, the machine can be in any state. That can
cause troubles if loading an older image which does not carry all state
information the executing QEMU requires. Hardly any device takes care of
this scenario.
However, fixing this is trivial. We just need to issue a system reset
during loadvm as well.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
This patch removes all references to signal.h when qemu-common.h is included
as they become redundant.
Signed-off-by: Alexandre Raymond <cerbere@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
helpfull -> helpful
usefull -> useful
cotrol -> control
and a grammar fix.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
commit 82fa39b751
only contains half of the fix. It forgots the save state fix for
UINT8 indexes.
Anthony, please apply, without this migration using hpet is broken.
(only current user).
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* 'for-anthony' of git://github.com/bonzini/qemu:
remove qemu_get_clock
add a generic scaling mechanism for timers
change all other clock references to use nanosecond resolution accessors
change all rt_clock references to use millisecond resolution accessors
add more helper functions with explicit milli/nanosecond resolution
This was done with:
sed -i 's/qemu_get_clock\>/qemu_get_clock_ns/' \
$(git grep -l 'qemu_get_clock\>' )
sed -i 's/qemu_new_timer\>/qemu_new_timer_ns/' \
$(git grep -l 'qemu_new_timer\>' )
after checking that get_clock and new_timer never occur twice
on the same line. There were no missed occurrences; however, even
if there had been, they would have been caught by the compiler.
There was exactly one false positive in qemu_run_timers:
- current_time = qemu_get_clock (clock);
+ current_time = qemu_get_clock_ns (clock);
which is of course not in this patch.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This was done with:
sed -i '/get_clock\>.*rt_clock/s/get_clock\>/get_clock_ms/' \
$(git grep -l 'get_clock\>.*rt_clock' )
sed -i '/new_timer\>.*rt_clock/s/new_timer\>/new_timer_ms/' \
$(git grep -l 'new_timer\>.*rt_clock' )
after checking that get_clock and new_timer never occur twice
on the same line. There were no missed occurrences; however, even
if there had been, they would have been caught by the compiler.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Define and use dedicated constants for vm_stop reasons, they actually
have nothing to do with the EXCP_* defines used so far. At this chance,
specify more detailed reasons so that VM state change handlers can
evaluate them.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Although it's rare to happen in live migration, when the head of a
byte stream contains 0x05 which is the marker of subsection, the
loader gets corrupted because vmstate_subsection_load() continues even
the device doesn't require it. This patch adds a checker whether
subsection is needed, and skips following routines if not needed.
Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Before commit 622b520f, index=12 meant bus=1,unit=5.
Since the commit, it means bus=0,unit=12. The drive is created, but
not the guest device. That's because the controllers we use with
if=scsi drives (lsi53c895a and esp) support only 7 units, and
scsi_bus_legacy_handle_cmdline() ignores drives with unit numbers
exceeding that limit.
Changing the mapping of index to bus, unit is a regression. Breaking
-drive invocations that used to work just makes it worse.
Revert the part of commit 622b520f that causes this, and clean up
some.
Note that the fix only affects if=scsi. You can still put more than 7
units on a SCSI bus with -device & friends.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The no_migrate save state flag is currently only checked in the
last phase of migration. This means that we potentially waste
a lot of time and bandwidth with the live state handlers before
we ever check the no_migrate flags. The error message printed
when we catch a non-migratable device doesn't get printed for
a detached migration. And, no_migrate does nothing to prevent
an incoming migration to a target that includes a non-migratable
device. This attempts to fix all of these.
One notable difference in behavior is that an outgoing migration
now checks for non-migratable devices before ever connecting to
the target system. This means the target will remain listening
rather than exit from failure.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There's no need to flush requests after vmstop
as vmstop does it for us automatically now.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Jason Wang <jasowang@redhat.com>
I'd like to disable bandwidth limit or make it very high,
Use int64_t all over to make values >= 4g work.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Jason Wang <jasowang@redhat.com>
Compiling with GCC 4.6.0 20100925 produced warnings like:
/src/qemu/net/tap-win32.c: In function 'tap_win32_open':
/src/qemu/net/tap-win32.c:582:12: error: variable 'hThread' set but not used [-Werror=unused-but-set-variable]
Fix by removing the unused variables.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Fix this warning:
CC savevm.o
/src/qemu/savevm.c: In function `do_savevm':
/src/qemu/savevm.c:1900: warning: passing arg 1 of `localtime_r' from incompatible pointer type
It looks like on OpenBSD the type of tv_sec in struct timeval is still
'long' instead of time_t as in most other OS. Fix by adding a cast.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
When savevm is run without a name, the name stays blank and the snapshot is
saved anyway.
The new behavior is when savevm is run without parameters a name will be
created automaticaly, so the snapshot is accessible to the user without needing
the id when loadvm is run.
(qemu) savevm
(qemu) info snapshots
ID TAG VM SIZE DATE VM CLOCK
1 vm-20100728134640 978K 2010-07-28 13:46:40 00:00:08.603
We use a name with the format 'vm-YYYYMMDDHHMMSS'.
This is a first step to hide the internal id, because I don't see a reason to
expose this kind of internals to the user.
Signed-off-by: Miguel Di Ciurcio Filho <miguel.filho@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The output generated by 'info snapshots' shows only snapshots that exist on the
block device that saves the VM state. This output can cause an user to
erroneously try to load an snapshot that is not available on all block devices.
$ qemu-img snapshot -l xxtest.qcow2
Snapshot list:
ID TAG VM SIZE DATE VM CLOCK
1 1.5M 2010-07-26 16:51:52 00:00:08.599
2 1.5M 2010-07-26 16:51:53 00:00:09.719
3 1.5M 2010-07-26 17:26:49 00:00:13.245
4 1.5M 2010-07-26 19:01:00 00:00:46.763
$ qemu-img snapshot -l xxtest2.qcow2
Snapshot list:
ID TAG VM SIZE DATE VM CLOCK
3 0 2010-07-26 17:26:49 00:00:13.245
4 0 2010-07-26 19:01:00 00:00:46.763
Current output:
$ qemu -hda xxtest.qcow2 -hdb xxtest2.qcow2 -monitor stdio -vnc :0
QEMU 0.12.4 monitor - type 'help' for more information
(qemu) info snapshots
Snapshot devices: ide0-hd0
Snapshot list (from ide0-hd0):
ID TAG VM SIZE DATE VM CLOCK
1 1.5M 2010-07-26 16:51:52 00:00:08.599
2 1.5M 2010-07-26 16:51:53 00:00:09.719
3 1.5M 2010-07-26 17:26:49 00:00:13.245
4 1.5M 2010-07-26 19:01:00 00:00:46.763
Snapshots 1 and 2 do not exist on xxtest2.qcow, but they are displayed anyway.
This patch sumarizes the output to only show fully available snapshots.
New output:
(qemu) info snapshots
ID TAG VM SIZE DATE VM CLOCK
3 1.5M 2010-07-26 17:26:49 00:00:13.245
4 1.5M 2010-07-26 19:01:00 00:00:46.763
Signed-off-by: Miguel Di Ciurcio Filho <miguel.filho@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
A non-migratable device should be removed before migration and re-added after.
Signed-off-by: Cam Macdonell <cam@cs.ualberta.ca>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch improves the resilience of the load_vmstate() function, doing
further and better ordered tests.
In load_vmstate(), if there is any error on bdrv_snapshot_goto(), except if the
error is on VM state device, load_vmstate() will return zero and the VM will be
started with major corruption chances.
The current process:
- test if there is any writable device without snapshot support
- if exists return -error
- get the device that saves the VM state, possible return -error but unlikely
because it was tested earlier
- flush I/O
- run bdrv_snapshot_goto() on devices
- if fails, give an warning and goes to the next (not good!)
- if fails on the VM state device, return zero (not good!)
- check if the requested snapshot exists on the device that saves the VM state
and the state is not zero
- if fails return -error
- open the file with the VM state
- if fails return -error
- load the VM state
- if fails return -error
- return zero
New behavior:
- get the device that saves the VM state
- if fails return -error
- check if the requested snapshot exists on the device that saves the VM state
and the state is not zero
- if fails return -error
- test if there is any writable device without snapshot support
- if exists return -error
- test if the devices with snapshot support have the requested snapshot
- if anyone fails, return -error
- flush I/O
- run snapshot_goto() on devices
- if anyone fails, return -error
- open the file with the VM state
- if fails return -error
- load the VM state
- if fails return -error
- return zero
do_loadvm must not call vm_start if any error has occurred in load_vmstate.
Signed-off-by: Miguel Di Ciurcio Filho <miguel.filho@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Forgot to check for and free these.
Found-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
This commit adds subsections for each device section.
Subsections is the way to handle information that don't need to be sent
to de destination of a migration because its values are not needed. It is
the way to handle optional information. Notice that only the source can
decide if the information is optional or not. The destination needs to
understand all subsections that it receives to have a sucessful load.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
For callers that pass a device we can traverse up the qdev tree and
make use of the BusInfo.get_dev_path information for creating unique
savevm id strings. This avoids needing to rely on the instance number,
which can cause problems with device initialization order and hotplug.
For compatibility, we also store away the old id string and instance
so we can accept migrations from VMs as we add new get_dev_path
implementations.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When available, we'd like to be able to access the DeviceState
when registering a savevm. For buses with a get_dev_path()
function, this will allow us to create more unique savevm
id strings.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
savevm.c keeps a pointer to the snapshot block device. If you manage
to get that device deleted, the pointer dangles, and the next snapshot
operation will crash & burn. Unplugging a guest device that uses it
does the trick:
$ MALLOC_PERTURB_=234 qemu-system-x86_64 [...]
QEMU 0.12.50 monitor - type 'help' for more information
(qemu) info snapshots
No available block device supports snapshots
(qemu) drive_add auto if=none,file=tmp.qcow2
OK
(qemu) device_add usb-storage,id=foo,drive=none1
(qemu) info snapshots
Snapshot devices: none1
Snapshot list (from none1):
ID TAG VM SIZE DATE VM CLOCK
(qemu) device_del foo
(qemu) info snapshots
Snapshot devices:
Segmentation fault (core dumped)
Move management of that pointer to block.c, and zap it when the device
it points becomes unusable.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We find snapshots by iterating over the list of drives defined with
drive_init(). This misses host block devices defined by other means.
Such means don't exist now, but will be introduced later in this
series.
Iterate over all host block devices instead, with bdrv_next().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Both bdrv_can_snapshot() and bdrv_has_snapshot() does not work as advertized.
First issue: Their names implies different porpouses, but they do the same thing
and have exactly the same code. Maybe copied and pasted and forgotten?
bdrv_has_snapshot() is called in various places for actually checking if there
is snapshots or not.
Second issue: the way bdrv_can_snapshot() verifies if a block driver supports or
not snapshots does not catch all cases. E.g.: a raw image.
So when do_savevm() is called, first thing it does is to set a global
BlockDriverState to save the VM memory state calling get_bs_snapshots().
static BlockDriverState *get_bs_snapshots(void)
{
BlockDriverState *bs;
DriveInfo *dinfo;
if (bs_snapshots)
return bs_snapshots;
QTAILQ_FOREACH(dinfo, &drives, next) {
bs = dinfo->bdrv;
if (bdrv_can_snapshot(bs))
goto ok;
}
return NULL;
ok:
bs_snapshots = bs;
return bs;
}
bdrv_can_snapshot() may return a BlockDriverState that does not support
snapshots and do_savevm() goes on.
Later on in do_savevm(), we find:
QTAILQ_FOREACH(dinfo, &drives, next) {
bs1 = dinfo->bdrv;
if (bdrv_has_snapshot(bs1)) {
/* Write VM state size only to the image that contains the state */
sn->vm_state_size = (bs == bs1 ? vm_state_size : 0);
ret = bdrv_snapshot_create(bs1, sn);
if (ret < 0) {
monitor_printf(mon, "Error while creating snapshot on '%s'\n",
bdrv_get_device_name(bs1));
}
}
}
bdrv_has_snapshot(bs1) is not checking if the device does support or has
snapshots as explained above. Only in bdrv_snapshot_create() the device is
actually checked for snapshot support.
So, in cases where the first device supports snapshots, and the second does not,
the snapshot on the first will happen anyways. I believe this is not a good
behavior. It should be an all or nothing process.
This patch addresses these issues by making bdrv_can_snapshot() actually do
what it must do and enforces better tests to avoid errors in the middle of
do_savevm(). bdrv_has_snapshot() is removed and replaced by bdrv_can_snapshot()
where appropriate.
bdrv_can_snapshot() was moved from savevm.c to block.c. It makes more sense to me.
The loadvm_state() function was updated too to enforce that when loading a VM at
least all writable devices must support snapshots too.
Signed-off-by: Miguel Di Ciurcio Filho <miguel.filho@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Anything that moves hundreds of lines out of vl.c can't be all bad.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch makes sure that if the exec: process exits with a non-zero return
status, we treat the migration as failed.
This fixes https://bugs.launchpad.net/qemu/+bug/391879
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Some legacy users (mostly PC devices) of vmstate_register manage
instance IDs on their own, and that unfortunately in a way that is
incompatible with automatically generated ones. This so far prevents
switching those users to vmstates that are registered by qdev.
To establish a migration path, this patch introduces the concept of
alias IDs. They can be passed to an extended vmstate registration
service, and qdev provides a set service to be used during device init.
find_se will consider the alias in addition to the default ID. We can
then start generating the default ID automatically and writing it on
vmsave, thus converting that format without breaking support for upward
migration.
The user is required specify the highest vmstate version for which the
alias is required. Once this version falls behind the minimum required
for a specific vmstate, an assertion triggers to motivate cleaning up
the obsolete alias.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The packet(s) sent out after migration are supposed to be RAPR type of
packets. If they are supposed to go anywhere useful, the RAPR ethernet
identifier needs to be fix.
Also see http://www.iana.org/assignments/ethernet-numbers for 0x8035 for
RARP.
Signed-off-by: Stefan Berger <stefanb@us.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
error_report() terminates the message with a newline. Strip it it
from its arguments.
This fixes a few error messages lacking a newline:
net_handle_fd_param()'s "No file descriptor named %s found", and
tap_open()'s "vnet_hdr=1 requested, but no kernel support for
IFF_VNET_HDR available" (all three versions).
There's one place that passes arguments without newlines
intentionally: load_vmstate(). Fix it up.
This grand cleanup drops all reset and vmsave/load related
synchronization points in favor of four(!) generic hooks:
- cpu_synchronize_all_states in qemu_savevm_state_complete
(initial sync from kernel before vmsave)
- cpu_synchronize_all_post_init in qemu_loadvm_state
(writeback after vmload)
- cpu_synchronize_all_post_init in main after machine init
- cpu_synchronize_all_post_reset in qemu_system_reset
(writeback after system reset)
These writeback points + the existing one of VCPU exec after
cpu_synchronize_state map on three levels of writeback:
- KVM_PUT_RUNTIME_STATE (during runtime, other VCPUs continue to run)
- KVM_PUT_RESET_STATE (on synchronous system reset, all VCPUs stopped)
- KVM_PUT_FULL_STATE (on init or vmload, all VCPUs stopped as well)
This level is passed to the arch-specific VCPU state writing function
that will decide which concrete substates need to be written. That way,
no writer of load, save or reset functions that interact with in-kernel
KVM states will ever have to worry about synchronization again. That
also means that a lot of reasons for races, segfaults and deadlocks are
eliminated.
cpu_synchronize_state remains untouched, just as Anthony suggested. We
continue to need it before reading or writing of VCPU states that are
also tracked by in-kernel KVM subsystems.
Consequently, this patch removes many cpu_synchronize_state calls that
are now redundant, just like remaining explicit register syncs.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
savevm without id or tag segfaults in:
(gdb) bt
#0 0x00007f600a83bf8a in __strcmp_sse42 () from /lib64/libc.so.6
#1 0x00000000004745b6 in bdrv_snapshot_find (bs=<value optimized out>,
sn_info=0x7fff996be280, name=0x0) at savevm.c:1631
#2 0x0000000000475c80 in del_existing_snapshots (name=<value optimized out>,
mon=<value optimized out>) at savevm.c:1654
#3 do_savevm (name=<value optimized out>, mon=<value optimized out>)
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
CC savevm.o
cc1: warnings being treated as errors
savevm.c: In function 'file_put_buffer':
savevm.c:342: error: ignoring return value of 'fwrite', declared with attribute warn_unused_result
make: *** [savevm.o] Error 1
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The effect of this patch with current block migration is that its stage
2, ie. the first full walk-through of the block devices will be
performed completely before RAM migration starts. This ensures that
continuously changing RAM pages are not re-synchronized all the time
while block migration is not completed.
Future versions of block migration which will respect the specified
downtime will generate a different pattern: After RAM migration has
started as well, block migration may also continue to inject dirty
blocks into the RAM stream once it detects that the number of pending
blocks would extend the downtime unacceptably.
Note that all this relies on the current registration order: block
before RAM migration.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In order to allow proper progress reporting to the monitor that
initiated the migration, forward the monitor reference through the
migration layer down to SaveLiveStateHandler.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Introduce qemu_savevm_state_cancel and inject a stage -1 to cancel a
live migration. This gives the involved subsystems a chance to clean up
dynamically allocated resources. Namely, the block migration layer can
now free its device descriptors and pending blocks.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When the size that we want to transmit is in another field, but in an
unit different that bytes
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Support for buffer that are pointed by a pointer (i.e. not embedded)
where the size that we want to use is a field in the state.
We also need a new place to store where to start in the middle of the
buffer, as now it is a pointer, not the offset of the 1st field.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Seeking on vmstate save/load does not work if the underlying file is a
stream. We could try to make all QEMUFile* forward-seek-aware, but first
attempts in this direction indicated that it's saner to convert the few
qemu_fseek-on-vmstates users to plain reads/writes.
This fixes various subtle vmstate corruptions where unused fields were
involved.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch introduces block migration called during live migration. Block
are being copied to the destination in an async way. First the code will
transfer the whole disk and then transfer all dirty blocks accumulted during
the migration.
Still need to improve transition from the iterative phase of migration to the
end phase. For now transition will take place when all blocks transfered once,
all the dirty blocks will be transfered during the end phase (guest is
suspended).
Changes from v4:
- Global variabels moved to a global state structure allocated dynamically.
- Minor coding style issues.
- Poll block.c for tracking of dirty blocks instead of manage it here.
Signed-off-by: Liran Schour <lirans@il.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When creating a snapshot we can run into the situation that the first disk
doesn't have a snapshot, but the second one does have one with the same name as
the new snapshot.
In this case, qemu doesn't recognize that there is a snapshot to be
overwritten, so it starts to save the new snapshot and errors out later when it
tries to snapshot the second image. With this patch, snapshots on secondary
images are overwritten just like on the first image.
Reported-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
commit b04c4134d6
broke incoming migration. After talking with Gleb, code was intended
to be the way is in this fix. This fixes migration here.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Use qemu_send_packet_raw to send gratuitous arp. This will ensure that
vnet header is handled properly.
Also, avoid sending the gratuitous packet to the guest. There doesn't
appear to be any reason for doing that and the code will currently just
crash if the NIC is not associated with a vlan.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Looks like these are just artifacts of vl.c being split up.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
It allows to have 'things' in savevm format not backed in the device state
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Currently, after a migration qemu sends a broadcast packet to update
switches' MAC->port mappings.
Unfortunately, it picks a random (constant) ethertype and crosses its
fingers that no one else is using it.
This patch causes it to send a RARP packet instead. RARP was chosen for
2 reasons. One, it is always harmless, and will continue to be so even
as new ethertypes are allocated. Two, it is what VMware ESX sends, so
people who write filtering rules for switches already know about it.
I also changed the code to send SELF_ANNOUNCE_ROUNDS packets, instead of
SELF_ANNOUNCE_ROUNDS + 1, and added a simple backoff scheme.
Signed-off-by: Nolan Leake <nolan <at> sigbus.net>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In a later patch, we introduce pre_save() and post_save() functions.
The whole point of that operation is to change things in the state.
Without this patch, we have to remove the const qualifier in each
use with a cast
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Problem: Our file sys-queue.h is a copy of the BSD file, but there are
some additions and it's not entirely compatible. Because of that, there have
been conflicts with system headers on BSD systems. Some hacks have been
introduced in the commits 15cc923584,
f40d753718,
96555a96d7 and
3990d09adf but the fixes were fragile.
Solution: Avoid the conflict entirely by renaming the functions and the
file. Revert the previous hacks.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
This naming was used in kvm tree, and is easier to remember
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
vmsd alone is not enugh, because we can have several structs saved with the same description (vmsd).
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In previosu series I remove v2 support for RAM (that was the version that was
supported when SaveVM v3 appeared). Now we can't load RAM for any image saved in SaveVM v2, we can as well remove SaveVM v2 entirely.
Note: That SaveVM RAM was at v2 when General SaveVM support went from v2 to v3 makes talking about versions confusing at least
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This commit ports command handlers that receive one argument to use
the new monitor's dictionary.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We can't check the version in a substruct, it is not stored anywhere
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We read the saved value and check that it is less or equal than the one
stored in the structure.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch adds support for static sized buffer and typecheks that the buffer is right.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch add supports for variable sized arrays whose size is
another field of the state.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We read the saved value and check that it is the same that the one
is stored in the structure.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch adds support for saving one VMStateDescription from other
VMStateDescription.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch adds support for saving arrays inside the struct
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch adds support for saving pointers to values
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch introduces VMState infrastructure, to convert the save/load
functions of devices to a table approach. This new approach has the
following advantages:
- it is type-safe
- you can't have load/save functions out of sync
- will allows us to have new interesting commands, like dump <device>, that
shows all its internal state.
- Just now, the only added type is arrays, but we can add structures.
- Uses old load_state() function for loading old state.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
do_loadvm() is now called from the monitor.
load_vmstate() is called by do_loadvm() and when -loadvm command line is used.
Command line don't have to play games with vmstop()/vmstart()
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
While reading Chris's code for fd migration I noticed the duplication
between QEMUFilePopen and QEMUFileStdio. This fixes it, and makes
qemu_fopen more similar qemu_popen.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>