migrate_fd_cleanup will usually close the file descriptor via
buffered_file_close's call to migrate_fd_close. However, in the case
of s->file == NULL it is "inlining" migrate_fd_close (almost: there is a
direct close() instead of using s->close(s)). To fix the inconsistency
and clean up the code, allow multiple calls to migrate_fd_close and use
the function in migrate_fd_cleanup.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* 'queue/qmp' of git://repo.or.cz/qemu/qmp-unstable:
migration: go to paused state after finishing incoming migration with -S
qmp: handle stop/cont in INMIGRATE state
hmp: fix info cpus for sparc targets
At the end of migration the machine has started already, and cannot be
destroyed without losing the guest's data. Hence, prelaunch is the
wrong state. Go to the paused state instead. QEMU would reach that
state anyway (after running the guest for the blink of an eye) if the
"stop" command had been received after the start of migration.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
And remove the superfluous integer return value.
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Error propagation is already there for socket backends. Add it to other
protocols, simplifying code that tests for errors that will never happen.
With all protocols understanding Error, the code can be simplified
further by removing the return value.
Unfortunately, the quality of error messages varies depending
on where the error is detected, because no Error is passed to the
NonBlockingConnectHandler. Thus, the exact error message still cannot
be sent to the user if the OS reports it asynchronously via SO_ERROR.
If NonBlockingConnectHandler received an Error**, we could for
example report the error class and/or message via a new field of the
query-migration command even if it is reported asynchronously.
Before:
(qemu) migrate fd:ffff
migrate: An undefined error has occurred
(qemu) info migrate
(qemu)
After:
(qemu) migrate fd:ffff
migrate: File descriptor named 'ffff' has not been found
(qemu) info migrate
capabilities: xbzrle: off
Migration status: failed
total time: 0 milliseconds
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This makes migration-unix.c again a cut-and-paste job from migration-tcp.c,
exactly as it was in the beginning. :)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The call to migrate_fd_error() was missing for non-socket backends, so
centralize it in qmp_migrate().
Before:
(qemu) migrate fd:ffff
migrate: An undefined error has occurred
(qemu) info migrate
(qemu)
After:
(qemu) migrate fd:ffff
migrate: An undefined error has occurred
(qemu) info migrate
capabilities: xbzrle: off
Migration status: failed
total time: 0 milliseconds
(The awful error message will be fixed later in the series).
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The migration code is using errp to detect "internal" errors, this means
that it relies on errp being non-NULL.
No impact so far because our only QMP clients (the QMP marshaller and HMP)
never pass a NULL Error **. But if we had others, this patch would make
sure that migration can work with a NULL Error **.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We only use it once, just remove the callback indirection.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
We only used it once, just remove the callback indirection
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Change XBZRLE cache size in bytes (the size should be a power of 2, it will be
rounded down to the nearest power of 2).
If XBZRLE cache size is too small there will be many cache miss.
New query-migrate-cache-size QMP command and 'info migrate_cache_size' HMP
command to query cache value.
Signed-off-by: Benoit Hudzia <benoit.hudzia@sap.com>
Signed-off-by: Petter Svard <petters@cs.umu.se>
Signed-off-by: Aidan Shribman <aidan.shribman@sap.com>
Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
In the outgoing migration check to see if the page is cached and
changed, then send compressed page by using save_xbrle_page function.
In the incoming migration check to see if RAM_SAVE_FLAG_XBZRLE is set
and decompress the page (by using load_xbrle function).
Signed-off-by: Benoit Hudzia <benoit.hudzia@sap.com>
Signed-off-by: Petter Svard <petters@cs.umu.se>
Signed-off-by: Aidan Shribman <aidan.shribman@sap.com>
Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The management can enable/disable a capability for the next migration by using
migrate-set-capabilities QMP command.
The user can use migrate_set_capability HMP command.
Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The management can query the current migration capabilities using
query-migrate-capabilities QMP command.
The user can use 'info migrate_capabilities' HMP command.
Currently only XBZRLE capability is available.
Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
We add time spent for migration to the output of "info migrate"
command. 'total_time' means time since the start fo migration if
migration is 'active', and total time of migration if migration is
completed. As we are also interested in transferred ram when
migration completes, adding all ram statistics
Signed-off-by: Juan Quintela <quintela@redhat.com>
Use help functions in qemu-socket.c for tcp migration,
which already support ipv6 addresses.
Currently errp will be set to UNDEFINED_ERROR when migration fails,
qemu would output "migration failed: ...", and current user can
see a message("An undefined error has occurred") in monitor.
This patch changed tcp_start_outgoing_migration()/inet_connect()
/inet_connect_opts(), socket error would be passed back,
then current user can see a meaningful err message in monitor.
Qemu will exit if listening fails, so output socket error
to qemu stderr.
For IPv6 brackets must be mandatory if you require a port.
Referencing to RFC5952, the recommended format is:
[2312::8274]:5200
test status: Successed
listen side: qemu-kvm .... -incoming tcp:[2312::8274]:5200
client side: qemu-kvm ...
(qemu) migrate -d tcp:[2312::8274]:5200
Signed-off-by: Amos Kong <akong@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Wakeup the guest when the live part of the migation is finished.
This avoids being in suspended state on migration, so we don't
have to save the is_suspended bit.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
The migrate command is one of those commands where HMP and QMP completely
mix up together. This made the conversion to the QAPI (which separates the
command into QMP and HMP parts) a bit difficult.
The first important change to be noticed is that this commit completes the
removal of the Monitor object from migration code, started by the previous
commit.
Another important and tricky change is about supporting the non-detached
mode. That is, if the user doesn't pass '-d' the migrate command will lock
the monitor and will only release it when migration is finished.
To support this in the new HMP command (hmp_migrate()), it is necessary
to create a timer which runs every second and checks if the migration is
still active. If it is, the timer callback will re-schedule itself to run
one second in the future. If the migration has already finished, the
monitor lock is released and the user can use it normally.
All these changes should be transparent to the user.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
The Monitor object is passed back and forth within the migration/savevm
code so that it can print errors and progress to the user.
However, that approach assumes a HMP monitor, being completely invalid
in QMP.
This commit drops almost every single usage of the Monitor object, all
monitor_printf() calls have been converted into DPRINTF() ones.
There are a few remaining Monitor objects, those are going to be dropped
by the next commit.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Notifiers do not need to access both ends of the list, and using
a QLIST also simplifies the API.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
All files under GPLv2 will get GPLv2+ changes starting tomorrow.
event_notifier.c and exec-obsolete.h were only ever touched by Red Hat
employees and can be relicensed now.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Also, we now return the qemu_fclose() value unchanged to the caller. For
reference, the migrate_fd_cleanup() callers are the following:
- migrate_fd_completed(): any negative value is considered an
error, so the change is OK.
- migrate_fd_error(): doesn't check the migrate_fd_cleanup() return value
- migrate_fd_cancel(): doesn't check the migrate_fd_cleanup() return
value
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Image files have two types of data: immutable data that describes things like
image size, backing files, etc. and mutable data that includes offset and
reference count tables.
Today, image formats aggressively cache mutable data to improve performance. In
some cases, this happens before a guest even starts. When dealing with live
migration, since a file is open on two machines, the caching of meta data can
lead to data corruption.
This patch addresses this by introducing a mechanism to invalidate any cached
mutable data a block driver may have which is then used by the live migration
code.
NB, this still requires coherent shared storage. Addressing migration without
coherent shared storage (i.e. NFS) requires additional work.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This lets different subsystems register an Error that is thrown whenever
migration is attempted. This works nicely because it gracefully supports
things like hotplug.
Right now, if multiple errors are registered, only one of them is reported.
I expect that for 1.1, we'll extend query-migrate to return all of the reasons
why migration is disabled at any given point in time.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Migration with fd uses s->mon to pass the fd. But we only assign the
s->mon for !detached migration. Fix it. Once there add a comment
indicating that s->mon has two uses.
Bug reported by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
CC: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
A simple migration reproduces it:
1. Start the source VM with:
# qemu [...] -S
2. Start the destination VM with:
# qemu <source VM cmd-line> -incoming tcp:0:4444
3. In the source VM:
(qemu) migrate -d tcp:0:4444
4. The source VM will segfault as soon as migration completes (might not
happen in the first try)
What is happening here is that qemu_file_put_notify() can end up closing
's->file' (in which case it's also set to NULL). The call stack is rather
complex, but Eduardo helped tracking it to:
select loop -> migrate_fd_put_notify() -> qemu_file_put_notify() ->
buffered_put_buffer() -> migrate_fd_put_ready() ->
migrate_fd_completed() -> migrate_fd_cleanup().
To be honest, it's not completely clear to me in which cases 's->file'
is not closed (on error maybe)? But I doubt this fix will make anything
worse.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>