Prior to servicing userfault requests we must ensure we've not got
huge pages in the area that might include non-transferred memory,
since a hugepage could incorrectly mark the whole huge page as present.
We mark the area as non-huge page (nhp) just before we perform
discards; the discard code now tells us to discard any areas
that haven't been sent (as well as any that are redirtied);
any already formed transparent-huge-pages get fragmented
by this discard process if they cotnain any discards.
Transparent huge pages that have been entirely transferred
and don't contain any discards are not broken by this mechanism;
they stay as huge pages.
By starting postcopy after a full precopy pass, many of the pages
then stay as huge pages; this is important for maintaining performance
after the end of the migration.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The loading of a device state (during postcopy) may access guest
memory that's still on the source machine and thus might need
a page fill; split off a separate thread that handles the incoming
page data so that the original incoming migration code can finish
off the device data.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
userfaultfd is a Linux syscall that gives an fd that receives a stream
of notifications of accesses to pages registered with it and allows
the program to acknowledge those stalls and tell the accessing
thread to carry on.
We convert the requests from the kernel into messages back to the
source asking for the pages.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
postcopy_place_page (etc) provide a way for postcopy to place a page
into guests memory atomically (using the copy ioctl on the ufd).
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
On receiving MIG_RPCOMM_REQ_PAGES look up the address and
queue the page.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add MIG_RP_MSG_REQ_PAGES command on Return path for the postcopy
destination to request a page from the source.
Two versions exist:
MIG_RP_MSG_REQ_PAGES_ID that includes a RAMBlock name and start/len
MIG_RP_MSG_REQ_PAGES that just has start/len for use with the same
RAMBlock as a previous MIG_RP_MSG_REQ_PAGES_ID
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Rework the migration thread to setup and start postcopy.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Mark the area of RAM as 'userfault'
Start up a fault-thread to handle any userfaults we might receive
from it (to be filled in later)
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Soon we'll be in either ACTIVE or POSTCOPY_ACTIVE when we
complete migration, and we need to know which we expect to be
in to change state safely.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add qemu_savevm_state_complete_postcopy to complement
qemu_savevm_state_complete_precopy together with a new
save_live_complete_postcopy method on devices.
The save_live_complete_precopy method is called on
all devices during a precopy migration, and all non-postcopy
devices during a postcopy migration at the transition.
The save_live_complete_postcopy method is called at
the end of postcopy for all postcopiable devices.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
'MIGRATION_STATUS_POSTCOPY_ACTIVE' is entered after migrate_start_postcopy
'migration_in_postcopy' is provided for other sections to know if
they're in postcopy.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Once postcopy is enabled (with migrate_set_capability), the migration
will still start on precopy mode. To cause a transition into postcopy
the:
migrate_start_postcopy
command must be issued. Postcopy will start sometime after this
(when it's next checked in the migration loop).
Issuing the command before migration has started will error,
and issuing after it has finished is ignored.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Provide a check to see if the OS we're running on has all the bits
needed for postcopy.
Creates postcopy-ram.c which will get most of the other helpers we need.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Modify save_live_pending to return separate postcopiable and
non-postcopiable counts.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The state of the postcopy process is managed via a series of messages;
* Add wrappers and handlers for sending/receiving these messages
* Add state variable that track the current state of postcopy
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The 'postcopy ram' capability allows postcopy migration of RAM;
note that the migration starts off in precopy mode until
postcopy mode is triggered (see the migrate_start_postcopy
patch later in the series).
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Postcopy needs to have two migration streams loading concurrently;
one from memory (with the device state) and the other from the fd
with the memory transactions.
Split the core of qemu_loadvm_state out so we can use it for both.
Allow the inner loadvm loop to quit and cause the parent loops to
exit as well.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Open a return path, and handle messages that are received upon it.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add migrate_send_rp_message to send a message from destination to source along the return path.
(It uses a mutex to let it be called from multiple threads)
Add migrate_send_rp_shut to send a 'shut' message to indicate
the destination is finished with the RP.
Add migrate_send_rp_ack to send a 'PONG' message in response to a PING
Use it in the MSG_RP_PING handler
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add two src->dest commands:
* OPEN_RETURN_PATH - To request that the destination open the return path
* PING - Request an acknowledge from the destination
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Create QEMU_VM_COMMAND section type for sending commands from
source to destination. These commands are not intended to convey
guest state but to control the migration process.
For use in postcopy.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Postcopy needs a method to send messages from the destination back to
the source, this is the 'return path'.
Wire it up for 'socket' QEMUFile's.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
In postcopy we're going to need to perform the complete phase
for postcopiable devices at a different point, start out by
renaming all of the 'complete's to make the difference obvious.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Suspend to file is very much like a migrate, and it makes life
easier if we have the Migration state available, so initialise it
in the savevm.c code for suspending.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewd-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Useful for debugging the migration bitmap and other bitmaps
of the same format (including the sentmap in postcopy).
The bitmap is printed to stderr.
Lines that are all the expected value are excluded so the output
can be quite compact for many bitmaps.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add a wrapper to change the blocking status on a QEMUFile
rather than having to use qemu_set_block(qemu_get_fd(f));
it seems best to avoid exposing the fd since not all QEMUFile's
really have one. With this wrapper we could move the implementation
down to be different on different transports.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
qemu_get_buffer always copies the data it reads to a users buffer,
however in many cases the file buffer inside qemu_file could be given
back to the caller, avoiding the copy. This isn't always possible
depending on the size and alignment of the data.
Thus 'qemu_get_buffer_in_place' either copies the data to a supplied
buffer or updates a pointer to the internal buffer if convenient.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
'file' becomes confusing when you have flows in each direction;
rename to make it clear.
This leaves just the main forward direction ms->file, which is used
in a lot of places and is probably not worth renaming given the churn.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
'cleanup' seems more appropriate than 'cancel'.
Signed-off-by: Liang Li <liang.z.li@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>al3
Reviewed-by: Amit Shah <amit.shah@redhat.com>al3
Signed-off-by: Juan Quintela <quintela@redhat.com>al3
This time convert the external functions:
qemu_get_buffer, qemu_peek_buffer
qemu_put_buffer and qemu_put_buffer_async
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1439463094-5394-6-git-send-email-dgilbert@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
This is a start on using size_t more in qemu-file and friends;
it fixes up QEMUFilePutBufferFunc and QEMUFileGetBufferFunc
to take size_t lengths and return ssize_t return values (like read(2))
and fixes up all the different implementations of them.
Note that I've not yet followed this deeply into bdrv_ implementations.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1439463094-5394-5-git-send-email-dgilbert@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
The macro is defined twice in identical ways.
Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Message-Id: <1439532987-16335-1-git-send-email-soren.brinkmann@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
When doing migration via the QMP command xen_save_devices_state, the
current runstate is not store into the global state section. Also the
current runstate is not the one we want on the receiver side.
During migration, the Xen toolstack paused QEMU before save the devices
state. Also, the toolstack expect QEMU to autostart when the migration is
finished.
So this patch store "running" as it's current runstate.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Since 38e0735e, register_device_unmigratable() has been removed
Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit df4b102452 introduced global_state
section. But it only filled the state while doing migration. While
doing a savevm, we stored an empty string as state. So when we did a
loadvm, it complained that state was invalid.
Fedora 21, 4.1.1, qemu 2.4.0-rc0
> ../../configure --target-list="x86_64-softmmu"
068 2s ... - output mismatch (see 068.out.bad)
--- /home/bos/jhuston/src/qemu/tests/qemu-iotests/068.out 2015-07-08
17:56:18.588164979 -0400
+++ 068.out.bad 2015-07-09 17:39:58.636651317 -0400
@@ -6,6 +6,8 @@
QEMU X.Y.Z monitor - type 'help' for more information
(qemu) savevm 0
(qemu) quit
+qemu-system-x86_64: Unknown savevm section or instance 'globalstate' 0
+qemu-system-x86_64: Error -22 while loading VM state
QEMU X.Y.Z monitor - type 'help' for more information
(qemu) quit
*** done
Failures: 068
Failed 1 of 1 tests
Actually, there were two problems here:
- we registered global_state too late for load_vm (fixed on another
patch on the list)
- we didn't store a valid state for savevm (fixed by this patch).
Reported-by: John Snow <jsnow@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Make check fails with events. THis is due to the parser/lexer that it
uses. Just in case that they are more broken parsers, just only send
events when there are capabilities.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
It needs to be the first one and it is not optional, that is the reason
why it is opencoded. For new machine types, it is required that machine
type name is the same in both sides.
It is just done right now for pc's.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
To make sections optional, we need to do it at the beggining of the code.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This section would be sent:
a- for all new machine types
b- for old machine types if section state is different form {running,paused}
that were the only giving us troubles.
So, in new qemus: it is alwasy there. In old qemus: they are only
there if it an error has happened, basically stoping on target.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This includes a new section that for now just stores the current qemu state.
Right now, there are only one way to control what is the state of the
target after migration.
- If you run the target qemu with -S, it would start stopped.
- If you run the target qemu without -S, it would run just after migration finishes.
The problem here is what happens if we start the target without -S and
there happens one error during migration that puts current state as
-EIO. Migration would ends (notice that the error happend doing block
IO, network IO, i.e. nothing related with migration), and when
migration finish, we would just "continue" running on destination,
probably hanging the guest/corruption data, whatever.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We need the names of RAMBlocks as they're loaded for RDMA,
reuse a slightly modified ram_control_load_hook:
a) Pass a 'data' parameter to use for the name in the block-reg
case
b) Only some hook types now require the presence of a hook function.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
There is no _TEST() variant of VMSTATE_BUFFER_UNSAFE_INFO() yet, but we'll
soon need it. Introduce it and rebase the original
VMSTATE_BUFFER_UNSAFE_INFO() on top.
The parameter order of the new function-like macro follows that of
VMSTATE_SINGLE_TEST(): "_test" is introduced between "_state" and
"_version".
Cc: Juan Quintela <quintela@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Badly formatted migration streams can go undetected or produce
misleading errors due to a lock of checking at the end of sections.
In particular a section that adds an extra 0x00 at the end
causes what looks like a normal end of stream and thus doesn't produce
any errors, and something that ends in a 0x01..0x04 kind of look
like real section headers and then fail when the section parser tries
to figure out which section they are. This is made worse by the
choice of 0x00..0x04 being small numbers that are particularly common
in normal section data.
This patch adds a section footer consisting of a marker (0x7e - ~)
followed by the section-id that was also sent in the header. If
they mismatch then it throws an error explaining which section was
being loaded.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The next patch adds section footers; but we don't want to
break migration compatibility so disable them on older
machine types
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
In postcopy we need the loadvm_handlers to be used in a couple
of different instances of the loadvm loop/routine, and thus
it can't be local any more.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
qemu_peek_buffer currently copies the data it reads into a buffer,
however a future patch wants access to the buffer without the copy,
hence rework to remove the copy to the layer above.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
There are currently lots of pieces of incoming migration state scattered
around, and postcopy is adding more, and it seems better to try and keep
it together.
allocate MIS in process_incoming_migration_co
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
and use it in loadvm_state and ram_load.
Where ever it's used, check the return and error if it failed.
Minor: ram_load was using a 257 byte array for its string, the
maximum length is 255 bytes + 0 terminator, so fix to 256
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
We create optional sections with this patch. But we already have
optional subsections. Instead of having two mechanism that do the
same, we can just generalize it.
For subsections we just change:
- Add a needed function to VMStateDescription
- Remove VMStateSubsection (after removal of the needed function
it is just a VMStateDescription)
- Adjust the whole tree, moving the needed function to the corresponding
VMStateDescription
Signed-off-by: Juan Quintela <quintela@redhat.com>