All callers were calling migrate_get_current(), so do it inside the function.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Merge them.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Once there, create populate_disk_info.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
--
- create populate_disk_info instead of "abusing" populate_ram_info
Assigning directly to *errp is not valid, as errp may be NULL,
&error_fatal, or &error_abort. Use error_propagate() instead.
Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
There are some places that binded "return path" with postcopy. Let's be
prepared for its usage even without postcopy. This patch mainly did this
on source side.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
We need to release any block migrations BlockBackends on the source
before successfully completing the migration because otherwise
inactivating the images will fail (inactivation only tolerates device
BBs).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Block migration may still access the image during its
.save_live_complete_precopy() implementation, so we should only
inactivate the image afterwards.
Another reason for the change is that inactivating an image fails when
there is still a non-device BlockBackend using it, which includes the
BBs used by block migration. We want to give block migration a chance to
release the BBs before trying to inactivate the image (this will be done
in another patch).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
In load_snapshot, mis->from_src_file is freed twice, the first free is by
qemu_fclose, the second is by migration_incoming_state_destroy and
it causes Illegal instruction exception. The fix is just to remove the
first free.
This problem is found by qemu-iotests case 068 since commit
"660819b migration: shut src return path unconditionally". The error is:
068 1s ... - output mismatch (see 068.out.bad)
--- tests/qemu-iotests/068.out 2017-05-06 01:00:26.417270437 +0200
+++ 068.out.bad 2017-06-03 13:59:55.360274640 +0200
@@ -6,6 +6,8 @@
QEMU X.Y.Z monitor - type 'help' for more information
(qemu) savevm 0
(qemu) quit
+./common.config: line 107: 242472 Illegal instruction (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
+ echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
+fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
QEMU X.Y.Z monitor - type 'help' for more information
-(qemu) quit
-*** done
+(qemu) *** done
Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
We create the variable while we are at migration and we remove it
after migration.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
RAM Statistics need to survive migration to make info migrate work, so we
need to store them outside of RAMState. As we already have an struct
with those fields, just used them. (MigrationStats and XBZRLECacheStats).
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
It was only used by XBZRLE anyways.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
We shouldn't be using memory later than that.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
As a rule, CPU internal state should never be updated when
!cpu->kvm_vcpu_dirty (or the HAX equivalent). If that is done, then
subsequent calls to cpu_synchronize_state() - usually safe and idempotent -
will clobber state.
However, we routinely do this during a loadvm or incoming migration.
Usually this is called shortly after a reset, which will clear all the cpu
dirty flags with cpu_synchronize_all_post_reset(). Nothing is expected
to set the dirty flags again before the cpu state is loaded from the
incoming stream.
This means that it isn't safe to call cpu_synchronize_state() from a
post_load handler, which is non-obvious and potentially inconvenient.
We could cpu_synchronize_all_state() before the loadvm, but that would be
overkill since a) we expect the state to already be synchronized from the
reset and b) we expect to completely rewrite the state with a call to
cpu_synchronize_all_post_init() at the end of qemu_loadvm_state().
To clear this up, this patch introduces cpu_synchronize_pre_loadvm() and
associated helpers, which simply marks the cpu state as dirty without
actually changing anything. i.e. it says we want to discard any existing
KVM (or HAX) state and replace it with what we're going to load.
Cc: Juan Quintela <quintela@redhat.com>
Cc: Dave Gilbert <dgilbert@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
We can replace the four remaining calls of register_savevm() by
calls to register_savevm_live(). So we can remove the function and
as we don't allocate anymore the ops pointer with g_new0()
we don't have to free it then.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
All functions were internal, except blk_mig_init() that is exported in
misc.h now.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
All functions are internal except for ram_mig_init(). Create
migration/misc.h for this kind of functions.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Start removing migration code from sysemu/sysemu.h.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Just for the functions exported from tls.c. Notice that we can't
remove the migration/migration.h include from tls.c because it access
directly MigrationState for the tls params.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Split the file into public and internal interfaces. I have to rename
the external one because we can't have two include files with the same
name in the same directory. Build system gets confused. The only
exported functions are the ones that handle basic types.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We were do the shutting off only for postcopy. Now we do this as long as
the source return path is there.
Moving the cleanup of from_src_file there too.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The return path channel is possibly leaked. Fix it.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Everything else assumes that we always load a device from its own
savevm handler.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
So we remove all traces of them.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
There is no reason for having the loadvm_handlers at all. There is
only one use, and we can use the savevm handlers.
We will remove the loadvm handlers on a following patch.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
--
- Added load_version_id: version_id read from the stream (laurent)
- Added load_section_id: section_id read from the stream (dave)
The commit message from 070afca25 suggests that dirty_rate_high_cnt
should be used more aggressively to start throttling after two
iterations instead of four. The code, however, only changes the auto
convergence behaviour to throttle after three iterations. This makes the
behaviour more aggressive by kicking off throttling after two iterations
as originally intended.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The bytes_xfer_now/prev counters are only used by the auto convergence
logic. However, they are used alongside the dirty_pages_rate counter,
which is calculated (and required) outside of this logic. The problem
with this approach is that if the auto convergence capability is changed
while a migration is ongoing, the relationship of the counters will be
broken.
This moves the management of bytes_xfer_now/prev counters outside of the
auto convergence logic to address this issue.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Currently, a "period" in the RAM migration logic is at least a second
long and accounts for what happened since the last period (or the
beginning of the migration). The dirty_pages_rate counter is calculated
at the end this logic.
If the auto convergence capability is enabled from the start of the
migration, it won't be able to use this counter the first time around.
This calculates dirty_pages_rate as soon as a period is deemed over,
which allows for it to be used immediately.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The first time migration_bitmap_sync() is called, bytes_xfer_prev is set
to ram_state.bytes_transferred which is, at this point, zero. The next
time migration_bitmap_sync() is called, an iteration has happened and
bytes_xfer_prev is set to 'x' bytes. Most likely, more than one second
has passed, so the auto converge logic will be triggered and
bytes_xfer_now will also be set to 'x' bytes.
This condition is currently masked by dirty_rate_high_cnt, which will
wait for a few iterations before throttling. It would otherwise always
assume zero bytes have been copied and therefore throttle the guest
(possibly) prematurely.
Given bytes_xfer_prev is only used by the auto convergence logic, it
makes sense to only set its value after a check has been made against
bytes_xfer_now.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This removes last trace of migration functions from sysemu/sysemu.h.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
We want to track why a guest was shutdown; in particular, being able
to tell the difference between a guest request (such as ACPI request)
and host request (such as SIGINT) will prove useful to libvirt.
Since all requests eventually end up changing shutdown_requested in
vl.c, the logical change is to make that value track the reason,
rather than its current 0/1 contents.
Since command-line options control whether a reset request is turned
into a shutdown request instead, the same treatment is given to
reset_requested.
This patch adds an internal enum ShutdownCause that describes reasons
that a shutdown can be requested, and changes qemu_system_reset() to
pass the reason through, although for now nothing is actually changed
with regards to what gets reported. The enum could be exported via
QAPI at a later date, if deemed necessary, but for now, there has not
been a request to expose that much detail to end clients.
For the most part, we turn 0 into SHUTDOWN_CAUSE_NONE, and 1 into
SHUTDOWN_CAUSE_HOST_ERROR; the only specific case where we have enough
information right now to use a different value is when we are reacting
to a host signal. It will take a further patch to edit all call-sites
that can trigger a reset or shutdown request to properly pass in any
other reasons; this patch includes TODOs to point such places out.
qemu_system_reset() trades its 'bool report' parameter for a
'ShutdownCause reason', with all non-zero values having the same
effect; this lets us get rid of the weird #defines for VMRESET_*
as synonyms for bools.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170515214114.15442-3-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
It only needed TARGET_PAGE_SIZE/BITS/BITS_MIN values, so just export
them from exec.h
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
That is the only function that we need from exec.c, and having to
include the whole sysemu.h for this.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
/me leans to be less sloppy with copyright notices
thanks Dave
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
Minor rearrangements due to rebase
Now one just has the interperter, and the other has the basic types.
Once there, add copyright boilerplate.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Use GPL v2 or later. Detected by David.
Create an include for its exported functions.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
Add proper header
Many users now prefer to use drive_mirror over NBD as an
alternative to the older migrate -b option; drive_mirror is
more complex to setup but gives you more options (e.g. only
migrating some of the disks if some of them are shared).
Allow the large chunk of block migration code to be compiled
out for those who don't use it.
Based on a downstream-patch we've had for a while by Jeff Cody.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
--
- When compiled out, allow seting block only with false value (eric)
Not used anymore after moving block migration to use capabilities.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
We have change in the previous patch to use migration capabilities for
it. Notice that we continue using the old command line flags from
migrate command from the time being. Remove the set_params method as
now it is empty.
For savevm, one can't do a:
savevm -b/-i foo
but now one can do:
migrate_set_capability block on
savevm foo
And we can't use block migration. We could disable block capability
unconditionally, but it would not be much better.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
- Maintain shared/enabled dependency (Xu suggestion)
- Now we maintain the dependency on the setter functions
- improve error messages
Create one capability for block migration and one parameter for
incremental block migration.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
- address all Markus comments
- use Markus and Eric text descriptions
- change logic another time
- improve text messages
It turns out that it's legal to create a VM with RAMBlocks that aren't
a multiple of the pagesize in use; e.g. a 1025M main memory using
2M host pages. That breaks postcopy's atomic placement of pages,
so disallow it.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Unfortunately it's legal to create a VM with a RAM size that's
not a multiple of the underlying host page or huge page size.
Recently I'd changed things to always send host sized pages,
and that breaks if we have say a 1025MB guest on 2MB hugepages.
Unfortunately we can't just make that illegal since it would break
migration from/to existing oddly configured VMs.
Symptom: qemu-system-x86_64: Illegal RAM offset 40100000
as it transmits the fraction of the hugepage after the end
of the RAMBlock (may also cause a crash on the source
- possibly due to clearing bits after the bitmap)
Reported-by: Yumei Huang <yuhuang@redhat.com>
Red Hat bug: https://bugzilla.redhat.com/show_bug.cgi?id=1449037
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZHJB7AAoJEAUWMx68W/3nSr8P/3uw04AxuCiEpwVYjtYaMGgs
C/1zNXFPRIWDx5kNxi7reH0aknYSjHVijxuGUiAqP9gM+VpGO485gEWDrmSo6tqy
9s3HbudTzL/Qfp7ZwJY+v8gFxjwCZYhDh3WI+3yIjWDLHpGhTdRuYIC4DEA1EkGY
zvG8fHISMqZUh1DYE7ttCX1zLlZLaAA1znjYbnXCYNjoRdeeY2+uJsaPhdjqi1la
5qhKWtKS68cfiKSkntBASKMs2+z1+4iyGVVzgd8E64O4JbdirB2JHnjSwm3fJ5yY
dhJLY6nSn7eNdBmVHSe0ZBDHfwoVbOlPC9+z0Ob+UxiGcXd0KSYaE3IaD2Ref2LS
Wju92rSLcADNYBQtNAxwKpDeixf+WbfetiKDPx71N2rcpdpZg1I3kwu4ippMwh6g
jpgjSV2Nkcnv4PQBD9kPSt6jJBmcyN3Y98VvaOFxyOv1u1Xm7ZbhkmxOQKQvG4jc
OlBbA20EzOC0axl9ktVF7GljiziCqDaMFljBxNlpdAqzjou7ztEEvXBwcl8I8eBH
ThocuOf25CPLrTAwggbEWpKfxzSQ9pEIPHpQL3Qz9Ip7STlcxB/FJ/9oEQvpZeJg
MHpgBLDzw1xqJw01n711qgGtEsCvTUoQMcnJxgeaI/WzEIaRd68breI2/qG22aBl
0XgH7THRl8WP+EC8ZYrB
=wJvl
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'dgilbert/tags/pull-hmp-20170517' into staging
HMP pull
# gpg: Signature made Wed 17 May 2017 07:03:39 PM BST
# gpg: using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* dgilbert/tags/pull-hmp-20170517:
ramblock: add new hmp command "info ramblock"
utils: provide size_to_str()
ramblock: add RAMBLOCK_FOREACH()
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
So that it can simplifies the iterators.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494562661-9063-2-git-send-email-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The function is only used once, and nothing else in migration knows
about objects. Create the function vmstate_device_is_migratable() in
savem.c that really do the bit that is related with migration.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Yes, we don't have a good place to put that stuff.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
It is only used by migration, so move it there.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This allows us to remove lots of includes of migration/migration.h
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reflects better what it does now, and avoid confussions with
RAM_SAVE_FLAG_COMPRESS_PAGE.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
This way we use the "normal" way of printing errors for hmp commands.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Compression threads got broken on commit
commit 2479569466
Author: Juan Quintela <quintela@redhat.com>
Date: Tue Mar 21 11:45:01 2017 +0100
ram: reorganize last_sent_block
On do_compress_ram_page() we use a different QEMUFile than the
migration one. We need to pass it there. The failure can be seen as:
(qemu) qemu-system-x86_64: Unknown combination of migration flags: 0
qemu-system-x86_64: error while loading state section id 3(ram)
qemu-system-x86_64: load of migration failed: Invalid argument
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Peter Xu <peterx@redhat.com>
Instead of manually calling blk_resume_after_migration() in migration
code after doing bdrv_invalidate_cache_all(), integrate the BlockBackend
activation with cache invalidation into a single function. This is
achieved with a new callback in BdrvChildRole that is called by
bdrv_invalidate_cache_all().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Migration code activates all block driver nodes on the destination when
the migration completes. It does so by calling
bdrv_invalidate_cache_all() and blk_resume_after_migration(). There is
one code path for precopy and one for postcopy migration, resulting in
four function calls, which used to have three different failure modes.
This patch unifies the behaviour so that failure to activate all block
nodes is non-fatal, but the error message is logged and the VM isn't
automatically started. 'cont' will retry activating the block nodes.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
SocketAddressLegacy is a simple union, and simple unions are awkward:
they have their variant members wrapped in a "data" object on the
wire, and require additional indirections in C. SocketAddress is the
equivalent flat union. Convert all users of SocketAddressLegacy to
SocketAddress, except for existing external interfaces.
See also commit fce5d53..9445673 and 85a82e8..c5f1ae3.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-7-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Minor editing accident fixed, commit message and a comment tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The next commit will rename SocketAddressFlat to SocketAddress, and
the commit after that will replace most uses of SocketAddressLegacy by
SocketAddress, replacing most of this commit's renames right back.
Note that checkpatch emits a few "line over 80 characters" warnings.
The long lines are all temporary; the SocketAddressLegacy replacement
will shorten them again.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-5-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
I'm going to flatten SocketAddress: rename SocketAddress to
SocketAddressLegacy, SocketAddressFlat to SocketAddress, eliminate
SocketAddressLegacy except in external interfaces.
inet_parse() returns a newly allocated InetSocketAddress. Lift the
allocation from inet_parse() into its caller socket_parse() to prepare
for flattening SocketAddress.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-3-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Straightforward rebase]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJZA6QBAAoJEH8JsnLIjy/WLmIQALlu36MvoETPGgzWAFPkaOah
awHSeyMOIIO9Lotc8SpLUzq97avXX8ZzMdQINu95npcThO7rhBCn9DaxHjKsxaGj
/fu3j3WHDCgbV2wQeX13G/LiBQdGMXWphuO0ww5lH7rRauRAEsgAxHdq3/KZmbeG
abV4s/6a8Z4mF+vPglrnODWA3jiOZ92U869xFuqSNYit2eWH/MJm2CbhOwj+UWXw
JtdtGl/a4LOJqcCxIxupWYkbjS1ugSCX7ZUco0UMG5R7yptXoFVaLRXwdJvx7dUk
RDUnvtjit4gAUViuDzXpTXC5pSAALpskMBB8OUsMQEjIDDq6Z77L/WGnF8guUdmX
xgfw+IRu3vur39zfLibxqUdfgsSeDXOfvu9Z91xCY5jNPaTmRYxfKPdw0ua7Lu9+
zzRfbejgxOOFZQGZAS9CNMU52zILpbnDFRxK2YfOBKMj++wS9vGX+qwj1tXnIu4z
35+ijhLvRMspZ3BhtsXGfP559BQ7+vXENKSIznDOgOnW1c75OMpRMl/pLq8CFGXk
UCMUE1S6qlNMnYUD6Ihm/TNjjVSO20RNrYM0DB/pakEV8MD9YhqrF65sXLr4i8Qk
nO+9ut/HBcWjbe7Ur83lbmLCYOB3fPBTxPKYRWBXMZbUngpsA5WbBXtyLlOUqXdN
z2Mc4FxzaPpvtFQSJNsP
=B+5d
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'kwolf/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Fri 28 Apr 2017 09:20:17 PM BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* kwolf/tags/for-upstream: (34 commits)
progress: Show current progress on SIGINFO
iotests: fix exclusion option
iotests: clarify help text
qemu-img: use blk_co_pwrite_zeroes for zero sectors when compressed
qemu-img: improve convert_iteration_sectors()
block: assert no image modification under BDRV_O_INACTIVE
block: fix obvious coding style mistakes in block_int.h
qcow2: Allow discard of final unaligned cluster
block: Add .bdrv_truncate() error messages
block: Add errp to BD.bdrv_truncate()
block: Add errp to b{lk,drv}_truncate()
block/vhdx: Make vhdx_create() always set errp
qemu-img: Document backing options
qemu-img/convert: Move bs_n > 1 && -B check down
qemu-img/convert: Use @opts for one thing only
block: fix alignment calculations in bdrv_co_do_zero_pwritev
block: Do not unref bs->file on error in BD's open
iotests: 109: Filter out "len" of failed jobs
iotests: Fix typo in 026
Issue a deprecation warning if the user specifies the "-hdachs" option.
...
Message-id: 1493411622-5343-1-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
A couple more traces that would have made fixing that postcopy
bug a bit easier.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
It is internal to migration, not intended for other users.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
It only uses block/* functions, nothing from migration.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
It really uses block/* stuff, not migration one.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
It is a monitor command, and has nothing migration specific in it.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
load_vmstate() already use error_report, so be consistent. There is
an identical error message in load_vmstate() that ends in a
period. Remove it.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We have just arrived as:
migration.c: qemu_migrate()
....
s = migrate_init() <- puts it to NULL
....
{tcp,unix}_start_outgoing_migration ->
socket_outgoing_migration
migration_channel_connect()
sets to_dst_file
if tls is enabled, we do another round through
migrate_channel_tls_connect(), but we only set it up if there is no
error. So we don't need the assignation. I am removing it to remove
in the follwing patches the knowledge about MigrationState in that two
files.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Historically the migration data channel has only needed to be
unidirectional. Thus the 'exec:' protocol was requesting an
I/O channel with O_RDONLY on incoming side, and O_WRONLY on
the outgoing side.
This is fine for classic migration, but if you then try to run
TLS over it, this fails because the TLS handshake requires a
bi-directional channel.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Both the ram bitmap and the unsent bitmap are split by RAMBlock.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Fix compilation when DEBUG_POSTCOPY is enabled (thanks Hailiang)
Commit d35ff5e6 ('block: Ignore guest dev permissions during incoming
migration') added blk_resume_after_migration() to the precopy migration
path, but neglected to add it to the duplicated code that is used for
postcopy migration. This means that the guest device doesn't request the
necessary permissions, which ultimately led to failing assertions.
Add the missing blk_resume_after_migration() to the postcopy path.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJY+d69AAoJEPSH7xhYctcj/4oQAIFFEyWaqrL9ve5ySiJgdtcY
zYtiIhZQ+nPuy2i1oDSX+vbMcmkJDDyfO5qLovxyHGkZHniR8HtxNHP+MkZQa07p
DiSIvd51HvcixIouhbGcoUCU63AYxqNL3o5/TyNpUI72nvsgwl3yfOot7PtutE/F
r384j8DrOJ9VwC5GGPg27mJvRPvyfDQWfxDCyMYVw153HTuwVYtgiu/layWojJDV
D2L1KV45ezBuGckZTHt9y6K4J5qz8qHb/dJc+whBBjj4j9T9XOILU9NPDAEuvjFZ
gHbrUyxj7EiApjHcDZoQm9Raez422ALU30yc9Kn7ik7vSqTxk2Ejq6Gz7y9MJrDn
KdMj75OETJNjBL+0T9MmbtWts28+aalpTUXtBpmi3eWQV5Hcox2NF1RP42jtD9Pa
lkrM6jv0nsdNfBPlQ+ZmBTJxysWECcMqy487nrzmPNC8vZfokjXL5be12puho9fh
ziU4gx9C6/k82S+/H6WD/AUtRiXJM7j4oTU2mnjrsSXQC1JNWqODBOFUo9zsDufl
vtcrxfPhSD1DwOInFSIBHf/RylcgTkPCL0rPoJ8npNDly6rHFYJ+oIbsn84Z4uYY
RWvH8xB9wgRlK9L1WdRgOd2q7PaeHQoSSdPOiS9YVEVMVvSW8Es5CRlhcAsw/M/T
1Tl65cNrjETAuZKL3dLH
=EsZ5
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20170421' into staging
migration/next for 20170421
# gpg: Signature made Fri 21 Apr 2017 11:28:13 BST
# gpg: using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg: aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* remotes/juanquintela/tags/migration/20170421: (65 commits)
hmp: info migrate_parameters format tunes
hmp: info migrate_capability format tunes
migration: rename max_size to threshold_size
migration: set current_active_state once
virtio-rng: stop virtqueue while the CPU is stopped
migration: don't close a file descriptor while it can be in use
ram: Remove migration_bitmap_extend()
migration: Disable hotplug/unplug during migration
qdev: Move qdev_unplug() to qdev-monitor.c
qdev: Export qdev_hot_removed
qdev: qdev_hotplug is really a bool
migration: Remove MigrationState parameter from migration_is_idle()
ram: Use RAMBitmap type for coherence
ram: rename last_ram_offset() last_ram_pages()
ram: Use ramblock and page offset instead of absolute offset
ram: Change offset field in PageSearchStatus to page
ram: Remember last_page instead of last_offset
ram: Use page number instead of an address for the bitmap operations
ram: reorganize last_sent_block
ram: ram_discard_range() don't use the mis parameter
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In migration codes (especially in migration_thread()), max_size is used
in many place for the threshold value that we will start to do the final
flush and jump to the next stage to dump the whole rest things to
destination. However its name is confusing to first readers. Let's
rename it to "threshold_size" when proper and add a comment for it. No
functional change is made.
CC: Juan Quintela <quintela@redhat.com>
CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
We set it right above this one. No need to set it twice.
CC: Juan Quintela <quintela@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
If we close the QEMUFile descriptor in process_incoming_migration_co()
while it has been stopped by an error, the postcopy_ram_listen_thread()
can try to continue to use it. And as the memory has been freed
it is working with an invalid pointer and crashes.
Fix this by releasing the memory after having managed the error
case (which, in fact, calls exit())
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit@kernel.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
We have disabled memory hotplug, so we don't need to handle
migration_bitamp there.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Only user don't have a MigrationState handly.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This removes the needto pass also the absolute offset.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We are moving everything to work on pages, not addresses.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
--
Improve comment
Fix typo
We use an unsigned long for the page number. Notice that our bitmaps
already got that for the index, so we have that limit.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
--
rename page to page_abs everywhere.
fix trace types for pages
We were setting it far away of when we changed it. Now everything is
done inside save_page_header. Once there, reorganize code to pass
RAMState. We also set CONTINUE flag in a single place.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
We change the meaning of start to be the offset from the beggining of
the block.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The number of dirty pages is output in 'pages' in the command
'info migrate', so add page-size to calculate the number of dirty
pages in bytes.
Signed-off-by: Chao Fan <fanc.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
It was used as a size in all cases except one.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Remove it from callers and callees.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
We need to call for the migrate_get_current() in more that half of the
uses, so call that inside.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
We can calculate its value, so we don't create a variable for it.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
--
After Peter and Dave review, I dropped the variable and just inlined
the condition.
Fix typo
We receive the file from save_live operations and we don't use it
until 3 or 4 levels of calls down.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Treat it like the rest of ram stats counters. Export its value the
same way. As an added bonus, no more MigrationState used in
migration_bitmap_sync();
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Again, dave was the one reviewing it
It can be recalculated from dirty_pages_rate.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Dave was the one that reviewed it O:-)
This is a ram field that was inside MigrationState. Move it to
RAMState and make it the same that the other ram stats.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
This are the last postcopy fields still at MigrationState. Once there
Move MigrationSrcPageRequest to ram.c and remove MigrationState
parameters where appropiate.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
It was on MigrationState when it is only used inside ram.c for
postcopy. Problem is that we need to access it without being able to
pass it RAMState directly.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Just unfold it. Move ram_bytes_remaining() with the rest of exported
functions.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Somewhere it was passed by reference, just use it from RAMState.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Once there, rename the type to be shorter.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
And then init only things that are not zero by default.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Once there, remove the now unused AccountingInfo struct and var.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Comment why we need bytes and pages
Its value can be calculated by other exported.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
For compatibility, we need to still send a value, but just specify it
and comment the fact.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Once there rename it to its actual meaning, zero_pages.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Renamed start_time to time_last_bitmap_sync(peterx suggestion)
We need to add a parameter to several functions to make this work.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
We create a struct where to put all the ram state
Start with the following fields:
last_seen_block, last_sent_block, last_offset, last_version and
ram_bulk_stage are globals that are really related together.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Fix typo and warnings
So all places are consistent on the naming of a block name parameter.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
It reflects better what it does.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Added doc comments for existing functions comment and rewrite them in
a common style.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Fix Peter Xu comments
Improve postcopy comments as per reviews.
BLOCK_SIZE is (1 << 20), qcow2 cluster size is 65536 by default,
this may cause the qcow2 file size to be bigger after migration.
This patch checks each cluster, using blk_pwrite_zeroes for each
zero cluster.
[Initialize cluster_size to BLOCK_SIZE to prevent a gcc uninitialized
variable compiler warning. In reality we always initialize cluster_size
in a conditional but gcc doesn't know that.
--Stefan]
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Message-id: 1492050868-16200-1-git-send-email-lidongchen@tencent.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Usually guest devices don't like other writers to the same image, so
they use blk_set_perm() to prevent this from happening. In the migration
phase before the VM is actually running, though, they don't have a
problem with writes to the image. On the other hand, storage migration
needs to be able to write to the image in this phase, so the restrictive
blk_set_perm() call of qdev devices breaks it.
This patch flags all BlockBackends with a qdev device as
blk->disable_perm during incoming migration, which means that the
requested permissions are stored in the BlockBackend, but not actually
applied to its root node yet.
Once migration has finished and the VM should be resumed, the
permissions are applied. If they cannot be applied (e.g. because the NBD
server used for block migration hasn't been shut down), resuming the VM
fails.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Postcopy doesn't support migration of RAM shared with another process
yet (we've got a bunch of things to understand).
Check for the case and don't allow postcopy to be enabled.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This problem affects s390x only if we are running without KVM.
Basically, S390CPU.irqstate is unused if we do not use KVM,
and thus no buffer is allocated.
This causes size=0, first_elem=NULL and n_elems=1 in
vmstate_load_state and vmstate_save_state. And the assert fails.
With this fix we can go back to the old behavior and support
VMS_VBUFFER with size 0 and nullptr.
Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Increase bmds->cur_dirty after submit io, so reduce the frequency
involve into blk_drain, and improve the performance obviously
when block migration.
The performance test result of this patch:
During the block dirty save phase, this patch improve guest os IOPS
from 4.0K to 9.5K. and improve the migration speed from
505856 rsec/s to 855756 rsec/s.
Signed-off-by: Lidong Chen <jemmy858585@gmail.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The tls-creds parameter has a default value of NULL indicating
that TLS should not be used. Setting it to non-NULL enables
use of TLS. Once tls-creds are set to a non-NULL value via the
monitor, it isn't possible to set them back to NULL again, due
to current implementation limitations. The empty string is not
a valid QObject identifier, so this switches to use "" as the
default, indicating that TLS will not be used
The tls-hostname parameter has a default value of NULL indicating
the the hostname from the migrate connection URI should be used.
Again, once tls-hostname is set non-NULL, to override the default
hostname for x509 cert validation, it isn't possible to reset it
back to NULL via the monitor. The empty string is not a valid
hostname, so this switches to use "" as the default, indicating
that the migrate URI hostname should be used.
Using "" as the default for both, also means that the monitor
commands "info migrate_parameters" / "query-migrate-parameters"
will report existance of tls-creds/tls-parameters even when set
to their default values.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
In function cpu_physical_memory_sync_dirty_bitmap, file
include/exec/ram_addr.h:
if (src[idx][offset]) {
unsigned long bits = atomic_xchg(&src[idx][offset], 0);
unsigned long new_dirty;
new_dirty = ~dest[k];
dest[k] |= bits;
new_dirty &= bits;
num_dirty += ctpopl(new_dirty);
}
After these codes executed, only the pages not dirtied in bitmap(dest),
but dirtied in dirty_memory[DIRTY_MEMORY_MIGRATION] will be calculated.
For example:
When ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] = 0b00001111,
and atomic_rcu_read(&migration_bitmap_rcu)->bmap = 0b00000011,
the new_dirty will be 0b00001100, and this function will return 2 but not
4 which is expected.
the dirty pages in dirty_memory[DIRTY_MEMORY_MIGRATION] are all new,
so these should be calculated also.
Signed-off-by: Chao Fan <fanc.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Migration is the only code left in the tree that does not react
to bdrv_is_allocated() failures. But as there is no useful way
to react to the failure, and we are merely skipping unallocated
sectors on success, just document that our choice of handling
is intended.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Note: The 'postcopy: Update userfaultfd.h header' is part of
Paolo's header update and will disappear if applied after it.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYtW9KAAoJEAUWMx68W/3nF/MQAIaMjoCkeVnCZsGCJ2VbJZcZ
Fb5gMNjnHUKwOeDhGuSBP7FGRMalYM2JNRufSiBZnUowCQMXi3Emjad6pGUiPTr0
B+L1czjw0FDdskQpE1U/StdruLiYBJ98oktnjlla00f+E9rylY0cMmrHpmqfRwDn
IXTa4qm77aw47Y2MYku1nce27gjA3JEko6Lg2fB7gTtwYzTi/uRrKa+ilbnTPoEZ
/ZzK8hcUYiV8oDAOtEmKSG3Azo+6ylzDG4r/ldwEecJPhZxeUk39AhDOoU0mx98N
OE8oOk2t/0Bo+mS7iOw9gZ8sr9p5L2myQkmoxxLuAXAcD9sHVlcp0eKi5lLYNmUa
oWnnYo3QeCvqrcZzhvSX0b4rLXoY4GP+qKpQo21eKIPEyq3v6EDhrk10UCTXaiBO
zxHblLgXSrX6VqYcEJGj2oUR/RjH9ouw3hjI5cDy/d/hRmNLCl8lwvPmVmv3tRer
6X1gcZSUs6hY/drs2/v6maJ0CqK/bx6/OBfkiUJUEN4Dg1ldgO2r1v8pBLukvM6c
De2aNRezl821HK487EvRlluUq0nO6L3LkqDTBql4/4Rf4HoTRXxoJ68sB0LBqym5
PwD/C3mQuvlWg8tKJtaHVtS0ESuSCSroaSk1FB648mSs8nJYYFjstc/XovuePqTl
6UT2OQbUdWITILoWSlI5
=PCYv
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20170228a' into staging
Migration pull
Note: The 'postcopy: Update userfaultfd.h header' is part of
Paolo's header update and will disappear if applied after it.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
# gpg: Signature made Tue 28 Feb 2017 12:38:34 GMT
# gpg: using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-migration-20170228a: (27 commits)
postcopy: Add extra check for COPY function
postcopy: Add doc about hugepages and postcopy
postcopy: Check for userfault+hugepage feature
postcopy: Update userfaultfd.h header
postcopy: Allow hugepages
postcopy: Send whole huge pages
postcopy: Mask fault addresses to huge page boundary
postcopy: Load huge pages in one go
postcopy: Use temporary for placing zero huge pages
postcopy: Plumb pagesize down into place helpers
postcopy: Record largest page size
postcopy: enhance ram_block_discard_range for hugepages
exec: ram_block_discard_range
postcopy: Chunk discards for hugepages
postcopy: Transmit and compare individual page sizes
postcopy: Transmit ram size summary word
migration: fix use-after-free of to_dst_file
migration: Update docs to discourage version bumps
migration: fix id leak regression
migrate: Introduce a 'dc->vmsd' check to avoid segfault for --only-migratable
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJYtd8UAAoJEH8JsnLIjy/WuK0P/i3yi29JOZTEeUIUc+QXHm4b
iXRj/3uLwLD5IJqfjLeDmOPUfI+w0SAkGTidSlKrS5maypW366ke/+CY8QZtbxiH
qTazKrXNumhvJuJVXQ0L8bAwDC/r8pPlrvLJ3bfBOK6t6fu//M/nUZP/N92UMX87
w1xA+PNq1TNZEKC6VwdY50fsrhOxIR1ZoS2rEKseEBuYqTm2q5Q0Y8EXhbReO0LE
Pb0vAOEmZg6K2Awv4Cg2L8BZlz+tGgUd+3eWV8zzCshHkAc2J/OaxjLCG/KUgXb4
MbdACnIHOUOGqhLfM9wjrVYJAhM/4F1sR0hDm8uUngFWlWsGwTdhH5bRlfat3QQs
E4oNt6ORHiEis0l7pXfwHTDC3ChockArxdJlOvmpRm6EUSBs132IJS3TrlKBCr+R
CBdoC3k1eXC6uHF4KCrsnW2u26D4Ju9V6Yb5h/RqYPJCc+o16BsD31/WOLhH4WTq
M7A9ZbH4jBDHTO3A5zho0x7AGbVGDVMzKssU9MUOkUuPB+yM2KMg/Kr0kUDhrl0k
aOdRfCF+OcsiXzM97U3msv0udbHr0/tUtE+/3tL3kcXjEHoaFECUdV8mU9F3KleZ
5MQP2vNcAZA96zuV1qMhPlDwWBoEwBebDnvM3LuTvkHdbBQZv+6A6XiCw/hXdh6o
zyUn+/2KKGoPiNv/B6PD
=g3ew
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Tue 28 Feb 2017 20:35:32 GMT
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (46 commits)
block: Add Error parameter to bdrv_append()
block: Add Error parameter to bdrv_set_backing_hd()
block: Assertions for resize permission
block: Assertions for write permissions
block: Pass BdrvChild to bdrv_aligned_preadv/pwritev and copy-on-read
tests: Remove FIXME comments
nbd/server: Use real permissions for NBD exports
migration/block: Use real permissions
hmp: Request permissions in qemu-io
commit: Add filter-node-name to block-commit
mirror: Add filter-node-name to blockdev-mirror
stream: Use real permissions in streaming block job
mirror: Use real permissions in mirror/active commit block job
blockjob: Factor out block_job_remove_all_bdrv()
block: Allow backing file links in change_parent_backing_link()
block: BdrvChildRole.attach/detach() callbacks
block: Fix pending requests check in bdrv_append()
backup: Use real permissions in backup block job
commit: Use real permissions for HMP 'commit'
commit: Use real permissions in commit block job
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Request BLK_PERM_CONSISTENT_READ for the source of block migration, and
handle potential permission errors as good as we can in this place
(which is not very good, but it matches the other failure cases).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Now that blk_insert_bs() requests the BlockBackend permissions for the
node it attaches to, it can fail. Instead of aborting, pass the errors
to the callers.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
We want every user to be specific about the permissions it needs, so
we'll pass the initial permissions as parameters to blk_new(). A user
only needs to call blk_set_perm() if it wants to change the permissions
after the fact.
The permissions are stored in the BlockBackend and applied whenever a
BlockDriverState should be attached in blk_insert_bs().
This does not include actually choosing the right set of permissions
everywhere yet. Instead, the usual FIXME comment is added to each place
and will be addressed in individual patches.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
We can call this qmp command to do checkpoint outside of qemu.
Xen colo will need this function.
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wencongyang@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
We can call this qmp command to start/stop replication outside of qemu.
Like Xen colo need this function.
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wencongyang@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
As an extra sanity check, make sure the region we're registering
can perform UFFDIO_COPY; the COPY will fail later but this
gives a cleaner failure.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170224182844.32452-17-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We need extra Linux kernel support (~4.11) to support userfaults
on hugetlbfs; check for them.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170224182844.32452-15-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Allow huge pages in postcopy.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170224182844.32452-13-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The RAM save code uses ram_save_host_page to send whole
host pages at a time; change this to use the host page size associated
with the RAM Block which may be a huge page.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170224182844.32452-12-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Currently the fault address received by userfault is rounded to
the host page boundary and a host page is requested from the source.
Use the current RAMBlock page size instead of the general host page
size so that for RAMBlocks backed by huge pages we request the whole
huge page.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170224182844.32452-11-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The existing postcopy RAM load loop already ensures that it
glues together whole host-pages from the target page size chunks sent
over the wire. Modify the definition of host page that it uses
to be the RAM block page size and thus be huge pages where appropriate.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170224182844.32452-10-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The kernel can't do UFFDIO_ZEROPAGE for huge pages, so we have
to allocate a temporary (always zero) page and use UFFDIO_COPYPAGE
on it.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170224182844.32452-9-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Now we deal with normal size pages and huge pages we need
to tell the place handlers the size we're dealing with
and make sure the temporary page is large enough.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170224182844.32452-8-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Record the largest page size in use; we'll need it soon for allocating
temporary buffers.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170224182844.32452-7-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Create ram_block_discard_range in exec.c to replace
postcopy_ram_discard_range and most of ram_discard_range.
Those two routines are a bit of a weird combination, and
ram_discard_range is about to get more complex for hugepages.
It's OS dependent code (so shouldn't be in migration/ram.c) but
it needs quite a bit of the innards of RAMBlock so doesn't belong in
the os*.c.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170224182844.32452-5-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
At the start of the postcopy phase, partially sent huge pages
must be discarded. The code for dealing with host page sizes larger
than the target page size can be reused for this case.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170224182844.32452-4-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
When using postcopy with hugepages, we require the source
and destination page sizes for any RAMBlock to match; note
that different RAMBlocks in the same VM can have different
page sizes.
Transmit them as part of the RAM information header and
fail if there's a difference.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20170224182844.32452-3-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Replace the host page-size in the 'advise' command by a pagesize
summary bitmap; if the VM is just using normal RAM then
this will be exactly the same as before, however if they're using
huge pages they'll be different, and thus:
a) Migration from/to old qemu's that don't understand huge pages
will fail early.
b) Migrations with different size RAMBlocks will also fail early.
This catches it very early; earlier than the detailed per-block
check in the next patch.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20170224182844.32452-2-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
hmp_savevm calls qemu_savevm_state(f), which sets to_dst_file=f in
global migration state. Then hmp_savevm closes f (g_free called).
Next access to to_dst_file in migration state (for example,
qmp_migrate_set_speed) will use it after it was freed.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170225193155.447462-5-vsementsov@virtuozzo.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This leak was introduced in commit
581f08bac2.
(it stands out quickly with ASAN once the rest of the leaks are also
removed from make check with this series)
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170221141451.28305-31-marcandre.lureau@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Commit a3a3d8c7 introduced a segfault bug while checking for
'dc->vmsd->unmigratable' which caused QEMU to crash when trying to add
devices which do no set their 'dc->vmsd' yet while initialization.
Place a 'dc->vmsd' check prior to it so that we do not segfault for
such devices.
NOTE: This doesn't compromise the functioning of --only-migratable
option as all the unmigratable devices do set their 'dc->vmsd'.
Introduce a new function check_migratable() and move the
only_migratable check inside it, also use stubs to avoid user-mode qemu
build failures.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Message-Id: <1487009088-23891-1-git-send-email-ashijeetacharya@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Make VMS_ARRAY_OF_POINTER cope with null pointers. Previously the
reward for trying to migrate an array with some null pointers in it was
an illegal memory access, that is a swift and painless death of the
process. Let's make vmstate cope with this scenario.
The general approach is, when we encounter a null pointer (element),
instead of following the pointer to save/load the data behind it, we
save/load a placeholder. This way we can detect if we expected a null
pointer at the load side but not null data was saved instead.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Guenther Hutzl <hutzl@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170222160119.52771-4-pasic@linux.vnet.ibm.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Currently vmstate_base_addr does several things: it pinpoints the field
within the struct, possibly allocates memory and possibly does the first
pointer dereference. Obviously allocation is needed only for load.
Let us split up the functionality in vmstate_base_addr and move the
address manipulations (that is everything but the allocation logic) to
load and save so it becomes more obvious what is actually going on. Like
this all the address calculations (and the handling of the flags
controlling these) is in one place and the sequence is more obvious.
The newly introduced function vmstate_handle_alloc also fixes the
allocation for the unused VMS_VBUFFER|VMS_MULTIPLY|VMS_ALLOC scenario
and is substantially simpler than the original vmstate_base_addr.
In load and save some asserts are added so it's easier to debug
situations where we would end up with a null pointer dereference.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170222160119.52771-3-pasic@linux.vnet.ibm.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The vmstate_(load|save)_state start out with an a void *opaque pointing
to some struct, and manipulate one or more elements of one field within
that struct.
First the field within the struct is pinpointed as opaque + offset, then
if this is a pointer the pointer is dereferenced to obtain a pointer to
the first element of the vmstate field. Pointers to further elements if
any are calculated as first_element + i * element_size (where i is the
zero based index of the element in question).
Currently base_addr and addr is used as a variable name for the pointer
to the first element and the pointer to the current element being
processed. This is suboptimal because base_addr is somewhat
counter-intuitive (because obtained as base + offset) and both base_addr
and addr not very descriptive (that we have a pointer should be clear
from the fact that it is declared as a pointer).
Let make things easier to understand by renaming base_addr to first_elem
and addr to curr_elem. This has the additional benefit of harmonizing
with other names within the scope (n_elems, vmstate_n_elems).
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170222160119.52771-2-pasic@linux.vnet.ibm.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Using QMP, the error message of 'migrate_set_downtime' was displaying
the values in milliseconds, being misleading with the command that
accepts the value in seconds:
{ "execute": "migrate_set_downtime", "arguments": {"value": 3000}}
{"error": {"class": "GenericError", "desc": "Parameter 'downtime_limit'
expects an integer in the range of 0 to 2000000 milliseconds"}}
This message is also seen in HMP when trying to set the same
parameter:
(qemu) migrate_set_parameter downtime-limit 3000000
Parameter 'downtime_limit' expects an integer in the range of 0 to
2000000 milliseconds
To allow for a proper error message when using QMP, a validation
of the user input was added in 'qmp_migrate_set_downtime'.
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Message-Id: <20170222151729.5812-1-danielhb@linux.vnet.ibm.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
VMSTATE_WITH_TMP is for handling structures where some calculation
or rearrangement of the data needs to be performed before the data
hits the wire.
For example, where the value on the wire is an offset from a
non-migrated base, but the data in the structure is the actual pointer.
To use it, a temporary type is created and a vmsd used on that type.
The first element of the type must be 'parent' a pointer back to the
type of the main structure. VMSTATE_WITH_TMP takes care of allocating
and freeing the temporary before running the child vmsd.
The post_load/pre_save on the child vmsd can copy things from the parent
to the temporary using the parent pointer and do any other calculations
needed; it can then use normal VMSD entries to do the actual data
storage without having to fiddle around with qemu_get_*/qemu_put_*
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20170203160651.19917-3-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We should not do failover work while the main thread is loading
VM's state. Otherwise the consistent of VM's memory and
device state will be broken.
We will restart the loading process after jump over the stage,
The new failover status 'RELAUNCH' will help to record if we
need to restart the process.
Cc: Eric Blake <eblake@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1484657864-21708-4-git-send-email-zhang.zhanghailiang@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Added a missing '(Since 2.9)'
If the net connection between primary host and secondary host breaks
while COLO/COLO incoming threads are doing read() or write().
It will block until connection is timeout, and the failover process
will be blocked because of it.
So it is necessary to shutdown all the socket fds used by COLO
to avoid this situation. Besides, we should close the corresponding
file descriptors after failvoer BH shutdown them,
Or there will be an error.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1484657864-21708-3-git-send-email-zhang.zhanghailiang@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
If we set checkpoint-delay through command 'migrate-set-parameters',
It will not take effect until we finish last sleep chekpoint-delay,
That's will be offensive espeically when we want to change its value
from an extreme big one to a proper value.
Fix it by using timer to realize checkpoint-delay.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Message-Id: <1484657864-21708-2-git-send-email-zhang.zhanghailiang@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The member VMStateField.start is used for two things, partial data
migration for VBUFFER data (basically provide migration for a
sub-buffer) and for locating next in QTAILQ.
The implementation of the VBUFFER feature is broken when VMSTATE_ALLOC
is used. This however goes unnoticed because actually partial migration
for VBUFFER is not used at all.
Let's consolidate the usage of VMStateField.start by removing support
for partial migration for VBUFFER.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Message-Id: <20170203175217.45562-1-pasic@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Migration of a "none" machine with no RAM crashes abruptly as
bitmap_new() fails and thus aborts. Instead place zero RAM checks at
appropriate places to skip migration of RAM in this case and complete
migration successfully for devices only.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Message-Id: <1486564125-31366-1-git-send-email-ashijeetacharya@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
After the start of postcopy migration there are some non-dirty pages which have
already been migrated. These pages are no longer needed on the source vm so that
we can free them and it doen't hurt to complete the migration.
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Message-Id: <20170203152321.19739-4-pbutsykin@virtuozzo.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This feature frees the migrated memory on the source during postcopy-ram
migration. In the second step of postcopy-ram migration when the source vm
is put on pause we can free unnecessary memory. It will allow, in particular,
to start relaxing the memory stress on the source host in a load-balancing
scenario.
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Message-Id: <20170203152321.19739-3-pbutsykin@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Manually merged in Pavel's 'migration: madvise error_report fixup!'
Cosmetic patch. The use of ms variable instead of migrate_get_current()
looks nicer, especially when there reuse.
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Message-Id: <20170203152321.19739-2-pbutsykin@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
An early postcopy failure can be recovered from as long as we know
we haven't sent the command to run the destination.
We have to undo the bdrv_inactivate_all by calling
bdrv_invalidate_cache_all
Note that I'm not using ms->block_inactive because once we've
sent the postcopy package we dont want anything else to try
and recover the block storage on the source; the destination
might have started writing to it.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170202155909.31784-3-dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
On a destination host with no userfault support an incoming
postcopy would cause the state to enter ADVISE before
it realised there was no support, and because it was in ADVISE
state it would perform a cleanup at the end. Since there
was no support the cleanup function should be unreachable,
but ends up being called and asserting.
Reset the state when we realise we have no support, thus the
cleanup doesn't happen.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170202155909.31784-2-dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The qdev id of a device can be huge if it's on the end of a chain
of bridges; in reality such chains shouldn't occur but they can
be made to by chaining PCIe bridges together.
The migration format has a number of 256 character long format
limits; check we don't hit them (we already use pstrcat/cpy but
that just protects us from buffer overruns, we fairly quickly
hit an assert).
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20170202125956.21942-3-dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
I'll be adding an error to it in a subsequent patch.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20170202125956.21942-2-dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1485207141-1941-3-git-send-email-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
There are a number of unused trace events that
scripts/cleanup-trace-events.pl finds. The "hw/vfio/pci-quirks.c"
filename was typoed and "qapi/qapi-visit-core.c" was missing the qapi/
directory prefix.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170126171613.1399-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch introduces save_vmstate function to allow saving and loading
vmstates from the replay module.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20170124071741.4572.13714.stgit@PASHA-ISP>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add some tracing to vmstate_subsection_save and vmstate_save_state
to help in debugging when you're not sure if a conditional piece
of data is being saved.
In vmstate_subsection_save I renamed the inner vmsd to avoid the aliasing
and be able to print both names.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20161212125838.14425-1-dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
So we can remove DPRINTF() macro
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1485207141-1941-2-git-send-email-quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixed up 'remained/remaining' as requested by Eric
Change the name of live migration thread from 'migration'
to 'live_migration' to identify it clearly. 'migration'
is a generic word and kernel also has tasks for process
migration with the name 'migration/cpu#'.
Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
Message-Id: <1485178976-15225-1-git-send-email-pagupta@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
commit fe904ea824 fixed a case
which migration aborted QEMU because it didn't regain the control
of images while some errors happened.
Actually, there are another two cases can trigger the same error reports:
" bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed",
Case 1, codes path:
migration_thread()
migration_completion()
bdrv_inactivate_all() ----------------> inactivate images
qemu_savevm_state_complete_precopy()
socket_writev_buffer() --------> error because destination fails
qemu_fflush() ----------------> set error on migration stream
-> qmp_migrate_cancel() ----------------> user cancelled migration concurrently
-> migrate_set_state() ------------------> set migrate CANCELLIN
migration_completion() -----------------> go on to fail_invalidate
if (s->state == MIGRATION_STATUS_ACTIVE) -> Jump this branch
Case 2, codes path:
migration_thread()
migration_completion()
bdrv_inactivate_all() ----------------> inactivate images
migreation_completion() finished
-> qmp_migrate_cancel() ---------------> user cancelled migration concurrently
qemu_mutex_lock_iothread();
qemu_bh_schedule (s->cleanup_bh);
As we can see from above, qmp_migrate_cancel can slip in whenever
migration_thread does not hold the global lock. If this happens after
bdrv_inactive_all() been called, the above error reports will appear.
To prevent this, we can call bdrv_invalidate_cache_all() in qmp_migrate_cancel()
directly if we find images become inactive.
Besides, bdrv_invalidate_cache_all() in migration_completion() doesn't have the
protection of big lock, fix it by add the missing qemu_mutex_lock_iothread();
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Message-Id: <1485244792-11248-1-git-send-email-zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
migrate_add_blocker should rightly fail if the '--only-migratable'
option was specified and the device in use should not be able to
perform the action which results in an unmigratable VM.
Make migrate_add_blocker return -EACCES in this case.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Message-Id: <1484566314-3987-6-git-send-email-ashijeetacharya@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
If a migration is already in progress and somebody attempts
to add a migration blocker, this should rightly fail.
Add an errp parameter and a retcode return value to migrate_add_blocker.
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Message-Id: <1484566314-3987-5-git-send-email-ashijeetacharya@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Merged with recent 'Allow invtsc migration' change
Added error_report where version_ids do not match in vmstate_load_state.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Message-Id: <1484852453-12728-5-git-send-email-duanj@linux.vnet.ibm.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Currently we cannot directly transfer a QTAILQ instance because of the
limitation in the migration code. Here we introduce an approach to
transfer such structures. We created VMStateInfo vmstate_info_qtailq
for QTAILQ. Similar VMStateInfo can be created for other data structures
such as list.
When a QTAILQ is migrated from source to target, it is appended to the
corresponding QTAILQ structure, which is assumed to have been properly
initialized.
This approach will be used to transfer pending_events and ccs_list in spapr
state.
We also create some macros in qemu/queue.h to access a QTAILQ using pointer
arithmetic. This ensures that we do not depend on the implementation
details about QTAILQ in the migration code.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Message-Id: <1484852453-12728-3-git-send-email-duanj@linux.vnet.ibm.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Current migration code cannot handle some data structures such as
QTAILQ in qemu/queue.h. Here we extend the signatures of put/get
in VMStateInfo so that customized handling is supported. put now
will return int type.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Message-Id: <1484852453-12728-2-git-send-email-duanj@linux.vnet.ibm.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>