Commit Graph

1224 Commits

Author SHA1 Message Date
Keqian Zhu
fb6135807f migration: Count new_dirty instead of real_dirty
real_dirty_pages becomes equal to total ram size after dirty log sync
in ram_init_bitmaps, the reason is that the bitmap of ramblock is
initialized to be all set, so old path counts them as "real dirty" at
beginning.

This causes wrong dirty rate and false positive throttling.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Message-Id: <20200622032037.31112-1-zhukeqian1@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-07-03 16:23:05 +01:00
Dr. David Alan Gilbert
617a32f529 migration: postcopy take proper error return
This function returns a boolean success and we're returning -1;
lets just use the 'out' error path.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixes: 58b7c17e22 ("Disable mlock around incoming postcopy")
Buglink: https://bugs.launchpad.net/qemu/+bug/1885720
Message-Id: <20200701093557.130096-1-dgilbert@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-07-03 16:23:05 +01:00
Markus Armbruster
9cde9caa04 migration/rdma: Plug memory leaks in qemu_rdma_registration_stop()
qemu_rdma_registration_stop() uses the ERROR() macro to create, report
to stderr, and store an Error object.  The stored Error object is
never used, and its memory is leaked.

Even where ERROR() doesn't leak, it is ill-advised.  The whole point
of passing an Error to the caller is letting the caller handle the
error.  Error handling may report to stderr, to somewhere else, or not
at all.  Also reporting in the callee mixes up concerns that should be
kept separate.  Since I don't know what reporting to stderr is
supposed to accomplish, I'm not touching it.

Commit 2a1bc8bde7 "migration/rdma: rdma_accept_incoming_migration fix
error handling" plugged the same leak in
rdma_accept_incoming_migration().

Plug the memory leak the same way: keep the report part, delete the
store part.

The report part uses fprintf().  If it's truly an error, it should use
error_report() instead.  But I don't know, so I leave it alone, just
like commit 2a1bc8bde7 did.

Fixes: 2da776db48
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200630090351.1247703-27-armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-07-02 11:54:48 +02:00
Eric Blake
f9919116b8 osdep: Make MIN/MAX evaluate arguments only once
I'm not aware of any immediate bugs in qemu where a second runtime
evaluation of the arguments to MIN() or MAX() causes a problem, but
proactively preventing such abuse is easier than falling prey to an
unintended case down the road.  At any rate, here's the conversation
that sparked the current patch:
https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg05718.html

Update the MIN/MAX macros to only evaluate their argument once at
runtime; this uses typeof(1 ? (a) : (b)) to ensure that we are
promoting the temporaries to the same type as the final comparison (we
have to trigger type promotion, as typeof(bitfield) won't compile; and
we can't use typeof((a) + (b)) or even typeof((a) + 0), as some of our
uses of MAX are on void* pointers where such addition is undefined).

However, we are unable to work around gcc refusing to compile ({}) in
a constant context (such as the array length of a static variable),
even when only used in the dead branch of a __builtin_choose_expr(),
so we have to provide a second macro pair MIN_CONST and MAX_CONST for
use when both arguments are known to be compile-time constants and
where the result must also be usable as a constant; this second form
evaluates arguments multiple times but that doesn't matter for
constants.  By using a void expression as the expansion if a
non-constant is presented to this second form, we can enlist the
compiler to ensure the double evaluation is not attempted on
non-constants.

Alas, as both macros now rely on compiler intrinsics, they are no
longer usable in preprocessor #if conditions; those will just have to
be open-coded or the logic rewritten into #define or runtime 'if'
conditions (but where the compiler dead-code-elimination will probably
still apply).

I tested that both gcc 10.1.1 and clang 10.0.0 produce errors for all
forms of macro mis-use.  As the errors can sometimes be cryptic, I'm
demonstrating the gcc output:

Use of MIN when MIN_CONST is needed:

In file included from /home/eblake/qemu/qemu-img.c:25:
/home/eblake/qemu/include/qemu/osdep.h:249:5: error: braced-group within expression allowed only inside a function
  249 |     ({                                                  \
      |     ^
/home/eblake/qemu/qemu-img.c:92:12: note: in expansion of macro ‘MIN’
   92 | char array[MIN(1, 2)] = "";
      |            ^~~

Use of MIN_CONST when MIN is needed:

/home/eblake/qemu/qemu-img.c: In function ‘is_allocated_sectors’:
/home/eblake/qemu/qemu-img.c:1225:15: error: void value not ignored as it ought to be
 1225 |             i = MIN_CONST(i, n);
      |               ^

Use of MIN in the preprocessor:

In file included from /home/eblake/qemu/accel/tcg/translate-all.c:20:
/home/eblake/qemu/accel/tcg/translate-all.c: In function ‘page_check_range’:
/home/eblake/qemu/include/qemu/osdep.h:249:6: error: token "{" is not valid in preprocessor expressions
  249 |     ({                                                  \
      |      ^

Fix the resulting callsites that used #if or computed a compile-time
constant min or max to use the new macros.  cpu-defs.h is interesting,
as CPU_TLB_DYN_MAX_BITS is sometimes used as a constant and sometimes
dynamic.

It may be worth improving glib's MIN/MAX definitions to be saner, but
that is a task for another day.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200625162602.700741-1-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:39 -04:00
Peter Maydell
eefe34ea4b Migration (and HMP and virtiofs) pull 2020-06-17
Migration:
    HMP/migration and test changes from Mao Zhongyi
    multifd fix from Laurent Vivier
 HMP
    qom-set partial reversion/change from David Hildenbrand
       now you need -j to pass json format, but it's regained the
       old 100M type format.
   Memory leak fix from Pan Nengyuan
 
 Virtiofs
   fchmod seccomp fix from Max Reitz
 
 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEERfXHG0oMt/uXep+pBRYzHrxb/ecFAl7qYlIACgkQBRYzHrxb
 /efT/A//eSmxz0m4LB3Vz3pmwKQosVlvLNYgLgfs/dthdM8f3YyhpnY/bCyEmgL2
 J/TJevJvzBinVjiGJZkp1nv3wShcf1gpDGKvuhPpStavEajZoxStfGdiGGSvv/sX
 XuzQW8efmphoah8qL7J9xE9eC+WOWDATGDlaYIQv4oDG9Ef/OpBOcoYzdwj90ZhR
 Gl5AqPf3Bn2Xt2TpWK2aIaBfQtRSTu7xGSB/G8BQNzrJOraoALQmCZKvQYbQtlaq
 Guqj7Ad/x7kic5RGG9I5vxy1rQnM0fQvHG57uNvj+Gvi3lLfsCqI77O3gZZmdTmt
 mxjHn3OcIiwFmlptM9lNyWaZlrIniB3Di3K7VobHpvnuXbfkXGTElhD2CPNzErNY
 QDit9I/ZJxHu9kauyoWBuhKKRfn4S9jpmrEiyIe51v3Gr9r2dkfZlFzYyk1wBJi8
 caR7g5QVQFm+3TI5LgnVRisv/FQlLsPXiBaRN2rv6dE+CZIVs/7lbwzwOshpbtxR
 NWv0FOvlIyQhdkZrCo+FIVbRK0vRJNZwpHOQ1ZwdlAbIbKBwyfqitMmHHCAJnpI4
 CqLg9Bao8VYf/mynaTYt6h+azVnByBgSpj2KJlDUstPxKFhv+0SDgVXdHCnbOEya
 6mknqBy4k/9To2g5Y3spJUinf5H4mcIuCWB0jEAioyZ6CjUfifg=
 =KYWs
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20200617a' into staging

Migration (and HMP and virtiofs) pull 2020-06-17

Migration:
   HMP/migration and test changes from Mao Zhongyi
   multifd fix from Laurent Vivier
HMP
   qom-set partial reversion/change from David Hildenbrand
      now you need -j to pass json format, but it's regained the
      old 100M type format.
  Memory leak fix from Pan Nengyuan

Virtiofs
  fchmod seccomp fix from Max Reitz

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

# gpg: Signature made Wed 17 Jun 2020 19:34:58 BST
# gpg:                using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-migration-20200617a:
  migration: fix multifd_send_pages() next channel
  docs/xbzrle: update 'cache miss rate' and 'encoding rate' to docs
  monitor/hmp-cmds: improvements for the 'info migrate'
  monitor/hmp-cmds: add 'goto end' to reduce duplicate code.
  monitor/hmp-cmds: delete redundant Error check before invoke hmp_handle_error()
  monitor/hmp-cmds: don't silently output when running 'migrate_set_downtime' fails
  monitor/hmp-cmds: add units for migrate_parameters
  tests/migration: fix unreachable path in stress test
  tests/migration: mem leak fix
  hmp: Make json format optional for qom-set
  qom-hmp-cmds: fix a memleak in hmp_qom_get
  virtiofsd: Whitelist fchmod

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-06-18 15:30:13 +01:00
Wei Wang
9227140217 migration: fix xbzrle encoding rate calculation
It's reported an error of implicit conversion from "unsigned long" to
"double" when compiling with Clang 10. Simply make the encoding rate 0
when the encoded_size is 0.

Fixes: e460a4b1a4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reported-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200617201309.1640952-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-06-18 10:26:02 +01:00
Laurent Vivier
7e89a1401a migration: fix multifd_send_pages() next channel
multifd_send_pages() loops around the available channels,
the next channel to use between two calls to multifd_send_pages() is stored
inside a local static variable, next_channel.

It works well, except if the number of channels decreases between two calls
to multifd_send_pages(). In this case, the loop can try to access the
data of a channel that doesn't exist anymore.

The problem can be triggered if we start a migration with a given number of
channels and then we cancel the migration to restart it with a lower number.
This ends generally with an error like:
qemu-system-ppc64: .../util/qemu-thread-posix.c:77: qemu_mutex_lock_impl: Assertion `mutex->initialized' failed.

This patch fixes the error by capping next_channel with the current number
of channels before using it.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20200617113154.593233-1-lvivier@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-17 17:48:39 +01:00
Markus Armbruster
2194abd623 qdev: qdev_create(), qdev_try_create() are now unused, drop
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200610053247.1583243-31-armbru@redhat.com>
2020-06-15 22:05:28 +02:00
Philippe Mathieu-Daudé
da278d58a0 accel: Move Xen accelerator code under accel/xen/
This code is not related to hardware emulation.
Move it under accel/ with the other hypervisors.

Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200508100222.7112-1-philmd@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-10 12:09:56 -04:00
Lukas Straub
773861274a migration/migration.c: Fix hang in ram_save_host_page
migration_rate_limit will erroneously ratelimit a shutdown socket,
which causes the migration thread to hang in ram_save_host_page
if the socket is shutdown.

Fix this by explicitly testing if the socket has errors or was
shutdown in migration_rate_limit.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Message-Id: <e79085bbe2d46dfa007dd41820194d5e2d4fcd80.1590007004.git.lukasstraub2@web.de>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01 18:44:27 +01:00
Lukas Straub
4fa8ed25b8 migration/colo.c: Move colo_notify_compares_event to the right place
If the secondary has to failover during checkpointing, it still is
in the old state (i.e. different state than primary). Thus we can't
expose the primary state until after the checkpoint is sent.

This fixes sporadic connection reset of client connections during
failover.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Message-Id: <d4555dd5146a54518c4d9d4efd996b7c745c6687.1589193382.git.lukasstraub2@web.de>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01 18:44:27 +01:00
Lukas Straub
92c932de6c migration/colo.c: Relaunch failover even if there was an error
If vmstate_loading is true, secondary_vm_do_failover will set failover
status to FAILOVER_STATUS_RELAUNCH and return success without initiating
failover. However, if there is an error during the vmstate_loading
section, failover isn't relaunched. Instead we then wait for
failover on colo_incoming_sem.

Fix this by relaunching failover even if there was an error. Also,
to make this work properly, set vmstate_loading to false when
returning during the vmstate_loading section.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Message-Id: <f60b0a8e2fadaaec792e04819dfc46951842d6ba.1589193382.git.lukasstraub2@web.de>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01 18:44:27 +01:00
Lukas Straub
24fa16f8cc migration/colo.c: Flush ram cache only after receiving device state
If we suceed in receiving ram state, but fail receiving the device
state, there will be a mismatch between the two.

Fix this by flushing the ram cache only after the vmstate has been
received.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Message-Id: <3289d007d494cb0e2f05b1cf4ae6a78d300fede3.1589193382.git.lukasstraub2@web.de>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01 18:44:27 +01:00
Lukas Straub
786d8b8e38 migration/colo.c: Use cpu_synchronize_all_states()
cpu_synchronize_all_pre_loadvm() marks all vcpus as dirty, so the
registers are loaded from CPUState before we continue running
the vm. However if we failover during checkpoint, CPUState is not
initialized and the registers are loaded with garbage. This causes
guest hangs and crashes.

Fix this by using cpu_synchronize_all_states(), which initializes
CPUState from the current cpu registers additionally to marking
the vcpus as dirty.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Message-Id: <9675031ce557b73ebd10e7bd20ebbf57f30b177c.1589193382.git.lukasstraub2@web.de>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01 18:44:27 +01:00
Lukas Straub
bb70b66ed7 migration/colo.c: Use event instead of semaphore
If multiple packets miscompare in a short timeframe, the semaphore
value will be increased multiple times. This causes multiple
checkpoints even if one would be sufficient.

Fix this by using a event instead of a semaphore for triggering
checkpoints. Now, checkpoint requests will be ignored until the
checkpoint event is sent to colo-compare (which releases the
miscompared packets).

Benchmark results (iperf3):
Client-to-server tcp:
without patch: ~66 Mbit/s
with patch: ~61 Mbit/s
Server-to-client tcp:
without patch: ~702 Kbit/s
with patch: ~16 Mbit/s

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Message-Id: <fd601ba1beb524aada54ba66e87ebfc12cf4574b.1589193382.git.lukasstraub2@web.de>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01 18:44:27 +01:00
Pan Nengyuan
2f0c285aaa migration/rdma: cleanup rdma context before g_free to avoid memleaks
When error happen in initializing 'rdma_return_path', we should cleanup rdma context
before g_free(rdma) to avoid some memleaks. This patch fix that.

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Message-Id: <20200508100755.7875-3-pannengyuan@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01 18:44:27 +01:00
Pan Nengyuan
3b59ee7227 migration/rdma: fix potential nullptr access in rdma_start_incoming_migration
'rdma' is NULL when taking the first error branch in rdma_start_incoming_migration.
And it will cause a null pointer access in label 'err'. Fix that.

Fixes: 59c59c67ee
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Message-Id: <20200508100755.7875-2-pannengyuan@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Note this is CID 1428762
2020-06-01 18:44:27 +01:00
Vladimir Sementsov-Ogievskiy
4ff5cc121b migration: forbid bitmap migration by generated node-name
It actually never worked with libvirt, as auto-generated names are
different on source and destination.

It's unsafe and useless to migrate by auto-generated node-names, so
let's forbid it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200521220648.3255-7-vsementsov@virtuozzo.com>
[eblake: shorter subject line]
Signed-off-by: Eric Blake <eblake@redhat.com>
2020-05-28 13:15:22 -05:00
Vladimir Sementsov-Ogievskiy
82640edb88 migration: add_bitmaps_to_list: check disk name once
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200521220648.3255-6-vsementsov@virtuozzo.com>
[eblake: shorter subject line]
Signed-off-by: Eric Blake <eblake@redhat.com>
2020-05-28 13:15:22 -05:00
Vladimir Sementsov-Ogievskiy
107cfb7254 migration: fix bitmaps pre-blockdev migration with mirror job
Important thing for bitmap migration is to select destination block
node to obtain the migrated bitmap.

Prepatch, on source we use bdrv_get_device_or_node_name() to identify
the node, and on target we do bdrv_lookup_bs.
bdrv_get_device_or_node_name() returns blk name only for direct
children of blk. So, bitmaps of direct children of blks are migrated by
blk name and others - by node name.

Old libvirt is unprepared to bitmap migration by node-name,
node-names are mostly auto-generated. So actually only migration by blk
name works for it.

Newer libvirt will use new interface (which will be added soon) to
specify node-mapping for bitmaps migration explicitly. Still, let's
improve the current behavior a bit.

Now, consider classic libvirt migrations assisted by mirror block job:
mirror block job inserts filter, so our source is not a direct child of
blk, and bitmaps are migrated by node-names. And this just doesn't work
with auto-generated node names.

Let's fix it by using blk-name even if some implicit filters are
inserted.

Note2: we, of course, can't skip filters and use blk name to migrate
bitmaps in filtered node by blk name for this blk if these filters have
named bitmaps which should be migrated.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1652424
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200521220648.3255-4-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: comment typo fix, shorter subject line]
Signed-off-by: Eric Blake <eblake@redhat.com>
2020-05-28 13:15:22 -05:00
Vladimir Sementsov-Ogievskiy
38908bbc65 migration: refactor init_dirty_bitmap_migration
Split out handling one bs, it is needed for the following commit, which
will handle BlockBackends separately.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200521220648.3255-2-vsementsov@virtuozzo.com>
[eblake: shorter subject line]
Signed-off-by: Eric Blake <eblake@redhat.com>
2020-05-28 13:15:22 -05:00
Pan Nengyuan
13f2cb21e5 migration/multifd: Do error_free after migrate_set_error to avoid memleaks
When error happen in multifd_send_thread, it use error_copy to set migrate error in
multifd_send_terminate_threads(). We should call error_free after it.

Similarly, fix another two places in multifd_recv_thread/multifd_save_cleanup.

The leak stack:
Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x7f781af07cf0 in calloc (/lib64/libasan.so.5+0xefcf0)
    #1 0x7f781a2ce22d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5322d)
    #2 0x55ee1d075c17 in error_setv /mnt/sdb/backup/qemu/util/error.c:61
    #3 0x55ee1d076464 in error_setg_errno_internal /mnt/sdb/backup/qemu/util/error.c:109
    #4 0x55ee1cef066e in qio_channel_socket_writev /mnt/sdb/backup/qemu/io/channel-socket.c:569
    #5 0x55ee1cee806b in qio_channel_writev /mnt/sdb/backup/qemu/io/channel.c:207
    #6 0x55ee1cee806b in qio_channel_writev_all /mnt/sdb/backup/qemu/io/channel.c:171
    #7 0x55ee1cee8248 in qio_channel_write_all /mnt/sdb/backup/qemu/io/channel.c:257
    #8 0x55ee1ca12c9a in multifd_send_thread /mnt/sdb/backup/qemu/migration/multifd.c:657
    #9 0x55ee1d0607fc in qemu_thread_start /mnt/sdb/backup/qemu/util/qemu-thread-posix.c:519
    #10 0x7f78159ae2dd in start_thread (/lib64/libpthread.so.0+0x82dd)
    #11 0x7f78156df4b2 in __GI___clone (/lib64/libc.so.6+0xfc4b2)

Indirect leak of 52 byte(s) in 1 object(s) allocated from:
    #0 0x7f781af07f28 in __interceptor_realloc (/lib64/libasan.so.5+0xeff28)
    #1 0x7f78156f07d9 in __GI___vasprintf_chk (/lib64/libc.so.6+0x10d7d9)
    #2 0x7f781a30ea6c in g_vasprintf (/lib64/libglib-2.0.so.0+0x93a6c)
    #3 0x7f781a2e7cd0 in g_strdup_vprintf (/lib64/libglib-2.0.so.0+0x6ccd0)
    #4 0x7f781a2e7d8c in g_strdup_printf (/lib64/libglib-2.0.so.0+0x6cd8c)
    #5 0x55ee1d075c86 in error_setv /mnt/sdb/backup/qemu/util/error.c:65
    #6 0x55ee1d076464 in error_setg_errno_internal /mnt/sdb/backup/qemu/util/error.c:109
    #7 0x55ee1cef066e in qio_channel_socket_writev /mnt/sdb/backup/qemu/io/channel-socket.c:569
    #8 0x55ee1cee806b in qio_channel_writev /mnt/sdb/backup/qemu/io/channel.c:207
    #9 0x55ee1cee806b in qio_channel_writev_all /mnt/sdb/backup/qemu/io/channel.c:171
    #10 0x55ee1cee8248 in qio_channel_write_all /mnt/sdb/backup/qemu/io/channel.c:257
    #11 0x55ee1ca12c9a in multifd_send_thread /mnt/sdb/backup/qemu/migration/multifd.c:657
    #12 0x55ee1d0607fc in qemu_thread_start /mnt/sdb/backup/qemu/util/qemu-thread-posix.c:519
    #13 0x7f78159ae2dd in start_thread (/lib64/libpthread.so.0+0x82dd)
    #14 0x7f78156df4b2 in __GI___clone (/lib64/libc.so.6+0xfc4b2)

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Message-Id: <20200506095416.26099-3-pannengyuan@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-05-07 17:40:24 +01:00
Pan Nengyuan
ad31b8af73 migration/multifd: fix memleaks in multifd_new_send_channel_async
When error happen in multifd_new_send_channel_async, 'sioc' will not be used
to create the multifd_send_thread. Let's free it to avoid a memleak. And also
do error_free after migrate_set_error() to avoid another leak in the same place.

The leak stack:
Direct leak of 2880 byte(s) in 8 object(s) allocated from:
    #0 0x7f20b5118ae8 in __interceptor_malloc (/lib64/libasan.so.5+0xefae8)
    #1 0x7f20b44df1d5 in g_malloc (/lib64/libglib-2.0.so.0+0x531d5)
    #2 0x564133bce18b in object_new_with_type /mnt/sdb/backup/qemu/qom/object.c:683
    #3 0x564133eea950 in qio_channel_socket_new /mnt/sdb/backup/qemu/io/channel-socket.c:56
    #4 0x5641339cfe4f in socket_send_channel_create /mnt/sdb/backup/qemu/migration/socket.c:37
    #5 0x564133a10328 in multifd_save_setup /mnt/sdb/backup/qemu/migration/multifd.c:772
    #6 0x5641339cebed in migrate_fd_connect /mnt/sdb/backup/qemu/migration/migration.c:3530
    #7 0x5641339d15e4 in migration_channel_connect /mnt/sdb/backup/qemu/migration/channel.c:92
    #8 0x5641339cf5b7 in socket_outgoing_migration /mnt/sdb/backup/qemu/migration/socket.c:108

Direct leak of 384 byte(s) in 8 object(s) allocated from:
    #0 0x7f20b5118cf0 in calloc (/lib64/libasan.so.5+0xefcf0)
    #1 0x7f20b44df22d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5322d)
    #2 0x56413406fc17 in error_setv /mnt/sdb/backup/qemu/util/error.c:61
    #3 0x564134070464 in error_setg_errno_internal /mnt/sdb/backup/qemu/util/error.c:109
    #4 0x5641340851be in inet_connect_addr /mnt/sdb/backup/qemu/util/qemu-sockets.c:379
    #5 0x5641340851be in inet_connect_saddr /mnt/sdb/backup/qemu/util/qemu-sockets.c:458
    #6 0x5641340870ab in socket_connect /mnt/sdb/backup/qemu/util/qemu-sockets.c:1105
    #7 0x564133eeaabf in qio_channel_socket_connect_sync /mnt/sdb/backup/qemu/io/channel-socket.c:145
    #8 0x564133eeabf5 in qio_channel_socket_connect_worker /mnt/sdb/backup/qemu/io/channel-socket.c:168

Indirect leak of 360 byte(s) in 8 object(s) allocated from:
    #0 0x7f20b5118ae8 in __interceptor_malloc (/lib64/libasan.so.5+0xefae8)
    #1 0x7f20af901817 in __GI___vasprintf_chk (/lib64/libc.so.6+0x10d817)
    #2 0x7f20b451fa6c in g_vasprintf (/lib64/libglib-2.0.so.0+0x93a6c)
    #3 0x7f20b44f8cd0 in g_strdup_vprintf (/lib64/libglib-2.0.so.0+0x6ccd0)
    #4 0x7f20b44f8d8c in g_strdup_printf (/lib64/libglib-2.0.so.0+0x6cd8c)
    #5 0x56413406fc86 in error_setv /mnt/sdb/backup/qemu/util/error.c:65
    #6 0x564134070464 in error_setg_errno_internal /mnt/sdb/backup/qemu/util/error.c:109
    #7 0x5641340851be in inet_connect_addr /mnt/sdb/backup/qemu/util/qemu-sockets.c:379
    #8 0x5641340851be in inet_connect_saddr /mnt/sdb/backup/qemu/util/qemu-sockets.c:458
    #9 0x5641340870ab in socket_connect /mnt/sdb/backup/qemu/util/qemu-sockets.c:1105
    #10 0x564133eeaabf in qio_channel_socket_connect_sync /mnt/sdb/backup/qemu/io/channel-socket.c:145
    #11 0x564133eeabf5 in qio_channel_socket_connect_worker /mnt/sdb/backup/qemu/io/channel-socket.c:168

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Message-Id: <20200506095416.26099-2-pannengyuan@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-05-07 17:40:24 +01:00
Wei Wang
e460a4b1a4 migration/xbzrle: add encoding rate
Users may need to check the xbzrle encoding rate to know if the guest
memory is xbzrle encoding-friendly, and dynamically turn off the
encoding if the encoding rate is low.

Signed-off-by: Yi Sun <yi.y.sun@intel.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Message-Id: <1588208375-19556-1-git-send-email-wei.w.wang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-05-07 17:40:24 +01:00
Pan Nengyuan
59c59c67ee migration/rdma: fix a memleak on error path in rdma_start_incoming_migration
'rdma->host' is malloced in qemu_rdma_data_init, but forgot to free on the error
path in rdma_start_incoming_migration(), this patch fix that.

The leak stack:
Direct leak of 2 byte(s) in 1 object(s) allocated from:
    #0 0x7fb7add18ae8 in __interceptor_malloc (/lib64/libasan.so.5+0xefae8)
    #1 0x7fb7ad0df1d5 in g_malloc (/lib64/libglib-2.0.so.0+0x531d5)
    #2 0x7fb7ad0f8b32 in g_strdup (/lib64/libglib-2.0.so.0+0x6cb32)
    #3 0x55a0464a0f6f in qemu_rdma_data_init /mnt/sdb/qemu/migration/rdma.c:2647
    #4 0x55a0464b0e76 in rdma_start_incoming_migration /mnt/sdb/qemu/migration/rdma.c:4020
    #5 0x55a0463f898a in qemu_start_incoming_migration /mnt/sdb/qemu/migration/migration.c:365
    #6 0x55a0458c75d3 in qemu_init /mnt/sdb/qemu/softmmu/vl.c:4438
    #7 0x55a046a3d811 in main /mnt/sdb/qemu/softmmu/main.c:48
    #8 0x7fb7a8417872 in __libc_start_main (/lib64/libc.so.6+0x23872)
    #9 0x55a04536b26d in _start (/mnt/sdb/qemu/build/x86_64-softmmu/qemu-system-x86_64+0x286926d)

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Message-Id: <20200420102727.17339-1-pannengyuan@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-05-07 17:40:24 +01:00
David Hildenbrand
ddf35bdf0a migration/ram: Consolidate variable reset after placement in ram_load_postcopy()
Let's consolidate resetting the variables.

Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200421085300.7734-10-david@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Fixup for context conflicts with 91ba442
2020-05-07 17:40:24 +01:00
Keqian Zhu
cbbf818224 migration/throttle: Add cpu-throttle-tailslow migration parameter
At the tail stage of throttling, the Guest is very sensitive to
CPU percentage while the @cpu-throttle-increment is excessive
usually at tail stage.

If this parameter is true, we will compute the ideal CPU percentage
used by the Guest, which may exactly make the dirty rate match the
dirty rate threshold. Then we will choose a smaller throttle increment
between the one specified by @cpu-throttle-increment and the one
generated by ideal CPU percentage.

Therefore, it is compatible to traditional throttling, meanwhile
the throttle increment won't be excessive at tail stage. This may
make migration time longer, and is disabled by default.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Message-Id: <20200413101508.54793-1-zhukeqian1@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-05-07 17:40:24 +01:00
Philippe Mathieu-Daudé
58602676df migration/colo: Add missing error-propagation code
Running the coccinelle script produced:

  $ spatch \
    --macro-file scripts/cocci-macro-file.h --include-headers \
    --sp-file scripts/coccinelle/find-missing-error_propagate.cocci \
    --keep-comments --smpl-spacing --dir .
  HANDLING: ./migration/colo.c
  [[manual check required: error_propagate() might be missing in migrate_set_block_enabled() ./migration/colo.c:439:4]]

Add the missing error_propagate() after review.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200413205250.687-1-f4bug@amsat.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-05-07 17:40:24 +01:00
Mao Zhongyi
7ac5529afb migration: move the units of migrate parameters from milliseconds to ms
Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <474bb6cf67defb8be9de5035c11aee57a680557a.1585641083.git.maozhongyi@cmss.chinamobile.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-05-07 17:40:24 +01:00
Mao Zhongyi
2ee30cf078 migration/migration: improve error reporting for migrate parameters
use QERR_INVALID_PARAMETER_VALUE instead of
"Parameter '%s' expects" for consistency.

Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com>
Message-Id: <4ce71da4a5f98ad6ead0806ec71043473dcb4c07.1585641083.git.maozhongyi@cmss.chinamobile.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-05-07 17:40:24 +01:00
Mao Zhongyi
ed8b2828cc migration: fix bad indentation in error_report()
bad indentation conflicts with CODING_STYLE doc.

Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com>
Message-Id: <09f7529c665cac0c6a5e032ac6fdb6ca701f7e37.1585329482.git.maozhongyi@cmss.chinamobile.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-05-07 17:40:24 +01:00
Peter Maydell
a2261b2754 trivial patches (20200504)
Silent static analyzer warning
 Remove dead assignments
 Support -chardev serial on macOS
 Update MAINTAINERS
 Some cosmetic changes
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAl6wOI4SHGxhdXJlbnRA
 dml2aWVyLmV1AAoJEPMMOL0/L748p7UQAIFSNN0FrDV+K7i8qqq0X+JrS+dNOHNm
 DSpOf8IaGm/BezzL6XirXBVpFxg9iB5DQVLsjP1kUggO7rbBO0blx5H5eOPhnXZj
 xg60kLN16ty7NZ/WPS1G9jF4nDsjz0ZUtCXb0OXsuGJIOrsmN2r/lxdJwcjHZaqJ
 RzbcCSFXlvL0g7mOakJinMJH5r/nWCiUoEYsikhP10DcvuSBoCnjr+LYV6Ef02G0
 Y5lgKN2G0EAMgWTJaL3gIF27zS8QLDNll+eO+PIU5K4yo75/wRCKr4e3PpErZlf6
 B+hCAAPnXCpDKw+8sK2z+9OZXUGe1hQ8LHNgNNM921C66f+vLLXpIDTAECihM4K4
 0wThYlFDwT4j+PMHFNlzIobGMtb33ui8m40lepMt/YOVFqY4tr8u3MLhHkVDo2+8
 sNuOOWLXAoFOYyRqgTeVJvZvMUFQqtDiftghw1BR55TyIpDWjvLYRqae5CI+MGXs
 6YylZVHGzVjMVptxvivvIQ735Nq8LaKq7N8Cb7uvcbRaCki39BsxXVPZx4p6NdwN
 dMndUOz/y75dNlRMDjK8l/oRFPJa/p1Yz8mZhl0uVOO6JeJhBwYmk+WkQ7g/GHZb
 Rx15HnVWRu6C/Icbw4kqZYyqrgl5lykS8aAWURePdpjzKY77rY1H71FesMhjifRN
 ZGgfUdWI88M4
 =ibgH
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vivier2/tags/trivial-branch-for-5.1-pull-request' into staging

trivial patches (20200504)

Silent static analyzer warning
Remove dead assignments
Support -chardev serial on macOS
Update MAINTAINERS
Some cosmetic changes

# gpg: Signature made Mon 04 May 2020 16:45:18 BST
# gpg:                using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
# gpg:                issuer "laurent@vivier.eu"
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>" [full]
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/trivial-branch-for-5.1-pull-request:
  hw/timer/pxa2xx_timer: Add assertion to silent static analyzer warning
  hw/timer/stm32f2xx_timer: Remove dead assignment
  hw/gpio/aspeed_gpio: Remove dead assignment
  hw/isa/i82378: Remove dead assignment
  hw/ide/sii3112: Remove dead assignment
  hw/input/adb-kbd: Remove dead assignment
  hw/i2c/pm_smbus: Remove dead assignment
  blockdev: Remove dead assignment
  block: Avoid dead assignment
  Compress lines for immediate return
  chardev: Add macOS to list of OSes that support -chardev serial
  MAINTAINERS: Update Keith Busch's email address
  elf_ops: Don't try to g_mapped_file_unref(NULL)
  hw/mem/pc-dimm: Fix line over 80 characters warning
  hw/mem/pc-dimm: Print slot number on error at pc_dimm_pre_plug()
  MAINTAINERS: Mark the LatticeMico32 target as orphan
  timer/exynos4210_mct: Remove redundant statement in exynos4210_mct_write()
  display/blizzard: use extract16() for fix clang analyzer warning in blizzard_draw_line16_32()
  scsi/esp-pci: add g_assert() for fix clang analyzer warning in esp_pci_io_write()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-05 14:03:28 +01:00
Daniel Brodsky
6e8a355de6 lockable: replaced locks with lock guard macros where appropriate
- ran regexp "qemu_mutex_lock\(.*\).*\n.*if" to find targets
- replaced result with QEMU_LOCK_GUARD if all unlocks at function end
- replaced result with WITH_QEMU_LOCK_GUARD if unlock not at end

Signed-off-by: Daniel Brodsky <dnbrdsky@gmail.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-id: 20200404042108.389635-3-dnbrdsky@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-05-04 16:07:43 +01:00
Simran Singhal
b3ac2b94cd Compress lines for immediate return
Compress two lines into a single line if immediate return statement is found.

It also remove variables progress, val, data, ret and sock
as they are no longer needed.

Remove space between function "mixer_load" and '(' to fix the
checkpatch.pl error:-
ERROR: space prohibited between function name and open parenthesis '('

Done using following coccinelle script:
@@
local idexpression ret;
expression e;
@@

-ret =
+return
     e;
-return ret;

Signed-off-by: Simran Singhal <singhalsimran0@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200401165314.GA3213@simran-Inspiron-5558>
[lv: in handle_aiocb_write_zeroes_unmap() move "int ret" inside the #ifdef]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-05-04 14:43:22 +02:00
Markus Armbruster
735527e179 migration/colo: Fix qmp_xen_colo_do_checkpoint() error handling
The Error ** argument must be NULL, &error_abort, &error_fatal, or a
pointer to a variable containing NULL.  Passing an argument of the
latter kind twice without clearing it in between is wrong: if the
first call sets an error, it no longer points to NULL for the second
call.

qmp_xen_colo_do_checkpoint() passes @errp first to
replication_do_checkpoint_all(), and then to
colo_notify_filters_event().  If both fail, this will trip the
assertion in error_setv().

Similar code in secondary_vm_do_failover() calls
colo_notify_filters_event() only after replication_do_checkpoint_all()
succeeded.  Do the same here.

Fixes: 0e8818f023
Cc: Zhang Chen <chen.zhang@intel.com>
Cc: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Message-Id: <20200422130719.28225-12-armbru@redhat.com>
2020-04-29 08:01:52 +02:00
Marc-André Lureau
9cbc36497c migration: fix cleanup_bh leak on resume
Since commit 8c6b0356b5 ("util/async:
make bh_aio_poll() O(1)"), migration-test reveals a leak:

QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
tests/qtest/migration-test  -p /x86_64/migration/postcopy/recovery
tests/qtest/libqtest.c:140: kill_qemu() tried to terminate QEMU
process but encountered exit status 1 (expected 0)

=================================================================
==2082571==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x7f25971dfc58 in __interceptor_malloc (/lib64/libasan.so.5+0x10dc58)
    #1 0x7f2596d08358 in g_malloc (/lib64/libglib-2.0.so.0+0x57358)
    #2 0x560970d006f8 in qemu_bh_new /home/elmarco/src/qemu/util/main-loop.c:532
    #3 0x5609704afa02 in migrate_fd_connect
/home/elmarco/src/qemu/migration/migration.c:3407
    #4 0x5609704b6b6f in migration_channel_connect
/home/elmarco/src/qemu/migration/channel.c:92
    #5 0x5609704b2bfb in socket_outgoing_migration
/home/elmarco/src/qemu/migration/socket.c:108
    #6 0x560970b9bd6c in qio_task_complete /home/elmarco/src/qemu/io/task.c:196
    #7 0x560970b9aa97 in qio_task_thread_result
/home/elmarco/src/qemu/io/task.c:111
    #8 0x7f2596cfee3a  (/lib64/libglib-2.0.so.0+0x4de3a)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20200325184723.2029630-2-marcandre.lureau@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-02 14:55:45 -04:00
Mao Zhongyi
7cd75cbdb8 migration: use "" instead of (null) for tls-authz
run:
(qemu) info migrate_parameters
announce-initial: 50 ms
...
announce-max: 550 ms
multifd-compression: none
xbzrle-cache-size: 4194304
max-postcopy-bandwidth: 0
 tls-authz: '(null)'

Migration parameter 'tls-authz' is used to provide the QOM ID
of a QAuthZ subclass instance that provides the access control
check, default is NULL. But the empty string is not a valid
object ID, so use "" instead of the default. Although it will
fail when lookup an object with ID "", it is harmless, just
consistent with tls_creds.

As a bonus, this patch also fixed the bad indentation on the
last line and removed 'has_tls_authz' redundant check in
'hmp_info_migrate_parameters'.

Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com>
Message-Id: <119f539a9f4d198bc3bcced46b8280520d60bc51.1585100802.git.maozhongyi@cmss.chinamobile.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-03-25 12:31:38 +00:00
Vladimir Sementsov-Ogievskiy
b4a1733c5e migration/ram: fix use after free of local_err
local_err is used again in migration_bitmap_sync_precopy() after
precopy_notify(), so we must zero it. Otherwise try to set
non-NULL local_err will crash.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200324153630.11882-6-vsementsov@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-03-25 12:31:38 +00:00
Vladimir Sementsov-Ogievskiy
27d07fcfa7 migration/colo: fix use after free of local_err
local_err is used again in secondary_vm_do_failover() after
replication_stop_all(), so we must zero it. Otherwise try to set
non-NULL local_err will crash.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200324153630.11882-5-vsementsov@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-03-25 12:31:38 +00:00
Mao Zhongyi
06b1c6f8b7 xbzrle: update xbzrle doc
Add new parameter description, also:
1. Remove unsociable space.
2. Nit picking: s/two/2 in report

Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com>
Message-Id: <20200320143216.423374-1-maozhongyi@cmss.chinamobile.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-03-25 12:31:38 +00:00
zhanghailiang
19dd408a47 migration: recognize COLO as part of activating process
We will migrate parts of dirty pages backgroud lively during the gap time
of two checkpoints, without this modification, it will not work
because ram_save_iterate() will check it before send RAM_SAVE_FLAG_EOS
at the end of it.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Message-Id: <20200224065414.36524-7-zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-03-13 09:36:30 +00:00
zhanghailiang
8af66371ed ram/colo: only record bitmap of dirty pages in COLO stage
It is only need to record bitmap of dirty pages while goes
into COLO stage.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Message-Id: <20200224065414.36524-6-zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-03-13 09:36:30 +00:00
zhanghailiang
0393031a16 COLO: Optimize memory back-up process
This patch will reduce the downtime of VM for the initial process,
Previously, we copied all these memory in preparing stage of COLO
while we need to stop VM, which is a time-consuming process.
Here we optimize it by a trick, back-up every page while in migration
process while COLO is enabled, though it affects the speed of the
migration, but it obviously reduce the downtime of back-up all SVM'S
memory in COLO preparing stage.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Message-Id: <20200224065414.36524-5-zhang.zhanghailiang@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  minor typo fixes
2020-03-13 09:36:30 +00:00
Keqian Zhu
dc14a47076 migration/throttle: Add throttle-trig-thres migration parameter
Currently, if the bytes_dirty_period is more than the 50% of
bytes_xfer_period, we start or increase throttling.

If we make this percentage higher, then we can tolerate higher
dirty rate during migration, which means less impact on guest.
The side effect of higher percentage is longer migration time.
We can make this parameter configurable to switch between mig-
ration time first or guest performance first.

The default value is 50 and valid range is 1 to 100.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Message-Id: <20200224023142.39360-1-zhukeqian1@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-03-13 09:36:30 +00:00
zhanghailiang
f51d0b4178 savevm: Don't call colo_init_ram_cache twice
This helper has been called twice which is wrong.
Left the one where called while get COLO enable message
from source side.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-02-28 10:13:54 +01:00
zhanghailiang
6ad8ad38d0 migration/colo: wrap incoming checkpoint process into new helper
Split checkpoint incoming process into a helper.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-02-28 10:13:54 +01:00
zhanghailiang
0306dae5ac migration: fix COLO broken caused by a previous commit
This commit "migration: Create migration_is_running()" broke
COLO. Becuase there is a process broken by this commit.

colo_process_checkpoint
 ->colo_do_checkpoint_transaction
   ->migrate_set_block_enabled
     ->qmp_migrate_set_capabilities

It can be fixed by make COLO process as an exception,
Maybe we need a better way to fix it.

Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-02-28 10:13:54 +01:00
Stefan Hajnoczi
a152bd0093 migration/block: rename BLOCK_SIZE macro
Both <linux/fs.h> and <sys/mount.h> define BLOCK_SIZE macros.  Avoiding
using that name in block/migration.c.

I noticed this when including <liburing.h> (Linux io_uring) from
"block/aio.h" and compilation failed.  Although patches adding that
include haven't been sent yet, it makes sense to rename the macro now in
case someone else stumbles on it in the meantime.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-02-28 09:25:49 +01:00
Pan Nengyuan
26daeba4d6 migration/savevm: release gslist after dump_vmstate_json
'list' forgot to free at the end of dump_vmstate_json_to_file(), although it's called only once, but seems like a clean code.

Fix the leak as follow:
Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7fb946abd768 in __interceptor_malloc (/lib64/libasan.so.5+0xef768)
    #1 0x7fb945eca445 in g_malloc (/lib64/libglib-2.0.so.0+0x52445)
    #2 0x7fb945ee2066 in g_slice_alloc (/lib64/libglib-2.0.so.0+0x6a066)
    #3 0x7fb945ee3139 in g_slist_prepend (/lib64/libglib-2.0.so.0+0x6b139)
    #4 0x5585db591581 in object_class_get_list_tramp /mnt/sdb/qemu-new/qemu/qom/object.c:1084
    #5 0x5585db590f66 in object_class_foreach_tramp /mnt/sdb/qemu-new/qemu/qom/object.c:1028
    #6 0x7fb945eb35f7 in g_hash_table_foreach (/lib64/libglib-2.0.so.0+0x3b5f7)
    #7 0x5585db59110c in object_class_foreach /mnt/sdb/qemu-new/qemu/qom/object.c:1038
    #8 0x5585db5916b6 in object_class_get_list /mnt/sdb/qemu-new/qemu/qom/object.c:1092
    #9 0x5585db335ca0 in dump_vmstate_json_to_file /mnt/sdb/qemu-new/qemu/migration/savevm.c:638
    #10 0x5585daa5bcbf in main /mnt/sdb/qemu-new/qemu/vl.c:4420
    #11 0x7fb941204812 in __libc_start_main ../csu/libc-start.c:308
    #12 0x5585da29420d in _start (/mnt/sdb/qemu-new/qemu/build/x86_64-softmmu/qemu-system-x86_64+0x27f020d)

Indirect leak of 7472 byte(s) in 467 object(s) allocated from:
    #0 0x7fb946abd768 in __interceptor_malloc (/lib64/libasan.so.5+0xef768)
    #1 0x7fb945eca445 in g_malloc (/lib64/libglib-2.0.so.0+0x52445)
    #2 0x7fb945ee2066 in g_slice_alloc (/lib64/libglib-2.0.so.0+0x6a066)
    #3 0x7fb945ee3139 in g_slist_prepend (/lib64/libglib-2.0.so.0+0x6b139)
    #4 0x5585db591581 in object_class_get_list_tramp /mnt/sdb/qemu-new/qemu/qom/object.c:1084
    #5 0x5585db590f66 in object_class_foreach_tramp /mnt/sdb/qemu-new/qemu/qom/object.c:1028
    #6 0x7fb945eb35f7 in g_hash_table_foreach (/lib64/libglib-2.0.so.0+0x3b5f7)
    #7 0x5585db59110c in object_class_foreach /mnt/sdb/qemu-new/qemu/qom/object.c:1038
    #8 0x5585db5916b6 in object_class_get_list /mnt/sdb/qemu-new/qemu/qom/object.c:1092
    #9 0x5585db335ca0 in dump_vmstate_json_to_file /mnt/sdb/qemu-new/qemu/migration/savevm.c:638
    #10 0x5585daa5bcbf in main /mnt/sdb/qemu-new/qemu/vl.c:4420
    #11 0x7fb941204812 in __libc_start_main ../csu/libc-start.c:308
    #12 0x5585da29420d in _start (/mnt/sdb/qemu-new/qemu/build/x86_64-softmmu/qemu-system-x86_64+0x27f020d)

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-02-28 09:25:49 +01:00
Chen Qun
600fe89d6e migration/vmstate: Remove redundant statement in vmstate_save_state_v()
The "ret" has been assigned in all branches. It didn't need to be
 assigned separately.

Clang static code analyzer show warning:
  migration/vmstate.c:365:17: warning: Value stored to 'ret' is never read
                ret = 0;
                ^     ~

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-02-28 09:25:49 +01:00