Commit Graph

1622 Commits

Author SHA1 Message Date
Daniel P. Berrangé
fbfa6404e5 migration: rename qemu_ftell to qemu_file_total_transferred
The name 'ftell' gives the misleading impression that the QEMUFile
objects are seekable. This is not the case, as in general we just
have an opaque stream. The users of this method are only interested
in the total bytes processed. This switches to a new name that
reflects the intended usage.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
   dgilbert: Wrapped long line
2022-06-22 19:33:36 +01:00
Daniel P. Berrangé
154d87b4ef migration: rename 'pos' field in QEMUFile to 'bytes_processed'
The field name 'pos' gives the misleading impression that the QEMUFile
objects are seekable. This is not the case, as in general we just
have an opaque stream. The users of this method are only interested
in the total bytes processed. This switches to a new name that
reflects the intended usage.

Every QIOChannel backed impl of QEMUFile is currently ignoring the
'pos' field.

The only QEMUFile impl using 'pos' as an offset for I/O is the block
device vmstate. A later patch is introducing a QIOChannel impl for the
vmstate, and to handle this it is tracking a file offset itself
internally to the QIOChannel impl. So when we later eliminate the
QEMUFileOps callbacks later, the 'pos' field will no longer be used
from any I/O read/write methods.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  dgilbert: Fixed long line
2022-06-22 18:59:38 +01:00
Daniel P. Berrangé
c7fc8d323a migration: rename rate limiting fields in QEMUFile
This renames the following QEMUFile fields

 * bytes_xfer -> rate_limit_used
 * xfer_limit -> rate_limit_max

The intent is to make it clear that 'bytes_xfer' is specifically related
to rate limiting of data and applies to data queued, which need not have
been transferred on the wire yet if a flush hasn't taken place.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-06-22 18:11:21 +01:00
Daniel P. Berrangé
246683c22f migration: remove unreachble RDMA code in save_hook impl
The QEMUFile 'save_hook' callback has a 'size_t size' parameter.

The RDMA impl of this has logic that takes different actions
depending on whether the value is zero or non-zero. It has
commented out logic that would have taken further actions
if the value was negative.

The only place where the 'save_hook' callback is invoked is
the ram_control_save_page() method, which passes 'size'
through from its caller. The only caller of this method is
in turn control_save_page(). This method unconditionally
passes the 'TARGET_PAGE_SIZE' constant for the 'size' parameter.

IOW, the only scenario for 'size' that can execute in the
qemu_rdma_save_page method is 'size > 0'. The remaining code
has been unreachable since RDMA support was first introduced
9 years ago.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-06-22 18:11:21 +01:00
Daniel P. Berrangé
c0e0825c98 migration: switch to use QIOChannelNull for dummy channel
This removes one further custom impl of QEMUFile, in favour of a
QIOChannel based impl.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-06-22 18:11:21 +01:00
Leonardo Bras
1abaec9a1b migration: Change zero_copy_send from migration parameter to migration capability
When originally implemented, zero_copy_send was designed as a Migration
paramenter.

But taking into account how is that supposed to work, and how
the difference between a capability and a parameter, it only makes sense
that zero-copy-send would work better as a capability.

Taking into account how recently the change got merged, it was decided
that it's still time to make it right, and convert zero_copy_send into
a Migration capability.

Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  dgilbert: always define the capability, even on non-Linux but error if
set; avoids build problems with the capability
2022-06-22 18:11:17 +01:00
Juan Quintela
f6f213e4c7 migration: Remove RDMA_UNREGISTRATION_EXAMPLE
Nobody has ever showed up to unregister individual pages, and another
set of patches written by Daniel P. Berrangé <berrange@redhat.com>
just remove qemu_rdma_signal_unregister() function needed here.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-06-22 17:02:37 +01:00
Leonardo Bras
5b1d9bab2d multifd: Implement zero copy write in multifd migration (multifd-zero-copy)
Implement zero copy send on nocomp_send_write(), by making use of QIOChannel
writev + flags & flush interface.

Change multifd_send_sync_main() so flush_zero_copy() can be called
after each iteration in order to make sure all dirty pages are sent before
a new iteration is started. It will also flush at the beginning and at the
end of migration.

Also make it return -1 if flush_zero_copy() fails, in order to cancel
the migration process, and avoid resuming the guest in the target host
without receiving all current RAM.

This will work fine on RAM migration because the RAM pages are not usually freed,
and there is no problem on changing the pages content between writev_zero_copy() and
the actual sending of the buffer, because this change will dirty the page and
cause it to be re-sent on a next iteration anyway.

A lot of locked memory may be needed in order to use multifd migration
with zero-copy enabled, so disabling the feature should be necessary for
low-privileged users trying to perform multifd migrations.

Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220513062836.965425-9-leobras@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-05-16 13:56:24 +01:00
Leonardo Bras
b7dbdd8e76 multifd: Send header packet without flags if zero-copy-send is enabled
Since d48c3a0445 ("multifd: Use a single writev on the send side"),
sending the header packet and the memory pages happens in the same
writev, which can potentially make the migration faster.

Using channel-socket as example, this works well with the default copying
mechanism of sendmsg(), but with zero-copy-send=true, it will cause
the migration to often break.

This happens because the header packet buffer gets reused quite often,
and there is a high chance that by the time the MSG_ZEROCOPY mechanism get
to send the buffer, it has already changed, sending the wrong data and
causing the migration to abort.

It means that, as it is, the buffer for the header packet is not suitable
for sending with MSG_ZEROCOPY.

In order to enable zero copy for multifd, send the header packet on an
individual write(), without any flags, and the remanining pages with a
writev(), as it was happening before. This only changes how a migration
with zero-copy-send=true works, not changing any current behavior for
migrations with zero-copy-send=false.

Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220513062836.965425-8-leobras@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-05-16 13:56:24 +01:00
Leonardo Bras
33d70973a3 multifd: multifd_send_sync_main now returns negative on error
Even though multifd_send_sync_main() currently emits error_reports, it's
callers don't really check it before continuing.

Change multifd_send_sync_main() to return -1 on error and 0 on success.
Also change all it's callers to make use of this change and possibly fail
earlier.

(This change is important to next patch on  multifd zero copy
implementation, to make it sure an error in zero-copy flush does not go
unnoticed.

Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220513062836.965425-7-leobras@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-05-16 13:56:24 +01:00
Leonardo Bras
d2fafb6a68 migration: Add migrate_use_tls() helper
A lot of places check parameters.tls_creds in order to evaluate if TLS is
in use, and sometimes call migrate_get_current() just for that test.

Add new helper function migrate_use_tls() in order to simplify testing
for TLS usage.

Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220513062836.965425-6-leobras@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-05-16 13:56:24 +01:00
Leonardo Bras
abb6295b3a migration: Add zero-copy-send parameter for QMP/HMP for Linux
Add property that allows zero-copy migration of memory pages
on the sending side, and also includes a helper function
migrate_use_zero_copy_send() to check if it's enabled.

No code is introduced to actually do the migration, but it allow
future implementations to enable/disable this feature.

On non-Linux builds this parameter is compiled-out.

Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20220513062836.965425-5-leobras@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-05-16 13:56:24 +01:00
Leonardo Bras
b88651cb4d QIOChannel: Add flags on io_writev and introduce io_flush callback
Add flags to io_writev and introduce io_flush as optional callback to
QIOChannelClass, allowing the implementation of zero copy writes by
subclasses.

How to use them:
- Write data using qio_channel_writev*(...,QIO_CHANNEL_WRITE_FLAG_ZERO_COPY),
- Wait write completion with qio_channel_flush().

Notes:
As some zero copy write implementations work asynchronously, it's
recommended to keep the write buffer untouched until the return of
qio_channel_flush(), to avoid the risk of sending an updated buffer
instead of the buffer state during write.

As io_flush callback is optional, if a subclass does not implement it, then:
- io_flush will return 0 without changing anything.

Also, some functions like qio_channel_writev_full_all() were adapted to
receive a flag parameter. That allows shared code between zero copy and
non-zero copy writev, and also an easier implementation on new flags.

Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20220513062836.965425-3-leobras@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-05-16 13:56:24 +01:00
Paolo Bonzini
3730a7341f meson, configure: move RDMA options to meson
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-28 08:52:20 +02:00
Dr. David Alan Gilbert
552de79bfd migration: Read state once
The 'status' field for the migration is updated normally using
an atomic operation from the migration thread.
Most readers of it aren't that careful, and in most cases it doesn't
matter.

In query_migrate->fill_source_migration_info the 'state'
is read twice; the first time to decide which state fields to fill in,
and then secondly to copy the state to the status field; that can end up
with a status that's inconsistent; e.g. setting up the fields
for 'setup' and then having an 'active' status.  In that case
libvirt gets upset by the lack of ram info.
The symptom is:
   libvirt.libvirtError: internal error: migration was active, but no RAM info was set

Read the state exactly once in fill_source_migration_info.

This is a possible fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=2074205

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20220413113329.103696-1-dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-04-21 19:36:46 +01:00
Dr. David Alan Gilbert
f912ec5b2d migration: Fix operator type
Clang spotted an & that should have been an &&; fix it.

Reported by: David Binderman / https://gitlab.com/dcb
Fixes: 65dacaa04f ("migration: introduce save_normal_page()")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/963
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20220406102515.96320-1-dgilbert@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-04-21 19:36:46 +01:00
Peter Xu
08401c0426 migration: Allow migrate-recover to run multiple times
Previously migration didn't have an easy way to cleanup the listening
transport, migrate recovery only allows to execute once.  That's done with a
trick flag in postcopy_recover_triggered.

Now the facility is already there.

Drop postcopy_recover_triggered and instead allows a new migrate-recover to
release the previous listener transport.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220331150857.74406-8-peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-04-21 19:36:46 +01:00
Peter Xu
a39e933962 migration: Move channel setup out of postcopy_try_recover()
We used to use postcopy_try_recover() to replace migration_incoming_setup() to
setup incoming channels.  That's fine for the old world, but in the new world
there can be more than one channels that need setup.  Better move the channel
setup out of it so that postcopy_try_recover() only handles the last phase of
switching to the recovery phase.

To do that in migration_fd_process_incoming(), move the postcopy_try_recover()
call to be after migration_incoming_setup(), which will setup the channels.
While in migration_ioc_process_incoming(), postpone the recover() routine right
before we'll jump into migration_incoming_process().

A side benefit is we don't need to pass in QEMUFile* to postcopy_try_recover()
anymore.  Remove it.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220331150857.74406-7-peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-04-21 19:36:46 +01:00
Peter Xu
929068ec2f migration: Export ram_load_postcopy()
Will be reused in postcopy fast load thread.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220331150857.74406-6-peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-04-21 19:36:46 +01:00
Peter Xu
f444eeda71 migration: Move migrate_allow_multifd and helpers into migration.c
This variable, along with its helpers, is used to detect whether multiple
channel will be supported for migration.  In follow up patches, there'll be
other capability that requires multi-channels.  Hence move it outside multifd
specific code and make it public.  Meanwhile rename it from "multifd" to
"multi_channels" to show its real meaning.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220331150857.74406-5-peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-04-21 19:36:46 +01:00
Peter Xu
ea2faf0c35 migration: Add pss.postcopy_requested status
This boolean flag shows whether the current page during migration is triggered
by postcopy or not.  Then in ram_save_host_page() and deeper stack we'll be
able to have a reference on the priority of this page.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220331150857.74406-4-peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-04-21 19:36:46 +01:00
Peter Xu
7f692ec79a migration: Drop multifd tls_hostname cache
The hostname is cached N times, N equals to the multifd channels.

Drop that cache because after previous patch we've got s->hostname
being alive for the whole lifecycle of migration procedure.

Cc: Juan Quintela <quintela@redhat.com>
Cc: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220331150857.74406-3-peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-04-21 19:36:46 +01:00
Peter Xu
83174765da migration: Postpone releasing MigrationState.hostname
We used to release it right after migrate_fd_connect().  That's not good
enough when there're more than one socket pair required, because it'll be
needed to establish TLS connection for the rest channels.

One example is multifd, where we copied over the hostname for each channel
but that's actually not needed.

Keeping the hostname until the cleanup phase of migration.

Cc: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220331150857.74406-2-peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  dgilbert: Fixup checkpatch error; don't need to check for NULL
  around g_free
2022-04-21 19:36:46 +01:00
Daniel P. Berrangé
4b2bbca7a0 migration: fix use of TLS PSK credentials with a UNIX socket
The migration TLS code has a check mandating that a hostname be
available when starting a TLS session. This is expected when using
x509 credentials, but is bogus for PSK and anonymous credentials
as neither involve hostname validation.

The TLS crdentials object gained suitable error reporting in the
case of TLS with x509 credentials, so there is no longer any need
for the migration code to do its own (incorrect) validation.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220310171821.3724080-7-berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-04-21 19:36:46 +01:00
Marc-André Lureau
69242e7e7e Move CPU softfloat unions to cpu-float.h
The types are no longer used in bswap.h since commit
f930224fff ("bswap.h: Remove unused float-access functions"), there
isn't much sense in keeping it there and having a dependency on fpu/.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220323155743.1585078-29-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06 14:31:43 +02:00
Marc-André Lureau
8e3b0cbb72 Replace qemu_real_host_page variables with inlined functions
Replace the global variables with inlined helper functions. getpagesize() is very
likely annotated with a "const" function attribute (at least with glibc), and thus
optimization should apply even better.

This avoids the need for a constructor initialization too.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06 10:50:38 +02:00
Markus Armbruster
b21e238037 Use g_new() & friends where that makes obvious sense
g_new(T, n) is neater than g_malloc(sizeof(T) * n).  It's also safer,
for two reasons.  One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.

This commit only touches allocations with size arguments of the form
sizeof(T).

Patch created mechanically with:

    $ spatch --in-place --sp-file scripts/coccinelle/use-g_new-etc.cocci \
	     --macro-file scripts/cocci-macro-file.h FILES...

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20220315144156.1595462-4-armbru@redhat.com>
Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
2022-03-21 15:44:44 +01:00
Emanuele Giuseppe Esposito
6b573efec8 include/block/snapshot: global state API + assertions
Snapshots run also under the BQL, so they all are
in the global state API. The aiocontext lock that they hold
is currently an overkill and in future could be removed.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220303151616.325444-23-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito
3b71719462 block: rename bdrv_invalidate_cache_all, blk_invalidate_cache and test_sync_op_invalidate_cache
Following the bdrv_activate renaming, change also the name
of the respective callers.

bdrv_invalidate_cache_all -> bdrv_activate_all
blk_invalidate_cache -> blk_activate
test_sync_op_invalidate_cache -> test_sync_op_activate

No functional change intended.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220209105452.1694545-5-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:14:40 +01:00
Peter Maydell
1862198702 migration: Remove load_state_old and minimum_version_id_old
There are no longer any VMStateDescription structs in the tree which
use the load_state_old support for custom handling of incoming
migration from very old QEMU.  Remove the mechanism entirely.

This includes removing one stray useless setting of
minimum_version_id_old in a VMStateDescription with no load_state_old
function, which crept in after the global weeding-out of them in
commit 17e3134061.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20220215175705.3846411-1-peter.maydell@linaro.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Francisco Iglesias <francisco.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-03-02 18:20:45 +00:00
Peter Xu
e031149c78 migration: Add migration_incoming_transport_cleanup()
Add a helper to cleanup the transport listener.

When do it, we should also null-ify the cleanup hook and the data, then it's
even safe to call it multiple times.

Move the socket_address_list cleanup altogether, because that's a mirror of the
listener channels and only for the purpose of query-migrate.  Hence when
someone wants to cleanup the listener transport, it should also want to cleanup
the socket list too, always.

No functional change intended.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220301083925.33483-15-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-03-02 18:20:45 +00:00
Peter Xu
27dd21b461 migration: postcopy_pause_fault_thread() never fails
Per the title, remove the return code and simplify the callers as the errors
will never be triggered.  No functional change intended.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220301083925.33483-12-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-03-02 18:20:45 +00:00
Peter Xu
d5c8f2afe8 migration: Enlarge postcopy recovery to capture !-EIO too
We used to have quite a few places making sure -EIO happened and that's the
only way to trigger postcopy recovery.  That's based on the assumption that
we'll only return -EIO for channel issues.

It'll work in 99.99% cases but logically that won't cover some corner cases.
One example is e.g. ram_block_from_stream() could fail with an interrupted
network, then -EINVAL will be returned instead of -EIO.

I remembered Dave Gilbert pointed that out before, but somehow this is
overlooked.  Neither did I encounter anything outside the -EIO error.

However we'd better touch that up before it triggers a rare VM data loss during
live migrating.

To cover as much those cases as possible, remove the -EIO restriction on
triggering the postcopy recovery, because even if it's not a channel failure,
we can't do anything better than halting QEMU anyway - the corpse of the
process may even be used by a good hand to dig out useful memory regions, or
the admin could simply kill the process later on.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220301083925.33483-11-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-03-02 18:20:45 +00:00
Peter Xu
755e8d7cb6 migration: Move static var in ram_block_from_stream() into global
Static variable is very unfriendly to threading of ram_block_from_stream().
Move it into MigrationIncomingState.

Make the incoming state pointer to be passed over to ram_block_from_stream() on
both caller sites.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220301083925.33483-8-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-03-02 18:20:45 +00:00
Peter Xu
095c12a4a2 migration: Add postcopy_thread_create()
Postcopy create threads. A common manner is we init a sem and use it to sync
with the thread.  Namely, we have fault_thread_sem and listen_thread_sem and
they're only used for this.

Make it a shared infrastructure so it's easier to create yet another thread.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220301083925.33483-7-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-03-02 18:20:45 +00:00
Peter Xu
cfc7dc8abf migration: Dump ramblock and offset too when non-same-page detected
In ram_load_postcopy() we'll try to detect non-same-page case and dump error.
This error is very helpful for debugging.  Adding ramblock & offset into the
error log too.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220301083925.33483-6-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  dgilbert: Fix up long line
2022-03-02 18:20:38 +00:00
Peter Xu
77dadc3f83 migration: Introduce postcopy channels on dest node
Postcopy handles huge pages in a special way that currently we can only have
one "channel" to transfer the page.

It's because when we install pages using UFFDIO_COPY, we need to have the whole
huge page ready, it also means we need to have a temp huge page when trying to
receive the whole content of the page.

Currently all maintainance around this tmp page is global: firstly we'll
allocate a temp huge page, then we maintain its status mostly within
ram_load_postcopy().

To enable multiple channels for postcopy, the first thing we need to do is to
prepare N temp huge pages as caching, one for each channel.

Meanwhile we need to maintain the tmp huge page status per-channel too.

To give some example, some local variables maintained in ram_load_postcopy()
are listed; they are responsible for maintaining temp huge page status:

  - all_zero:     this keeps whether this huge page contains all zeros
  - target_pages: this counts how many target pages have been copied
  - host_page:    this keeps the host ptr for the page to install

Move all these fields to be together with the temp huge pages to form a new
structure called PostcopyTmpPage.  Then for each (future) postcopy channel, we
need one structure to keep the state around.

For vanilla postcopy, obviously there's only one channel.  It contains both
precopy and postcopy pages.

This patch teaches the dest migration node to start realize the possible number
of postcopy channels by introducing the "postcopy_channels" variable.  Its
value is calculated when setup postcopy on dest node (during POSTCOPY_LISTEN
phase).

Vanilla postcopy will have channels=1, but when postcopy-preempt capability is
enabled (in the future), we will boost it to 2 because even during partial
sending of a precopy huge page we still want to preempt it and start sending
the postcopy requested page right away (so we start to keep two temp huge
pages; more if we want to enable multifd).  In this patch there's a TODO marked
for that; so far the channels is always set to 1.

We need to send one "host huge page" on one channel only and we cannot split
them, because otherwise the data upon the same huge page can locate on more
than one channel so we need more complicated logic to manage.  One temp host
huge page for each channel will be enough for us for now.

Postcopy will still always use the index=0 huge page even after this patch.
However it prepares for the latter patches where it can start to use multiple
channels (which needs src intervention, because only src knows which channel we
should use).

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220301083925.33483-5-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  dgilbert: Fixed up long line
2022-03-02 18:19:31 +00:00
Peter Xu
b9a040b935 migration: Tracepoint change in postcopy-run bottom half
Remove the old two tracepoints and they're even near each other:

    trace_loadvm_postcopy_handle_run_cpu_sync()
    trace_loadvm_postcopy_handle_run_vmstart()

Add trace_loadvm_postcopy_handle_run_bh() with a finer granule trace.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220301083925.33483-4-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-03-02 18:17:56 +00:00
Peter Xu
c84f976e91 migration: Finer grained tracepoints for POSTCOPY_LISTEN
The enablement of postcopy listening has a few steps, add a few tracepoints to
be there ready for some basic measurements for them.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220301083925.33483-3-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-03-02 18:17:56 +00:00
Peter Xu
a7060ba3cc migration: Dump sub-cmd name in loadvm_process_command tp
It'll be easier to read the name rather than index of sub-cmd when debugging.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220301083925.33483-2-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-03-02 18:17:56 +00:00
Jack Wang
f736e414ee migration/rdma: set the REUSEADDR option for destination
We hit following error during testing RDMA transport:
in case of migration error, mgmt daemon pick one migration port,
incoming rdma:[::]:8089: RDMA ERROR: Error: could not rdma_bind_addr

Then try another -incoming rdma:[::]:8103, sometime it worked,
sometimes need another try with other ports number.

Set the REUSEADDR option for destination, This allow address could
be reused to avoid rdma_bind_addr error out.

Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Message-Id: <20220208085640.19702-1-jinpu.wang@ionos.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  dgilbert: Fixed up some tabs
2022-03-02 18:17:50 +00:00
Peter Maydell
922268067f * More Meson conversions (0.59.x now required rather than suggested)
* UMIP support for TCG x86
 * Fix migration crash
 * Restore error output for check-block
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmITXP8UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOsdQf/Srx+8BImb+LtRpiKHhn4SiucGSe8
 EhEAPSnblbvIGK9BYfj953svDzlLN2JIADcmOI59QE2xsPEtxLlEmMlvg/JIUMQp
 jk07oxbVXdv4olTyECmO3hj2VbSG7VR3tP9TOuJA5Vi4a+VzYXc6zv1/mp/8rdnl
 pGW0pjBZTXSp2Z/Be9/aGN8IuW+GnQuVZDXWBuEJmz2UzcdPWaOUVDro7IaUXmqp
 eB4XcT0jPR5uKetA1R1cyHCUVd7P0v6UV8SLYj905H1a8sqxDWMiUzX6fKkoN0SJ
 r/y7kCuyNzpxoWRuA2KN6Q5f9kAlMI/j9H3ih0wUfEkauiPtTATAc1+s+Q==
 =sSBY
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging

* More Meson conversions (0.59.x now required rather than suggested)
* UMIP support for TCG x86
* Fix migration crash
* Restore error output for check-block

# gpg: Signature made Mon 21 Feb 2022 09:35:59 GMT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini-gitlab/tags/for-upstream: (29 commits)
  configure, meson: move CONFIG_IASL to a Meson option
  meson, configure: move ntddscsi API check to meson
  meson: require dynamic linking for VSS support
  qga/vss-win32: require widl/midl, remove pre-built TLB file
  meson: do not make qga/vss-win32/meson.build conditional on C++ presence
  configure, meson: replace VSS SDK checks and options with --enable-vss-sdk
  qga/vss: use standard windows headers location
  qga/vss-win32: use widl if available
  meson: drop --with-win-sdk
  qga/vss-win32: fix midl arguments
  meson: refine check for whether to look for virglrenderer
  configure, meson: move guest-agent, tools to meson
  configure, meson: move smbd options to meson_options.txt
  configure, meson: move coroutine options to meson_options.txt
  configure, meson: move some default-disabled options to meson_options.txt
  meson: define qemu_cflags/qemu_ldflags
  configure, meson: move block layer options to meson_options.txt
  configure, meson: move image format options to meson_options.txt
  configure, meson: cleanup qemu-ga libraries
  configure, meson: move TPM check to meson
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-02-21 17:24:05 +00:00
Peter Maydell
b85ea5fa2f include: Move qemu_madvise() and related #defines to new qemu/madvise.h
The function qemu_madvise() and the QEMU_MADV_* constants associated
with it are used in only 10 files.  Move them out of osdep.h to a new
qemu/madvise.h header that is included where it is needed.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220208200856.3558249-2-peter.maydell@linaro.org
2022-02-21 13:30:20 +00:00
Paolo Bonzini
406523f6b3 configure, meson: move block layer options to meson_options.txt
Unlike image formats, these also require an entry in config-host.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-21 10:35:53 +01:00
Peter Xu
476ebf77fe migration: Move temp page setup and cleanup into separate functions
Temp pages will need to grow if we want to have multiple channels for postcopy,
because each channel will need its own temp page to cache huge page data.

Before doing that, cleanup the related code.  No functional change intended.

Since at it, touch up the errno handling a little bit on the setup side.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
Peter Xu
cfd66f30fb migration: Simplify unqueue_page()
This patch simplifies unqueue_page() on both sides of it (itself, and caller).

Firstly, due to the fact that right after unqueue_page() returned true, we'll
definitely send a huge page (see ram_save_huge_page() call - it will _never_
exit before finish sending that huge page), so unqueue_page() does not need to
jump in small page size if huge page is enabled on the ramblock.  IOW, it's
destined that only the 1st 4K page will be valid, when unqueue the 2nd+ time
we'll notice the whole huge page has already been sent anyway.  Switching to
operating on huge page reduces a lot of the loops of redundant unqueue_page().

Meanwhile, drop the dirty check.  It's not helpful to call test_bit() every
time to jump over clean pages, as ram_save_host_page() has already done so,
while in a faster way (see commit ba1b7c812c ("migration/ram: Optimize
ram_save_host_page()", 2021-05-13)).  So that's not necessary too.

Drop the two tracepoints along the way - based on above analysis it's very
possible that no one is really using it..

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
Peter Xu
a1fe28df75 migration: Add postcopy_has_request()
Add a helper to detect whether postcopy has pending request.

Since at it, cleanup the code a bit, e.g. in unqueue_page() we shouldn't need
to check it again on queue empty because we're the only one (besides cleanup
code, which should never run during this process) that will take a request off
the list, so the request list can only grow but not shrink under the hood.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
Peter Xu
2d1c37c603 migration: Enable UFFD_FEATURE_THREAD_ID even without blocktime feat
This patch allows us to read the tid even without blocktime feature enabled.
It's useful when tracing postcopy fault thread on faulted pages to show thread
id too with the address.

Remove the comments - they're merely not helpful at all.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
Peter Xu
258f5c9825 migration: No off-by-one for pss->page update in host page size
We used to do off-by-one fixup for pss->page when finished one host huge page
transfer.  That seems to be unnecesary at all.  Drop it.

Cc: Keqian Zhu <zhukeqian1@huawei.com>
Cc: Kunkun Jiang <jiangkunkun@huawei.com>
Cc: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
David Edmondson
ae68066880 migration: Tally pre-copy, downtime and post-copy bytes independently
Provide information on the number of bytes copied in the pre-copy,
downtime and post-copy phases of migration.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
David Edmondson
4c2d0f6dca migration: Introduce ram_transferred_add()
Replace direct manipulation of ram_counters.transferred with a
function.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
Philippe Mathieu-Daudé
9e7d1223ac migration: Don't return for postcopy_send_discard_bm_ram()
postcopy_send_discard_bm_ram() always return zero. Since it can't
fail, simplify and do not return anything.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: David Edmondson <david.edmondson@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
Peter Xu
739fcc1b0e migration: Drop return code for disgard ram process
It will just never fail.  Drop those return values where they're constantly
zeros.

A tiny touch-up on the tracepoint so trace_ram_postcopy_send_discard_bitmap()
is called after the logic itself (which sounds more reasonable).

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
Peter Xu
f30c2e5ba8 migration: Do chunk page in postcopy_each_ram_send_discard()
Right now we loop ramblocks for twice, the 1st time chunk the dirty bits with
huge page information; the 2nd time we send the discard ranges.  That's not
necessary - we can do them in a single loop.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
Peter Xu
e3fbf76021 migration: Drop postcopy_chunk_hostpages()
This function calls three functions:

  - postcopy_discard_send_init(ms, block->idstr);
  - postcopy_chunk_hostpages_pass(ms, block);
  - postcopy_discard_send_finish(ms);

However only the 2nd function call is meaningful.  It's major role is to make
sure dirty bits are applied in host-page-size granule, so there will be no
partial dirty bits set for a whole host page if huge pages are used.

The 1st/3rd call are for latter when we want to send the disgard ranges.
They're mostly no-op here besides some tracepoints (which are misleading!).

Drop them, then we can directly drop postcopy_chunk_hostpages() as a whole
because we can call postcopy_chunk_hostpages_pass() directly.

There're still some nice comments above postcopy_chunk_hostpages() that explain
what it does.  Copy it over to the caller's site.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
Peter Xu
dc57d6f2ec migration: Don't return for postcopy_chunk_hostpages()
It always return zero, because it just can't go wrong so far.  Simplify the
code with no functional change.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
Peter Xu
53405ffb33 migration: Drop dead code of ram_debug_dump_bitmap()
I planned to add "#ifdef DEBUG_POSTCOPY" around the function too because
otherwise it'll be compiled into qemu binary even if it'll never be used.  Then
I found that maybe it's easier to just drop it for good..

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
Xu Zheng
a6d1223b4a migration/ram: clean up unused comment.
Just a removal of an unused comment.
a0a8aa147a did many fixes and removed the parameter named "ms", but forget to remove the corresponding comment in function named "ram_save_host_page".

Signed-off-by: Xu Zheng <xuzheng@cmss.chinamobile.com>
Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
David Edmondson
9200265838 migration: Report the error returned when save_live_iterate fails
Should qemu_savevm_state_iterate() encounter a failure when calling a
particular save_live_iterate function, report the error code returned
by the function.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
Zhang Chen
01ee5e3556 migration/migration.c: Remove the MIGRATION_STATUS_ACTIVE when migration finished
The MIGRATION_STATUS_ACTIVE indicates that migration is running.
Remove it to be handled by the default operation,
It should be part of the unknown ending states.

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
Zhang Chen
eeeb48ee33 migration/migration.c: Avoid COLO boot in postcopy migration
COLO dose not support postcopy migration and remove the Fixme.

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
Zhang Chen
444252b96a migration/migration.c: Add missed default error handler for migration state
In the migration_completion() no other status is expected, for
example MIGRATION_STATUS_CANCELLING, MIGRATION_STATUS_CANCELLED, etc.

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
Juan Quintela
8c0ec0b2b0 multifd: Rename pages_used to normal_pages
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-01-28 15:38:23 +01:00
Juan Quintela
faf60935df multifd: recv side only needs the RAMBlock host address
So we can remove the MultiFDPages.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-01-28 15:38:23 +01:00
Juan Quintela
cf2d4aa8a2 multifd: Use normal pages array on the recv side
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

---

Rename num_normal_pages to total_normal_pages (peter)
2022-01-28 15:38:23 +01:00
Juan Quintela
815956f039 multifd: Use normal pages array on the send side
We are only sending normal pages through multifd channels.
Later on this series, we are going to also send zero pages.
We are going to detect if a page is zero or non zero in the multifd
channel thread, not on the main thread.

So we receive an array of pages page->offset[N]

And we will end with:

p->normal[N - zero_pages]
p->zero[zero_pages].

In this patch, we just copy all the pages in offset to normal.

for (i = 0; i < pages->num; i++) {
    p->narmal[p->normal_num] = pages->offset[i];
    p->normal_num++:
}

Later in the series this becomes:

for (i = 0; i < pages->num; i++) {
    if (buffer_is_zero(page->offset[i])) {
        p->zerol[p->zero_num] = pages->offset[i];
        p->zero_num++:
    } else {
        p->narmal[p->normal_num] = pages->offset[i];
        p->normal_num++:
    }
}

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

---

Improving comment (dave)
Renaming num_normal_pages to total_normal_pages (peter)
2022-01-28 15:38:23 +01:00
Juan Quintela
c27779a215 multifd: Unfold "used" variable by its value
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-01-28 15:38:23 +01:00
Juan Quintela
d48c3a0445 multifd: Use a single writev on the send side
Until now, we wrote the packet header with write(), and the rest of the
pages with writev().  Just increase the size of the iovec and do a
single writev().

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-01-28 15:38:23 +01:00
Juan Quintela
468fcb5dd0 multifd: Remove send_write() method
Everything use now iov's.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-01-28 15:38:23 +01:00
Juan Quintela
0a818b89eb multifd: Make zstd use iov's
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-01-28 15:38:23 +01:00
Juan Quintela
48a4a44c1c multifd: Make zlib use iov's
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-01-28 15:38:23 +01:00
Juan Quintela
226468ba3d multifd: Move iov from pages to params
This will allow us to reduce the number of system calls on the next patch.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28 15:38:23 +01:00
Juan Quintela
fc6705229c multifd: Use proper maximum compression values
It happens that there are functions to calculate the worst possible
compression size for a packet.  Use them.

Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-01-28 15:38:23 +01:00
Juan Quintela
47fe16ff66 migration: Move ram_release_pages() call to save_zero_page_to_file()
We always need to call it when we find a zero page, so put it in a
single place.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2022-01-28 15:38:23 +01:00
Juan Quintela
e7f2e190e5 migration: simplify do_compress_ram_page
The goto is not needed at all.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-01-28 15:38:23 +01:00
Juan Quintela
20d549cb0b migration: Remove masking for compression
Remove the mask in the call to ram_release_pages().  Nothing else does
it, and if the offset has that bits set, we have a lot of trouble.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-01-28 15:38:23 +01:00
Juan Quintela
0189c72291 migration: ram_release_pages() always receive 1 page as argument
Remove the pages argument. And s/pages/page/

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>

---

- Use 1LL instead of casts (philmd)
- Change the whole 1ULL for TARGET_PAGE_SIZE
2022-01-28 15:38:22 +01:00
Juan Quintela
05931ec561 migration: We only need last_stage in two places
We only need last_stage in two places and we are passing it all
around.  Just add a field to RAMState that passes it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>

---

Repeat subject (philmd suggestion)
2022-01-28 15:38:22 +01:00
Juan Quintela
04e1140494 migration: All this fields are unsigned
So printing it as %d is wrong.  Notice that for the channel id, that
is an uint8_t, but I changed it anyways for consistency.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2022-01-28 15:38:22 +01:00
Stefan Hajnoczi
826cc32423 aio-posix: split poll check from ready handler
Adaptive polling measures the execution time of the polling check plus
handlers called when a polled event becomes ready. Handlers can take a
significant amount of time, making it look like polling was running for
a long time when in fact the event handler was running for a long time.

For example, on Linux the io_submit(2) syscall invoked when a virtio-blk
device's virtqueue becomes ready can take 10s of microseconds. This
can exceed the default polling interval (32 microseconds) and cause
adaptive polling to stop polling.

By excluding the handler's execution time from the polling check we make
the adaptive polling calculation more accurate. As a result, the event
loop now stays in polling mode where previously it would have fallen
back to file descriptor monitoring.

The following data was collected with virtio-blk num-queues=2
event_idx=off using an IOThread. Before:

168k IOPS, IOThread syscalls:

  9837.115 ( 0.020 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 16, iocbpp: 0x7fcb9f937db0)    = 16
  9837.158 ( 0.002 ms): IO iothread1/620155 write(fd: 103, buf: 0x556a2ef71b88, count: 8)                         = 8
  9837.161 ( 0.001 ms): IO iothread1/620155 write(fd: 104, buf: 0x556a2ef71b88, count: 8)                         = 8
  9837.163 ( 0.001 ms): IO iothread1/620155 ppoll(ufds: 0x7fcb90002800, nfds: 4, tsp: 0x7fcb9f1342d0, sigsetsize: 8) = 3
  9837.164 ( 0.001 ms): IO iothread1/620155 read(fd: 107, buf: 0x7fcb9f939cc0, count: 512)                        = 8
  9837.174 ( 0.001 ms): IO iothread1/620155 read(fd: 105, buf: 0x7fcb9f939cc0, count: 512)                        = 8
  9837.176 ( 0.001 ms): IO iothread1/620155 read(fd: 106, buf: 0x7fcb9f939cc0, count: 512)                        = 8
  9837.209 ( 0.035 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 32, iocbpp: 0x7fca7d0cebe0)    = 32

174k IOPS (+3.6%), IOThread syscalls:

  9809.566 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0cdd62be0)    = 32
  9809.625 ( 0.001 ms): IO iothread1/623061 write(fd: 103, buf: 0x5647cfba5f58, count: 8)                         = 8
  9809.627 ( 0.002 ms): IO iothread1/623061 write(fd: 104, buf: 0x5647cfba5f58, count: 8)                         = 8
  9809.663 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0d0388b50)    = 32

Notice that ppoll(2) and eventfd read(2) syscalls are eliminated because
the IOThread stays in polling mode instead of falling back to file
descriptor monitoring.

As usual, polling is not implemented on Windows so this patch ignores
the new io_poll_read() callback in aio-win32.c.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20211207132336.36627-2-stefanha@redhat.com

[Fixed up aio_set_event_notifier() calls in
tests/unit/test-fdmon-epoll.c added after this series was queued.
--Stefan]

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-01-12 17:09:39 +00:00
Laurent Vivier
1b529d908d failover: Silence warning messages during qtest
virtio-net-failover test tries several device combinations that produces
some expected warnings.
These warning can be confusing, so we disable them during the qtest
sequence.

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20211220145314.390697-1-lvivier@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
[thuth: Fix memory leak by using error_free()]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-12-22 08:12:45 +01:00
Juan Quintela
a5ed229488 multifd: Make zlib compression method not use iovs
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-12-15 10:38:34 +01:00
Juan Quintela
f5ff548774 multifd: Make zstd compression method not use iovs
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-12-15 10:38:17 +01:00
Rao, Lei
9c5c8ff24e COLO: Move some trace code behind qemu_mutex_unlock_iothread()
There is no need to put some trace code in the critical section.
So, moving it behind qemu_mutex_unlock_iothread() can reduce the
lock time.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-12-15 10:31:42 +01:00
Li Zhang
077fbb5942 multifd: Shut down the QIO channels to avoid blocking the send threads when they are terminated.
When doing live migration with multifd channels 8, 16 or larger number,
the guest hangs in the presence of the network errors such as missing TCP ACKs.

At sender's side:
The main thread is blocked on qemu_thread_join, migration_fd_cleanup
is called because one thread fails on qio_channel_write_all when
the network problem happens and other send threads are blocked on sendmsg.
They could not be terminated. So the main thread is blocked on qemu_thread_join
to wait for the threads terminated.

(gdb) bt
0  0x00007f30c8dcffc0 in __pthread_clockjoin_ex () at /lib64/libpthread.so.0
1  0x000055cbb716084b in qemu_thread_join (thread=0x55cbb881f418) at ../util/qemu-thread-posix.c:627
2  0x000055cbb6b54e40 in multifd_save_cleanup () at ../migration/multifd.c:542
3  0x000055cbb6b4de06 in migrate_fd_cleanup (s=0x55cbb8024000) at ../migration/migration.c:1808
4  0x000055cbb6b4dfb4 in migrate_fd_cleanup_bh (opaque=0x55cbb8024000) at ../migration/migration.c:1850
5  0x000055cbb7173ac1 in aio_bh_call (bh=0x55cbb7eb98e0) at ../util/async.c:141
6  0x000055cbb7173bcb in aio_bh_poll (ctx=0x55cbb7ebba80) at ../util/async.c:169
7  0x000055cbb715ba4b in aio_dispatch (ctx=0x55cbb7ebba80) at ../util/aio-posix.c:381
8  0x000055cbb7173ffe in aio_ctx_dispatch (source=0x55cbb7ebba80, callback=0x0, user_data=0x0) at ../util/async.c:311
9  0x00007f30c9c8cdf4 in g_main_context_dispatch () at /usr/lib64/libglib-2.0.so.0
10 0x000055cbb71851a2 in glib_pollfds_poll () at ../util/main-loop.c:232
11 0x000055cbb718521c in os_host_main_loop_wait (timeout=42251070366) at ../util/main-loop.c:255
12 0x000055cbb7185321 in main_loop_wait (nonblocking=0) at ../util/main-loop.c:531
13 0x000055cbb6e6ba27 in qemu_main_loop () at ../softmmu/runstate.c:726
14 0x000055cbb6ad6fd7 in main (argc=68, argv=0x7ffc0c578888, envp=0x7ffc0c578ab0) at ../softmmu/main.c:50

To make sure that the send threads could be terminated, IO channels should be
shut down to avoid waiting IO.

Signed-off-by: Li Zhang <lizhang@suse.de>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-12-15 10:31:42 +01:00
Juan Quintela
01102a2ef6 multifd: Fill offset and block for reception
We were using the iov directly, but we will need this info on the
following patch.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-12-15 10:31:42 +01:00
Juan Quintela
40a4bfe9d3 multifd: remove used parameter from send_recv_pages() method
It is already there as p->pages->num.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-12-15 10:31:42 +01:00
Juan Quintela
02fb81043e multifd: remove used parameter from send_prepare() method
It is already there as p->pages->num.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-12-15 10:31:42 +01:00
Juan Quintela
1943c11a62 multifd: The variable is only used inside the loop
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-12-15 10:31:42 +01:00
Juan Quintela
18ede636bc multifd: Add missing documention
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-12-15 10:31:42 +01:00
Juan Quintela
90a3d2f9d5 multifd: Rename used field to num
We will need to split it later in zero_num (number of zero pages) and
normal_num (number of normal pages).  This name is better.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-12-15 10:31:42 +01:00
Juan Quintela
144fa06b34 migration: Never call twice qemu_target_page_size()
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-12-15 10:31:42 +01:00
Juan Quintela
47a1782461 multifd: Delete useless operation
We are dividing by page_size to multiply again in the only use.
Once there, improve the comments.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-12-15 10:31:42 +01:00
Juan Quintela
bad452a77e migration: Remove is_zero_range()
It just calls buffer_is_zero().  Just change the callers.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2021-12-15 10:31:42 +01:00
Zhang Chen
751fe4c608 migration/colo: Optimize COLO primary node start code path
Optimize COLO primary start path from:
MIGRATION_STATUS_XXX --> MIGRATION_STATUS_ACTIVE --> MIGRATION_STATUS_COLO --> MIGRATION_STATUS_COMPLETED
To:
MIGRATION_STATUS_XXX --> MIGRATION_STATUS_COLO --> MIGRATION_STATUS_COMPLETED
No need to start primary COLO through "MIGRATION_STATUS_ACTIVE".

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-12-15 10:31:42 +01:00
Rao, Lei
795969ab1f Fixed a QEMU hang when guest poweroff in COLO mode
When the PVM guest poweroff, the COLO thread may wait a semaphore
in colo_process_checkpoint().So, we should wake up the COLO thread
before migration shutdown.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-12-15 10:31:42 +01:00
Zhang Chen
0e0f0479e2 migration/colo: More accurate update checkpoint time
Previous operation(like vm_start and replication_start_all) will consume
extra time before update the timer, so reduce time in this patch.

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-12-15 10:31:42 +01:00
Rao, Lei
672159a97c migration/ram.c: Remove the qemu_mutex_lock in colo_flush_ram_cache.
The code to acquire bitmap_mutex is added in the commit of
"63268c4970a5f126cc9af75f3ccb8057abef5ec0". There is no
need to acquire bitmap_mutex in colo_flush_ram_cache(). This
is because the colo_flush_ram_cache only be called on the COLO
secondary VM, which is the destination side.
On the COLO secondary VM, only the COLO thread will touch
the bitmap of ram cache.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-12-15 10:31:42 +01:00
Rao, Lei
91fe9a8dbd Reset the auto-converge counter at every checkpoint.
if we don't reset the auto-converge counter,
it will continue to run with COLO running,
and eventually the system will hang due to the
CPU throttle reaching DEFAULT_MIGRATE_MAX_CPU_THROTTLE.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Tested-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-09 08:48:36 +01:00
Rao, Lei
a6a83cef9c Reduce the PVM stop time during Checkpoint
When flushing memory from ram cache to ram during every checkpoint
on secondary VM, we can copy continuous chunks of memory instead of
4096 bytes per time to reduce the time of VM stop during checkpoint.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Tested-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-09 08:46:30 +01:00