Commit Graph

107532 Commits

Author SHA1 Message Date
David Hildenbrand
177f9b1ee4 virtio-mem: Expose device memory dynamically via multiple memslots if enabled
Having large virtio-mem devices that only expose little memory to a VM
is currently a problem: we map the whole sparse memory region into the
guest using a single memslot, resulting in one gigantic memslot in KVM.
KVM allocates metadata for the whole memslot, which can result in quite
some memory waste.

Assuming we have a 1 TiB virtio-mem device and only expose little (e.g.,
1 GiB) memory, we would create a single 1 TiB memslot and KVM has to
allocate metadata for that 1 TiB memslot: on x86, this implies allocating
a significant amount of memory for metadata:

(1) RMAP: 8 bytes per 4 KiB, 8 bytes per 2 MiB, 8 bytes per 1 GiB
    -> For 1 TiB: 2147483648 + 4194304 + 8192 = ~ 2 GiB (0.2 %)

    With the TDP MMU (cat /sys/module/kvm/parameters/tdp_mmu) this gets
    allocated lazily when required for nested VMs
(2) gfn_track: 2 bytes per 4 KiB
    -> For 1 TiB: 536870912 = ~512 MiB (0.05 %)
(3) lpage_info: 4 bytes per 2 MiB, 4 bytes per 1 GiB
    -> For 1 TiB: 2097152 + 4096 = ~2 MiB (0.0002 %)
(4) 2x dirty bitmaps for tracking: 2x 1 bit per 4 KiB page
    -> For 1 TiB: 536870912 = 64 MiB (0.006 %)

So we primarily care about (1) and (2). The bad thing is, that the
memory consumption *doubles* once SMM is enabled, because we create the
memslot once for !SMM and once for SMM.

Having a 1 TiB memslot without the TDP MMU consumes around:
* With SMM: 5 GiB
* Without SMM: 2.5 GiB
Having a 1 TiB memslot with the TDP MMU consumes around:
* With SMM: 1 GiB
* Without SMM: 512 MiB

... and that's really something we want to optimize, to be able to just
start a VM with small boot memory (e.g., 4 GiB) and a virtio-mem device
that can grow very large (e.g., 1 TiB).

Consequently, using multiple memslots and only mapping the memslots we
really need can significantly reduce memory waste and speed up
memslot-related operations. Let's expose the sparse RAM memory region using
multiple memslots, mapping only the memslots we currently need into our
device memory region container.

The feature can be enabled using "dynamic-memslots=on" and requires
"unplugged-inaccessible=on", which is nowadays the default.

Once enabled, we'll auto-detect the number of memslots to use based on the
memslot limit provided by the core. We'll use at most 1 memslot per
gigabyte. Note that our global limit of memslots accross all memory devices
is currently set to 256: even with multiple large virtio-mem devices,
we'd still have a sane limit on the number of memslots used.

The default is to not dynamically map memslot for now
("dynamic-memslots=off"). The optimization must be enabled manually,
because some vhost setups (e.g., hotplug of vhost-user devices) might be
problematic until we support more memslots especially in vhost-user backends.

Note that "dynamic-memslots=on" is just a hint that multiple memslots
*may* be used for internal optimizations, not that multiple memslots
*must* be used. The actual number of memslots that are used is an
internal detail: for example, once memslot metadata is no longer an
issue, we could simply stop optimizing for that. Migration source and
destination can differ on the setting of "dynamic-memslots".

Message-ID: <20230926185738.277351-17-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
884a0c20e6 virtio-mem: Update state to match bitmap as soon as it's been migrated
It's cleaner and future-proof to just have other state that depends on the
bitmap state to be updated as soon as possible when restoring the bitmap.

So factor out informing RamDiscardListener into a functon and call it in
case of early migration right after we restored the bitmap.

Message-ID: <20230926185738.277351-16-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
a45171dba7 virtio-mem: Pass non-const VirtIOMEM via virtio_mem_range_cb
Let's prepare for a user that has to modify the VirtIOMEM device state.

Message-ID: <20230926185738.277351-15-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
aa5317ef7c memory: Clarify mapping requirements for RamDiscardManager
We really only care about the RAM memory region not being mapped into
an address space yet as long as we're still setting up the
RamDiscardManager. Once mapped into an address space, memory notifiers
would get notified about such a region and any attempts to modify the
RamDiscardManager would be wrong.

While "mapped into an address space" is easy to check for RAM regions that
are mapped directly (following the ->container links), it's harder to
check when such regions are mapped indirectly via aliases. For now, we can
only detect that a region is mapped through an alias (->mapped_via_alias),
but we don't have a handle on these aliases to follow all their ->container
links to test if they are eventually mapped into an address space.

So relax the assertion in memory_region_set_ram_discard_manager(),
remove the check in memory_region_get_ram_discard_manager() and clarify
the doc.

Message-ID: <20230926185738.277351-14-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
a2335113ae memory-device,vhost: Support automatic decision on the number of memslots
We want to support memory devices that can automatically decide how many
memslots they will use. In the worst case, they have to use a single
memslot.

The target use cases are virtio-mem and the hyper-v balloon.

Let's calculate a reasonable limit such a memory device may use, and
instruct the device to make a decision based on that limit. Use a simple
heuristic that considers:
* A memslot soft-limit for all memory devices of 256; also, to not
  consume too many memslots -- which could harm performance.
* Actually still free and unreserved memslots
* The percentage of the remaining device memory region that memory device
  will occupy.

Further, while we properly check before plugging a memory device whether
there still is are free memslots, we have other memslot consumers (such as
boot memory, PCI BARs) that don't perform any checks and might dynamically
consume memslots without any prior reservation. So we might succeed in
plugging a memory device, but once we dynamically map a PCI BAR we would
be in trouble. Doing accounting / reservation / checks for all such
users is problematic (e.g., sometimes we might temporarily split boot
memory into two memslots, triggered by the BIOS).

We use the historic magic memslot number of 509 as orientation to when
supporting 256 memory devices -> memslots (leaving 253 for boot memory and
other devices) has been proven to work reliable. We'll fallback to
suggesting a single memslot if we don't have at least 509 total memslots.

Plugging vhost devices with less than 509 memslots available while we
have memory devices plugged that consume multiple memslots due to
automatic decisions can be problematic. Most configurations might just fail
due to "limit < used + reserved", however, it can also happen that these
memory devices would suddenly consume memslots that would actually be
required by other memslot consumers (boot, PCI BARs) later. Note that this
has always been sketchy with vhost devices that support only a small number
of memslots; but we don't want to make it any worse.So let's keep it simple
and simply reject plugging such vhost devices in such a configuration.

Eventually, all vhost devices that want to be fully compatible with such
memory devices should support a decent number of memslots (>= 509).

Message-ID: <20230926185738.277351-13-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
cd89c065b0 vhost: Add vhost_get_max_memslots()
Let's add vhost_get_max_memslots().

Message-ID: <20230926185738.277351-12-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
16ab2eda57 kvm: Add stub for kvm_get_max_memslots()
We'll need the stub soon from memory device context.

While at it, use "unsigned int" as return value and place the
declaration next to kvm_get_free_memslots().

Message-ID: <20230926185738.277351-11-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
766aa0a654 memory-device,vhost: Support memory devices that dynamically consume memslots
We want to support memory devices that have a dynamically managed memory
region container as device memory region. This device memory region maps
multiple RAM memory subregions (e.g., aliases to the same RAM memory
region), whereby these subregions can be (un)mapped on demand.

Each RAM subregion will consume a memslot in KVM and vhost, resulting in
such a new device consuming memslots dynamically, and initially usually
0. We already track the number of used vs. required memslots for all
memslots. From that, we can derive the number of reserved memslots that
must not be used otherwise.

The target use case is virtio-mem and the hyper-v balloon, which will
dynamically map aliases to RAM memory region into their device memory
region container.

Properly document what's supported and what's not and extend the vhost
memslot check accordingly.

Message-ID: <20230926185738.277351-10-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
f9716f4b0d memory-device: Track required and actually used memslots in DeviceMemoryState
Let's track how many memslots are required by plugged memory devices and
how many are currently actually getting used by plugged memory
devices.

"required - used" is the number of reserved memslots. For now, the number
of used and required memslots is always equal, and there are no
reservations. This is a preparation for memory devices that want to
dynamically consume memslots after initially specifying how many they
require -- where we'll end up with reserved memslots.

To track the number of used memslots, create a new address space for
our device memory and register a memory listener (add/remove) for that
address space.

Message-ID: <20230926185738.277351-9-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
759bac673a stubs: Rename qmp_memory_device.c to memory_device.c
We want to place non-qmp stubs in there, so let's rename it. While at
it, put it into the MAINTAINERS file under "Memory devices".

Message-ID: <20230926185738.277351-8-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
7975feece9 memory-device: Support memory devices with multiple memslots
We want to support memory devices that have a memory region container as
device memory region that maps multiple RAM memory regions. Let's start
by supporting memory devices that statically map multiple RAM memory
regions and, thereby, consume multiple memslots.

We already have one device that uses a container as device memory region:
NVDIMMs. However, a NVDIMM always ends up consuming exactly one memslot.

Let's add support for that by asking the memory device via a new
callback how many memslots it requires.

Message-ID: <20230926185738.277351-7-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
8c49951c4a vhost: Return number of free memslots
Let's return the number of free slots instead of only checking if there
is a free slot. Required to support memory devices that consume multiple
memslots.

This is a preparation for memory devices that consume multiple memslots.

Message-ID: <20230926185738.277351-6-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
5b23186a95 kvm: Return number of free memslots
Let's return the number of free slots instead of only checking if there
is a free slot. While at it, check all address spaces, which will also
consider SMM under x86 correctly.

This is a preparation for memory devices that consume multiple memslots.

Message-ID: <20230926185738.277351-5-david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
022f033bd7 softmmu/physmem: Fixup qemu_ram_block_from_host() documentation
Let's fixup the documentation (e.g., removing traces of the ram_addr
parameter that no longer exists) and move it to the header file while at
it.

Message-ID: <20230926185738.277351-4-david@redhat.com>
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:21 +02:00
David Hildenbrand
309ebfa691 vhost: Remove vhost_backend_can_merge() callback
Checking whether the memory regions are equal is sufficient: if they are
equal, then most certainly the contained fd is equal.

The whole vhost-user memslot handling is suboptimal and overly
complicated. We shouldn't have to lookup a RAM memory regions we got
notified about in vhost_user_get_mr_data() using a host pointer. But that
requires a bigger rework -- especially an alternative vhost_set_mem_table()
backend call that simply consumes MemoryRegionSections.

For now, let's just drop vhost_backend_can_merge().

Message-ID: <20230926185738.277351-3-david@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:21 +02:00
David Hildenbrand
552b25229c vhost: Rework memslot filtering and fix "used_memslot" tracking
Having multiple vhost devices, some filtering out fd-less memslots and
some not, can mess up the "used_memslot" accounting. Consequently our
"free memslot" checks become unreliable and we might run out of free
memslots at runtime later.

An example sequence which can trigger a potential issue that involves
different vhost backends (vhost-kernel and vhost-user) and hotplugged
memory devices can be found at [1].

Let's make the filtering mechanism less generic and distinguish between
backends that support private memslots (without a fd) and ones that only
support shared memslots (with a fd). Track the used_memslots for both
cases separately and use the corresponding value when required.

Note: Most probably we should filter out MAP_PRIVATE fd-based RAM regions
(for example, via memory-backend-memfd,...,shared=off or as default with
 memory-backend-file) as well. When not using MAP_SHARED, it might not work
as expected. Add a TODO for now.

[1] https://lkml.kernel.org/r/fad9136f-08d3-3fd9-71a1-502069c000cf@redhat.com

Message-ID: <20230926185738.277351-2-david@redhat.com>
Fixes: 988a27754b ("vhost: allow backends to filter memory sections")
Cc: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:21 +02:00
Stefan Hajnoczi
a51e5124a6 testing, gdbstub and plugin updates
- enable more sbsa-ref tests in avocado
   - add swtpm to the package lists
   - reduce avocado noise in gitlab by limiting tests
   - make docker engine choice driven by configure and enable override
   - remove unneeded gcc suffix on some cross compilers
   - fix some NULL returns in gdbstub
   - improve locking in execlog plugin
   - introduce the GDBFeature structure
   - consistently set gdb_core_xml_file
   - use cleaner escaping for gdb xml
   - drop ancient gdb_has_xml() test
   - disable multi-instruction GUSA emulation when plugins enabled
   - fix some coverity issues in plugins
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmUmVJkACgkQ+9DbCVqe
 KkTfNggAiS02FcL3faGjAN9+60xhvEQ3DJjI473hjvFWu0bSkQTjObcQqGc+V7Cw
 9yNtnxOOWB6KdAU8At7HlVqiUXeyTCJB7Att5/UgNUZj63j+cs7PXb4p7cVCcJOc
 17zni22tnmCBcC8wZaz0yj68jaftL3hz1QNUZOmv6CBt42q0+/4g1WKfaJ+w+SbK
 T7cJEiMDObm8qeNAAXpDLB+9v3bRDxMZ8hFJ3p3CatQC8jbDrkuH7RrVPHDWiWQx
 w0uXpUHlZEOVX23v6+iIoeb8YQW2bZI9UsfeyIHJlENaVgyL200LHgLvvAE4Qd63
 dCtfQUZzj4t9sfoL4XgxaB7G4qtXTg==
 =7PLI
 -----END PGP SIGNATURE-----

Merge tag 'pull-omnibus-111023-1' of https://gitlab.com/stsquad/qemu into staging

testing, gdbstub and plugin updates

  - enable more sbsa-ref tests in avocado
  - add swtpm to the package lists
  - reduce avocado noise in gitlab by limiting tests
  - make docker engine choice driven by configure and enable override
  - remove unneeded gcc suffix on some cross compilers
  - fix some NULL returns in gdbstub
  - improve locking in execlog plugin
  - introduce the GDBFeature structure
  - consistently set gdb_core_xml_file
  - use cleaner escaping for gdb xml
  - drop ancient gdb_has_xml() test
  - disable multi-instruction GUSA emulation when plugins enabled
  - fix some coverity issues in plugins

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmUmVJkACgkQ+9DbCVqe
# KkTfNggAiS02FcL3faGjAN9+60xhvEQ3DJjI473hjvFWu0bSkQTjObcQqGc+V7Cw
# 9yNtnxOOWB6KdAU8At7HlVqiUXeyTCJB7Att5/UgNUZj63j+cs7PXb4p7cVCcJOc
# 17zni22tnmCBcC8wZaz0yj68jaftL3hz1QNUZOmv6CBt42q0+/4g1WKfaJ+w+SbK
# T7cJEiMDObm8qeNAAXpDLB+9v3bRDxMZ8hFJ3p3CatQC8jbDrkuH7RrVPHDWiWQx
# w0uXpUHlZEOVX23v6+iIoeb8YQW2bZI9UsfeyIHJlENaVgyL200LHgLvvAE4Qd63
# dCtfQUZzj4t9sfoL4XgxaB7G4qtXTg==
# =7PLI
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 11 Oct 2023 03:54:01 EDT
# gpg:                using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* tag 'pull-omnibus-111023-1' of https://gitlab.com/stsquad/qemu: (25 commits)
  contrib/plugins: fix coverity warning in hotblocks
  contrib/plugins: fix coverity warning in lockstep
  contrib/plugins: fix coverity warning in cache
  plugins: Set final instruction count in plugin_gen_tb_end
  target/sh4: Disable decode_gusa when plugins enabled
  accel/tcg: Add plugin_enabled to DisasContextBase
  gdbstub: Replace gdb_regs with an array
  gdbstub: Remove gdb_has_xml variable
  target/ppc: Remove references to gdb_has_xml
  target/arm: Remove references to gdb_has_xml
  gdbstub: Use g_markup_printf_escaped()
  hw/core/cpu: Return static value with gdb_arch_name()
  target/arm: Move the reference to arm-core.xml
  gdbstub: Introduce GDBFeature structure
  contrib/plugins: Use GRWLock in execlog
  plugins: Check if vCPU is realized
  gdbstub: Fix target.xml response
  gdbstub: Fix target_xml initialization
  configure: remove gcc version suffixes
  configure: allow user to override docker engine
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-11 09:43:10 -04:00
Stefan Hajnoczi
48747938d1 Migration Pull request (20231011 edition)
Hi
 
 In this pull request:
 
 - Markus RDMA cleanup series
 - recover fixes from peter
 - migration capability from fabiano
 - negative migration test from fabiano.
 
 Please, pull.
 
 Thanks, Juan.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmUmaQ8ACgkQ9IfvGFhy
 1yN9fA//SBnea3Wl2158J673l5aaI8Vp/1PjfzvNdcr/6EQbXZBgug+haQ3n5Hhf
 USNRhemrCkpZAGCUf07g9pfF4R/Jsq1OkOrWF4e6gAaZPNU4V5F7VKBk8pmFMLtr
 Kk2XgnH2ZPaFEvts0qBrOfvDHH8gOzzjpF2HGrioM8Zr3p1JHz9OqJoSyawLF0U7
 YFTq2jJSgaOQ6ax1+L8hLLuXlmNccBaTWT8Cv0rbPEgcwrJOM/wMfmd6O39ps929
 yS5NnxqqkrprTDjmeGOgOQd0Cy/flinnzmu+BVMO6/ns9Hu6q1TGG6D+DOBdgmHH
 jq7Ej5VILtXWOoZtXLHqA1Xt73ciVlmditVupoC+5vtIJou2JseClutOp98qxxzV
 llMF7ldHbRTWnu7qIrwv2OINarowR0pIZfkJqBc6dNHHScwMCnX5L9YAvNePEo2V
 1oJpbqW7mmgwdlFAiKFD+AE6qUWxcnzOvPf+fzWrJMi507Kv5nmxQWTHw9dsFs7k
 neWnK21t0s2t77+vVBtLlr06JESG+WndzvQsXKZu8Pd0+ASnzpX8pRVzxEPk5EiH
 fT9bhXOCvxTTHulznjkOApODE5NF+KlHAFXU87cSIkdi/6JfzcvTe6KeeIPC248Q
 jk3nVlhds1xajTcPAK7HF5Ta6R8rNdTZ6q/kFNhLaTGqv9agxDU=
 =hekO
 -----END PGP SIGNATURE-----

Merge tag 'migration-20231011-pull-request' of https://gitlab.com/juan.quintela/qemu into staging

Migration Pull request (20231011 edition)

Hi

In this pull request:

- Markus RDMA cleanup series
- recover fixes from peter
- migration capability from fabiano
- negative migration test from fabiano.

Please, pull.

Thanks, Juan.

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmUmaQ8ACgkQ9IfvGFhy
# 1yN9fA//SBnea3Wl2158J673l5aaI8Vp/1PjfzvNdcr/6EQbXZBgug+haQ3n5Hhf
# USNRhemrCkpZAGCUf07g9pfF4R/Jsq1OkOrWF4e6gAaZPNU4V5F7VKBk8pmFMLtr
# Kk2XgnH2ZPaFEvts0qBrOfvDHH8gOzzjpF2HGrioM8Zr3p1JHz9OqJoSyawLF0U7
# YFTq2jJSgaOQ6ax1+L8hLLuXlmNccBaTWT8Cv0rbPEgcwrJOM/wMfmd6O39ps929
# yS5NnxqqkrprTDjmeGOgOQd0Cy/flinnzmu+BVMO6/ns9Hu6q1TGG6D+DOBdgmHH
# jq7Ej5VILtXWOoZtXLHqA1Xt73ciVlmditVupoC+5vtIJou2JseClutOp98qxxzV
# llMF7ldHbRTWnu7qIrwv2OINarowR0pIZfkJqBc6dNHHScwMCnX5L9YAvNePEo2V
# 1oJpbqW7mmgwdlFAiKFD+AE6qUWxcnzOvPf+fzWrJMi507Kv5nmxQWTHw9dsFs7k
# neWnK21t0s2t77+vVBtLlr06JESG+WndzvQsXKZu8Pd0+ASnzpX8pRVzxEPk5EiH
# fT9bhXOCvxTTHulznjkOApODE5NF+KlHAFXU87cSIkdi/6JfzcvTe6KeeIPC248Q
# jk3nVlhds1xajTcPAK7HF5Ta6R8rNdTZ6q/kFNhLaTGqv9agxDU=
# =hekO
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 11 Oct 2023 05:21:19 EDT
# gpg:                using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full]
# gpg:                 aka "Juan Quintela <quintela@trasno.org>" [full]
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* tag 'migration-20231011-pull-request' of https://gitlab.com/juan.quintela/qemu: (65 commits)
  migration: Add migration_rp_wait|kick()
  migration: Remember num of ramblocks to sync during recovery
  qemufile: Always return a verbose error
  migration: Introduce migrate_has_error()
  migration: Display error in query-migrate irrelevant of status
  migration/rdma: Replace flawed device detail dump by tracing
  migration/rdma: Use error_report() & friends instead of stderr
  migration/rdma: Downgrade qemu_rdma_cleanup() errors to warnings
  migration/rdma: Silence qemu_rdma_register_and_get_keys()
  migration/rdma: Silence qemu_rdma_block_for_wrid()
  migration/rdma: Don't report received completion events as error
  migration/rdma: Silence qemu_rdma_reg_control()
  migration/rdma: Silence qemu_rdma_connect()
  migration/rdma: Silence qemu_rdma_resolve_host()
  migration/rdma: Convert qemu_rdma_alloc_pd_cq() to Error
  migration/rdma: Convert qemu_rdma_post_recv_control() to Error
  migration/rdma: Convert qemu_rdma_post_send_control() to Error
  migration/rdma: Convert qemu_rdma_write() to Error
  migration/rdma: Convert qemu_rdma_write_one() to Error
  migration/rdma: Convert qemu_rdma_write_flush() to Error
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-11 09:42:40 -04:00
Stefan Hajnoczi
67d2486c0e Audio es1370 fix & cleanups
-----BEGIN PGP SIGNATURE-----
 
 iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAmUmQIQcHG1hcmNhbmRy
 ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5UiED/0VjK4J6I+MYljyJePj
 ki62XC/dtojuxkE+nwsTrU54u86776TLmthRYyHOffSs8mDpk0ynY6sWD99T4r1b
 9Szmm9FJM658RSpN9i128JAiHKD9Vth7UUpekrLIBgRsW5lGuQFapHouTB1wa3hm
 8ZszUNAGEpAgGM2My9t+pvGy8lrOZan2ZAlQzKfMc+msYTnivK6huzjVw/xUc0kD
 vzfpVMgKntvl391IBf9hFBBrJCWRtDrydtPl1/PxpVj/96FGIxlx+o30X8z9hSzA
 yg96XfwmS/mgqFsw8QqtFyrkSE3iD2yzcrepjtK+erwcv6M0BVG38wn89F571Lcl
 Ga8PcqBd7pP0nMXrgjbX8qB6t58TPipEbFMeS2iabO4pwz0jy9aCVQJ2GZqHNqaF
 VoMN5hEDBhanxFcNEcjSWZEfJOvK7uVLAZ2Xdqcrsm50ESTu55H/BIQWFipTGlqu
 +BlvfpwVYXGziuDsj8h4cUGvyuMY02XWsBU9qvl5aCiyWoyED0ZlR2UJ0jVjn68K
 RUbTrxSMF9hLgI1j8FYkNgXMBuR4+Sopc7vp//AXpp92/yxTNCbe44Y06C9iaib1
 0zrYhd4rfTn0MX+YM9IUKzvoNSVY6VxHoTucyTM9HcT7XtCnmwnMzjY4yGxQ6k3V
 X0WBPpH4mCyXla7TJ0zwbnvH0A==
 =B/KL
 -----END PGP SIGNATURE-----

Merge tag 'audio-pull-request' of https://gitlab.com/marcandre.lureau/qemu into staging

Audio es1370 fix & cleanups

# -----BEGIN PGP SIGNATURE-----
#
# iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAmUmQIQcHG1hcmNhbmRy
# ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5UiED/0VjK4J6I+MYljyJePj
# ki62XC/dtojuxkE+nwsTrU54u86776TLmthRYyHOffSs8mDpk0ynY6sWD99T4r1b
# 9Szmm9FJM658RSpN9i128JAiHKD9Vth7UUpekrLIBgRsW5lGuQFapHouTB1wa3hm
# 8ZszUNAGEpAgGM2My9t+pvGy8lrOZan2ZAlQzKfMc+msYTnivK6huzjVw/xUc0kD
# vzfpVMgKntvl391IBf9hFBBrJCWRtDrydtPl1/PxpVj/96FGIxlx+o30X8z9hSzA
# yg96XfwmS/mgqFsw8QqtFyrkSE3iD2yzcrepjtK+erwcv6M0BVG38wn89F571Lcl
# Ga8PcqBd7pP0nMXrgjbX8qB6t58TPipEbFMeS2iabO4pwz0jy9aCVQJ2GZqHNqaF
# VoMN5hEDBhanxFcNEcjSWZEfJOvK7uVLAZ2Xdqcrsm50ESTu55H/BIQWFipTGlqu
# +BlvfpwVYXGziuDsj8h4cUGvyuMY02XWsBU9qvl5aCiyWoyED0ZlR2UJ0jVjn68K
# RUbTrxSMF9hLgI1j8FYkNgXMBuR4+Sopc7vp//AXpp92/yxTNCbe44Y06C9iaib1
# 0zrYhd4rfTn0MX+YM9IUKzvoNSVY6VxHoTucyTM9HcT7XtCnmwnMzjY4yGxQ6k3V
# X0WBPpH4mCyXla7TJ0zwbnvH0A==
# =B/KL
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 11 Oct 2023 02:28:20 EDT
# gpg:                using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg:                issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* tag 'audio-pull-request' of https://gitlab.com/marcandre.lureau/qemu:
  hw/audio/es1370: trace lost interrupts
  hw/audio/es1370: change variable type and name
  hw/audio/es1370: block structure coding style fixes
  hw/audio/es1370: remove #ifdef ES1370_VERBOSE to avoid bit rot
  hw/audio/es1370: remove #ifdef ES1370_DEBUG to avoid bit rot
  hw/audio/es1370: remove unused dolog macro
  hw/audio/es1370: replace bit-rotted code with tracepoints
  hw/audio/es1370: reset current sample counter

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-11 09:42:14 -04:00
Peter Xu
5e79a4bf03 migration: Add migration_rp_wait|kick()
It's just a simple wrapper for rp_sem on either wait() or kick(), make it
even clearer on how it is used.  Prepared to be used even for other things.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-ID: <20231004220240.167175-8-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-10-11 11:17:05 +02:00
Peter Xu
1015ff5476 migration: Remember num of ramblocks to sync during recovery
Instead of only relying on the count of rp_sem, make the counter be part of
RAMState so it can be used in both threads to synchronize on the process.

rp_sem will be further reused in follow up patches, as a way to kick the
main thread, e.g., on recovery failures.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-7-peterx@redhat.com>
2023-10-11 11:17:05 +02:00
Peter Xu
f4b897f485 qemufile: Always return a verbose error
There're a lot of cases where we only have an errno set in last_error but
without a detailed error description.  When this happens, try to generate
an error contains the errno as a descriptive error.

This will be helpful in cases where one relies on the Error*.  E.g.,
migration state only caches Error* in MigrationState.error.  With this,
we'll display correct error messages in e.g. query-migrate when the error
was only set by qemu_file_set_error().

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-6-peterx@redhat.com>
2023-10-11 11:17:05 +02:00
Peter Xu
2b2f6f411e migration: Introduce migrate_has_error()
Introduce a helper to detect whether MigrationState.error is set for
whatever reason.

This is preparation work for any thread (e.g. source return path thread) to
setup errors in an unified way to MigrationState, rather than relying on
its own way to set errors (mark_source_rp_bad()).

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-3-peterx@redhat.com>
2023-10-11 11:17:05 +02:00
Peter Xu
c94143e587 migration: Display error in query-migrate irrelevant of status
Display it as long as being set, irrelevant of FAILED status.  E.g., it may
also be applicable to PAUSED stage of postcopy, to provide hint on what has
gone wrong.

The error_mutex seems to be overlooked when referencing the error, add it
to be very safe.

This will change QAPI behavior by showing up error message outside !FAILED
status, but it's intended and doesn't expect to break anyone.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018404
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-2-peterx@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
2c88739cfd migration/rdma: Replace flawed device detail dump by tracing
qemu_rdma_dump_id() dumps RDMA device details to stdout.

rdma_start_outgoing_migration() calls it via qemu_rdma_source_init()
and qemu_rdma_resolve_host() to show source device details.
rdma_start_incoming_migration() arranges its call via
rdma_accept_incoming_migration() and qemu_rdma_accept() to show
destination device details.

Two issues:

1. rdma_start_outgoing_migration() can run in HMP context.  The
   information should arguably go the monitor, not stdout.

2. ibv_query_port() failure is reported as error.  Its callers remain
   unaware of this failure (qemu_rdma_dump_id() can't fail), so
   reporting this to the user as an error is problematic.

Fixable, but the device detail dump is noise, except when
troubleshooting.  Tracing is a better fit.  Similar function
qemu_rdma_dump_id() was converted to tracing in commit
733252deb8 (Tracify migration/rdma.c).

Convert qemu_rdma_dump_id(), too.

While there, touch up qemu_rdma_dump_gid()'s outdated comment.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-54-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
ff4c919459 migration/rdma: Use error_report() & friends instead of stderr
error_report() obeys -msg, reports the current error location if any,
and reports to the current monitor if any.  Reporting to stderr
directly with fprintf() or perror() is wrong, because it loses all
this.

Fix the offenders.  Bonus: resolves a FIXME about problematic use of
errno.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-53-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
5cec563d0c migration/rdma: Downgrade qemu_rdma_cleanup() errors to warnings
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_source_init(), qemu_rdma_connect(),
rdma_start_incoming_migration(), and rdma_start_outgoing_migration()
violate this principle: they call error_report() via
qemu_rdma_cleanup().

Moreover, qemu_rdma_cleanup() can't fail.  It is called on error
paths, and QIOChannel close and finalization.  Are the conditions it
reports really errors?  I doubt it.

Downgrade qemu_rdma_cleanup()'s errors to warnings.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-52-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
b765d21e4a migration/rdma: Silence qemu_rdma_register_and_get_keys()
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_write_one() violates this principle: it reports errors to
stderr via qemu_rdma_register_and_get_keys().  I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.

Clean this up: silence qemu_rdma_register_and_get_keys().  I believe
the caller's error reports suffice.  If they don't, we need to convert
to Error instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-51-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
7555c7713d migration/rdma: Silence qemu_rdma_block_for_wrid()
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_post_send_control(), qemu_rdma_exchange_get_response(), and
qemu_rdma_write_one() violate this principle: they call
error_report(), fprintf(stderr, ...), and perror() via
qemu_rdma_block_for_wrid(), qemu_rdma_poll(), and
qemu_rdma_wait_comp_channel().  I elected not to investigate how
callers handle the error, i.e. precise impact is not known.

Clean this up by dropping the error reporting from qemu_rdma_poll(),
qemu_rdma_wait_comp_channel(), and qemu_rdma_block_for_wrid().  I
believe the callers' error reports suffice.  If they don't, we need to
convert to Error instead.

Bonus: resolves a FIXME about problematic use of errno.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-50-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
8dee156c1d migration/rdma: Don't report received completion events as error
When qemu_rdma_wait_comp_channel() receives an event from the
completion channel, it reports an error "receive cm event while wait
comp channel,cm event is T", where T is the numeric event type.
However, the function fails only when T is a disconnect or device
removal.  Events other than these two are not actually an error, and
reporting them as an error is wrong.  If we need to report them to the
user, we should use something else, and what to use depends on why we
need to report them to the user.

For now, report this error only when the function actually fails.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-49-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
01efb10637 migration/rdma: Silence qemu_rdma_reg_control()
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_source_init() and qemu_rdma_accept() violate this principle:
they call error_report() via qemu_rdma_reg_control().  I elected not
to investigate how callers handle the error, i.e. precise impact is
not known.

Clean this up by dropping the error reporting from
qemu_rdma_reg_control().  I believe the callers' error reports
suffice.  If they don't, we need to convert to Error instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-48-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
35b1561e3e migration/rdma: Silence qemu_rdma_connect()
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_connect() violates this principle: it calls error_report()
and perror().  I elected not to investigate how callers handle the
error, i.e. precise impact is not known.

Clean this up: replace perror() by changing error_setg() to
error_setg_errno(), and drop error_report().  I believe the callers'
error reports suffice then.  If they don't, we need to convert to
Error instead.

Bonus: resolves a FIXME about problematic use of errno.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-47-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
e6696d3ee9 migration/rdma: Silence qemu_rdma_resolve_host()
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_resolve_host() violates this principle: it calls
error_report().

Clean this up: drop error_report().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-46-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
07d5b94653 migration/rdma: Convert qemu_rdma_alloc_pd_cq() to Error
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_source_init() violates this principle: it calls
error_report() via qemu_rdma_alloc_pd_cq().  I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.

Clean this up by converting qemu_rdma_alloc_pd_cq() to Error.

The conversion loses a piece of advice on one of two failure paths:

    Your mlock() limits may be too low. Please check $ ulimit -a # and search for 'ulimit -l' in the output

Not worth retaining.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-45-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
3c0c3eba8d migration/rdma: Convert qemu_rdma_post_recv_control() to Error
Just for symmetry with qemu_rdma_post_send_control().  Error messages
lose detail I consider of no use to users.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-44-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
f3805964f8 migration/rdma: Convert qemu_rdma_post_send_control() to Error
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_exchange_send() violates this principle: it calls
error_report() via qemu_rdma_post_send_control().  I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.

Clean this up by converting qemu_rdma_post_send_control() to Error.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-43-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
446e559c43 migration/rdma: Convert qemu_rdma_write() to Error
Just for consistency with qemu_rdma_write_one() and
qemu_rdma_write_flush(), and for slightly simpler code.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-42-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
557c34ca60 migration/rdma: Convert qemu_rdma_write_one() to Error
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_write_flush() violates this principle: it calls
error_report() via qemu_rdma_write_one().  I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.

Clean this up by converting qemu_rdma_write_one() to Error.  Bonus:
resolves a FIXME about problematic use of errno.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-41-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
5609547781 migration/rdma: Convert qemu_rdma_write_flush() to Error
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qio_channel_rdma_writev() violates this principle: it calls
error_report() via qemu_rdma_write_flush().  I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.

Clean this up by converting qemu_rdma_write_flush() to Error.

Necessitates setting an error when qemu_rdma_write_one() failed.
Since this error will go away later in this series, simply use "FIXME
temporary error message" there.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-40-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
de1aa35f8d migration/rdma: Convert qemu_rdma_reg_whole_ram_blocks() to Error
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_exchange_send() violates this principle: it calls
error_report() via callback qemu_rdma_reg_whole_ram_blocks().  I
elected not to investigate how callers handle the error, i.e. precise
impact is not known.

Clean this up by converting the callback to Error.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-39-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
3765ec1f43 migration/rdma: Convert qemu_rdma_exchange_get_response() to Error
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qemu_rdma_exchange_send() and qemu_rdma_exchange_recv() violate this
principle: they call error_report() via
qemu_rdma_exchange_get_response().  I elected not to investigate how
callers handle the error, i.e. precise impact is not known.

Clean this up by converting qemu_rdma_exchange_get_response() to
Error.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-38-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
c4c78dce0b migration/rdma: Convert qemu_rdma_exchange_send() to Error
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qio_channel_rdma_writev() violates this principle: it calls
error_report() via qemu_rdma_exchange_send().  I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.

Clean this up by converting qemu_rdma_exchange_send() to Error.

Necessitates setting an error when qemu_rdma_post_recv_control(),
callback(), or qemu_rdma_exchange_get_response() failed.  Since these
errors will go away later in this series, simply use "FIXME temporary
error message" there.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-37-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
96f363d839 migration/rdma: Convert qemu_rdma_exchange_recv() to Error
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qio_channel_rdma_readv() violates this principle: it calls
error_report() via qemu_rdma_exchange_recv().  I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.

Clean this up by converting qemu_rdma_exchange_recv() to Error.

Necessitates setting an error when qemu_rdma_exchange_get_response()
failed.  Since this error will go away later in this series, simply
use "FIXME temporary error message" there.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-36-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
dcf07e72a4 migration/rdma: Drop "@errp is clear" guards around error_setg()
These guards are all redundant now.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-35-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
071d5ae4f3 migration/rdma: Fix error handling around rdma_getaddrinfo()
qemu_rdma_resolve_host() and qemu_rdma_dest_init() iterate over
addresses to find one that works, holding onto the first Error from
qemu_rdma_broken_ipv6_kernel() for use when no address works.  Issues:

1. If @errp was &error_abort or &error_fatal, we'd terminate instead
   of trying the next address.  Can't actually happen, since no caller
   passes these arguments.

2. When @errp is a pointer to a variable containing NULL, and
   qemu_rdma_broken_ipv6_kernel() fails, the variable no longer
   contains NULL.  Subsequent iterations pass it again, violating
   Error usage rules.  Dangerous, as setting an error would then trip
   error_setv()'s assertion.  Works only because
   qemu_rdma_broken_ipv6_kernel() and the code following the loops
   carefully avoids setting a second error.

3. If qemu_rdma_broken_ipv6_kernel() fails, and then a later iteration
   finds a working address, @errp still holds the first error from
   qemu_rdma_broken_ipv6_kernel().  If we then run into another error,
   we report the qemu_rdma_broken_ipv6_kernel() failure instead.

4. If we don't run into another error, we leak the Error object.

Use a local error variable, and propagate to @errp.  This fixes 3. and
also cleans up 1 and partly 2.

Free this error when we have a working address.  This fixes 4.

Pass the local error variable to qemu_rdma_broken_ipv6_kernel() only
until it fails.  Pass null on any later iterations.  This cleans up
the remainder of 2.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-34-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
8fd471bd77 migration/rdma: Retire macro ERROR()
ERROR() has become "error_setg() unless an error has been set
already".  Hiding the conditional in the macro is in the way of
further work.  Replace the macro uses by their expansion, and delete
the macro.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-33-armbru@redhat.com>
2023-10-11 11:17:04 +02:00
Markus Armbruster
1718f238d1 migration/rdma: Delete inappropriate error_report() in macro ERROR()
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

Macro ERROR() violates this principle.  Delete the error_report()
there.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Tested-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-32-armbru@redhat.com>
2023-10-11 11:17:03 +02:00
Markus Armbruster
e518b0050d migration/rdma: Plug a memory leak and improve a message
When migration capability @rdma-pin-all is true, but the server cannot
honor it, qemu_rdma_connect() calls macro ERROR(), then returns
success.

ERROR() sets an error.  Since qemu_rdma_connect() returns success, its
caller rdma_start_outgoing_migration() duly assumes @errp is still
clear.  The Error object leaks.

ERROR() additionally reports the situation to the user as an error:

    RDMA ERROR: Server cannot support pinning all memory. Will register memory dynamically.

Is this an error or not?  It actually isn't; we disable @rdma-pin-all
and carry on.  "Correcting" the user's configuration decisions that
way feels problematic, but that's a topic for another day.

Replace ERROR() by warn_report().  This plugs the memory leak, and
emits a clearer message to the user.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-31-armbru@redhat.com>
2023-10-11 11:17:03 +02:00
Markus Armbruster
4a10217962 migration/rdma: Check negative error values the same way everywhere
When a function returns 0 on success, negative value on error,
checking for non-zero suffices, but checking for negative is clearer.
So do that.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-30-armbru@redhat.com>
2023-10-11 11:17:03 +02:00
Markus Armbruster
c0d77702d2 migration/rdma: Drop superfluous assignments to @ret
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-29-armbru@redhat.com>
2023-10-11 11:17:03 +02:00