Commit Graph

738 Commits

Author SHA1 Message Date
Avihai Horon
5e1f8905ca vfio/migration: Emit VFIO migration QAPI event
Emit VFIO migration QAPI event when a VFIO device changes its migration
state. This can be used by management applications to get updates on the
current state of the VFIO device for their own purposes.

A new per VFIO device capability, "migration-events", is added so events
can be enabled only for the required devices. It is disabled by default.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-05-16 16:59:19 +02:00
Vinayak Kale
187716feeb vfio/pci: migration: Skip config space check for Vendor Specific Information in VSC during restore/load
In case of migration, during restore operation, qemu checks config space of the
pci device with the config space in the migration stream captured during save
operation. In case of config space data mismatch, restore operation is failed.

config space check is done in function get_pci_config_device(). By default VSC
(vendor-specific-capability) in config space is checked.

Due to qemu's config space check for VSC, live migration is broken across NVIDIA
vGPU devices in situation where source and destination host driver is different.
In this situation, Vendor Specific Information in VSC varies on the destination
to ensure vGPU feature capabilities exposed to the guest driver are compatible
with destination host.

If a vfio-pci device is migration capable and vfio-pci vendor driver is OK with
volatile Vendor Specific Info in VSC then qemu should exempt config space check
for Vendor Specific Info. It is vendor driver's responsibility to ensure that
VSC is consistent across migration. Here consistency could mean that VSC format
should be same on source and destination, however actual Vendor Specific Info
may not be byte-to-byte identical.

This patch skips the check for Vendor Specific Information in VSC for VFIO-PCI
device by clearing pdev->cmask[] offsets. Config space check is still enforced
for 3 byte VSC header. If cmask[] is not set for an offset, then qemu skips
config space check for that offset.

VSC check is skipped for machine types >= 9.1. The check would be enforced on
older machine types (<= 9.0).

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Vinayak Kale <vkale@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-05-16 16:59:19 +02:00
Cédric Le Goater
8aaeff97ac vfio/ccw: Make vfio_ccw_register_irq_notifier() return a bool
Since vfio_ccw_register_irq_notifier() takes an 'Error **' argument,
best practices suggest to return a bool. See the qapi/error.h Rules
section.

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-05-16 16:59:19 +02:00
Cédric Le Goater
04f8e4f29b vfio/ccw: Use g_autofree variable in vfio_ccw_register_irq_notifier()
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-05-16 16:59:19 +02:00
Cédric Le Goater
cbd470f0aa vfio/ap: Make vfio_ap_register_irq_notifier() return a bool
Since vfio_ap_register_irq_notifier() takes and 'Error **' argument,
best practices suggest to return a bool. See the qapi/error.h Rules
section.

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-05-16 16:59:19 +02:00
Cédric Le Goater
ae7aca14bd vfio/ap: Use g_autofree variable in vfio_ap_register_irq_notifier()
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Anthony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-05-16 16:59:19 +02:00
Cédric Le Goater
33dc04b072 vfio: Also trace event failures in vfio_save_complete_precopy()
vfio_save_complete_precopy() currently returns before doing the trace
event. Change that.

Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-05-16 16:59:19 +02:00
Cédric Le Goater
2da5f9e4d8 vfio: Add Error** argument to .get_dirty_bitmap() handler
Let the callers do the error reporting. Add documentation while at it.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-05-16 16:59:19 +02:00
Cédric Le Goater
ebb481c03c memory: Add Error** argument to memory_get_xlat_addr()
Let the callers do the reporting. This will be useful in
vfio_iommu_map_dirty_notify().

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-05-16 16:59:19 +02:00
Cédric Le Goater
94d1208840 vfio: Reverse test on vfio_get_xlat_addr()
It will simplify the changes coming after.

Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-05-16 16:59:19 +02:00
Cédric Le Goater
3783f814e7 vfio/migration: Add Error** argument to .vfio_save_config() handler
Use vmstate_save_state_with_err() to improve error reporting in the
callers and store a reported error under the migration stream. Add
documentation while at it.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-05-16 16:59:19 +02:00
Cédric Le Goater
fbd2469a66 vfio/migration: Add an Error** argument to vfio_migration_set_state()
Add an Error** argument to vfio_migration_set_state() and adjust
callers, including vfio_save_setup(). The error will be propagated up
to qemu_savevm_state_setup() where the save_setup() handler is
executed.

Modify vfio_vmstate_change_prepare() and vfio_vmstate_change() to
store a reported error under the migration stream if a migration is in
progress.

Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-05-16 16:59:19 +02:00
Cédric Le Goater
019d9e6cc4 migration: Extend migration_file_set_error() with Error* argument
Use it to update the current error of the migration stream if
available and if not, simply print out the error. Next changes will
update with an error to report.

Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-05-16 16:59:19 +02:00
Cédric Le Goater
0f21358f33 vfio: Add Error** argument to vfio_devices_dma_logging_start()
This allows to update the Error argument of the VFIO log_global_start()
handler. Errors for container based logging will also be propagated to
qemu_savevm_state_setup() when the ram save_setup() handler is executed.
Also, errors from vfio_container_set_dirty_page_tracking() are now
collected and reported.

The vfio_set_migration_error() call becomes redundant in
vfio_listener_log_global_start(). Remove it.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-05-16 16:59:19 +02:00
Cédric Le Goater
836bb30868 vfio: Add Error** argument to .set_dirty_page_tracking() handler
We will use the Error object to improve error reporting in the
.log_global*() handlers of VFIO. Add documentation while at it.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-05-16 16:59:19 +02:00
Dongwon Kim
c0fcd6334f ui/console: Use qemu_dmabuf_new() and free() helpers instead
This commit introduces utility functions for the creation and deallocation
of QemuDmaBuf instances. Additionally, it updates all relevant sections
of the codebase to utilize these new utility functions.

v7: remove prefix, "dpy_gl_" from all helpers
    qemu_dmabuf_free() returns without doing anything if input is null
    (Daniel P. Berrangé <berrange@redhat.com>)
    call G_DEFINE_AUTOPTR_CLEANUP_FUNC for qemu_dmabuf_free()
    (Daniel P. Berrangé <berrange@redhat.com>)

v8: Introduction of helpers was removed as those were already added
    by the previous commit

v9: set dmabuf->allow_fences to 'true' when dmabuf is created in
    virtio_gpu_create_dmabuf()/virtio-gpu-udmabuf.c

    removed unnecessary spaces were accidently added in the patch,
    'ui/console: Use qemu_dmabuf_new() a...'

v11: Calling qemu_dmabuf_close was removed as closing dmabuf->fd will be
     done in qemu_dmabuf_free anyway.
     (Daniel P. Berrangé <berrange@redhat.com>)

v12: --- Calling qemu_dmabuf_close separately as qemu_dmabuf_free doesn't
         do it.

     --- 'dmabuf' is now allocated space so it should be freed at the end of
         dbus_scanout_texture

v13: --- Immediately free dmabuf after it is released to prevent possible
         leaking of the ptr
         (Marc-André Lureau <marcandre.lureau@redhat.com>)

     --- Use g_autoptr macro to define *dmabuf for auto clean up instead of
         calling qemu_dmabuf_free
         (Marc-André Lureau <marcandre.lureau@redhat.com>)

v14: --- (vhost-user-gpu) Change qemu_dmabuf_free back to g_clear_pointer
         as it was done because of some misunderstanding (v13).

     --- (vhost-user-gpu) g->dmabuf[m->scanout_id] needs to be set to NULL
         to prevent freed dmabuf to be accessed again in case if(fd==-1)break;
         happens (before new dmabuf is allocated). Otherwise, it would cause
         invalid memory access when the same function is executed. Also NULL
         check should be done before qemu_dmabuf_close (it asserts dmabuf!=NULL.).
         (Marc-André Lureau <marcandre.lureau@redhat.com>)

Suggested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Message-Id: <20240508175403.3399895-6-dongwon.kim@intel.com>
2024-05-14 17:14:12 +04:00
Dongwon Kim
6779a3076f ui/console: Use qemu_dmabuf_get_..() helpers instead
This commit updates all instances where fields within the QemuDmaBuf
struct are directly accessed, replacing them with calls to these new
helper functions.

v6: fix typos in helper names in ui/spice-display.c

v7: removed prefix, "dpy_gl_" from all helpers

v8: Introduction of helpers was removed as those were already added
    by the previous commit

v11: -- Use new qemu_dmabuf_close() instead of close(qemu_dmabuf_get_fd()).
        (Daniel P. Berrangé <berrange@redhat.com>)
     -- Use new qemu_dmabuf_dup_fd() instead of dup(qemu_dmabuf_get_fd()).
        (Daniel P. Berrangé <berrange@redhat.com>)

Suggested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Message-Id: <20240508175403.3399895-4-dongwon.kim@intel.com>
2024-05-14 17:14:12 +04:00
Cédric Le Goater
3688fec892 memory: Add Error** argument to .log_global_start() handler
Modify all .log_global_start() handlers to take an Error** parameter
and return a bool. Adapt memory_global_dirty_log_start() to interrupt
on the first error the loop on handlers. In such case, a rollback is
performed to stop dirty logging on all listeners where it was
previously enabled.

Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Paul Durrant <paul@xen.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240320064911.545001-10-clg@redhat.com
[peterx: modify & enrich the comment for listener_add_address_space() ]
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-04-23 18:36:01 -04:00
Cédric Le Goater
e4fa064d56 migration: Add Error** argument to .load_setup() handler
This will be useful to report errors at a higher level, mostly in VFIO
today.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20240320064911.545001-9-clg@redhat.com
[peterx: drop comment for ERRP_GUARD, per Markus]
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-04-23 18:36:01 -04:00
Cédric Le Goater
01c3ac681b migration: Add Error** argument to .save_setup() handler
The purpose is to record a potential error in the migration stream if
qemu_savevm_state_setup() fails. Most of the current .save_setup()
handlers can be modified to use the Error argument instead of managing
their own and calling locally error_report().

Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Harsh Prateek Bora <harshpb@linux.ibm.com>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Cc: John Snow <jsnow@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20240320064911.545001-8-clg@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-04-23 18:36:01 -04:00
Cédric Le Goater
31cf7c1413 vfio: Always report an error in vfio_save_setup()
This will prepare ground for future changes adding an Error** argument
to the save_setup() handler. We need to make sure that on failure,
vfio_save_setup() always sets a new error.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20240320064911.545001-3-clg@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-04-23 18:36:01 -04:00
Cédric Le Goater
5700114462 vfio/iommufd: Fix memory leak
Coverity reported a memory leak on variable 'contents' in routine
iommufd_cdev_getfd(). Use g_autofree variables to simplify the exit
path and get rid of g_free() calls.

Cc: Eric Auger <eric.auger@redhat.com>
Cc: Yi Liu <yi.l.liu@intel.com>
Fixes: CID 1540007
Fixes: 5ee3dc7af7 ("vfio/iommufd: Implement the iommufd backend")
Suggested-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-03-19 11:56:37 +01:00
Peter Maydell
e692f9c6a6 * Add missing ERRP_GUARD() statements in functions that need it
* Prefer fast cpu_env() over slower CPU QOM cast macro
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmXwPhYRHHRodXRoQHJl
 ZGhhdC5jb20ACgkQLtnXdP5wLbWHvBAAgKx5LHFjz3xREVA+LkDTQ49mz0lK3s32
 SGvNlIHjiaDGVttVYhVC4sinBWUruG4Lyv/2QN72OJBzn6WUsEUQE3KPH1d7Y3/s
 wS9X7mj70n4kugWJqeIJP5AXSRasHmWoQ4QJLVQRJd6+Eb9jqwep0x7bYkI1de6D
 bL1Q7bIfkFeNQBXaiPWAm2i+hqmT4C1r8HEAGZIjAsMFrjy/hzBEjNV+pnh6ZSq9
 Vp8BsPWRfLU2XHm4WX0o8d89WUMAfUGbVkddEl/XjIHDrUD+Zbd1HAhLyfhsmrnE
 jXIwSzm+ML1KX4MoF5ilGtg8Oo0gQDEBy9/xck6G0HCm9lIoLKlgTxK9glr2vdT8
 yxZmrM9Hder7F9hKKxmb127xgU6AmL7rYmVqsoQMNAq22D6Xr4UDpgFRXNk2/wO6
 zZZBkfZ4H4MpZXbd/KJpXvYH5mQA4IpkOy8LJdE+dbcHX7Szy9ksZdPA+Z10hqqf
 zqS13qTs3abxymy2Q/tO3hPKSJCk1+vCGUkN60Wm+9VoLWGoU43qMc7gnY/pCS7m
 0rFKtvfwFHhokX1orK0lP/ppVzPv/5oFIeK8YDY9if+N+dU2LCwVZHIuf2/VJPRq
 wmgH2vAn3JDoRKPxTGX9ly6AMxuZaeP92qBTOPap0gDhihYzIpaCq9ecEBoTakI7
 tdFhV0iRr08=
 =NiP4
 -----END PGP SIGNATURE-----

Merge tag 'pull-request-2024-03-12' of https://gitlab.com/thuth/qemu into staging

* Add missing ERRP_GUARD() statements in functions that need it
* Prefer fast cpu_env() over slower CPU QOM cast macro

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmXwPhYRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbWHvBAAgKx5LHFjz3xREVA+LkDTQ49mz0lK3s32
# SGvNlIHjiaDGVttVYhVC4sinBWUruG4Lyv/2QN72OJBzn6WUsEUQE3KPH1d7Y3/s
# wS9X7mj70n4kugWJqeIJP5AXSRasHmWoQ4QJLVQRJd6+Eb9jqwep0x7bYkI1de6D
# bL1Q7bIfkFeNQBXaiPWAm2i+hqmT4C1r8HEAGZIjAsMFrjy/hzBEjNV+pnh6ZSq9
# Vp8BsPWRfLU2XHm4WX0o8d89WUMAfUGbVkddEl/XjIHDrUD+Zbd1HAhLyfhsmrnE
# jXIwSzm+ML1KX4MoF5ilGtg8Oo0gQDEBy9/xck6G0HCm9lIoLKlgTxK9glr2vdT8
# yxZmrM9Hder7F9hKKxmb127xgU6AmL7rYmVqsoQMNAq22D6Xr4UDpgFRXNk2/wO6
# zZZBkfZ4H4MpZXbd/KJpXvYH5mQA4IpkOy8LJdE+dbcHX7Szy9ksZdPA+Z10hqqf
# zqS13qTs3abxymy2Q/tO3hPKSJCk1+vCGUkN60Wm+9VoLWGoU43qMc7gnY/pCS7m
# 0rFKtvfwFHhokX1orK0lP/ppVzPv/5oFIeK8YDY9if+N+dU2LCwVZHIuf2/VJPRq
# wmgH2vAn3JDoRKPxTGX9ly6AMxuZaeP92qBTOPap0gDhihYzIpaCq9ecEBoTakI7
# tdFhV0iRr08=
# =NiP4
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 11:35:50 GMT
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* tag 'pull-request-2024-03-12' of https://gitlab.com/thuth/qemu: (55 commits)
  user: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/xtensa: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/tricore: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/sparc: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/sh4: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/rx: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/ppc: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/openrisc: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/nios2: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/mips: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/microblaze: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/m68k: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/loongarch: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/i386/hvf: Use CPUState typedef
  target/hexagon: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/cris: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/avr: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/alpha: Prefer fast cpu_env() over slower CPU QOM cast macro
  target: Replace CPU_GET_CLASS(cpu -> obj) in cpu_reset_hold() handler
  bulk: Call in place single use cpu_env()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-03-12 16:55:42 +00:00
Peter Maydell
8f3f329f5e Migration pull request
- Avihai's fix to allow vmstate iterators to not starve for VFIO
 - Maksim's fix on additional check on precopy load error
 - Fabiano's fix on fdatasync() hang in mapped-ram
 - Jonathan's fix on vring cached access over MMIO regions
 - Cedric's cleanup patches 1-4 out of his error report series
 - Yu's fix for RDMA migration (which used to be broken even for 8.2)
 - Anthony's small cleanup/fix on err message
 - Steve's patches on privatize migration.h
 - Xiang's patchset to enable zero page detections in multifd threads
 -----BEGIN PGP SIGNATURE-----
 
 iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZe9+uBIccGV0ZXJ4QHJl
 ZGhhdC5jb20ACgkQO1/MzfOr1wamaQD/SvmpMEcuRndT9LPSxzXowAGDZTBpYUfv
 5XAbx80dS9IBAO8PJJgQJIBHBeacyLBjHP9CsdVtgw5/VW+wCsbfV4AB
 =xavb
 -----END PGP SIGNATURE-----

Merge tag 'migration-20240311-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull request

- Avihai's fix to allow vmstate iterators to not starve for VFIO
- Maksim's fix on additional check on precopy load error
- Fabiano's fix on fdatasync() hang in mapped-ram
- Jonathan's fix on vring cached access over MMIO regions
- Cedric's cleanup patches 1-4 out of his error report series
- Yu's fix for RDMA migration (which used to be broken even for 8.2)
- Anthony's small cleanup/fix on err message
- Steve's patches on privatize migration.h
- Xiang's patchset to enable zero page detections in multifd threads

# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZe9+uBIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wamaQD/SvmpMEcuRndT9LPSxzXowAGDZTBpYUfv
# 5XAbx80dS9IBAO8PJJgQJIBHBeacyLBjHP9CsdVtgw5/VW+wCsbfV4AB
# =xavb
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 11 Mar 2024 21:59:20 GMT
# gpg:                using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg:                issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg:                 aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D  D1A9 3B5F CCCD F3AB D706

* tag 'migration-20240311-pull-request' of https://gitlab.com/peterx/qemu: (34 commits)
  migration/multifd: Add new migration test cases for legacy zero page checking.
  migration/multifd: Enable multifd zero page checking by default.
  migration/multifd: Implement ram_save_target_page_multifd to handle multifd version of MigrationOps::ram_save_target_page.
  migration/multifd: Implement zero page transmission on the multifd thread.
  migration/multifd: Add new migration option zero-page-detection.
  migration/multifd: Allow clearing of the file_bmap from multifd
  migration/multifd: Allow zero pages in file migration
  migration: purge MigrationState from public interface
  migration: delete unused accessors
  migration: privatize colo interfaces
  migration: migration_file_set_error
  migration: migration_is_device
  migration: migration_thread_is_self
  migration: export vcpu_dirty_limit_period
  migration: export migration_is_running
  migration: export migration_is_active
  migration: export migration_is_setup_or_active
  migration: remove migration.h references
  migration: export fewer options
  migration: Fix format in error message
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-03-12 11:35:41 +00:00
Zhao Liu
498696ef76 hw/vfio/platform: Fix missing ERRP_GUARD() for error_prepend()
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
*   error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.

ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].

The vfio_platform_realize() passes @errp to error_prepend(), and as a
DeviceClass.realize method, there are too many possible callers to check
the impact of this defect; it may or may not be harmless. Thus it is
necessary to protect @errp with ERRP_GUARD().

To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.

[1]: Issue description in the commit message of commit ae7c80a7bd
     ("error: New macro ERRP_GUARD()").

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-25-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12 11:45:45 +01:00
Zhao Liu
cf8afdfad4 hw/vfio/pci: Fix missing ERRP_GUARD() for error_prepend()
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
*   error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.

ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].

In hw/vfio/pci.c, there are 2 functions passing @errp to error_prepend()
without ERRP_GUARD():
- vfio_add_std_cap()
- vfio_realize()

The @errp of vfio_add_std_cap() is also from vfio_realize(). And
vfio_realize(), as a PCIDeviceClass.realize method, its @errp is from
DeviceClass.realize so that there is no guarantee that the @errp won't
point to @error_fatal.

To avoid the issue like [1] said, add missing ERRP_GUARD() at their
beginning.

[1]: Issue description in the commit message of commit ae7c80a7bd
     ("error: New macro ERRP_GUARD()").

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-24-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12 11:45:45 +01:00
Zhao Liu
447655089e hw/vfio/pci-quirks: Fix missing ERRP_GUARD() for error_prepend()
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
*   error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.

ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].

In hw/vfio/pci-quirks.c, there are 2 functions passing @errp to
error_prepend() without ERRP_GUARD():
- vfio_add_nv_gpudirect_cap()
- vfio_add_vmd_shadow_cap()

There are too many possible callers to check the impact of this defect;
it may or may not be harmless. Thus it is necessary to protect their
@errp with ERRP_GUARD().

To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.

[1]: Issue description in the commit message of commit ae7c80a7bd
     ("error: New macro ERRP_GUARD()").

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-23-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12 11:45:45 +01:00
Zhao Liu
5cf8f51c03 hw/vfio/iommufd: Fix missing ERRP_GUARD() for error_prepend()
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
*   error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.

ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].

The iommufd_cdev_getfd() passes @errp to error_prepend(). Its @errp is
from vfio_attach_device(), and there are too many possible callers to
check the impact of this defect; it may or may not be harmless. Thus it
is necessary to protect @errp with ERRP_GUARD().

To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.

[1]: Issue description in the commit message of commit ae7c80a7bd
     ("error: New macro ERRP_GUARD()").

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-22-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12 11:45:34 +01:00
Zhao Liu
89a8a2e9c4 hw/vfio/helpers: Fix missing ERRP_GUARD() for error_prepend()
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
*   error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.

ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].

In hw/vfio/helpers.c, there are 3 functions passing @errp to
error_prepend() without ERRP_GUARD():
 - vfio_set_irq_signaling()
 - vfio_device_get_name()
 - vfio_device_set_fd()

There are too many possible callers to check the impact of this defect;
it may or may not be harmless. Thus it is necessary to protect their
@errp with ERRP_GUARD().

To avoid the issue like [1] said, add missing ERRP_GUARD() at their
beginning.

[1]: Issue description in the commit message of commit ae7c80a7bd
     ("error: New macro ERRP_GUARD()").

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-21-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12 11:45:34 +01:00
Zhao Liu
93a518ba56 hw/vfio/container: Fix missing ERRP_GUARD() for error_prepend()
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
*   error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.

ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].

The vfio_get_group() passes @errp to error_prepend(). Its @errp is
from vfio_attach_device(), and there are too many possible callers to
check the impact of this defect; it may or may not be harmless. Thus it
is necessary to protect @errp with ERRP_GUARD().

To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.

[1]: Issue description in the commit message of commit ae7c80a7bd
     ("error: New macro ERRP_GUARD()").

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-20-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12 11:45:34 +01:00
Zhao Liu
c9e21f9987 hw/vfio/ap: Fix missing ERRP_GUARD() for error_prepend()
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
*   error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.

ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].

The vfio_ap_realize() passes @errp to error_prepend(), and as a
DeviceClass.realize method, there are too many possible callers to check
the impact of this defect; it may or may not be harmless. Thus it is
necessary to protect @errp with ERRP_GUARD().

To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.

[1]: Issue description in the commit message of commit ae7c80a7bd
     ("error: New macro ERRP_GUARD()").

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Cc: Tony Krowiak <akrowiak@linux.ibm.com>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Jason Herne <jjherne@linux.ibm.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: qemu-s390x@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20240311033822.3142585-19-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12 11:45:34 +01:00
Steve Sistare
20c64c8a51 migration: migration_file_set_error
Define and export migration_file_set_error to eliminate a dependency
on MigrationState.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1710179338-294359-9-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11 16:28:59 -04:00
Steve Sistare
9bb630c6ee migration: migration_is_device
Define and export migration_is_device to eliminate a dependency
on MigrationState.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1710179338-294359-8-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11 16:28:59 -04:00
Steve Sistare
3a6813b68c migration: export migration_is_active
Delete the MigrationState parameter from migration_is_active so it
can be exported and used without including migration.h.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1710179338-294359-4-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11 16:28:59 -04:00
Steve Sistare
7dcb3c87d8 migration: export migration_is_setup_or_active
Delete the MigrationState parameter from migration_is_setup_or_active
and move it to the public API in misc.h.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/1710179338-294359-3-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11 16:28:59 -04:00
Steve Sistare
f853fa0714 migration: remove migration.h references
Remove migration.h from files that no longer need it due to
previous commits.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1710179338-294359-2-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11 16:28:59 -04:00
Steve Sistare
f3bff6c443 migration: export fewer options
A small number of migration options are accessed by migration clients,
but to see them clients must include all of options.h, which is mostly
for migration core code.  migrate_mode() in particular will be needed by
multiple clients.

Refactor the option declarations so clients can see the necessary few via
misc.h, which already exports a portion of the client API.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1710179319-294320-1-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11 16:28:59 -04:00
Peter Maydell
67ff703c97 vfio queue:
* Allow cpr-reboot for vfio
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmXtrhMACgkQUaNDx8/7
 7KFCgw/7BEdbP4e32/+D0Mk1+oeiulgGvo/kPtg7waWACrijWxG5K5/c0kcFnhte
 OnpDfObnF7l2GjKU61SAluTocWcVbb8A61Tt5+ta5xxPQGp5XCXF7aIkb/DQQDq/
 yU7pQCmqMMSgo+siR6yb+g/2t71dYlMyCbW3LU5/oGQkFtsSXjWHqnTut+wFeuRT
 Vd62vHcZqHWG2epoyEnW3HCVMrZ8Dl+PLGkORV55P6uEzZuKwgoCTmR2m5/NLkdU
 SF4ZnZzruqkc1dsRh+vYFglQ6GttzWz1VBJg8GJTwrXAJ7C7JSTS25fNRNNhRkJl
 2/DZbdMyyJWJmrv9AXwWEEJ+bSSbM3uM76hqgMFVUlyz9y8FXduwf4MIkZjl0Jg5
 phJb3Awxxxd61I3vWhi9lZKS/RHQ5anA+rBt/4RxdnzqZ3mDcoBhiEjNaGp1Yghv
 QbkOSUmY8aBFdeqaw4UqpL2l6mlN/idAOIyq7ADp0S2/eNftw/FGBEu+KP5C/IZF
 vReB36qWws9go9w8aDdyK/1bB9vMI3fo/AA8Y4Cnr22tEbgctfygl2jmmyHZUFDI
 R8/NESzv17G2/g7OowCqG0qrOWW+UTDcDCkOlVFlLABiij9aNUYV631cQ379K7tR
 mFo10BlTcXV83XkA9iM9qqIIVNoYm5uJRva+oEaBncRIAGiYTF4=
 =kiGc
 -----END PGP SIGNATURE-----

Merge tag 'pull-vfio-20240310' of https://github.com/legoater/qemu into staging

vfio queue:

* Allow cpr-reboot for vfio

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmXtrhMACgkQUaNDx8/7
# 7KFCgw/7BEdbP4e32/+D0Mk1+oeiulgGvo/kPtg7waWACrijWxG5K5/c0kcFnhte
# OnpDfObnF7l2GjKU61SAluTocWcVbb8A61Tt5+ta5xxPQGp5XCXF7aIkb/DQQDq/
# yU7pQCmqMMSgo+siR6yb+g/2t71dYlMyCbW3LU5/oGQkFtsSXjWHqnTut+wFeuRT
# Vd62vHcZqHWG2epoyEnW3HCVMrZ8Dl+PLGkORV55P6uEzZuKwgoCTmR2m5/NLkdU
# SF4ZnZzruqkc1dsRh+vYFglQ6GttzWz1VBJg8GJTwrXAJ7C7JSTS25fNRNNhRkJl
# 2/DZbdMyyJWJmrv9AXwWEEJ+bSSbM3uM76hqgMFVUlyz9y8FXduwf4MIkZjl0Jg5
# phJb3Awxxxd61I3vWhi9lZKS/RHQ5anA+rBt/4RxdnzqZ3mDcoBhiEjNaGp1Yghv
# QbkOSUmY8aBFdeqaw4UqpL2l6mlN/idAOIyq7ADp0S2/eNftw/FGBEu+KP5C/IZF
# vReB36qWws9go9w8aDdyK/1bB9vMI3fo/AA8Y4Cnr22tEbgctfygl2jmmyHZUFDI
# R8/NESzv17G2/g7OowCqG0qrOWW+UTDcDCkOlVFlLABiij9aNUYV631cQ379K7tR
# mFo10BlTcXV83XkA9iM9qqIIVNoYm5uJRva+oEaBncRIAGiYTF4=
# =kiGc
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 10 Mar 2024 12:56:51 GMT
# gpg:                using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B  0B60 51A3 43C7 CFFB ECA1

* tag 'pull-vfio-20240310' of https://github.com/legoater/qemu:
  vfio: allow cpr-reboot migration if suspended
  vfio: register container for cpr

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-03-11 18:42:23 +00:00
Avihai Horon
ff64e0ba81 vfio/migration: Add a note about migration rate limiting
VFIO migration buffer size is currently limited to 1MB. Therefore, there
is no need to check if migration rate exceeded, as in the worst case it
will exceed by only 1MB.

However, if the buffer size is later changed to a bigger value,
vfio_save_iterate() should enforce migration rate (similar to migration
RAM code).

Add a note about this in vfio_save_iterate() to serve as a reminder.

Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240304105339.20713-4-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11 14:41:40 -04:00
Avihai Horon
3f6ed59ec4 vfio/migration: Refactor vfio_save_state() return value
Currently, vfio_save_state() returns 1 regardless of whether there is
more data to send or not. This was done to prevent a fast changing VFIO
device from potentially blocking other devices from sending their data,
as qemu_savevm_state_iterate() serialized devices.

Now that qemu_savevm_state_iterate() no longer serializes devices, there
is no need for that.

Refactor vfio_save_state() to return 0 if more data is available and 1
if no more data is available.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240304105339.20713-3-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11 14:41:40 -04:00
Frediano Ziglio
75d5a5fe67 hw/vfio/pci.c: Make some structure static
Not used outside C module.

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-03-09 18:56:36 +03:00
Steve Sistare
0cb51c183a vfio: allow cpr-reboot migration if suspended
Allow cpr-reboot for vfio if the guest is in the suspended runstate.  The
guest drivers' suspend methods flush outstanding requests and re-initialize
the devices, and thus there is no device state to save and restore.  The
user is responsible for suspending the guest before initiating cpr, such as
by issuing guest-suspend-ram to the qemu guest agent.

Relax the vfio blocker so it does not apply to cpr, and add a notifier that
verifies the guest is suspended.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
2024-03-08 22:10:13 +01:00
Steve Sistare
d9fa4223b3 vfio: register container for cpr
Define entry points to perform per-container cpr-specific initialization
and teardown.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
2024-03-08 22:10:13 +01:00
Steve Sistare
5663dd3f1a migration: MigrationNotifyFunc
Define MigrationNotifyFunc to improve type safety and simplify migration
notifiers.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-7-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28 11:31:28 +08:00
Steve Sistare
9d9babf78d migration: MigrationEvent for notifiers
Passing MigrationState to notifiers is unsound because they could access
unstable migration state internals or even modify the state.  Instead, pass
the minimal info needed in a new MigrationEvent struct, which could be
extended in the future if needed.

Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-5-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28 11:31:28 +08:00
Steve Sistare
3e7757301c migration: convert to NotifierWithReturn
Change all migration notifiers to type NotifierWithReturn, so notifiers
can return an error status in a future patch.  For now, pass NULL for the
notifier error parameter, and do not check the return value.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-4-git-send-email-steven.sistare@oracle.com
[peterx: dropped unexpected update to roms/seabios-hppa]
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28 11:31:28 +08:00
Cédric Le Goater
d2b668fca5 vfio/pci: Clear MSI-X IRQ index always
When doing device assignment of a physical device, MSI-X can be
enabled with no vectors enabled and this sets the IRQ index to
VFIO_PCI_MSIX_IRQ_INDEX. However, when MSI-X is disabled, the IRQ
index is left untouched if no vectors are in use. Then, when INTx
is enabled, the IRQ index value is considered incompatible (set to
MSI-X) and VFIO_DEVICE_SET_IRQS fails. QEMU complains with :

qemu-system-x86_64: vfio 0000:08:00.0: Failed to set up TRIGGER eventfd signaling for interrupt INTX-0: VFIO_DEVICE_SET_IRQS failure: Invalid argument

To avoid that, unconditionaly clear the IRQ index when MSI-X is
disabled.

Buglink: https://issues.redhat.com/browse/RHEL-21293
Fixes: 5ebffa4e87 ("vfio/pci: use an invalid fd to enable MSI-X")
Cc: Jing Liu <jing2.liu@intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-01-29 08:26:25 +01:00
Paolo Bonzini
850051b94b vfio: use matching sizeof type
Do not use uint64_t for the type of the declaration and __u64 when
computing the number of elements in the array.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-01-29 08:26:25 +01:00
Paolo Bonzini
592d0bc030 remove unnecessary casts from uintptr_t
uintptr_t, or unsigned long which is equivalent on Linux I32LP64 systems,
is an unsigned type and there is no need to further cast to __u64 which is
another unsigned integer type; widening casts from unsigned integers
zero-extend the value.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-01-18 10:43:51 +01:00
Avihai Horon
c817e5a377 vfio/migration: Add helper function to set state or reset device
There are several places where failure in setting the device state leads
to a device reset, which is done by setting ERROR as the recover state.

Add a helper function that sets the device state and resets the device
in case of failure. This will make the code cleaner and remove duplicate
comments.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-01-05 21:25:20 +01:00
Zhenzhong Duan
9f734a117c vfio/container: Rename vfio_init_container to vfio_set_iommu
vfio_container_init() and vfio_init_container() names are confusing
especially when we see vfio_init_container() calls vfio_container_init().

vfio_container_init() operates on base container which is consistent
with all routines handling 'VFIOContainerBase *' ops.

vfio_init_container() operates on legacy container and setup IOMMU
context with ioctl(VFIO_SET_IOMMU).

So choose to rename vfio_init_container to vfio_set_iommu to avoid
the confusion.

No functional change intended.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
2024-01-05 21:25:20 +01:00
Cédric Le Goater
6ba254801f vfio/iommufd: Remove the use of stat() to check file existence
Using stat() before opening a file or a directory can lead to a
time-of-check to time-of-use (TOCTOU) filesystem race, which is
reported by coverity as a Security best practices violations. The
sequence could be replaced by open and fdopendir but it doesn't add
much in this case. Simply use opendir to avoid the race.

Fixes: CID 1531551
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <Zhenzhong.duan@intel.com>
2024-01-05 21:25:20 +01:00
Volker Rümelin
9353b6da43 hw/vfio: fix iteration over global VFIODevice list
Commit 3d779abafe ("vfio/common: Introduce a global VFIODevice list")
introduced a global VFIODevice list, but forgot to update the list
element field name when iterating over the new list. Change the code
to use the correct list element field.

Fixes: 3d779abafe ("vfio/common: Introduce a global VFIODevice list")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2061
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
2024-01-05 21:25:20 +01:00
Cédric Le Goater
213ae3ffda vfio/container: Replace basename with g_path_get_basename
g_path_get_basename() is a portable utility function that has the
advantage of not modifing the string argument. It also fixes a compile
breakage with the Musl C library reported in [1].

[1] https://lore.kernel.org/all/20231212010228.2701544-1-raj.khem@gmail.com/

Reported-by: Khem Raj <raj.khem@gmail.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-01-05 21:25:20 +01:00
Cédric Le Goater
c1139fa4fe vfio/iommufd: Remove CONFIG_IOMMUFD usage
Availability of the IOMMUFD backend can now be fully determined at
runtime and the ifdef check was a build time protection (for PPC not
supporting it mostly).

Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-01-05 21:25:20 +01:00
Cédric Le Goater
10164df6ed vfio/spapr: Only compile sPAPR IOMMU support when needed
sPAPR IOMMU support is only needed for pseries machines. Compile out
support when CONFIG_PSERIES is not set. This saves ~7K of text.

Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-01-05 21:25:20 +01:00
Cédric Le Goater
ce5f6d49f5 vfio/iommufd: Introduce a VFIOIOMMU iommufd QOM interface
As previously done for the sPAPR and legacy IOMMU backends, convert
the VFIOIOMMUOps struct to a QOM interface. The set of of operations
for this backend can be referenced with a literal typename instead of
a C struct.

Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-01-05 21:25:20 +01:00
Cédric Le Goater
f221f641a2 vfio/spapr: Introduce a sPAPR VFIOIOMMU QOM interface
Move vfio_spapr_container_setup() to a VFIOIOMMUClass::setup handler
and convert the sPAPR VFIOIOMMUOps struct to a QOM interface. The
sPAPR QOM interface inherits from the legacy QOM interface because
because both have the same basic needs. The sPAPR interface is then
extended with the handlers specific to the sPAPR IOMMU.

This allows reuse and provides better abstraction of the backends. It
will be useful to avoid compiling the sPAPR IOMMU backend on targets
not supporting it.

Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-01-05 21:25:20 +01:00
Cédric Le Goater
61d893f2cd vfio/container: Intoduce a new VFIOIOMMUClass::setup handler
This will help in converting the sPAPR IOMMU backend to a QOM interface.

Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-01-05 21:25:20 +01:00
Cédric Le Goater
9812feefab vfio/container: Introduce a VFIOIOMMU legacy QOM interface
Convert the legacy VFIOIOMMUOps struct to the new VFIOIOMMU QOM
interface. The set of of operations for this backend can be referenced
with a literal typename instead of a C struct. This will simplify
support of multiple backends.

Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-01-05 21:25:20 +01:00
Cédric Le Goater
fdaa774e67 vfio/container: Introduce a VFIOIOMMU QOM interface
VFIOContainerBase was not introduced as an abstract QOM object because
it felt unnecessary to expose all the IOMMU backends to the QEMU
machine and human interface. However, we can still abstract the IOMMU
backend handlers using a QOM interface class. This provides more
flexibility when referencing the various implementations.

Simply transform the VFIOIOMMUOps struct in an InterfaceClass and do
some initial name replacements. Next changes will start converting
VFIOIOMMUOps.

Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-01-05 21:25:20 +01:00
Cédric Le Goater
bffe92af0e vfio/container: Initialize VFIOIOMMUOps under vfio_init_container()
vfio_init_container() already defines the IOMMU type of the container.
Do the same for the VFIOIOMMUOps struct. This prepares ground for the
following patches that will deduce the associated VFIOIOMMUOps struct
from the IOMMU type.

Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-01-05 21:25:20 +01:00
Cédric Le Goater
d3764db875 vfio/container: Introduce vfio_legacy_setup() for further cleanups
This will help subsequent patches to unify the initialization of type1
and sPAPR IOMMU backends.

Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-01-05 21:25:20 +01:00
Cédric Le Goater
001a013ea3 vfio/spapr: Extend VFIOIOMMUOps with a release handler
This allows to abstract a bit more the sPAPR IOMMU support in the
legacy IOMMU backend.

Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-01-05 21:25:20 +01:00
Richard Henderson
65bd53e8d5 hw/vfio: Constify VMState
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20231221031652.119827-60-richard.henderson@linaro.org>
2023-12-30 07:38:06 +11:00
Zhenzhong Duan
6106a32914 vfio: Introduce a helper function to initialize VFIODevice
Introduce a helper function to replace the common code to initialize
VFIODevice in pci, platform, ap and ccw VFIO device.

No functional change intended.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
c12b55ad6f vfio/ccw: Move VFIODevice initializations in vfio_ccw_instance_init
Some of the VFIODevice initializations is in vfio_ccw_realize,
move all of them in vfio_ccw_instance_init.

No functional change intended.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
cbbcc2f170 vfio/ap: Move VFIODevice initializations in vfio_ap_instance_init
Some of the VFIODevice initializations is in vfio_ap_realize,
move all of them in vfio_ap_instance_init.

No functional change intended.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
a0cf44c8d6 vfio/platform: Move VFIODevice initializations in vfio_platform_instance_init
Some of the VFIODevice initializations is in vfio_platform_realize,
move all of them in vfio_platform_instance_init.

No functional change intended.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
dd2fcb1716 vfio/pci: Move VFIODevice initializations in vfio_instance_init
Some of the VFIODevice initializations is in vfio_realize,
move all of them in vfio_instance_init.

No functional change intended.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
4517c33c31 vfio: Make VFIOContainerBase poiner parameter const in VFIOIOMMUOps callbacks
Some of the callbacks in VFIOIOMMUOps pass VFIOContainerBase poiner,
those callbacks only need read access to the sub object of VFIOContainerBase.
So make VFIOContainerBase, VFIOContainer and VFIOIOMMUFDContainer as const
in these callbacks.

Local functions called by those callbacks also need same changes to avoid
build error.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
909a6254ed vfio/ccw: Make vfio cdev pre-openable by passing a file handle
This gives management tools like libvirt a chance to open the vfio
cdev with privilege and pass FD to qemu. This way qemu never needs
to have privilege to open a VFIO or iommu cdev node.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
e70f971a6c vfio/ccw: Allow the selection of a given iommu backend
Now we support two types of iommu backends, let's add the capability
to select one of them. This depends on whether an iommufd object has
been linked with the vfio-ccw device:

If the user wants to use the legacy backend, it shall not
link the vfio-ccw device with any iommufd object:

 -device vfio-ccw,sysfsdev=/sys/bus/mdev/devices/XXX

This is called the legacy mode/backend.

If the user wants to use the iommufd backend (/dev/iommu) it
shall pass an iommufd object id in the vfio-ccw device options:

 -object iommufd,id=iommufd0
 -device vfio-ccw,sysfsdev=/sys/bus/mdev/devices/XXX,iommufd=iommufd0

Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
5e7ba401b7 vfio/ap: Make vfio cdev pre-openable by passing a file handle
This gives management tools like libvirt a chance to open the vfio
cdev with privilege and pass FD to qemu. This way qemu never needs
to have privilege to open a VFIO or iommu cdev node.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
336f308958 vfio/ap: Allow the selection of a given iommu backend
Now we support two types of iommu backends, let's add the capability
to select one of them. This depends on whether an iommufd object has
been linked with the vfio-ap device:

if the user wants to use the legacy backend, it shall not
link the vfio-ap device with any iommufd object:

 -device vfio-ap,sysfsdev=/sys/bus/mdev/devices/XXX

This is called the legacy mode/backend.

If the user wants to use the iommufd backend (/dev/iommu) it
shall pass an iommufd object id in the vfio-ap device options:

 -object iommufd,id=iommufd0
 -device vfio-ap,sysfsdev=/sys/bus/mdev/devices/XXX,iommufd=iommufd0

Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
3016e60f8f vfio/platform: Make vfio cdev pre-openable by passing a file handle
This gives management tools like libvirt a chance to open the vfio
cdev with privilege and pass FD to qemu. This way qemu never needs
to have privilege to open a VFIO or iommu cdev node.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
a6c50e1c3f vfio/platform: Allow the selection of a given iommu backend
Now we support two types of iommu backends, let's add the capability
to select one of them. This depends on whether an iommufd object has
been linked with the vfio-platform device:

If the user wants to use the legacy backend, it shall not
link the vfio-platform device with any iommufd object:

 -device vfio-platform,host=XXX

This is called the legacy mode/backend.

If the user wants to use the iommufd backend (/dev/iommu) it
shall pass an iommufd object id in the vfio-platform device options:

 -object iommufd,id=iommufd0
 -device vfio-platform,host=XXX,iommufd=iommufd0

Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
da3e04b26f vfio/pci: Make vfio cdev pre-openable by passing a file handle
This gives management tools like libvirt a chance to open the vfio
cdev with privilege and pass FD to qemu. This way qemu never needs
to have privilege to open a VFIO or iommu cdev node.

Together with the earlier support of pre-opening /dev/iommu device,
now we have full support of passing a vfio device to unprivileged
qemu by management tool. This mode is no more considered for the
legacy backend. So let's remove the "TODO" comment.

Add helper functions vfio_device_set_fd() and vfio_device_get_name()
to set fd and get device name, they will also be used by other vfio
devices.

There is no easy way to check if a device is mdev with FD passing,
so fail the x-balloon-allowed check unconditionally in this case.

There is also no easy way to get BDF as name with FD passing, so
we fake a name by VFIO_FD[fd].

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Eric Auger
ee42b261b0 vfio/pci: Allow the selection of a given iommu backend
Now we support two types of iommu backends, let's add the capability
to select one of them. This depends on whether an iommufd object has
been linked with the vfio-pci device:

If the user wants to use the legacy backend, it shall not
link the vfio-pci device with any iommufd object:

 -device vfio-pci,host=0000:02:00.0

This is called the legacy mode/backend.

If the user wants to use the iommufd backend (/dev/iommu) it
shall pass an iommufd object id in the vfio-pci device options:

 -object iommufd,id=iommufd0
 -device vfio-pci,host=0000:02:00.0,iommufd=iommufd0

Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
96d6f85ff0 vfio/iommufd: Enable pci hot reset through iommufd cdev interface
Implement the newly introduced pci_hot_reset callback named
iommufd_cdev_pci_hot_reset to do iommufd specific check and
reset operation.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
c328e7e8ad vfio/pci: Introduce a vfio pci hot reset interface
Legacy vfio pci and iommufd cdev have different process to hot reset
vfio device, expand current code to abstract out pci_hot_reset callback
for legacy vfio, this same interface will also be used by iommufd
cdev vfio device.

Rename vfio_pci_hot_reset to vfio_legacy_pci_hot_reset and move it
into container.c.

vfio_pci_[pre/post]_reset and vfio_pci_host_match are exported so
they could be called in legacy and iommufd pci_hot_reset callback.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
4d36ec23a7 vfio/pci: Extract out a helper vfio_pci_get_pci_hot_reset_info
This helper will be used by both legacy and iommufd backends.

No functional changes intended.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
714e9affa8 vfio/iommufd: Add support for iova_ranges and pgsizes
Some vIOMMU such as virtio-iommu use IOVA ranges from host side to
setup reserved ranges for passthrough device, so that guest will not
use an IOVA range beyond host support.

Use an uAPI of IOMMUFD to get IOVA ranges of host side and pass to
vIOMMU just like the legacy backend, if this fails, fallback to
64bit IOVA range.

Also use out_iova_alignment returned from uAPI as pgsizes instead of
qemu_real_host_page_size() as a fallback.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
36e84d0c17 vfio/iommufd: Relax assert check for iommufd backend
Currently iommufd doesn't support dirty page sync yet,
but it will not block us doing live migration if VFIO
migration is force enabled.

So in this case we allow set_dirty_page_tracking to be NULL.
Note we don't need same change for query_dirty_bitmap because
when dirty page sync isn't supported, query_dirty_bitmap will
never be called.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Yi Liu
5ee3dc7af7 vfio/iommufd: Implement the iommufd backend
The iommufd backend is implemented based on the new /dev/iommu user API.
This backend obviously depends on CONFIG_IOMMUFD.

So far, the iommufd backend doesn't support dirty page sync yet.

Co-authored-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
1eae5b7bd3 vfio/common: return early if space isn't empty
This is a trivial optimization. If there is active container in space,
vfio_reset_handler will never be unregistered. So revert the check of
space->containers and return early.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
dbb9d0c969 vfio/spapr: Move hostwin_list into spapr container
No functional changes intended.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
6ad359ec29 vfio/spapr: Move prereg_listener into spapr container
No functional changes intended.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
233309e8e4 vfio/spapr: switch to spapr IOMMU BE add/del_section_window
No functional change intended.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
9b7d38bf5a vfio/spapr: Introduce spapr backend and target interface
Introduce an empty spapr backend which will hold spapr specific
content, currently only prereg_listener and hostwin_list.

Also introduce two spapr specific callbacks add/del_window into
VFIOIOMMUOps. Instantiate a spapr ops with a helper setup_spapr_ops
and assign it to bcontainer->ops.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Eric Auger
1eb31f13b2 vfio/container: Implement attach/detach_device
No functional change intended.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
f79baf8c95 vfio/container: Move iova_ranges to base container
Meanwhile remove the helper function vfio_free_container as it
only calls g_free now.

No functional change intended.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Eric Auger
4d6b95010c vfio/container: Move dirty_pgsizes and max_dirty_bitmap_size to base container
No functional change intended.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Eric Auger
c7b313d300 vfio/container: Move listener to base container
Move listener to base container. Also error and initialized fields
are moved at the same time.

No functional change intended.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
dc74a4b005 vfio/container: Move vrdl_list to base container
No functional change intended.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Eric Auger
7ab1cb74ff vfio/container: Move pgsizes and dma_max_mappings to base container
No functional change intended.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Eric Auger
e1cac6b203 vfio/container: Convert functions to base container
In the prospect to get rid of VFIOContainer refs
in common.c lets convert misc functions to use the base
container object instead:

vfio_devices_all_dirty_tracking
vfio_devices_all_device_dirty_tracking
vfio_devices_all_running_and_mig_active
vfio_devices_query_dirty_bitmap
vfio_get_dirty_bitmap

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Zhenzhong Duan
3e6015d111 vfio/container: Move per container device list in base container
VFIO Device is also changed to point to base container instead of
legacy container.

No functional change intended.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Eric Auger
bb424490ed vfio/container: Switch to IOMMU BE set_dirty_page_tracking/query_dirty_bitmap API
dirty_pages_supported field is also moved to the base container

No functional change intended.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Eric Auger
e559706338 vfio/container: Move space field to base container
Move the space field to the base object. Also the VFIOAddressSpace
now contains a list of base containers.

No functional change intended.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00