mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Cédric Le Goater	dd03167725	migration: Add Error** argument to add_bitmaps_to_list() This allows to report more precise errors in the migration handler dirty_bitmap_save_setup(). Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Link: https://lore.kernel.org/r/20240329105627.311227-1-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Cédric Le Goater	030b56b280	migration: Modify ram_init_bitmaps() to report dirty tracking errors The .save_setup() handler has now an Error** argument that we can use to propagate errors reported by the .log_global_start() handler. Do that for the RAM. The caller qemu_savevm_state_setup() will store the error under the migration stream for later detection in the migration sequence. Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240320064911.545001-15-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Cédric Le Goater	7bee8ba8bb	migration: Add Error** argument to xbzrle_init() Since the return value (-ENOMEM) is not exploited, follow the recommendations of qapi/error.h and change it to a bool Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240320064911.545001-14-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Cédric Le Goater	16ecd25a4f	migration: Add Error** argument to ram_state_init() Since the return value not exploited, follow the recommendations of qapi/error.h and change it to a bool Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240320064911.545001-13-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Cédric Le Goater	639ec3fbf9	memory: Add Error** argument to the global_dirty_log routines Now that the log_global() handlers take an Error* parameter and return a bool, do the same for memory_global_dirty_log_start() and memory_global_dirty_log_stop(). The error is reported in the callers for now and it will be propagated in the call stack in the next changes. To be noted a functional change in ram_init_bitmaps(), if the dirty pages logger fails to start, there is no need to synchronize the dirty pages bitmaps. colo_incoming_start_dirty_log() could be modified in a similar way. Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Paul Durrant <paul@xen.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hyman Huang <yong.huang@smartx.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Acked-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-12-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Cédric Le Goater	92c20b2fc5	migration: Introduce ram_bitmaps_destroy() We will use it in ram_init_bitmaps() to clear the allocated bitmaps when support for error reporting is added to memory_global_dirty_log_start(). Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240320064911.545001-11-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Cédric Le Goater	3688fec892	memory: Add Error argument to .log_global_start() handler Modify all .log_global_start() handlers to take an Error parameter and return a bool. Adapt memory_global_dirty_log_start() to interrupt on the first error the loop on handlers. In such case, a rollback is performed to stop dirty logging on all listeners where it was previously enabled. Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Paul Durrant <paul@xen.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-10-clg@redhat.com [peterx: modify & enrich the comment for listener_add_address_space() ] Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Cédric Le Goater	e4fa064d56	migration: Add Error** argument to .load_setup() handler This will be useful to report errors at a higher level, mostly in VFIO today. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-9-clg@redhat.com [peterx: drop comment for ERRP_GUARD, per Markus] Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Cédric Le Goater	01c3ac681b	migration: Add Error** argument to .save_setup() handler The purpose is to record a potential error in the migration stream if qemu_savevm_state_setup() fails. Most of the current .save_setup() handlers can be modified to use the Error argument instead of managing their own and calling locally error_report(). Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Harsh Prateek Bora <harshpb@linux.ibm.com> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Thomas Huth <thuth@redhat.com> Cc: Eric Blake <eblake@redhat.com> Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Cc: John Snow <jsnow@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-8-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Cédric Le Goater	057a20099b	migration: Add Error argument to qemu_savevm_state_setup() This prepares ground for the changes coming next which add an Error argument to the .save_setup() handler. Callers of qemu_savevm_state_setup() now handle the error and fail earlier setting the migration state from MIGRATION_STATUS_SETUP to MIGRATION_STATUS_FAILED. In qemu_savevm_state(), move the cleanup to preserve the error reported by .save_setup() handlers. Since the previous behavior was to ignore errors at this step of migration, this change should be examined closely to check that cleanups are still correctly done. Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-7-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Cédric Le Goater	6138d43ab2	migration: Add Error argument to vmstate_save() This will prepare ground for future changes adding an Error argument to qemu_savevm_state_setup(). Reviewed-by: Prasad Pandit <pjp@fedoraproject.org> Signed-off-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-6-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Cédric Le Goater	76936bbc31	migration: Always report an error in ram_save_setup() This will prepare ground for future changes adding an Error** argument to the save_setup() handler. We need to make sure that on failure, ram_save_setup() sets a new error. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-5-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Cédric Le Goater	150da48cb2	migration: Always report an error in block_save_setup() This will prepare ground for future changes adding an Error** argument to the save_setup() handler. We need to make sure that on failure, block_save_setup() always sets a new error. Cc: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-4-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Cédric Le Goater	31cf7c1413	vfio: Always report an error in vfio_save_setup() This will prepare ground for future changes adding an Error** argument to the save_setup() handler. We need to make sure that on failure, vfio_save_setup() always sets a new error. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-3-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Cédric Le Goater	e86f243487	s390/stattrib: Add Error argument to set_migrationmode() handler This will prepare ground for future changes adding an Error argument to the save_setup() handler. We need to make sure that on failure, set_migrationmode() always sets a new error. See the Rules section in qapi/error.h. Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Thomas Huth <thuth@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-2-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Het Gala	fe3ba17b33	tests/qtest/migration: Fix typo for vsock in SocketAddress_to_str Signed-off-by: Het Gala <het.gala@nutanix.com> Link: https://lore.kernel.org/r/20240319204840.211632-2-het.gala@nutanix.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Het Gala	bc6307a5ee	tests/qtest/migration: Add negative tests to validate migration QAPIs Migration QAPI arguments - uri and channels are mutually exhaustive. Add negative validation tests, one with both arguments present and one with none present. Signed-off-by: Het Gala <het.gala@nutanix.com> Suggested-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240312202634.63349-9-het.gala@nutanix.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Het Gala	9d36d62c00	tests/qtest/migration: Add multifd_tcp_plain test using list of channels instead of uri Add a positive test to check multifd live migration but this time using list of channels (restricted to 1) as the starting point instead of simple uri string. Signed-off-by: Het Gala <het.gala@nutanix.com> Suggested-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240312202634.63349-8-het.gala@nutanix.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Het Gala	d5ee387de9	tests/qtest/migration: Add channels parameter in migrate_qmp Alter migrate_qmp() to allow use of channels parameter, but only fill the uri with correct port number if there are no channels. Here we don't want to allow the wrong cases of having both or none (ex: migrate_qmp_fail). Signed-off-by: Het Gala <het.gala@nutanix.com> Suggested-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240312202634.63349-7-het.gala@nutanix.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Het Gala	2a49e3c618	tests/qtest/migration: Add migrate_set_ports into migrate_qmp to update migration port value migrate_get_connect_qdict gets qdict with the dst QEMU parameters. migrate_set_ports() from list of channels reads each QDict for port, and fills the port with correct value in case it was 0 in the test. Signed-off-by: Het Gala <het.gala@nutanix.com> Suggested-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240312202634.63349-6-het.gala@nutanix.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Het Gala	387dc407db	tests/qtest/migration: Add channels parameter in migrate_qmp_fail Alter migrate_qmp_fail() to allow both uri and channels independently. For channels, convert string to a Dict. No dealing with migrate_get_socket_address() here because we will fail before starting the migration anyway. Signed-off-by: Het Gala <het.gala@nutanix.com> Suggested-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240312202634.63349-5-het.gala@nutanix.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Het Gala	4f2f5b694d	tests/qtest/migration: Replace migrate_get_connect_uri inplace of migrate_get_socket_address Refactor migrate_get_socket_address to internally utilize 'socket-address' parameter, reducing redundancy in the function definition. migrate_get_socket_address implicitly converts SocketAddress into str. Move migrate_get_socket_address inside migrate_get_connect_uri which should return the uri string instead. Signed-off-by: Het Gala <het.gala@nutanix.com> Suggested-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240312202634.63349-4-het.gala@nutanix.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Het Gala	d1155fd485	tests/qtest/migration: Replace connect_uri and move migrate_get_socket_address inside migrate_qmp Move the calls to migrate_get_socket_address() into migrate_qmp(). Get rid of connect_uri and replace it with args->connect_uri only because 'to' object will help to generate connect_uri with the correct port number. Signed-off-by: Het Gala <het.gala@nutanix.com> Suggested-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240312202634.63349-3-het.gala@nutanix.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Het Gala	8c47168cca	tests/qtest/migration: Add 'to' object into migrate_qmp() Add the 'to' object into migrate_qmp(), so we can use migrate_get_socket_address() inside migrate_qmp() to get the port value. This is not applied to other migrate_qmp* because they don't need the port. Signed-off-by: Het Gala <het.gala@nutanix.com> Suggested-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240312202634.63349-2-het.gala@nutanix.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Mark Cave-Ayland	7653b44534	target/i386/translate.c: always write 32-bits for SGDT and SIDT The various Intel CPU manuals claim that SGDT and SIDT can write either 24-bits or 32-bits depending upon the operand size, but this is incorrect. Not only do the Intel CPU manuals give contradictory information between processor revisions, but this information doesn't even match real-life behaviour. In fact, tests on real hardware show that the CPU always writes 32-bits for SGDT and SIDT, and this behaviour is required for at least OS/2 Warp and WFW 3.11 with Win32s to function correctly. Remove the masking applied due to the operand size for SGDT and SIDT so that the TCG behaviour matches the behaviour on real hardware. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198 -- MCA: Whilst I don't have a copy of OS/2 Warp handy, I've confirmed that this patch fixes the issue in WFW 3.11 with Win32s. For more technical information I highly recommend the excellent write-up at https://www.os2museum.com/wp/sgdtsidt-fiction-and-reality/. Message-ID: <20240419195147.434894-1-mark.cave-ayland@ilande.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:26 +02:00
Paolo Bonzini	9c05071719	pythondeps.toml: warn about updates needed to docs/requirements.txt docs/requirements.txt is expected by readthedocs and should be in sync with pythondeps.toml. Add a comment to both. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:26 +02:00
Zhao Liu	94da7b6e9a	accel/tcg/icount-common: Consolidate the use of warn_report_once() Use warn_report_once() to get rid of the static local variable "notified". Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Message-ID: <20240418100716.1085491-1-zhao1.liu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:26 +02:00
Zhao Liu	aec202cb0e	target/i386/cpu: Merge the warning and error messages for AMD HT check Currently, the difference between warn_report_once() and error_report_once() is the former has the "warning:" prefix, while the latter does not have a similar level prefix. At the meantime, considering that there is no error handling logic here, and the purpose of error_report_once() is only to prompt the user with an abnormal message, there is no need to use an error-level message here, and instead we can just use a warning. Therefore, downgrade the message in error_report_once() to warning, and merge it into the previous warn_report_once(). Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Message-ID: <20240327103951.3853425-4-zhao1.liu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:26 +02:00
Zhao Liu	8e3991ebc8	target/i386/cpu: Consolidate the use of warn_report_once() The difference between error_printf() and error_report() is the latter may contain more information, such as the name of the program ("qemu-system-x86_64"). Thus its variant error_report_once() and warn_report()'s variant warn_report_once() can be used here to print the information only once without a static local variable "ht_warned". Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Message-ID: <20240327103951.3853425-3-zhao1.liu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:26 +02:00
Zhao Liu	7502ffb2f3	target/i386/host-cpu: Consolidate the use of warn_report_once() Use warn_report_once() to get rid of the static local variable "warned". Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Message-ID: <20240327103951.3853425-2-zhao1.liu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:26 +02:00
Isaku Yamahata	565f4768bb	kvm/tdx: Ignore memory conversion to shared of unassigned region TDX requires vMMIO region to be shared. For KVM, MMIO region is the region which kvm memslot isn't assigned to (except in-kernel emulation). qemu has the memory region for vMMIO at each device level. While OVMF issues MapGPA(to-shared) conservatively on 32bit PCI MMIO region, qemu doesn't find corresponding vMMIO region because it's before PCI device allocation and memory_region_find() finds the device region, not PCI bus region. It's safe to ignore MapGPA(to-shared) because when guest accesses those region they use GPA with shared bit set for vMMIO. Ignore memory conversion request of non-assigned region to shared and return success. Otherwise OVMF is confused and panics there. Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240229063726.610065-35-xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:26 +02:00
Isaku Yamahata	c5d9425ef4	kvm/tdx: Don't complain when converting vMMIO region to shared Because vMMIO region needs to be shared region, guest TD may explicitly convert such region from private to shared. Don't complain such conversion. Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240229063726.610065-34-xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:26 +02:00
Chao Peng	c15e568407	kvm: handle KVM_EXIT_MEMORY_FAULT Upon an KVM_EXIT_MEMORY_FAULT exit, userspace needs to do the memory conversion on the RAMBlock to turn the memory into desired attribute, switching between private and shared. Currently only KVM_MEMORY_EXIT_FLAG_PRIVATE in flags is valid when KVM_EXIT_MEMORY_FAULT happens. Note, KVM_EXIT_MEMORY_FAULT makes sense only when the RAMBlock has guest_memfd memory backend. Note, KVM_EXIT_MEMORY_FAULT returns with -EFAULT, so special handling is added. When page is converted from shared to private, the original shared memory can be discarded via ram_block_discard_range(). Note, shared memory can be discarded only when it's not back'ed by hugetlb because hugetlb is supposed to be pre-allocated and no need for discarding. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240320083945.991426-13-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:26 +02:00
Xiaoyao Li	b2e9426c04	physmem: Introduce ram_block_discard_guest_memfd_range() When memory page is converted from private to shared, the original private memory is back'ed by guest_memfd. Introduce ram_block_discard_guest_memfd_range() for discarding memory in guest_memfd. Based on a patch by Isaku Yamahata <isaku.yamahata@intel.com>. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240320083945.991426-12-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:26 +02:00
Paolo Bonzini	852f0048f3	RAMBlock: make guest_memfd require uncoordinated discard Some subsystems like VFIO might disable ram block discard, but guest_memfd uses discard operations to implement conversions between private and shared memory. Because of this, sequences like the following can result in stale IOMMU mappings: 1. allocate shared page 2. convert page shared->private 3. discard shared page 4. convert page private->shared 5. allocate shared page 6. issue DMA operations against that shared page This is not a use-after-free, because after step 3 VFIO is still pinning the page. However, DMA operations in step 6 will hit the old mapping that was allocated in step 1. Address this by taking ram_block_discard_is_enabled() into account when deciding whether or not to discard pages. Since kvm_convert_memory()/guest_memfd doesn't implement a RamDiscardManager handler to convey and replay discard operations, this is a case of uncoordinated discard, which is blocked/released by ram_block_discard_require(). Interestingly, this function had no use so far. Alternative approaches would be to block discard of shared pages, but this would cause guests to consume twice the memory if they use VFIO; or to implement a RamDiscardManager and only block uncoordinated discard, i.e. use ram_block_coordinated_discard_require(). [Commit message mostly by Michael Roth <michael.roth@amd.com>] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:26 +02:00
Xiaoyao Li	37662d85b0	HostMem: Add mechanism to opt in kvm guest memfd via MachineState Add a new member "guest_memfd" to memory backends. When it's set to true, it enables RAM_GUEST_MEMFD in ram_flags, thus private kvm guest_memfd will be allocated during RAMBlock allocation. Memory backend's @guest_memfd is wired with @require_guest_memfd field of MachineState. It avoid looking up the machine in phymem.c. MachineState::require_guest_memfd is supposed to be set by any VMs that requires KVM guest memfd as private memory, e.g., TDX VM. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-ID: <20240320083945.991426-8-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:25 +02:00
Xiaoyao Li	bd3bcf6962	kvm/memory: Make memory type private by default if it has guest memfd backend KVM side leaves the memory to shared by default, which may incur the overhead of paging conversion on the first visit of each page. Because the expectation is that page is likely to private for the VMs that require private memory (has guest memfd). Explicitly set the memory to private when memory region has valid guest memfd backend. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240320083945.991426-16-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:25 +02:00
Chao Peng	ce5a983233	kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot Switch to KVM_SET_USER_MEMORY_REGION2 when supported by KVM. With KVM_SET_USER_MEMORY_REGION2, QEMU can set up memory region that backend'ed both by hva-based shared memory and guest memfd based private memory. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240320083945.991426-10-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:25 +02:00
Xiaoyao Li	15f7a80c49	RAMBlock: Add support of KVM private guest memfd Add KVM guest_memfd support to RAMBlock so both normal hva based memory and kvm guest memfd based private memory can be associated in one RAMBlock. Introduce new flag RAM_GUEST_MEMFD. When it's set, it calls KVM ioctl to create private guest_memfd during RAMBlock setup. Allocating a new RAM_GUEST_MEMFD flag to instruct the setup of guest memfd is more flexible and extensible than simply relying on the VM type because in the future we may have the case that not all the memory of a VM need guest memfd. As a benefit, it also avoid getting MachineState in memory subsystem. Note, RAM_GUEST_MEMFD is supposed to be set for memory backends of confidential guests, such as TDX VM. How and when to set it for memory backends will be implemented in the following patches. Introduce memory_region_has_guest_memfd() to query if the MemoryRegion has KVM guest_memfd allocated. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-ID: <20240320083945.991426-7-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:25 +02:00
Xiaoyao Li	0811baed49	kvm: Introduce support for memory_attributes Introduce the helper functions to set the attributes of a range of memory to private or shared. This is necessary to notify KVM the private/shared attribute of each gpa range. KVM needs the information to decide the GPA needs to be mapped at hva-based shared memory or guest_memfd based private memory. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240320083945.991426-11-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:25 +02:00
Xiaoyao Li	72853afc63	trace/kvm: Split address space and slot id in trace_kvm_set_user_memory() The upper 16 bits of kvm_userspace_memory_region::slot are address space id. Parse it separately in trace_kvm_set_user_memory(). Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240229063726.610065-5-xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:25 +02:00
Michael Roth	ea7fbd3753	hw/i386/sev: Use legacy SEV VM types for older machine types Newer 9.1 machine types will default to using the KVM_SEV_INIT2 API for creating SEV/SEV-ES going forward. However, this API results in guest measurement changes which are generally not expected for users of these older guest types and can cause disruption if they switch to a newer QEMU/kernel version. Avoid this by continuing to use the older KVM_SEV_INIT/KVM_SEV_ES_INIT APIs for older machine types. Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240409230743.962513-4-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:25 +02:00
Michael Roth	023267334d	i386/sev: Add 'legacy-vm-type' parameter for SEV guest objects QEMU will currently automatically make use of the KVM_SEV_INIT2 API for initializing SEV and SEV-ES guests verses the older KVM_SEV_INIT/KVM_SEV_ES_INIT interfaces. However, the older interfaces will silently avoid sync'ing FPU/XSAVE state to the VMSA prior to encryption, thus relying on behavior and measurements that assume the related fields to be allow zero. With KVM_SEV_INIT2, this state is now synced into the VMSA, resulting in measurements changes and, theoretically, behaviorial changes, though the latter are unlikely to be seen in practice. To allow a smooth transition to the newer interface, while still providing a mechanism to maintain backward compatibility with VMs created using the older interfaces, provide a new command-line parameter: -object sev-guest,legacy-vm-type=true,... and have it default to false. Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240409230743.962513-2-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:25 +02:00
Paolo Bonzini	663e2f443e	target/i386: SEV: use KVM_SEV_INIT2 if possible Implement support for the KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM virtual machine types, and the KVM_SEV_INIT2 function of KVM_MEMORY_ENCRYPT_OP. These replace the KVM_SEV_INIT and KVM_SEV_ES_INIT functions, and have several advantages: - sharing the initialization sequence with SEV-SNP and TDX - allowing arguments including the set of desired VMSA features - protection against invalid use of KVM_GET/SET_* ioctls for guests with encrypted state If the KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM types are not supported, fall back to KVM_SEV_INIT and KVM_SEV_ES_INIT (which use the default x86 VM type). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:25 +02:00
Paolo Bonzini	ee88612df1	target/i386: Implement mc->kvm_type() to get VM type KVM is introducing a new API to create confidential guests, which will be used by TDX and SEV-SNP but is also available for SEV and SEV-ES. The API uses the VM type argument to KVM_CREATE_VM to identify which confidential computing technology to use. Since there are no other expected uses of VM types, delegate mc->kvm_type() for x86 boards to the confidential-guest-support object pointed to by ms->cgs. For example, if a sev-guest object is specified to confidential-guest-support, like, qemu -machine ...,confidential-guest-support=sev0 \ -object sev-guest,id=sev0,... it will check if a VM type KVM_X86_SEV_VM or KVM_X86_SEV_ES_VM is supported, and if so use them together with the KVM_SEV_INIT2 function of the KVM_MEMORY_ENCRYPT_OP ioctl. If not, it will fall back to KVM_SEV_INIT and KVM_SEV_ES_INIT. This is a preparatory work towards TDX and SEV-SNP support, but it will also enable support for VMSA features such as DebugSwap, which are only available via KVM_SEV_INIT2. Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:25 +02:00
Paolo Bonzini	d82e9c843d	target/i386: introduce x86-confidential-guest Introduce a common superclass for x86 confidential guest implementations. It will extend ConfidentialGuestSupportClass with a method that provides the VM type to be passed to KVM_CREATE_VM. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:25 +02:00
Paolo Bonzini	a99c0c66eb	KVM: remove kvm_arch_cpu_check_are_resettable Board reset requires writing a fresh CPU state. As far as KVM is concerned, the only thing that blocks reset is that CPU state is encrypted; therefore, kvm_cpus_are_resettable() can simply check if that is the case. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:25 +02:00
Paolo Bonzini	5c3131c392	KVM: track whether guest state is encrypted So far, KVM has allowed KVM_GET/SET_* ioctls to execute even if the guest state is encrypted, in which case they do nothing. For the new API using VM types, instead, the ioctls will fail which is a safer and more robust approach. The new API will be the only one available for SEV-SNP and TDX, but it is also usable for SEV and SEV-ES. In preparation for that, require architecture-specific KVM code to communicate the point at which guest state is protected (which must be after kvm_cpu_synchronize_post_init(), though that might change in the future in order to suppor migration). From that point, skip reading registers so that cpu->vcpu_dirty is never true: if it ever becomes true, kvm_arch_put_registers() will fail miserably. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:25 +02:00
Paolo Bonzini	08b2d15cdd	runstate: skip initial CPU reset if reset is not actually possible Right now, the system reset is concluded by a call to cpu_synchronize_all_post_reset() in order to sync any changes that the machine reset callback applied to the CPU state. However, for VMs with encrypted state such as SEV-ES guests (currently the only case of guests with non-resettable CPUs) this cannot be done, because guest state has already been finalized by machine-init-done notifiers. cpu_synchronize_all_post_reset() does nothing on these guests, and actually we would like to make it fail if called once guest has been encrypted. So, assume that boards that support non-resettable CPUs do not touch CPU state and that all such setup is done before, at the time of cpu_synchronize_all_post_init(). Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:25 +02:00
Paolo Bonzini	ab0c7fb22b	linux-headers: update to current kvm/next Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:25 +02:00

... 3 4 5 6 7 ...

112566 Commits