mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Alexey Kardashevskiy	5bb55f3e3b	spapr: Use address from elf parser for kernel address tl;dr: This allows Big Endian zImage booting via -kernel + x-vof=on. QEMU loads the kernel at 0x400000 by default which works most of the time as Linux kernels are relocatable, 64bit and compiled with "-pie" (position independent code). This works for a little endian zImage too. However a big endian zImage is compiled without -pie, is 32bit, linked to 0x4000000 so current QEMU ends up loading it at 0x4400000 but keeps spapr->kernel_addr unchanged so booting fails. This uses the kernel address returned from load_elf(). If the default kernel_addr is used, there is no change in behavior (as translate_kernel_address() takes care of this), which is: LE/BE vmlinux and LE zImage boot, BE zImage does not. If the VM created with "-machine kernel-addr=0,x-vof=on", then QEMU prints a warning and BE zImage boots. Note #1: SLOF (x-vof=off) still cannot boot a big endian zImage as SLOF enables MSR_SF for everything loaded by QEMU and this leads to early crash of 32bit zImage. Note #2: BE/LE vmlinux images set MSR_SF in early boot so these just work; a LE zImage restores MSR_SF after every CI call and we are lucky enough not to crash before the first CI call. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20220504065536.3534488-1-aik@ozlabs.ru> [danielhb: use PRIx64 instead of lx in warn_report] Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2022-05-26 17:11:32 -03:00
Paolo Bonzini	f73eb9484b	pseries: allow setting stdout-path even on machines with a VGA -machine graphics=off is the usual way to tell the firmware or the OS that the user wants a serial console. The pseries machine however does not support this, and never adds the stdout-path node to the device tree if a VGA device is provided. This is in addition to the other magic behavior of VGA devices, which is to add a keyboard and mouse to the default USB bus. Split spapr->has_graphics in two variables so that the two behaviors can be separated: the USB devices remains the same, but the stdout-path is added even with "-device VGA -machine graphics=off". Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220507054826.124936-1-pbonzini@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2022-05-26 17:11:32 -03:00
Paolo Bonzini	97ec4d21e0	machine: use QAPI struct for boot configuration As part of converting -boot to a property with a QAPI type, define the struct and use it throughout QEMU to access boot configuration. machine_boot_parse takes care of doing the QemuOpts->QAPI conversion by hand, for now. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220414165300.555321-2-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:43 +02:00
Gautam Agrawal	f9bcb2d684	Warn user if the vga flag is passed but no vga device is created A global boolean variable "vga_interface_created"(declared in softmmu/globals.c) has been used to track the creation of vga interface. If the vga flag is passed in the command line "default_vga"(declared in softmmu/vl.c) variable is set to 0. To warn user, the condition checks if vga_interface_created is false and default_vga is equal to 0. If "-vga none" is passed, this patch will not warn the user regarding the creation of VGA device. The warning "A -vga option was passed but this machine type does not use that option; no VGA device has been created" is logged if vga flag is passed but no vga device is created. This patch has been tested for x86_64, i386, sparc, sparc64 and arm boards. Signed-off-by: Gautam Agrawal <gautamnagrawal@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/581 Message-Id: <20220501122505.29202-1-gautamnagrawal@gmail.com> [thuth: Fix wrong warning with "-device" in some cases as reported by Paolo] Signed-off-by: Thomas Huth <thuth@redhat.com>	2022-05-09 08:21:14 +02:00
Víctor Colombo	d41ccf6eea	target/ppc: Remove msr_pr macro msr_pr macro hides the usage of env->msr, which is a bad behavior Substitute it with FIELD_EX64 calls that explicitly use env->msr as a parameter. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220504210541.115256-4-victor.colombo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2022-05-05 15:36:17 -03:00
Cornelia Huck	0ca703662e	hw: Add compat machines for 7.1 Add 7.1 machine types for arm/i440fx/m68k/q35/s390x/spapr. Signed-off-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20220316145521.1224083-1-cohuck@redhat.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Acked-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2022-04-20 09:36:24 +02:00
Marc-André Lureau	0f9668e0c1	Remove qemu-common.h include from most units Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-33-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:55 +02:00
Markus Armbruster	b21e238037	Use g_new() & friends where that makes obvious sense g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer, for two reasons. One, it catches multiplication overflowing size_t. Two, it returns T * rather than void *, which lets the compiler catch more type errors. This commit only touches allocations with size arguments of the form sizeof(T). Patch created mechanically with: $ spatch --in-place --sp-file scripts/coccinelle/use-g_new-etc.cocci \ --macro-file scripts/cocci-macro-file.h FILES... Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20220315144156.1595462-4-armbru@redhat.com> Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>	2022-03-21 15:44:44 +01:00
Peter Maydell	5df022cf2e	osdep: Move memalign-related functions to their own header Move the various memalign-related functions out of osdep.h and into their own header, which we include only where they are used. While we're doing this, add some brief documentation comments. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org	2022-03-07 13:16:49 +00:00
Daniel Henrique Barboza	5f2b96b38e	hw/ppc/spapr.c: fail early if no firmware found in machine_init() The firmware check consists on a file search (qemu_find_file) and load it via load_imag_targphys(). This validation is not dependent on any other machine state but it currently being done at the end of spapr_machine_init(). This means that we can do a lot of stuff and end up failing at the end for something that we can verify right out of the gate. Move this validation to the start of spapr_machine_init() to fail earlier. While we're at it, use g_autofree in the 'filename' pointer. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220228175004.8862-3-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-03-02 06:51:39 +01:00
Daniel Henrique Barboza	aebb9b9cb2	hw/ppc/spapr.c: use g_autofree in spapr_dt_chosen() Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220228175004.8862-2-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-03-02 06:51:39 +01:00
Nicholas Piggin	120f738a46	spapr: implement nested-hv capability for the virtual hypervisor This implements the Nested KVM HV hcall API for spapr under TCG. The L2 is switched in when the H_ENTER_NESTED hcall is made, and the L1 is switched back in returned from the hcall when a HV exception is sent to the vhyp. Register state is copied in and out according to the nested KVM HV hcall API specification. The hdecr timer is started when the L2 is switched in, and it provides the HDEC / 0x980 return to L1. The MMU re-uses the bare metal radix 2-level page table walker by using the get_pate method to point the MMU to the nested partition table entry. MMU faults due to partition scope errors raise HV exceptions and accordingly are routed back to the L1. The MMU does not tag translations for the L1 (direct) vs L2 (nested) guests, so the TLB is flushed on any L1<->L2 transition (hcall entry and exit). Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> [ clg: checkpatch fixes ] Message-Id: <20220216102545.1808018-10-npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-02-18 08:34:14 +01:00
Nicholas Piggin	7cebc5db2e	target/ppc: Introduce a vhyp framework for nested HV support Introduce virtual hypervisor methods that can support a "Nested KVM HV" implementation using the bare metal 2-level radix MMU, and using HV exceptions to return from H_ENTER_NESTED (rather than cause interrupts). HV exceptions can now be raised in the TCG spapr machine when running a nested KVM HV guest. The main ones are the lev==1 syscall, the hdecr, hdsi and hisi, hv fu, and hv emu, and h_virt external interrupts. HV exceptions are intercepted in the exception handler code and instead of causing interrupts in the guest and switching the machine to HV mode, they go to the vhyp where it may exit the H_ENTER_NESTED hcall with the interrupt vector numer as return value as required by the hcall API. Address translation is provided by the 2-level page table walker that is implemented for the bare metal radix MMU. The partition scope page table is pointed to the L1's partition scope by the get_pate vhc method. Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20220216102545.1808018-9-npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-02-18 08:34:14 +01:00
Nicholas Piggin	f32d4ab41c	target/ppc: make vhyp get_pate method take lpid and return success In prepartion for implementing a full partition table option for vhyp, update the get_pate method to take an lpid and return a success/fail indicator. The spapr implementation currently just asserts lpid is always 0 and always return success. Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [ clg: checkpatch fixes ] Message-Id: <20220216102545.1808018-6-npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-02-18 08:34:14 +01:00
Shivaprasad G Bhat	b5513584a0	spapr: nvdimm: Implement H_SCM_FLUSH hcall The patch adds support for the SCM flush hcall for the nvdimm devices. To be available for exploitation by guest through the next patch. The hcall is applicable only for new SPAPR specific device class which is also introduced in this patch. The hcall expects the semantics such that the flush to return with H_LONG_BUSY_ORDER_10_MSEC when the operation is expected to take longer time along with a continue_token. The hcall to be called again by providing the continue_token to get the status. So, all fresh requests are put into a 'pending' list and flush worker is submitted to the thread pool. The thread pool completion callbacks move the requests to 'completed' list, which are cleaned up after collecting the return status for the guest in subsequent hcall from the guest. The semantics makes it necessary to preserve the continue_tokens and their return status across migrations. So, the completed flush states are forwarded to the destination and the pending ones are restarted at the destination in post_load. The necessary nvdimm flush specific vmstate structures are also introduced in this patch which are to be saved in the new SPAPR specific nvdimm device to be introduced in the following patch. Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <164396254862.109112.16675611182159105748.stgit@ltczzess4.aus.stglabs.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-02-18 08:34:14 +01:00
Daniel Henrique Barboza	1977434bbf	spapr.c: check bus != NULL in spapr_get_fw_dev_path() spapr_get_fw_dev_path() is an impl of FWPathProviderClass::get_dev_path(). This interface is used by hw/core/qdev-fw.c via fw_path_provider_try_get_dev_path() in two functions: - static char qdev_get_fw_dev_path_from_handler(), which is used only in qdev_get_fw_dev_path_helper() and it's guarded by "if (dev && dev->parent_bus)"; - char qdev_get_own_fw_dev_path_from_handler(), which is used in softmmu/bootdevice.c in get_boot_device_path() like this: if (dev) { d = qdev_get_own_fw_dev_path_from_handler(dev->parent_bus, dev); This means that, when called via softmmu/bootdevice.c, there's no check of 'dev->parent_bus' being not NULL. The result is that the "BusState bus" arg of spapr_get_fw_dev_path() can potentially be NULL and if, at the same time, "SCSIDevice d" is not NULL, we'll hit this line: void *spapr = CAST(void, bus->parent, "spapr-vscsi"); And we'll SIGINT because 'bus' is NULL and we're accessing bus->parent. Adding a simple 'bus != NULL' check to guard the instances where we access 'bus->parent' can avoid this altogether. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20220121213852.30243-1-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-01-28 13:15:02 +01:00
Cédric Le Goater	2460e1d75b	spapr: Fix support of POWER5+ processors POWER5+ (ISA v2.03) processors are supported by the pseries machine but they do not have Altivec instructions. Do not advertise support for it in the DT. To be noted that this test is in contradiction with the assert in cap_vsx_apply(). Signed-off-by: Cédric Le Goater <clg@kaod.org> Tested-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20220105095142.3990430-3-clg@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-01-12 11:28:26 +01:00
Cornelia Huck	01854af2cf	hw: Add compat machines for 7.0 Add 7.0 machine types for arm/i440fx/q35/s390x/spapr. Signed-off-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20211217143948.289995-1-cohuck@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2022-01-05 09:06:36 +01:00
Leandro Lupori	7bf00dfb51	target/ppc: fix Hash64 MMU update of PTE bit R When updating the R bit of a PTE, the Hash64 MMU was using a wrong byte offset, causing the first byte of the adjacent PTE to be corrupted. This caused a panic when booting FreeBSD, using the Hash MMU. Fixes: `a2dd4e83e7` ("ppc/hash64: Rework R and C bit updates") Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2021-11-29 21:00:08 +01:00
Yanan Wang	2b52619994	machine: Move smp_prefer_sockets to struct SMPCompatProps Now we have a common structure SMPCompatProps used to store information about SMP compatibility stuff, so we can also move smp_prefer_sockets there for cleaner code. No functional change intended. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-15-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 15:29:15 +02:00
Yanan Wang	4a0af2930a	machine: Prefer cores over sockets in smp parsing since 6.2 In the real SMP hardware topology world, it's much more likely that we have high cores-per-socket counts and few sockets totally. While the current preference of sockets over cores in smp parsing results in a virtual cpu topology with low cores-per-sockets counts and a large number of sockets, which is just contrary to the real world. Given that it is better to make the virtual cpu topology be more reflective of the real world and also for the sake of compatibility, we start to prefer cores over sockets over threads in smp parsing since machine type 6.2 for different arches. In this patch, a boolean "smp_prefer_sockets" is added, and we only enable the old preference on older machines and enable the new one since type 6.2 for all arches by using the machine compat mechanism. Suggested-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-10-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 15:28:16 +02:00
Daniel Henrique Barboza	e0eb84d4f5	spapr_numa.c: FORM2 NUMA affinity support The main feature of FORM2 affinity support is the separation of NUMA distances from ibm,associativity information. This allows for a more flexible and straightforward NUMA distance assignment without relying on complex associations between several levels of NUMA via ibm,associativity matches. Another feature is its extensibility. This base support contains the facilities for NUMA distance assignment, but in the future more facilities will be added for latency, performance, bandwidth and so on. This patch implements the base FORM2 affinity support as follows: - the use of FORM2 associativity is indicated by using bit 2 of byte 5 of ibm,architecture-vec-5. A FORM2 aware guest can choose to use FORM1 or FORM2 affinity. Setting both forms will default to FORM2. We're not advertising FORM2 for pseries-6.1 and older machine versions to prevent guest visible changes in those; - ibm,associativity-reference-points has a new semantic. Instead of being used to calculate distances via NUMA levels, it's now used to indicate the primary domain index in the ibm,associativity domain of each resource. In our case it's set to {0x4}, matching the position where we already place logical_domain_id; - two new RTAS DT artifacts are introduced: ibm,numa-lookup-index-table and ibm,numa-distance-table. The index table is used to list all the NUMA logical domains of the platform, in ascending order, and allows for spartial NUMA configurations (although QEMU ATM doesn't support that). ibm,numa-distance-table is an array that contains all the distances from the first NUMA node to all other nodes, then the second NUMA node distances to all other nodes and so on; - get_max_dist_ref_points(), get_numa_assoc_size() and get_associativity() now checks for OV5_FORM2_AFFINITY and returns FORM2 values if the guest selected FORM2 affinity during CAS. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210920174947.556324-7-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-09-30 12:26:06 +10:00
Daniel Henrique Barboza	5dab5abe62	spapr: move FORM1 verifications to post CAS FORM2 NUMA affinity is prepared to deal with empty (memory/cpu less) NUMA nodes. This is used by the DAX KMEM driver to locate a PAPR SCM device that has a different latency than the original NUMA node from the regular memory. FORM2 is also able to deal with asymmetric NUMA distances gracefully, something that our FORM1 implementation doesn't do. Move these FORM1 verifications to a new function and wait until after CAS, when we're sure that we're sticking with FORM1, to enforce them. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210920174947.556324-6-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-09-30 12:26:06 +10:00
Daniel Henrique Barboza	4b08cd567b	spapr: use DEVICE_UNPLUG_GUEST_ERROR to report unplug errors Linux Kernel 5.12 is now unisolating CPU DRCs in the device_removal error path, signalling that the hotunplug process wasn't successful. This allow us to send a DEVICE_UNPLUG_GUEST_ERROR in drc_unisolate_logical() to signal this error to the management layer. We also have another error path in spapr_memory_unplug_rollback() for configured LMB DRCs. Kernels older than 5.13 will not unisolate the LMBs in the hotunplug error path, but it will reconfigure them. Let's send the DEVICE_UNPLUG_GUEST_ERROR event in that code path as well to cover the case of older kernels. Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210907004755.424931-7-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-09-30 12:26:06 +10:00
Daniel Henrique Barboza	44d886abab	spapr.c: handle dev->id in spapr_memory_unplug_rollback() As done in hw/acpi/memory_hotplug.c, pass an empty string if dev->id is NULL to qapi_event_send_mem_unplug_error() to avoid relying on a behavior that can be changed in the future. Suggested-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210907004755.424931-3-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-09-29 19:37:39 +10:00
Yanan Wang	52e64f5b1f	hw: Add compat machines for 6.2 Add 6.2 machine types for arm/i440fx/q35/s390x/spapr. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-09-01 11:08:16 +01:00
Peter Maydell	d1987c8114	* More SVM fixes (Lara) * Module annotation database (Gerd) * Memory leak fixes (myself) * Build fixes (myself) * --with-devices-* support (Alex) -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmDoeBgUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMtFAgAippmxRt3lt+tcdSrCOZlKmxW6veK nUidtzfH5uE8vQsh5Q98WCEq871C/C+St1gK+q2H/MLrJeAqZD39DV+SKTuZ6Tcp 3jL0iYC+oO0OjkHppDQTUDweF9KrsAW1WEeNz2th1OUDSjBXuXbZ+N497taouX18 p2UN0gKNsOO2/QFrKL5KO7vSC56eBGoZz6gKtw/7dDtJBtizf1xKBRHW43b+CnQJ mHLs7Tj6oMC+vnMHkUKLH/6za3WJF1XHs5fp2isRgqoOSP8m0r6CMg8JnFIvmQf/ tbLospKSWqcgD5C5PlFm2wSOjdU7zuPKM7wchhKrrEIvdDPhXaKrlpwi5Q== =GFX1 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging * More SVM fixes (Lara) * Module annotation database (Gerd) * Memory leak fixes (myself) * Build fixes (myself) * --with-devices-* support (Alex) # gpg: Signature made Fri 09 Jul 2021 17:23:52 BST # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini-gitlab/tags/for-upstream: (48 commits) meson: Use input/output for entitlements target configure: allow the selection of alternate config in the build configs: rename default-configs to configs and reorganise hw/arm: move CONFIG_V7M out of default-devices hw/arm: add dependency on OR_IRQ for XLNX_VERSAL meson: Introduce target-specific Kconfig meson: switch function tests from compilation to linking vl: fix leak of qdict_crumple return value target/i386: fix exceptions for MOV to DR target/i386: Added DR6 and DR7 consistency checks target/i386: Added MSRPM and IOPM size check monitor/tcg: move tcg hmp commands to accel/tcg, register them dynamically usb: build usb-host as module monitor/usb: register 'info usbhost' dynamically usb: drop usb_host_dev_is_scsi_storage hook monitor: allow register hmp commands accel: build tcg modular accel: add tcg module annotations accel: build qtest modular accel: add qtest module annotations ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-07-11 22:20:51 +01:00
Gerd Hoffmann	b7b2a60b01	usb: drop usb_host_dev_is_scsi_storage hook Introduce an usb device flag instead, set it when usb-host looks at the device descriptors anyway. Also set it for emulated storage devices, for consistency. Add an inline helper function to check the flag. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Jose R. Ziviani <jziviani@suse.de> Message-Id: <20210624103836.2382472-32-kraxel@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-07-09 18:21:33 +02:00
Bharata B Rao	82123b756a	target/ppc: Support for H_RPT_INVALIDATE hcall If KVM_CAP_RPT_INVALIDATE KVM capability is enabled, then - indicate the availability of H_RPT_INVALIDATE hcall to the guest via ibm,hypertas-functions property. - Enable the hcall Both the above are done only if the new sPAPR machine capability cap-rpt-invalidate is set. Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Message-Id: <20210706112440.1449562-3-bharata@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-07-09 11:01:06 +10:00
Alexey Kardashevskiy	21bde1ecb6	spapr: Fix implementation of Open Firmware client interface This addresses the comments from v22. The functional changes are (the VOF ones need retesting with Pegasos2): (VOF) setprop will start failing if the machine class callback did not handle it; (VOF) unit addresses are lowered in path_offset(); (SPAPR) /chosen/bootargs is initialized from kernel_cmdline if the client did not change it. Fixes: 5c991e5d4378 ("spapr: Implement Open Firmware client interface") Cc: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20210708065625.548396-1-aik@ozlabs.ru> Tested-by: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-07-09 10:55:11 +10:00
Alexey Kardashevskiy	fc8c745d50	spapr: Implement Open Firmware client interface The PAPR platform describes an OS environment that's presented by a combination of a hypervisor and firmware. The features it specifies require collaboration between the firmware and the hypervisor. Since the beginning, the runtime component of the firmware (RTAS) has been implemented as a 20 byte shim which simply forwards it to a hypercall implemented in qemu. The boot time firmware component is SLOF - but a build that's specific to qemu, and has always needed to be updated in sync with it. Even though we've managed to limit the amount of runtime communication we need between qemu and SLOF, there's some, and it has become increasingly awkward to handle as we've implemented new features. This implements a boot time OF client interface (CI) which is enabled by a new "x-vof" pseries machine option (stands for "Virtual Open Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall which implements Open Firmware Client Interface (OF CI). This allows using a smaller stateless firmware which does not have to manage the device tree. The new "vof.bin" firmware image is included with source code under pc-bios/. It also includes RTAS blob. This implements a handful of CI methods just to get -kernel/-initrd working. In particular, this implements the device tree fetching and simple memory allocator - "claim" (an OF CI memory allocator) and updates "/memory@0/available" to report the client about available memory. This implements changing some device tree properties which we know how to deal with, the rest is ignored. To allow changes, this skips fdt_pack() when x-vof=on as not packing the blob leaves some room for appending. In absence of SLOF, this assigns phandles to device tree nodes to make device tree traversing work. When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree. This adds basic instances support which are managed by a hash map ihandle -> [phandle]. Before the guest started, the used memory is: 0..e60 - the initial firmware 8000..10000 - stack 400000.. - kernel 3ea0000.. - initramdisk This OF CI does not implement "interpret". Unlike SLOF, this does not format uninitialized nvram. Instead, this includes a disk image with pre-formatted nvram. With this basic support, this can only boot into kernel directly. However this is just enough for the petitboot kernel and initradmdisk to boot from any possible source. Note this requires reasonably recent guest kernel with: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735 The immediate benefit is much faster booting time which especially crucial with fully emulated early CPU bring up environments. Also this may come handy when/if GRUB-in-the-userspace sees light of the day. This separates VOF and sPAPR in a hope that VOF bits may be reused by other POWERPC boards which do not support pSeries. This assumes potential support for booting from QEMU backends such as blockdev or netdev without devices/drivers used. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20210625055155.2252896-1-aik@ozlabs.ru> Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu> [dwg: Adjusted some includes which broke compile in some more obscure compilation setups] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-07-09 10:38:19 +10:00
Alexey Kardashevskiy	7381c5d11f	spapr: tune rtas-size QEMU reserves space for RTAS via /rtas/rtas-size which tells the client how much space the RTAS requires to work which includes the RTAS binary blob implementing RTAS runtime. Because pseries supports FWNMI which requires plenty of space, QEMU reserves more than 2KB which is enough for the RTAS blob as it is just 20 bytes (under QEMU). Since FWNMI reset delivery was added, RTAS_SIZE macro is not used anymore. This replaces RTAS_SIZE with RTAS_MIN_SIZE and uses it in the /rtas/rtas-size calculation to account for the RTAS blob. Fixes: `0e236d3477` ("ppc/spapr: Implement FWNMI System Reset delivery") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20210622070336.1463250-1-aik@ozlabs.ru> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-07-09 10:38:18 +10:00
Greg Kurz	3bf0844f3b	spapr: Don't hijack current_machine->boot_order QEMU 6.0 moved all the -boot variables to the machine. Especially, the removal of the boot_order static changed the handling of '-boot once' from: if (boot_once) { qemu_boot_set(boot_once, &error_fatal); qemu_register_reset(restore_boot_order, g_strdup(boot_order)); } to if (current_machine->boot_once) { qemu_boot_set(current_machine->boot_once, &error_fatal); qemu_register_reset(restore_boot_order, g_strdup(current_machine->boot_order)); } This means that we now register as subsequent boot order a copy of current_machine->boot_once that was just set with the previous call to qemu_boot_set(), i.e. we never transition away from the once boot order. It is certainly fragile^Wwrong for the spapr code to hijack a field of the base machine type object like that. The boot order rework simply turned this software boundary violation into an actual bug. Have the spapr code to handle that with its own field in SpaprMachineState. Also kfree() the initial boot device string when "once" was used. Fixes: `4b7acd2ac8` ("vl: clean up -boot variables") Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1960119 Cc: pbonzini@redhat.com Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20210521160735.1901914-1-groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-06-03 13:22:06 +10:00
Lucas Mateus Castro (alqotel)	03282a3ab8	hw/ppc: moved has_spr to cpu.h Moved has_spr to cpu.h as ppc_has_spr and turned it into an inline function. Change spr verification in pnv.c and spapr.c to a version that can compile in a !TCG environment. Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Message-Id: <20210507164146.67086-1-lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-05-19 10:30:28 +10:00
Fabiano Rosas	ab5add4c7b	hw/ppc/spapr.c: Make sure the host supports the selected MMU mode Starting with Linux kernel v5.12 we dropped support[1] in KVM for hosts that can't have their threads running in different MMU modes (POWER9 < DD2.2). In these hosts, KVM will no longer report the KVM_CAP_PPC_MMU_HASH_V3 capability[2] when the host is running Radix. For guests that support both MMU modes, the negotiation during CAS will make sure it selects the correct one. For guests that only support Hash, such as P8 compat mode guests, the following error is currently thrown: $ ~/qemu-system-ppc64 -machine pseries,accel=kvm,max-cpu-compat=power8 ... error: kvm run failed Invalid argument NIP 0000000000000100 LR 0000000000000000 CTR 0000000000000000 XER 0000000000000000 CPU#0 MSR 8000000000001000 HID0 0000000000000000 HF 8000000000000000 iidx 3 didx 3 TB 00000000 00000000 DECR 0 GPR00 0000000000000000 0000000000000000 0000000000000000 000000007ff00000 GPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR08 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000 CR 00000000 [ - - - - - - - - ] RES ffffffffffffffff SRR0 0000000000000000 SRR1 0000000000000000 PVR 00000000004e1201 VRSAVE 0000000000000000 SPRG0 0000000000000000 SPRG1 0000000000000000 SPRG2 0000000000000000 SPRG3 0000000000000000 SPRG4 0000000000000000 SPRG5 0000000000000000 SPRG6 0000000000000000 SPRG7 0000000000000000 HSRR0 0000000000000000 HSRR1 0000000000000000 CFAR 0000000000000000 LPCR 000000000004f01f PTCR 0000000000000000 DAR 0000000000000000 DSISR 0000000000000000 This patch adds a verification during the writing of the platform support vector so that we error out as soon as we determine this guest only supports Hash and the host doesn't. ~/qemu-system-ppc64 -machine pseries,accel=kvm,max-cpu-compat=power8 ... qemu-system-ppc64: Guest requested unavailable MMU mode (hash). 1- https://git.kernel.org/torvalds/p/b1b1697ae0cc8 2- https://git.kernel.org/torvalds/p/a722076e94702 Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20210505001130.3999968-3-farosas@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-05-19 10:30:28 +10:00
Fabiano Rosas	068479e1e1	hw/ppc/spapr.c: Extract MMU mode error reporting into a function A following patch will make use of it. Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20210505001130.3999968-2-farosas@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-05-19 10:30:28 +10:00
Peter Maydell	d90f154867	ppc patch queue 2021-05-04 Here's the first ppc pull request for qemu-6.1. It has a wide variety of stuff accumulated during the 6.0 freeze. Highlights are: * Multi-phase reset cleanups for PAPR * Preliminary cleanups towards allowing !CONFIG_TCG for the ppc target * Cleanup of AIL logic and extension to POWER10 * Further improvements to handling of hot unplug failures on PAPR * Allow much larger numbers of CPU on pseries * Support for the H_SCM_HEALTH hypercall * Add support for the Pegasos II board * Substantial cleanup to hflag handling * Assorted minor fixes and cleanups -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAmCQ4ScACgkQbDjKyiDZ s5KmNhAAsICdDqeu/jm1uhRCr0DDT/Wa6KE1xlglQ53ybWb5Hm2ae0Uwzti5ZWkt T9yryObX++wiugbU5Dlx9eXTiJIPgTbDoBV1wfOa3a1BAxSEES1t70jwuwAXXBpX mgU++SurQB70IB7vVvyXDi2Z592qGvMiKXqT0sdkfoexPHzAL0+KkQPyJZLeFchM Ap/zRHAodXf9SuWAl+LwLXeb350jivXYXBWNcFRrBbOGpbVT0AJMYrk/TEa2ZIpi SvbzAWuW+9mX0EOmk7JK5JfkT41cGNdcBcwd0bt4xyvUpmkXLaTMFDLVHj3HWSUn PFA4RB3uKXyTfISVtWdxJBbFOzMpchI6lEiRJHCS+KuY7UsACqV1T/y54ATOUauC ycLc9APgRaStdNPxfDl+xeFfoVb/f0mQsNwcmY1tv7z+3qE/trY9bMyrbgaebBFn /TAkmPvXfwtAREnx8xF/57poarWUkvupGTQkANNosdFokpExmrLj8T0sKv90hh5Y vkGf5zP4pYGN1Rs8qhOdHu+IjhVJvUl/L3LZYWcoMI6E61D8rGRc0Dkacx7gcja+ sluFi5Yh2fQn55y6LTi3049cB1wMd6wly0214F11RKoBswguiGuaqJmL4sNDO/s4 IcMCy5mg6C0jNZA5kHcdWmqsVzD2+XwP5J29n/LedlmgXoHYF+M= =N0qr -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.1-20210504' into staging ppc patch queue 2021-05-04 Here's the first ppc pull request for qemu-6.1. It has a wide variety of stuff accumulated during the 6.0 freeze. Highlights are: * Multi-phase reset cleanups for PAPR * Preliminary cleanups towards allowing !CONFIG_TCG for the ppc target * Cleanup of AIL logic and extension to POWER10 * Further improvements to handling of hot unplug failures on PAPR * Allow much larger numbers of CPU on pseries * Support for the H_SCM_HEALTH hypercall * Add support for the Pegasos II board * Substantial cleanup to hflag handling * Assorted minor fixes and cleanups # gpg: Signature made Tue 04 May 2021 06:52:39 BST # gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full] # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full] # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full] # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown] # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dg-gitlab/tags/ppc-for-6.1-20210504: (46 commits) hw/ppc/pnv_psi: Use device_cold_reset() instead of device_legacy_reset() hw/ppc/spapr_vio: Reset TCE table object with device_cold_reset() hw/intc/spapr_xive: Use device_cold_reset() instead of device_legacy_reset() target/ppc: removed VSCR from SPR registration target/ppc: Reduce the size of ppc_spr_t target/ppc: Clean up _spr_register et al target/ppc: Add POWER10 exception model target/ppc: rework AIL logic in interrupt delivery target/ppc: move opcode table logic to translate.c target/ppc: code motion from translate_init.c.inc to gdbstub.c spapr_drc.c: handle hotunplug errors in drc_unisolate_logical() spapr.h: increase FDT_MAX_SIZE spapr.c: do not use MachineClass::max_cpus to limit CPUs ppc: Rename current DAWR macros and variables target/ppc: POWER10 supports scv target/ppc: Fix POWER9 radix guest HV interrupt AIL behaviour docs/system: ppc: Add documentation for ppce500 machine roms/u-boot: Bump ppce500 u-boot to v2021.04 to fix broken pci support roms/Makefile: Update ppce500 u-boot build directory name ppc/spapr: Add support for implement support for H_SCM_HEALTH ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-05-05 20:29:14 +01:00
Daniel Henrique Barboza	5642e4513e	spapr.c: do not use MachineClass::max_cpus to limit CPUs Up to this patch, 'max_cpus' value is hardcoded to 1024 (commit `6244bb7e58`). In theory this patch would simply bump it to 2048, since it's the default NR_CPUS kernel setting for ppc64 servers nowadays, but the whole mechanic of MachineClass:max_cpus is flawed for the pSeries machine. The two supported accelerators, KVM and TCG, can live without it. TCG guests don't have a theoretical limit. The user must be free to emulate as many CPUs as the hardware is capable of. And even if there were a limit, max_cpus is not the proper way to report it since it's a common value checked by SMP code in machine_smp_parse() for KVM as well. For KVM guests, the proper way to limit KVM CPUs is by host configuration via NR_CPUS, not a QEMU hardcoded value. There is no technical reason for a pSeries QEMU guest to forcefully stay below NR_CPUS. This hardcoded value also disregard hosts that might have a lower NR_CPUS limit, say 512. In this case, machine.c:machine_smp_parse() will allow a 1024 value to pass, but then kvm_init() will complain about it because it will exceed NR_CPUS: Number of SMP cpus requested (1024) exceeds the maximum cpus supported by KVM (512) A better 'max_cpus' value would consider host settings, but MachineClass::max_cpus is defined well before machine_init() and kvm_init(). We can't check for KVM limits because it's too soon, so we end up making a guess. This patch makes MachineClass:max_cpus settings innocuous by setting it to INT32_MAX. machine.c:machine_smp_parse() will not fail the verification based on max_cpus, letting kvm_init() do the checking with actual host settings. And TCG guests get to do whatever the hardware is capable of emulating. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210408204049.221802-2-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-05-04 11:41:25 +10:00
Alexey Kardashevskiy	4b98e72d97	spapr: Rename RTAS_MAX_ADDR to FDT_MAX_ADDR SLOF instantiates RTAS since `744a928cce` ("spapr: Stop providing RTAS blob") so the max address applies to the FDT only. This renames the macro and fixes up the comment. This should not cause any behavioral change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20210331025123.29310-1-aik@ozlabs.ru> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-05-04 11:41:25 +10:00
Thomas Huth	ee86213aa3	Do not include exec/address-spaces.h if it's not really necessary Stop including exec/address-spaces.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-5-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:51 +02:00
Thomas Huth	ead62c75f6	Do not include hw/boards.h if it's not really necessary Stop including hw/boards.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-3-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:51 +02:00
Cornelia Huck	da7e13c00b	hw: add compat machines for 6.1 Add 6.1 machine types for arm/i440fx/q35/s390x/spapr. Signed-off-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Greg Kurz <groug@kaod.org> Message-id: 20210331111900.118274-1-cohuck@redhat.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-04-30 11:16:51 +01:00
Daniel Henrique Barboza	2b18fc794f	spapr.c: always pulse guest IRQ in spapr_core_unplug_request() Commit `47c8c915b1` fixed a problem where multiple spapr_drc_detach() requests were breaking QEMU. The solution was to just spapr_drc_detach() once, and use spapr_drc_unplug_requested() to filter whether we already detached it or not. The commit also tied the hotplug request to the guest in the same condition. Turns out that there is a reliable way for a CPU hotunplug to fail. If a guest with one CPU hotplugs a CPU1, then offline CPU0s via 'echo 0 > /sys/devices/system/cpu/cpu0/online', then attempts to hotunplug CPU1, the kernel will refuse it because it's the last online CPU of the system. Given that we're pulsing the IRQ only in the first try, in a failed attempt, all other CPU1 hotunplug attempts will fail, regardless of the online state of CPU1 in the kernel, because we're simply not letting the guest know that we want to hotunplug the device. Let's move spapr_hotplug_req_remove_by_index() back out of the "if (!spapr_drc_unplug_requested(drc))" conditional, allowing for multiple 'device_del' requests to the same CPU core to reach the guest, in case the CPU core didn't fully hotunplugged previously. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210401000437.131140-3-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-04-12 12:27:14 +10:00
Daniel Henrique Barboza	d522cb52e6	spapr: rollback 'unplug timeout' for CPU hotunplugs The pseries machines introduced the concept of 'unplug timeout' for CPU hotunplugs. The idea was to circunvent a deficiency in the pSeries specification (PAPR), that currently does not define a proper way for the hotunplug to fail. If the guest refuses to release the CPU (see [1] for an example) there is no way for QEMU to detect the failure. Further discussions about how to send a QAPI event to inform about the hotunplug timeout [2] exposed problems that weren't predicted back when the idea was developed. Other QEMU machines don't have any type of hotunplug timeout mechanism for any device, e.g. ACPI based machines have a way to make hotunplug errors visible to the hypervisor. This would make this timeout mechanism exclusive to pSeries, which is not ideal. The real problem is that a QAPI event that reports hotunplug timeouts puts the management layer (namely Libvirt) in a weird spot. We're not telling that the hotunplug failed, because we can't be 100% sure of that, and yet we're resetting the unplug state back, preventing any DEVICE_DEL events to reach out in case the guest decides to release the device. Libvirt would need to inspect the guest itself to see if the device was released or not, otherwise the internal domain states will be inconsistent. Moreover, Libvirt already has an 'unplug timeout' concept, and a QEMU side timeout would need to be juggled together with the existing Libvirt timeout. All this considered, this solution ended up creating more trouble than it solved. This patch reverts the 3 commits that introduced the timeout mechanism for CPU hotplugs in pSeries machines. This reverts commit `4515a5f786` "qemu_timer.c: add timer_deadline_ms() helper" This reverts commit `d1c2e3ce3d` "spapr_drc.c: add hotunplug timeout for CPUs" This reverts commit `51254ffb32` "spapr_drc.c: introduce unplug_timeout_timer" [1] https://bugzilla.redhat.com/show_bug.cgi?id=1911414 [2] https://lists.gnu.org/archive/html/qemu-devel/2021-03/msg04682.html CC: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210401000437.131140-2-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-04-12 12:27:14 +10:00
Greg Kurz	df2d7ca774	spapr: Assert DIMM unplug state in spapr_memory_unplug() spapr_memory_unplug() is the last step of the hot unplug sequence. It is indirectly called by: spapr_lmb_release() hotplug_handler_unplug() and spapr_lmb_release() already buys us that DIMM unplug state is present : it gets restored with spapr_recover_pending_dimm_state() if missing. g_assert() that spapr_pending_dimm_unplugs_find() cannot return NULL in spapr_memory_unplug() to make this clear and silence Coverity. Fixes: Coverity CID 1450767 Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <161562021166.948373.15092876234470478331.stgit@bahia.lan> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-03-31 11:10:50 +11:00
Daniel Henrique Barboza	eb7f80fd26	spapr.c: send QAPI event when memory hotunplug fails Recent changes allowed the pSeries machine to rollback the hotunplug process for the DIMM when the guest kernel signals, via a reconfiguration of the DR connector, that it's not going to release the LMBs. Let's also warn QAPI listerners about it. One place to do it would be right after the unplug state is cleaned up, spapr_clear_pending_dimm_unplug_state(). This would mean that the function is now doing more than cleaning up the pending dimm state though. This patch does the following changes in spapr.c: - send a QAPI event to inform that we experienced a failure in the hotunplug of the DIMM; - rename spapr_clear_pending_dimm_unplug_state() to spapr_memory_unplug_rollback(). This is a better fit for what the function is now doing, and it makes callers care more about what the function goal is and less about spapr.c internals such as clearing the pending dimm unplug state. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210302141019.153729-3-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-03-10 09:07:09 +11:00
Daniel Henrique Barboza	41c8ad3d92	spapr.c: remove duplicated assert in spapr_memory_unplug_request() We are asserting the existence of the first DRC LMB after sending unplug requests to all LMBs of the DIMM, where every DRC is being asserted inside the loop. This means that the first DRC is being asserted twice. Remove the duplicated assert. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210302141019.153729-2-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-03-10 09:07:09 +11:00
Daniel Henrique Barboza	7420033ec4	spapr.c: add 'unplug already in progress' message for PHB unplug Both CPU hotunplug and PC_DIMM unplug reports an user warning, mentioning that the hotunplug is in progress, if consecutive 'device_del' are issued in quick succession. Do the same for PHBs in spapr_phb_unplug_request(). Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210226163301.419727-4-danielhb413@gmail.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-03-10 09:07:09 +11:00
Daniel Henrique Barboza	fe1831eff8	spapr_drc.c: use DRC reconfiguration to cleanup DIMM unplug state Handling errors in memory hotunplug in the pSeries machine is more complex than any other device type, because there are all the complications that other devices has, and more. For instance, determining a timeout for a DIMM hotunplug must consider if it's a Hash-MMU or a Radix-MMU guest, because Hash guests takes longer to hotunplug DIMMs. The size of the DIMM is also a factor, given that longer DIMMs naturally takes longer to be hotunplugged from the kernel. And there's also the guest memory usage to be considered: if there's a process that is consuming memory that would be lost by the DIMM unplug, the kernel will postpone the unplug process until the process finishes, and then initiate the regular hotunplug process. The first two considerations are manageable, but the last one is a deal breaker. There is no sane way for the pSeries machine to determine the memory load in the guest when attempting a DIMM hotunplug - and even if there was a way, the guest can start using all the RAM in the middle of the unplug process and invalidate our previous assumptions - and in result we can't even begin to calculate a timeout for the operation. This means that we can't implement a viable timeout mechanism for memory unplug in pSeries. Going back to why we would consider an unplug timeout, the reason is that we can't know if the kernel is giving up the unplug. Turns out that, sometimes, we can. Consider a failed memory hotunplug attempt where the kernel will error out with the following message: 'pseries-hotplug-mem: Memory indexed-count-remove failed, adding any removed LMBs' This happens when there is a LMB that the kernel gave up in removing, and the LMBs previously marked for removal are now being added back. This happens in the pseries kernel in [1], dlpar_memory_remove_by_ic() into dlpar_add_lmb(), and after that update_lmb_associativity_index(). In this function, the kernel is configuring the LMB DRC connector again. Note that this is a valid usage in LOPAR, as stated in section "ibm,configure-connector RTAS Call": 'A subsequent sequence of calls to ibm,configure-connector with the same entry from the “ibm,drc-indexes” or “ibm,drc-info” property will restart the configuration of devices which were not completely configured.' We can use this kernel behavior in our favor. If a DRC connector reconfiguration for a LMB that we marked as unplug pending happens, this indicates that the kernel changed its mind about the unplug and is reasserting that it will keep using all the LMBs of the DIMM. In this case, it's safe to assume that the whole DIMM device unplug was cancelled. This patch hops into rtas_ibm_configure_connector() and, in the scenario described above, clear the unplug state for the DIMM device. This will not solve all the problems we still have with memory unplug, but it will cover this case where the kernel reconfigures LMBs after a failed unplug. We are a bit more resilient, without using an unreliable timeout, and we didn't make the remaining error cases any worse. [1] arch/powerpc/platforms/pseries/hotplug-memory.c Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210222194531.62717-6-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-03-10 09:07:09 +11:00
Daniel Henrique Barboza	d1c2e3ce3d	spapr_drc.c: add hotunplug timeout for CPUs There is a reliable way to make a CPU hotunplug fail in the pseries machine. Hotplug a CPU A, then offline all other CPUs inside the guest but A. When trying to hotunplug A the guest kernel will refuse to do it, because A is now the last online CPU of the guest. PAPR has no 'error callback' in this situation to report back to the platform, so the guest kernel will deny the unplug in silent and QEMU will never know what happened. The unplug pending state of A will remain until the guest is shutdown or rebooted. Previous attempts of fixing it (see [1] and [2]) were aimed at trying to mitigate the effects of the problem. In [1] we were trying to guess which guest CPUs were online to forbid hotunplug of the last online CPU in the QEMU layer, avoiding the scenario described above because QEMU is now failing in behalf of the guest. This is not robust because the last online CPU of the guest can change while we're in the middle of the unplug process, and our initial assumptions are now invalid. In [2] we were accepting that our unplug process is uncertain and the user should be allowed to spam the IRQ hotunplug queue of the guest in case the CPU hotunplug fails. This patch presents another alternative, using the timeout infrastructure introduced in the previous patch. CPU hotunplugs in the pSeries machine will now timeout after 15 seconds. This is a long time for a single CPU unplug to occur, regardless of guest load - although the user is strongly encouraged to not hotunplug devices from a guest under high load - and we can be sure that something went wrong if it takes longer than that for the guest to release the CPU (the same can't be said about memory hotunplug - more on that in the next patch). Timing out the unplug operation will reset the unplug state of the CPU and allow the user to try it again, regardless of the error situation that prevented the hotunplug to occur. Of all the not so pretty fixes/mitigations for CPU hotunplug errors in pSeries, timing out the operation is an admission that we have no control in the process, and must assume the worst case if the operation doesn't succeed in a sensible time frame. [1] https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg03353.html [2] https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg04400.html Reported-by: Xujun Ma <xuma@redhat.com> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1911414 Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210222194531.62717-5-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-03-10 09:07:09 +11:00
Daniel Henrique Barboza	a03509cd2b	spapr: rename spapr_drc_detach() to spapr_drc_unplug_request() spapr_drc_detach() is not the best name for what the function does. The function does not detach the DRC, it makes an uncommited attempt to do it. It'll mark the DRC as pending unplug, via the 'unplug_request' flag, and only if the DRC state is drck->empty_state it will detach the DRC, via spapr_drc_release(). This is a contrast with its pair spapr_drc_attach(), where the function is indeed creating the DRC QOM object. If you know what spapr_drc_attach() does, you can be misled into thinking that spapr_drc_detach() is removing the DRC from QEMU internal state, which isn't true. The current role of this function is better described as a request for detach, since there's no guarantee that we're going to detach the DRC in the end. Rename the function to spapr_drc_unplug_request to reflect what is is doing. The initial idea was to change the name to spapr_drc_detach_request(), and later on change the unplug_request flag to detach_request. However, unplug_request is a migratable boolean for a long time now and renaming it is not worth the trouble. spapr_drc_unplug_request() setting drc->unplug_request is more natural than spapr_drc_detach_request setting drc->unplug_request. Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210222194531.62717-3-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-03-10 09:07:08 +11:00
Daniel Henrique Barboza	6640706972	spapr_numa.c: create spapr_numa_initial_nvgpu_numa_id() helper We'll need to check the initial value given to spapr->gpu_numa_id when building the rtas DT, so put it in a helper for easier access and to avoid repetition. Tested-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210128174213.1349181-3-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-02-10 10:43:50 +11:00
Daniel Henrique Barboza	3b880445e6	spapr: move spapr_machine_using_legacy_numa() to spapr_numa.c This function is used only in spapr_numa.c. Tested-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210128174213.1349181-2-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-02-10 10:43:50 +11:00
Greg Kurz	040bdafce1	spapr: Adjust firmware path of PCI devices It is currently not possible to perform a strict boot from USB storage: $ qemu-system-ppc64 -accel kvm -nodefaults -nographic -serial stdio \ -boot strict=on \ -device qemu-xhci \ -device usb-storage,drive=disk,bootindex=0 \ -blockdev driver=file,node-name=disk,filename=fedora-ppc64le.qcow2 SLOF ********************************************************************** QEMU Starting Build Date = Jul 17 2020 11:15:24 FW Version = git-e18ddad8516ff2cf Press "s" to enter Open Firmware. Populating /vdevice methods Populating /vdevice/vty@71000000 Populating /vdevice/nvram@71000001 Populating /pci@800000020000000 00 0000 (D) : 1b36 000d serial bus [ usb-xhci ] No NVRAM common partition, re-initializing... Scanning USB XHCI: Initializing USB Storage SCSI: Looking for devices 101000000000000 DISK : "QEMU QEMU HARDDISK 2.5+" Using default console: /vdevice/vty@71000000 Welcome to Open Firmware Copyright (c) 2004, 2017 IBM Corporation All rights reserved. This program and the accompanying materials are made available under the terms of the BSD License available at http://www.opensource.org/licenses/bsd-license.php Trying to load: from: /pci@800000020000000/usb@0/storage@1/disk@101000000000000 ... E3405: No such device E3407: Load failed Type 'boot' and press return to continue booting the system. Type 'reset-all' and press return to reboot the system. Ready! 0 > The device tree handed over by QEMU to SLOF indeed contains: qemu,boot-list = "/pci@800000020000000/usb@0/storage@1/disk@101000000000000 HALT"; but the device node is named usb-xhci@0, not usb@0. This happens because the firmware names of PCI devices returned by get_boot_devices_list() come from pcibus_get_fw_dev_path(), while the sPAPR PHB code uses a different naming scheme for device nodes. This inconsistency has always been there but it was hidden for a long time because SLOF used to rename USB device nodes, until this commit, merged in QEMU 4.2.0 : commit `85164ad4ed` Author: Alexey Kardashevskiy <aik@ozlabs.ru> Date: Wed Sep 11 16:24:32 2019 +1000 pseries: Update SLOF firmware image This fixes USB host bus adapter name in the device tree to match QEMU's one. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Fortunately, sPAPR implements the firmware path provider interface. This provides a way to override the default firmware paths. Just factor out the sPAPR PHB naming logic from spapr_dt_pci_device() to a helper, and use it in the sPAPR firmware path provider hook. Fixes: `85164ad4ed` ("pseries: Update SLOF firmware image") Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20210122170157.246374-1-groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-02-10 10:43:50 +11:00
Daniel Henrique Barboza	a85bb34e1c	spapr.c: add 'name' property for hotplugged CPUs nodes In the CPU hotunplug bug [1] the guest kernel throws a scary message in dmesg: pseries-hotplug-cpu: Failed to offline CPU <NULL>, rc: -16 The reason isn't related to the bug though. This happens because the kernel file arch/powerpc/platform/pseries/hotplug-cpu.c, function dlpar_cpu_remove(), is not finding the device_node.name of the offending CPU. We're not populating the 'name' property for hotplugged CPUs. Since the kernel relies on device_node.name for identifying CPU nodes, and the CPUs that are coldplugged has the 'name' property filled by SLOF, this is creating an unneeded inconsistency between hotplug and coldplug CPUs in the kernel. Let's fill the 'name' property for hotplugged CPUs as well. This will make the guest dmesg throws a less intimidating message when we try to unplug the last online CPU: pseries-hotplug-cpu: Failed to offline CPU PowerPC,POWER9@1, rc: -16 [1] https://bugzilla.redhat.com/1911414 Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210120232305.241521-3-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-02-10 10:43:49 +11:00
Daniel Henrique Barboza	7265bc3e54	spapr.c: use g_auto* with 'nodename' in CPU DT functions Next patch will use the 'nodename' string in spapr_core_dt_populate() after the point it's being freed today. Instead of moving 'g_free(nodename)' around, let's do a QoL change in both CPU DT functions where 'nodename' is being freed, and use g_autofree to avoid the 'g_free()' call altogether. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210120232305.241521-2-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-02-10 10:43:49 +11:00
David Gibson	6c8ebe30ea	spapr: Add PEF based confidential guest support Some upcoming POWER machines have a system called PEF (Protected Execution Facility) which uses a small ultravisor to allow guests to run in a way that they can't be eavesdropped by the hypervisor. The effect is roughly similar to AMD SEV, although the mechanisms are quite different. Most of the work of this is done between the guest, KVM and the ultravisor, with little need for involvement by qemu. However qemu does need to tell KVM to allow secure VMs. Because the availability of secure mode is a guest visible difference which depends on having the right hardware and firmware, we don't enable this by default. In order to run a secure guest you need to create a "pef-guest" object and set the confidential-guest-support property to point to it. Note that this just allows secure guests, the architecture of PEF is such that the guest still needs to talk to the ultravisor to enter secure mode. Qemu has no direct way of knowing if the guest is in secure mode, and certainly can't know until well after machine creation time. To start a PEF-capable guest, use the command line options: -object pef-guest,id=pef0 -machine confidential-guest-support=pef0 Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org>	2021-02-08 16:57:38 +11:00
Greg Kurz	73598c75df	spapr: Improve handling of memory unplug with old guests Since commit `1e8b5b1aa1` ("spapr: Allow memory unplug to always succeed") trying to unplug memory from a guest that doesn't support it (eg. rhel6) no longer generates an error like it used to. Instead, it leaves the memory around : only a subsequent reboot or manual use of drmgr within the guest can complete the hot-unplug sequence. A flag was added to SpaprMachineClass so that this new behavior only applies to the default machine type. We can do better. CAS processes all pending hot-unplug requests. This means that we don't really care about what the guest supports if the hot-unplug request happens before CAS. All guests that we care for, even old ones, set enough bits in OV5 that lead to a non-empty bitmap in spapr->ov5_cas. Use that as a heuristic to decide if CAS has already occured or not. Always accept unplug requests that happen before CAS since CAS will process them. Restore the previous behavior of rejecting them after CAS when we know that the guest doesn't support memory hot-unplug. This behavior is suitable for all machine types : this allows to drop the pre_6_0_memory_unplug flag. Fixes: `1e8b5b1aa1` ("spapr: Allow memory unplug to always succeed") Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <161012708715.801107.11418801796987916516.stgit@bahia.lan> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-01-19 10:20:29 +11:00
Greg Kurz	1105504100	spapr: Use spapr_drc_reset_all() at machine reset Documentation of object_child_foreach_recursive() clearly stipulates that "it is forbidden to add or remove children from @obj from the @fn callback". But this is exactly what we do during machine reset. The call to spapr_drc_reset() can finalize the hot-unplug sequence of a PHB or a PCI bridge, both of which will then in turn destroy their PCI DRCs. This could potentially invalidate the iterator used by do_object_child_foreach(). It is pure luck that this haven't caused any issues so far. Use spapr_drc_reset_all() since it can cope with DRC removal. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201218103400.689660-5-groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-01-06 11:09:59 +11:00
Greg Kurz	1e8b5b1aa1	spapr: Allow memory unplug to always succeed It is currently impossible to hot-unplug a memory device between machine reset and CAS. (qemu) device_del dimm1 Error: Memory hot unplug not supported for this guest This limitation was introduced in order to provide an explicit error path for older guests that didn't support hot-plug event sources (and thus memory hot-unplug). The linux kernel has been supporting these since 4.11. All recent enough guests are thus capable of handling the removal of a memory device at all time, including during early boot. Lift the limitation for the latest machine type. This means that trying to unplug memory from a guest that doesn't support it will likely just do nothing and the memory will only get removed at next reboot. Such older guests can still get the existing behavior by using an older machine type. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160794035064.23292.17560963281911312439.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-01-06 11:09:59 +11:00
Greg Kurz	776e887f08	spapr: Fix DR properties of the root node Section 13.5.2 of LoPAPR mandates various DR related indentifiers for all hot-pluggable entities to be exposed in the "ibm,drc-indexes", "ibm,drc-power-domains", "ibm,drc-names" and "ibm,drc-types" properties of their parent node. These properties are created with spapr_dt_drc(). PHBs and LMBs are both children of the machine. Their DR identifiers are thus supposed to be exposed in the afore mentioned properties of the root node. When PHB hot-plug support was added, an extra call to spapr_dt_drc() was introduced: this overwrites the existing properties, previously populated with the LMB identifiers, and they end up containing only PHB identifiers. This went unseen so far because linux doesn't care, but this is still not conformant with LoPAPR. Fortunately spapr_dt_drc() is able to handle multiple DR entity types at the same time. Use that to handle DR indentifiers for PHBs and LMBs with a single call to spapr_dt_drc(). While here also account for PMEM DR identifiers, which were forgotten when NVDIMM hot-plug support was added. Also add an assert to prevent further misuse of spapr_dt_drc(). With -m 1G,maxmem=2G,slots=8 passed on the QEMU command line we get: Without this patch: /proc/device-tree/ibm,drc-indexes 0000001f 20000001 20000002 20000003 20000000 20000005 20000006 20000007 20000004 20000009 20000008 20000010 20000011 20000012 20000013 20000014 20000015 20000016 20000017 20000018 20000019 2000000a 2000000b 2000000c 2000000d 2000000e 2000000f 2000001a 2000001b 2000001c 2000001d 2000001e These are the DRC indexes for the 31 possible PHBs. With this patch: /proc/device-tree/ibm,drc-indexes 0000002b 90000000 90000001 90000002 90000003 90000004 90000005 90000006 90000007 20000001 20000002 20000003 20000000 20000005 20000006 20000007 20000004 20000009 20000008 20000010 20000011 20000012 20000013 20000014 20000015 20000016 20000017 20000018 20000019 2000000a 2000000b 2000000c 2000000d 2000000e 2000000f 2000001a 2000001b 2000001c 2000001d 2000001e 80000004 80000005 80000006 80000007 And now we also have the 4 ((2G - 1G) / 256M) LMBs and the 8 (slots) PMEMs. Fixes: `3998ccd092` ("spapr: populate PHB DRC entries for root DT node") Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160794479566.35245.17809158217760761558.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-01-06 11:09:59 +11:00
Greg Kurz	73231f7c5f	spapr: DRC lookup cannot fail All memory DRC objects are created during machine init. It is thus safe to assume spapr_drc_by_id() cannot return NULL when hot-plug/unplugging memory. Make this clear with an assertion, like the code already does a few lines above when looping over memory DRCs. This fixes Coverity reports 1437757 and 1437758. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160805381160.228955.5388294067094240175.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-01-06 11:09:59 +11:00
Igor Mammedov	55810e90cc	ppc/spapr: cleanup -machine pseries,nvdimm=X handling Since NVDIMM support was introduced on pseries machine, it ignored machine's nvdimm=on\|off option and effectively was always enabled on machines that support NVDIMM. Later on commit (`28f5a71621` ppc/spapr_nvdimm: do not enable support with 'nvdimm=off') makes QEMU error out in case user explicitly set 'nvdimm=off' on CLI by peeking at machine_opts. However that's a workaround and leaves 'nvdimms_state->is_enabled' in inconsistent state (false) when it should be set true by default. Instead of using on machine_opts, set default to true for pseries machine in initfn time. If user sets manually 'nvdimm=off' it will overwrite default value to false and QEMU will error as expected without need to peek into machine_opts. That way pseries will have, nvdimm enabled by default and will honor user provided 'nvdimm=on\|off'. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20201208164606.4109134-1-imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-12-15 12:51:53 -05:00
Daniel Henrique Barboza	07b10bc42c	spapr.c: set a 'kvm-type' default value instead of relying on NULL spapr_kvm_type() is considering 'vm_type=NULL' as a valid input, where the function returns 0. This is relying on the current QEMU machine options handling logic, where the absence of the 'kvm-type' option will be reflected as 'vm_type=NULL' in this function. This is not robust, and will break if QEMU options code decides to propagate something else in the case mentioned above (e.g. an empty string instead of NULL). Let's avoid this entirely by setting a non-NULL default value in case of no user input for 'kvm-type'. spapr_kvm_type() was changed to handle 3 fixed values of kvm-type: "auto", "hv", and "pr", with "auto" being the default if no kvm-type was set by the user. This allows us to always be predictable regardless of any enhancements/changes made in QEMU options mechanics. While we're at it, let's also document in 'kvm-type' description the already existing default mode, now named 'auto'. The information provided about it is based on how the pseries kernel handles the KVM_CREATE_VM ioctl(), where the default value '0' makes the kernel choose an available KVM module to use, giving precedence to kvm_hv. This logic is described in the kernel source file arch/powerpc/kvm/powerpc.c, function kvm_arch_init_vm(). Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20201210145517.1532269-2-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org>	2020-12-14 15:54:12 +11:00
Greg Kurz	bc370a659a	spapr: spapr_drc_attach() cannot fail All users are passing &error_abort already. Document the fact that spapr_drc_attach() should only be passed a free DRC, which is supposedly the case if appropriate checking is done earlier. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201201113728.885700-5-groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-12-14 15:54:12 +11:00
Greg Kurz	f9b43958b9	spapr: Simplify error path of spapr_core_plug() spapr_core_pre_plug() already guarantees that the slot for the given core ID is available. It is thus safe to assume that spapr_find_cpu_slot() returns a slot during plug. Turn the error path into an assertion. It is also safe to assume that no device is attached to the corresponding DRC and that spapr_drc_attach() shouldn't fail. Pass &error_abort to spapr_drc_attach() and simplify error handling. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201201113728.885700-4-groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-12-14 15:54:12 +11:00
Greg Kurz	376412135d	spapr: Abort if ppc_set_compat() fails for hot-plugged CPUs When a CPU is hot-plugged, we set its compat mode to match the boot CPU, which was either set by machine reset or by CAS. This is currently handled in the plug handler after the core got realized. Potential errors of ppc_set_compat() are propagated to the hot-plug logic. Handling errors this late in the hot-plug sequence is generally frown upon. Ideally, we should do sanity checks in a pre-plug handler and pass &error_abort to ppc_set_compat() in the plug handler. We can filter out some error cases of ppc_set_compat() by calling ppc_check_compat() at pre-plug. But ppc_set_compat() also sets the compat register in KVM, and KVM doesn't provide any API that would allow to check valid compat mode settings beforehand. However, at this point we know that the compat mode was already successfully set for the boot CPU. Since this all boils down to setting a register with the very same value that was valid for the boot CPU, it should definitely not fail for hot-plugged CPUS. Pass &error_abort to ppc_set_compat(). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201201113728.885700-3-groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-12-14 15:54:12 +11:00
Greg Kurz	1b4ab51493	spapr: Fix pre-2.10 dummy ICP hack This hack registers dummy VMState entries of ICPs in order to support migration of old pseries machine types that used to create all smp.max_cpus possible ICPs at machine init. Part of the work is to unregister the dummy entries when plugging an actual vCPU core, and to register them back when unplugging the core. The code that unregisters the dummy ICPs in spapr_core_plug() is misplaced: if ppc_set_compat() fails afterwards, the hotplug operation will be cancelled and the dummy ICPs won't be registered back since the unplug handler isn't called. Unregister the dummy ICPs at the end of spapr_core_plug(). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201201113728.885700-2-groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-12-14 15:54:12 +11:00
Greg Kurz	ac96807b02	spapr: Do TPM proxy hotplug sanity checks at pre-plug There can be only one TPM proxy at a time. This is currently checked at plug time. But this can be detected at pre-plug in order to error out earlier. This allows to get rid of error handling in the plug handler. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201120234208.683521-9-groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-12-14 15:50:55 +11:00
Greg Kurz	9a07069958	spapr: Do PHB hoplug sanity check at pre-plug We currently detect that a PHB index is already in use at plug time. But this can be decteted at pre-plug in order to error out earlier. This allows to pass &error_abort to spapr_drc_attach() and to end up with a plug handler that doesn't need to report errors anymore. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201120234208.683521-8-groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-12-14 15:50:55 +11:00
Greg Kurz	f5598c92b8	spapr: Make PHB placement functions and spapr_pre_plug_phb() return status Read documentation in "qapi/error.h" and changelog of commit `e3fe3988d7` ("error: Document Error API usage rules") for rationale. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201120234208.683521-7-groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-12-14 15:50:55 +11:00
Greg Kurz	ea042c53f4	spapr: Do NVDIMM/PC-DIMM device hotplug sanity checks at pre-plug only Pre-plug of a memory device, be it an NVDIMM or a PC-DIMM, ensures that the memory slot is available and that addresses don't overlap with existing memory regions. The corresponding DRCs in the LMB and PMEM namespaces are thus necessarily attachable at plug time. Pass &error_abort to spapr_drc_attach() in spapr_add_lmbs() and spapr_add_nvdimm(). This allows to greatly simplify error handling on the plug path. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201120234208.683521-3-groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-12-14 15:50:55 +11:00
Paolo Bonzini	46ee119fb6	vl: remove serial_max_hds serial_hd(i) is NULL if and only if i >= serial_max_hds(). Test serial_hd(i) instead of bounding the loop at serial_max_hds(), thus removing one more function that vl.c is expected to export. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-12-10 12:15:19 -05:00
Paolo Bonzini	2c65db5e58	vl: extract softmmu/datadir.c Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-12-10 12:15:18 -05:00
Paolo Bonzini	cd7b94989a	ppc: remove bios_name Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20201026143028.3034018-11-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-12-10 12:15:06 -05:00
Cornelia Huck	576a00bdeb	hw: add compat machines for 6.0 Add 6.0 machine types for arm/i440fx/q35/s390x/spapr. Signed-off-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20201109173928.1001764-1-cohuck@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-12-08 13:48:58 -05:00
Greg Kurz	184b813e7b	spapr: Drop dead code in spapr_reallocate_hpt() Sometimes QEMU needs to allocate the HPT in userspace, namely with TCG or PR KVM. This is performed with qemu_memalign() because of alignment requirements. Like glib's allocators, its behaviour is to abort on OOM instead of returning NULL. This could be changed to qemu_try_memalign(), but in the specific case of spapr_reallocate_hpt(), the outcome would be to terminate QEMU anyway since no HPT means no MMU for the guest. Drop the dead code instead. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160398562892.32380.15006707861753544263.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-11-05 12:18:48 +11:00
Greg Kurz	a4e3a7c02b	spapr: Improve spapr_reallocate_hpt() error reporting spapr_reallocate_hpt() has three users, two of which pass &error_fatal and the third one, htab_load(), passes &local_err, uses it to detect failures and simply propagates -EINVAL up to vmstate_load(), which will cause QEMU to exit. It is thus confusing that spapr_reallocate_hpt() doesn't return right away when an error is detected in some cases. Also, the comment suggesting that the caller is welcome to try to carry on seems like a remnant in this respect. This can be improved: - change spapr_reallocate_hpt() to always report a negative errno on failure, either as reported by KVM or -ENOSPC if the HPT is smaller than what was asked, - use that to detect failures in htab_load() which is preferred over checking &local_err, - propagate this negative errno to vmstate_load() because it is more accurate than propagating -EINVAL for all possible errors. [dwg: Fix compile error due to omitted prelim patch] Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160371605460.305923.5890143959901241157.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-10-28 01:08:53 +11:00
Greg Kurz	0a06e4d626	target/ppc: Fix kvmppc_load_htab_chunk() error reporting If kvmppc_load_htab_chunk() fails, its return value is propagated up to vmstate_load(). It should thus be a negative errno, not -1 (which maps to EPERM and would lure the user into thinking that the problem is necessarily related to a lack of privilege). Return the error reported by KVM or ENOSPC in case of short write. While here, propagate the error message through an @errp argument and have the caller to print it with error_report_err() instead of relying on fprintf(). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160371604713.305923.5264900354159029580.stgit@bahia.lan> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-10-28 01:08:53 +11:00
Greg Kurz	c3e051ed6d	spapr: Use error_append_hint() in spapr_reallocate_hpt() Hints should be added with the dedicated error_append_hint() API because we don't want to print them when using QMP. This requires to insert ERRP_GUARD as explained in "qapi/error.h". Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160371604030.305923.17464161378167312662.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-10-28 01:08:53 +11:00
Greg Kurz	6e837f98ba	spapr: Simplify error handling in spapr_memory_plug() As recommended in "qapi/error.h", add a bool return value to spapr_add_lmbs() and spapr_add_nvdimm(), and use them instead of local_err in spapr_memory_plug(). This allows to get rid of the error propagation overhead. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160309734178.2739814.3488437759887793902.stgit@bahia.lan> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-10-28 01:08:53 +11:00
Greg Kurz	271ced1d62	spapr: Pass &error_abort when getting some PC DIMM properties Both PC_DIMM_SLOT_PROP and PC_DIMM_ADDR_PROP are defined in the default property list of the PC DIMM device class: DEFINE_PROP_UINT64(PC_DIMM_ADDR_PROP, PCDIMMDevice, addr, 0), DEFINE_PROP_INT32(PC_DIMM_SLOT_PROP, PCDIMMDevice, slot, PC_DIMM_UNASSIGNED_SLOT), They should thus be always gettable for both PC DIMMs and NVDIMMs. An error in getting them can only be the result of a programming error. It doesn't make much sense to propagate the error in this case. Abort instead. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160309732180.2739814.7243774674998010907.stgit@bahia.lan> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-10-28 01:08:53 +11:00
Greg Kurz	581778dd47	spapr: Use appropriate getter for PC_DIMM_SLOT_PROP The PC_DIMM_SLOT_PROP property is defined as: DEFINE_PROP_INT32(PC_DIMM_SLOT_PROP, PCDIMMDevice, slot, PC_DIMM_UNASSIGNED_SLOT), Use object_property_get_int() instead of object_property_get_uint(). Since spapr_memory_plug() only gets called if pc_dimm_pre_plug() succeeded, we expect to have a valid >= 0 slot number, either because the user passed a valid slot number or because pc_dimm_get_free_slot() picked one up for us. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160309730758.2739814.15821922745424652642.stgit@bahia.lan> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-10-28 01:08:53 +11:00
Greg Kurz	65226afd90	spapr: Use appropriate getter for PC_DIMM_ADDR_PROP The PC_DIMM_ADDR_PROP property is defined as: DEFINE_PROP_UINT64(PC_DIMM_ADDR_PROP, PCDIMMDevice, addr, 0), Use object_property_get_uint() instead of object_property_get_int(). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160309729609.2739814.4996614957953215591.stgit@bahia.lan> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-10-28 01:08:53 +11:00
Greg Kurz	84fd549619	pc-dimm: Drop @errp argument of pc_dimm_plug() pc_dimm_plug() doesn't use it. It only aborts on error. Drop @errp and adapt the callers accordingly. [dwg: Removed unused label to fix compile] Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160309728447.2739814.12831204841251148202.stgit@bahia.lan> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-10-28 01:08:53 +11:00
Greg Kurz	ce316b5118	spapr: Move spapr_create_nvdimm_dr_connectors() to core machine code The spapr_create_nvdimm_dr_connectors() function doesn't need to access any internal details of the sPAPR NVDIMM implementation. Also, pretty much like for the LMBs, only spapr_machine_init() is responsible for the creation of DR connectors for NVDIMMs. Make this clear by making this function static in hw/ppc/spapr.c. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160249772183.757627.7396780936543977766.stgit@bahia.lan> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-10-28 01:08:53 +11:00
Daniel Henrique Barboza	29bfe52a52	spapr: add spapr_machine_using_legacy_numa() helper The changes to come to NUMA support are all guest visible. In theory we could just create a new 5_1 class option flag to avoid the changes to cascade to 5.1 and under. The reality is that these changes are only relevant if the machine has more than one NUMA node. There is no need to change guest behavior that has been around for years needlesly. This new helper will be used by the next patches to determine whether we should retain the (soon to be) legacy NUMA behavior in the pSeries machine. The new behavior will only be exposed if: - machine is pseries-5.2 and newer; - more than one NUMA node is declared in NUMA state. Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20201007172849.302240-2-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-10-09 10:52:09 +11:00
Greg Kurz	35dce34fbc	spapr: Add a return value to spapr_check_pagesize() As recommended in "qapi/error.h", return true on success and false on failure. This allows to reduce error propagation overhead in the callers. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20200914123505.612812-14-groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-10-09 10:15:06 +11:00
Greg Kurz	451c690589	spapr: Add a return value to spapr_nvdimm_validate() As recommended in "qapi/error.h", return true on success and false on failure. This allows to reduce error propagation overhead in the callers. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20200914123505.612812-13-groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-10-09 10:15:06 +11:00
Greg Kurz	cfdc527473	spapr: Add a return value to spapr_set_vcpu_id() As recommended in "qapi/error.h", return true on success and false on failure. This allows to reduce error propagation overhead in the callers. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20200914123505.612812-11-groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-10-09 10:15:06 +11:00
Greg Kurz	17548fe64a	spapr: Add a return value to spapr_drc_attach() As recommended in "qapi/error.h", return true on success and false on failure. This allows to reduce error propagation overhead in the callers. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20200914123505.612812-9-groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-10-09 10:15:06 +11:00
Greg Kurz	a3114923d4	spapr: Simplify error handling in callers of ppc_set_compat() Now that ppc_set_compat() indicates success/failure with a return value, use it and reduce error propagation overhead. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20200914123505.612812-5-groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-10-09 10:15:06 +11:00
Fabiano Rosas	f0638a0b6b	spapr: Handle HPT allocation failure in nested guest The nested KVM code does not yet support HPT guests. Calling the KVM_CAP_PPC_ALLOC_HTAB ioctl currently leads to KVM setting the guest as HPT and erroneously executing code in L1 that should only run in hypervisor mode, leading to an exception in the L1 vcpu thread when it enters the nested guest. This can be reproduced with -machine max-cpu-compat=power8 in the L2 guest command line. The KVM code has since been modified to fail the ioctl when running in a nested environment so QEMU needs to be able to handle that. This patch provides an error message informing the user about the lack of support for HPT in nested guests. Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20200911043123.204162-1-farosas@linux.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-10-09 10:15:06 +11:00
Igor Mammedov	b21aa7e01e	numa: drop support for '-numa node' (without memory specified) it was deprecated since 4.1 commit `4bb4a2732e` (numa: deprecate implict memory distribution between nodes) Users of existing VMs, wishing to preserve the same RAM distribution, should configure it explicitly using ``-numa node,memdev`` options. Current RAM distribution can be retrieved using HMP command `info numa` and if separate memory devices (pc\|nv-dimm) are present use `info memory-device` and subtract device memory from output of `info numa`. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20200911084410.788171-2-imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-09-30 19:09:20 +02:00
BALATON Zoltan	617160c9e1	load_elf: Remove unused address variables from callers Several callers of load_elf() pass pointers for lowaddr and highaddr parameters which are then not used for anything. This may stem from a misunderstanding that load_elf need a value here but in fact it can take NULL to ignore these values. Remove such unused variables and pass NULL instead from callers that don't need these. Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Message-Id: <20200705174020.BDD0174633F@zero.eik.bme.hu> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>	2020-09-25 16:52:08 -07:00
Daniel Henrique Barboza	0ee520126a	spapr, spapr_numa: move lookup-arrays handling to spapr_numa.c In a similar fashion as the previous patch, let's move the handling of ibm,associativity-lookup-arrays from spapr.c to spapr_numa.c. A spapr_numa_write_assoc_lookup_arrays() helper was created, and spapr_dt_dynamic_reconfiguration_memory() can now use it to advertise the lookup-arrays. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20200903220639.563090-4-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-09-08 10:08:43 +10:00
Daniel Henrique Barboza	8f86a40824	spapr, spapr_numa: handle vcpu ibm,associativity Vcpus have an additional paramenter to be appended, vcpu_id. This also changes the size of the of property itself, which is being represented in index 0 of numa_assoc_array[cpu->node_id], and defaults to MAX_DISTANCE_REF_POINTS for all cases but vcpus. All this logic makes more sense in spapr_numa.c, where we handle everything NUMA and associativity. A new helper spapr_numa_fixup_cpu_dt() was added, and spapr.c uses it the same way as it was using the former spapr_fixup_cpu_numa_dt(). Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20200903220639.563090-3-danielhb413@gmail.com> [dwg: Correct uint to int type, which can break windows builds] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-09-08 10:08:43 +10:00
Daniel Henrique Barboza	f1aa45fffe	spapr: introduce SpaprMachineState::numa_assoc_array The next step to centralize all NUMA/associativity handling in the spapr machine is to create a 'one stop place' for all things ibm,associativity. This patch introduces numa_assoc_array, a 2 dimensional array that will store all ibm,associativity arrays of all NUMA nodes. This array is initialized in a new spapr_numa_associativity_init() function, called in spapr_machine_init(). It is being initialized with the same values used in other ibm,associativity properties around spapr files (i.e. all zeros, last value is node_id). The idea is to remove all hardcoded definitions and FDT writes of ibm,associativity arrays, doing instead a call to the new helper spapr_numa_write_associativity_dt() helper, that will be able to write the DT with the correct values. We'll start small, handling the trivial cases first. The remaining instances of ibm,associativity will be handled next. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20200903220639.563090-2-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-09-08 10:08:43 +10:00
Daniel Henrique Barboza	1eee995026	ppc: introducing spapr_numa.c NUMA code helper We're going to make changes in how spapr handles all ibm,associativity* related properties to enhance our current NUMA support. At this moment we have associativity code scattered all around spapr_* files, with hardcoded values and array sizes. This makes it harder to change any NUMA specific parameters in the future. Having everything in the same place allows not only for easier tuning, but also easier understanding since all NUMA related code is on the same file. This patch introduces a new file to gather all NUMA/associativity handling code in spapr, spapr_numa.c. To get things started, let's remove associativity-reference-points and max-associativity-domains code from spapr_dt_rtas() to a new helper called spapr_numa_write_rtas_dt(). This will decouple spapr_dt_rtas() from the NUMA changes that are going to happen in those two properties. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20200901125645.118026-2-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-09-08 10:08:43 +10:00
Daniel Henrique Barboza	beb6073fe7	spapr, spapr_nvdimm: fold NVDIMM validation in the same place NVDIMM has different contraints and conditions than the regular DIMM and we'll need to add at least one more. Instead of relying on 'if (nvdimm)' conditionals in the body of spapr_memory_pre_plug(), use the existing spapr_nvdimm_validate_opts() and put all NVDIMM handling code there. Rename it to spapr_nvdimm_validate() to reflect that the function is now checking more than the nvdimm device options. This makes spapr_memory_pre_plug() a bit easier to follow, and we can tune in NVDIMM parameters and validation in the same place. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20200825215749.213536-3-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-09-08 10:08:42 +10:00

1 2 3 4 5 ...

926 Commits