mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Philippe Mathieu-Daudé	a580fdcd60	hw/ppc/pnv: Avoid dynamic stack allocation Use autofree heap allocation instead of variable-length array on the stack. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-id: 20220819153931.3147384-7-peter.maydell@linaro.org	2022-09-22 16:38:28 +01:00
Xuzhou Cheng	cb5b5ab9a5	hw/ppc: spapr: Use qemu_vfree() to free spapr->htab spapr->htab is allocated by qemu_memalign(), hence we should use qemu_vfree() to free it. Fixes: `c5f54f3e31` ("pseries: Move hash page table allocation to reset time") Fixes: `b4db54132f` ("target/ppc: Implement H_REGISTER_PROCESS_TABLE H_CALL"") Signed-off-by: Xuzhou Cheng <xuzhou.cheng@windriver.com> Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220920103159.1865256-28-bmeng.cn@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2022-09-20 12:31:53 -03:00
Cornelia Huck	f514e1477f	hw: Add compat machines for 7.2 Add 7.2 machine types for arm/i440fx/m68k/q35/s390x/spapr. Signed-off-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220727121755.395894-1-cohuck@redhat.com> [thuth: fixed conflict with pcmc->legacy_no_rng_seed] Signed-off-by: Thomas Huth <thuth@redhat.com>	2022-08-25 21:59:04 +02:00
Leandro Lupori	3c2e80ad2f	ppc: Check partition and process table alignment Check if partition and process tables are properly aligned, in their size, according to PowerISA 3.1B, Book III 6.7.6 programming note. Hardware and KVM also raise an exception in these cases. Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20220628133959.15131-2-leandro.lupori@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2022-07-18 13:59:43 -03:00
Jason A. Donenfeld	c4b075318e	hw/ppc: pass random seed to fdt If the FDT contains /chosen/rng-seed, then the Linux RNG will use it to initialize early. Set this using the usual guest random number generation function. This is confirmed to successfully initialize the RNG on Linux 5.19-rc6. The rng-seed node is part of the DT spec. Set this on the paravirt platforms, spapr and e500, just as is done on other architectures with paravirt hardware. Cc: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20220712135114.289855-1-Jason@zx2c4.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2022-07-18 13:59:43 -03:00
Alexey Kardashevskiy	81b205cecf	ppc/spapr: Implement H_WATCHDOG The new PAPR 2.12 defines a watchdog facility managed via the new H_WATCHDOG hypercall. This adds H_WATCHDOG support which a proposed driver for pseries uses: https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=303120 This was tested by running QEMU with a debug kernel and command line: -append \ "pseries-wdt.timeout=60 pseries-wdt.nowayout=1 pseries-wdt.action=2" and running "echo V > /dev/watchdog0" inside the VM. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20220622051008.1067464-1-aik@ozlabs.ru> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2022-07-06 10:22:38 -03:00
Alexey Kardashevskiy	5bb55f3e3b	spapr: Use address from elf parser for kernel address tl;dr: This allows Big Endian zImage booting via -kernel + x-vof=on. QEMU loads the kernel at 0x400000 by default which works most of the time as Linux kernels are relocatable, 64bit and compiled with "-pie" (position independent code). This works for a little endian zImage too. However a big endian zImage is compiled without -pie, is 32bit, linked to 0x4000000 so current QEMU ends up loading it at 0x4400000 but keeps spapr->kernel_addr unchanged so booting fails. This uses the kernel address returned from load_elf(). If the default kernel_addr is used, there is no change in behavior (as translate_kernel_address() takes care of this), which is: LE/BE vmlinux and LE zImage boot, BE zImage does not. If the VM created with "-machine kernel-addr=0,x-vof=on", then QEMU prints a warning and BE zImage boots. Note #1: SLOF (x-vof=off) still cannot boot a big endian zImage as SLOF enables MSR_SF for everything loaded by QEMU and this leads to early crash of 32bit zImage. Note #2: BE/LE vmlinux images set MSR_SF in early boot so these just work; a LE zImage restores MSR_SF after every CI call and we are lucky enough not to crash before the first CI call. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20220504065536.3534488-1-aik@ozlabs.ru> [danielhb: use PRIx64 instead of lx in warn_report] Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2022-05-26 17:11:32 -03:00
Paolo Bonzini	f73eb9484b	pseries: allow setting stdout-path even on machines with a VGA -machine graphics=off is the usual way to tell the firmware or the OS that the user wants a serial console. The pseries machine however does not support this, and never adds the stdout-path node to the device tree if a VGA device is provided. This is in addition to the other magic behavior of VGA devices, which is to add a keyboard and mouse to the default USB bus. Split spapr->has_graphics in two variables so that the two behaviors can be separated: the USB devices remains the same, but the stdout-path is added even with "-device VGA -machine graphics=off". Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220507054826.124936-1-pbonzini@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2022-05-26 17:11:32 -03:00
Paolo Bonzini	97ec4d21e0	machine: use QAPI struct for boot configuration As part of converting -boot to a property with a QAPI type, define the struct and use it throughout QEMU to access boot configuration. machine_boot_parse takes care of doing the QemuOpts->QAPI conversion by hand, for now. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220414165300.555321-2-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:43 +02:00
Gautam Agrawal	f9bcb2d684	Warn user if the vga flag is passed but no vga device is created A global boolean variable "vga_interface_created"(declared in softmmu/globals.c) has been used to track the creation of vga interface. If the vga flag is passed in the command line "default_vga"(declared in softmmu/vl.c) variable is set to 0. To warn user, the condition checks if vga_interface_created is false and default_vga is equal to 0. If "-vga none" is passed, this patch will not warn the user regarding the creation of VGA device. The warning "A -vga option was passed but this machine type does not use that option; no VGA device has been created" is logged if vga flag is passed but no vga device is created. This patch has been tested for x86_64, i386, sparc, sparc64 and arm boards. Signed-off-by: Gautam Agrawal <gautamnagrawal@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/581 Message-Id: <20220501122505.29202-1-gautamnagrawal@gmail.com> [thuth: Fix wrong warning with "-device" in some cases as reported by Paolo] Signed-off-by: Thomas Huth <thuth@redhat.com>	2022-05-09 08:21:14 +02:00
Víctor Colombo	d41ccf6eea	target/ppc: Remove msr_pr macro msr_pr macro hides the usage of env->msr, which is a bad behavior Substitute it with FIELD_EX64 calls that explicitly use env->msr as a parameter. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220504210541.115256-4-victor.colombo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2022-05-05 15:36:17 -03:00
Cornelia Huck	0ca703662e	hw: Add compat machines for 7.1 Add 7.1 machine types for arm/i440fx/m68k/q35/s390x/spapr. Signed-off-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20220316145521.1224083-1-cohuck@redhat.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Acked-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2022-04-20 09:36:24 +02:00
Marc-André Lureau	0f9668e0c1	Remove qemu-common.h include from most units Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-33-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:55 +02:00
Markus Armbruster	b21e238037	Use g_new() & friends where that makes obvious sense g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer, for two reasons. One, it catches multiplication overflowing size_t. Two, it returns T * rather than void *, which lets the compiler catch more type errors. This commit only touches allocations with size arguments of the form sizeof(T). Patch created mechanically with: $ spatch --in-place --sp-file scripts/coccinelle/use-g_new-etc.cocci \ --macro-file scripts/cocci-macro-file.h FILES... Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20220315144156.1595462-4-armbru@redhat.com> Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>	2022-03-21 15:44:44 +01:00
Peter Maydell	5df022cf2e	osdep: Move memalign-related functions to their own header Move the various memalign-related functions out of osdep.h and into their own header, which we include only where they are used. While we're doing this, add some brief documentation comments. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org	2022-03-07 13:16:49 +00:00
Daniel Henrique Barboza	5f2b96b38e	hw/ppc/spapr.c: fail early if no firmware found in machine_init() The firmware check consists on a file search (qemu_find_file) and load it via load_imag_targphys(). This validation is not dependent on any other machine state but it currently being done at the end of spapr_machine_init(). This means that we can do a lot of stuff and end up failing at the end for something that we can verify right out of the gate. Move this validation to the start of spapr_machine_init() to fail earlier. While we're at it, use g_autofree in the 'filename' pointer. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220228175004.8862-3-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-03-02 06:51:39 +01:00
Daniel Henrique Barboza	aebb9b9cb2	hw/ppc/spapr.c: use g_autofree in spapr_dt_chosen() Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220228175004.8862-2-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-03-02 06:51:39 +01:00
Nicholas Piggin	120f738a46	spapr: implement nested-hv capability for the virtual hypervisor This implements the Nested KVM HV hcall API for spapr under TCG. The L2 is switched in when the H_ENTER_NESTED hcall is made, and the L1 is switched back in returned from the hcall when a HV exception is sent to the vhyp. Register state is copied in and out according to the nested KVM HV hcall API specification. The hdecr timer is started when the L2 is switched in, and it provides the HDEC / 0x980 return to L1. The MMU re-uses the bare metal radix 2-level page table walker by using the get_pate method to point the MMU to the nested partition table entry. MMU faults due to partition scope errors raise HV exceptions and accordingly are routed back to the L1. The MMU does not tag translations for the L1 (direct) vs L2 (nested) guests, so the TLB is flushed on any L1<->L2 transition (hcall entry and exit). Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> [ clg: checkpatch fixes ] Message-Id: <20220216102545.1808018-10-npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-02-18 08:34:14 +01:00
Nicholas Piggin	7cebc5db2e	target/ppc: Introduce a vhyp framework for nested HV support Introduce virtual hypervisor methods that can support a "Nested KVM HV" implementation using the bare metal 2-level radix MMU, and using HV exceptions to return from H_ENTER_NESTED (rather than cause interrupts). HV exceptions can now be raised in the TCG spapr machine when running a nested KVM HV guest. The main ones are the lev==1 syscall, the hdecr, hdsi and hisi, hv fu, and hv emu, and h_virt external interrupts. HV exceptions are intercepted in the exception handler code and instead of causing interrupts in the guest and switching the machine to HV mode, they go to the vhyp where it may exit the H_ENTER_NESTED hcall with the interrupt vector numer as return value as required by the hcall API. Address translation is provided by the 2-level page table walker that is implemented for the bare metal radix MMU. The partition scope page table is pointed to the L1's partition scope by the get_pate vhc method. Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20220216102545.1808018-9-npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-02-18 08:34:14 +01:00
Nicholas Piggin	f32d4ab41c	target/ppc: make vhyp get_pate method take lpid and return success In prepartion for implementing a full partition table option for vhyp, update the get_pate method to take an lpid and return a success/fail indicator. The spapr implementation currently just asserts lpid is always 0 and always return success. Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [ clg: checkpatch fixes ] Message-Id: <20220216102545.1808018-6-npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-02-18 08:34:14 +01:00
Shivaprasad G Bhat	b5513584a0	spapr: nvdimm: Implement H_SCM_FLUSH hcall The patch adds support for the SCM flush hcall for the nvdimm devices. To be available for exploitation by guest through the next patch. The hcall is applicable only for new SPAPR specific device class which is also introduced in this patch. The hcall expects the semantics such that the flush to return with H_LONG_BUSY_ORDER_10_MSEC when the operation is expected to take longer time along with a continue_token. The hcall to be called again by providing the continue_token to get the status. So, all fresh requests are put into a 'pending' list and flush worker is submitted to the thread pool. The thread pool completion callbacks move the requests to 'completed' list, which are cleaned up after collecting the return status for the guest in subsequent hcall from the guest. The semantics makes it necessary to preserve the continue_tokens and their return status across migrations. So, the completed flush states are forwarded to the destination and the pending ones are restarted at the destination in post_load. The necessary nvdimm flush specific vmstate structures are also introduced in this patch which are to be saved in the new SPAPR specific nvdimm device to be introduced in the following patch. Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <164396254862.109112.16675611182159105748.stgit@ltczzess4.aus.stglabs.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-02-18 08:34:14 +01:00
Daniel Henrique Barboza	1977434bbf	spapr.c: check bus != NULL in spapr_get_fw_dev_path() spapr_get_fw_dev_path() is an impl of FWPathProviderClass::get_dev_path(). This interface is used by hw/core/qdev-fw.c via fw_path_provider_try_get_dev_path() in two functions: - static char qdev_get_fw_dev_path_from_handler(), which is used only in qdev_get_fw_dev_path_helper() and it's guarded by "if (dev && dev->parent_bus)"; - char qdev_get_own_fw_dev_path_from_handler(), which is used in softmmu/bootdevice.c in get_boot_device_path() like this: if (dev) { d = qdev_get_own_fw_dev_path_from_handler(dev->parent_bus, dev); This means that, when called via softmmu/bootdevice.c, there's no check of 'dev->parent_bus' being not NULL. The result is that the "BusState bus" arg of spapr_get_fw_dev_path() can potentially be NULL and if, at the same time, "SCSIDevice d" is not NULL, we'll hit this line: void *spapr = CAST(void, bus->parent, "spapr-vscsi"); And we'll SIGINT because 'bus' is NULL and we're accessing bus->parent. Adding a simple 'bus != NULL' check to guard the instances where we access 'bus->parent' can avoid this altogether. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20220121213852.30243-1-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-01-28 13:15:02 +01:00
Cédric Le Goater	2460e1d75b	spapr: Fix support of POWER5+ processors POWER5+ (ISA v2.03) processors are supported by the pseries machine but they do not have Altivec instructions. Do not advertise support for it in the DT. To be noted that this test is in contradiction with the assert in cap_vsx_apply(). Signed-off-by: Cédric Le Goater <clg@kaod.org> Tested-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20220105095142.3990430-3-clg@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-01-12 11:28:26 +01:00
Cornelia Huck	01854af2cf	hw: Add compat machines for 7.0 Add 7.0 machine types for arm/i440fx/q35/s390x/spapr. Signed-off-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20211217143948.289995-1-cohuck@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2022-01-05 09:06:36 +01:00
Leandro Lupori	7bf00dfb51	target/ppc: fix Hash64 MMU update of PTE bit R When updating the R bit of a PTE, the Hash64 MMU was using a wrong byte offset, causing the first byte of the adjacent PTE to be corrupted. This caused a panic when booting FreeBSD, using the Hash MMU. Fixes: `a2dd4e83e7` ("ppc/hash64: Rework R and C bit updates") Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2021-11-29 21:00:08 +01:00
Yanan Wang	2b52619994	machine: Move smp_prefer_sockets to struct SMPCompatProps Now we have a common structure SMPCompatProps used to store information about SMP compatibility stuff, so we can also move smp_prefer_sockets there for cleaner code. No functional change intended. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-15-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 15:29:15 +02:00
Yanan Wang	4a0af2930a	machine: Prefer cores over sockets in smp parsing since 6.2 In the real SMP hardware topology world, it's much more likely that we have high cores-per-socket counts and few sockets totally. While the current preference of sockets over cores in smp parsing results in a virtual cpu topology with low cores-per-sockets counts and a large number of sockets, which is just contrary to the real world. Given that it is better to make the virtual cpu topology be more reflective of the real world and also for the sake of compatibility, we start to prefer cores over sockets over threads in smp parsing since machine type 6.2 for different arches. In this patch, a boolean "smp_prefer_sockets" is added, and we only enable the old preference on older machines and enable the new one since type 6.2 for all arches by using the machine compat mechanism. Suggested-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-10-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 15:28:16 +02:00
Daniel Henrique Barboza	e0eb84d4f5	spapr_numa.c: FORM2 NUMA affinity support The main feature of FORM2 affinity support is the separation of NUMA distances from ibm,associativity information. This allows for a more flexible and straightforward NUMA distance assignment without relying on complex associations between several levels of NUMA via ibm,associativity matches. Another feature is its extensibility. This base support contains the facilities for NUMA distance assignment, but in the future more facilities will be added for latency, performance, bandwidth and so on. This patch implements the base FORM2 affinity support as follows: - the use of FORM2 associativity is indicated by using bit 2 of byte 5 of ibm,architecture-vec-5. A FORM2 aware guest can choose to use FORM1 or FORM2 affinity. Setting both forms will default to FORM2. We're not advertising FORM2 for pseries-6.1 and older machine versions to prevent guest visible changes in those; - ibm,associativity-reference-points has a new semantic. Instead of being used to calculate distances via NUMA levels, it's now used to indicate the primary domain index in the ibm,associativity domain of each resource. In our case it's set to {0x4}, matching the position where we already place logical_domain_id; - two new RTAS DT artifacts are introduced: ibm,numa-lookup-index-table and ibm,numa-distance-table. The index table is used to list all the NUMA logical domains of the platform, in ascending order, and allows for spartial NUMA configurations (although QEMU ATM doesn't support that). ibm,numa-distance-table is an array that contains all the distances from the first NUMA node to all other nodes, then the second NUMA node distances to all other nodes and so on; - get_max_dist_ref_points(), get_numa_assoc_size() and get_associativity() now checks for OV5_FORM2_AFFINITY and returns FORM2 values if the guest selected FORM2 affinity during CAS. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210920174947.556324-7-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-09-30 12:26:06 +10:00
Daniel Henrique Barboza	5dab5abe62	spapr: move FORM1 verifications to post CAS FORM2 NUMA affinity is prepared to deal with empty (memory/cpu less) NUMA nodes. This is used by the DAX KMEM driver to locate a PAPR SCM device that has a different latency than the original NUMA node from the regular memory. FORM2 is also able to deal with asymmetric NUMA distances gracefully, something that our FORM1 implementation doesn't do. Move these FORM1 verifications to a new function and wait until after CAS, when we're sure that we're sticking with FORM1, to enforce them. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210920174947.556324-6-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-09-30 12:26:06 +10:00
Daniel Henrique Barboza	4b08cd567b	spapr: use DEVICE_UNPLUG_GUEST_ERROR to report unplug errors Linux Kernel 5.12 is now unisolating CPU DRCs in the device_removal error path, signalling that the hotunplug process wasn't successful. This allow us to send a DEVICE_UNPLUG_GUEST_ERROR in drc_unisolate_logical() to signal this error to the management layer. We also have another error path in spapr_memory_unplug_rollback() for configured LMB DRCs. Kernels older than 5.13 will not unisolate the LMBs in the hotunplug error path, but it will reconfigure them. Let's send the DEVICE_UNPLUG_GUEST_ERROR event in that code path as well to cover the case of older kernels. Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210907004755.424931-7-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-09-30 12:26:06 +10:00
Daniel Henrique Barboza	44d886abab	spapr.c: handle dev->id in spapr_memory_unplug_rollback() As done in hw/acpi/memory_hotplug.c, pass an empty string if dev->id is NULL to qapi_event_send_mem_unplug_error() to avoid relying on a behavior that can be changed in the future. Suggested-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210907004755.424931-3-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-09-29 19:37:39 +10:00
Yanan Wang	52e64f5b1f	hw: Add compat machines for 6.2 Add 6.2 machine types for arm/i440fx/q35/s390x/spapr. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-09-01 11:08:16 +01:00
Peter Maydell	d1987c8114	* More SVM fixes (Lara) * Module annotation database (Gerd) * Memory leak fixes (myself) * Build fixes (myself) * --with-devices-* support (Alex) -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmDoeBgUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMtFAgAippmxRt3lt+tcdSrCOZlKmxW6veK nUidtzfH5uE8vQsh5Q98WCEq871C/C+St1gK+q2H/MLrJeAqZD39DV+SKTuZ6Tcp 3jL0iYC+oO0OjkHppDQTUDweF9KrsAW1WEeNz2th1OUDSjBXuXbZ+N497taouX18 p2UN0gKNsOO2/QFrKL5KO7vSC56eBGoZz6gKtw/7dDtJBtizf1xKBRHW43b+CnQJ mHLs7Tj6oMC+vnMHkUKLH/6za3WJF1XHs5fp2isRgqoOSP8m0r6CMg8JnFIvmQf/ tbLospKSWqcgD5C5PlFm2wSOjdU7zuPKM7wchhKrrEIvdDPhXaKrlpwi5Q== =GFX1 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging * More SVM fixes (Lara) * Module annotation database (Gerd) * Memory leak fixes (myself) * Build fixes (myself) * --with-devices-* support (Alex) # gpg: Signature made Fri 09 Jul 2021 17:23:52 BST # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini-gitlab/tags/for-upstream: (48 commits) meson: Use input/output for entitlements target configure: allow the selection of alternate config in the build configs: rename default-configs to configs and reorganise hw/arm: move CONFIG_V7M out of default-devices hw/arm: add dependency on OR_IRQ for XLNX_VERSAL meson: Introduce target-specific Kconfig meson: switch function tests from compilation to linking vl: fix leak of qdict_crumple return value target/i386: fix exceptions for MOV to DR target/i386: Added DR6 and DR7 consistency checks target/i386: Added MSRPM and IOPM size check monitor/tcg: move tcg hmp commands to accel/tcg, register them dynamically usb: build usb-host as module monitor/usb: register 'info usbhost' dynamically usb: drop usb_host_dev_is_scsi_storage hook monitor: allow register hmp commands accel: build tcg modular accel: add tcg module annotations accel: build qtest modular accel: add qtest module annotations ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-07-11 22:20:51 +01:00
Gerd Hoffmann	b7b2a60b01	usb: drop usb_host_dev_is_scsi_storage hook Introduce an usb device flag instead, set it when usb-host looks at the device descriptors anyway. Also set it for emulated storage devices, for consistency. Add an inline helper function to check the flag. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Jose R. Ziviani <jziviani@suse.de> Message-Id: <20210624103836.2382472-32-kraxel@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-07-09 18:21:33 +02:00
Bharata B Rao	82123b756a	target/ppc: Support for H_RPT_INVALIDATE hcall If KVM_CAP_RPT_INVALIDATE KVM capability is enabled, then - indicate the availability of H_RPT_INVALIDATE hcall to the guest via ibm,hypertas-functions property. - Enable the hcall Both the above are done only if the new sPAPR machine capability cap-rpt-invalidate is set. Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Message-Id: <20210706112440.1449562-3-bharata@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-07-09 11:01:06 +10:00
Alexey Kardashevskiy	21bde1ecb6	spapr: Fix implementation of Open Firmware client interface This addresses the comments from v22. The functional changes are (the VOF ones need retesting with Pegasos2): (VOF) setprop will start failing if the machine class callback did not handle it; (VOF) unit addresses are lowered in path_offset(); (SPAPR) /chosen/bootargs is initialized from kernel_cmdline if the client did not change it. Fixes: 5c991e5d4378 ("spapr: Implement Open Firmware client interface") Cc: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20210708065625.548396-1-aik@ozlabs.ru> Tested-by: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-07-09 10:55:11 +10:00
Alexey Kardashevskiy	fc8c745d50	spapr: Implement Open Firmware client interface The PAPR platform describes an OS environment that's presented by a combination of a hypervisor and firmware. The features it specifies require collaboration between the firmware and the hypervisor. Since the beginning, the runtime component of the firmware (RTAS) has been implemented as a 20 byte shim which simply forwards it to a hypercall implemented in qemu. The boot time firmware component is SLOF - but a build that's specific to qemu, and has always needed to be updated in sync with it. Even though we've managed to limit the amount of runtime communication we need between qemu and SLOF, there's some, and it has become increasingly awkward to handle as we've implemented new features. This implements a boot time OF client interface (CI) which is enabled by a new "x-vof" pseries machine option (stands for "Virtual Open Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall which implements Open Firmware Client Interface (OF CI). This allows using a smaller stateless firmware which does not have to manage the device tree. The new "vof.bin" firmware image is included with source code under pc-bios/. It also includes RTAS blob. This implements a handful of CI methods just to get -kernel/-initrd working. In particular, this implements the device tree fetching and simple memory allocator - "claim" (an OF CI memory allocator) and updates "/memory@0/available" to report the client about available memory. This implements changing some device tree properties which we know how to deal with, the rest is ignored. To allow changes, this skips fdt_pack() when x-vof=on as not packing the blob leaves some room for appending. In absence of SLOF, this assigns phandles to device tree nodes to make device tree traversing work. When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree. This adds basic instances support which are managed by a hash map ihandle -> [phandle]. Before the guest started, the used memory is: 0..e60 - the initial firmware 8000..10000 - stack 400000.. - kernel 3ea0000.. - initramdisk This OF CI does not implement "interpret". Unlike SLOF, this does not format uninitialized nvram. Instead, this includes a disk image with pre-formatted nvram. With this basic support, this can only boot into kernel directly. However this is just enough for the petitboot kernel and initradmdisk to boot from any possible source. Note this requires reasonably recent guest kernel with: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735 The immediate benefit is much faster booting time which especially crucial with fully emulated early CPU bring up environments. Also this may come handy when/if GRUB-in-the-userspace sees light of the day. This separates VOF and sPAPR in a hope that VOF bits may be reused by other POWERPC boards which do not support pSeries. This assumes potential support for booting from QEMU backends such as blockdev or netdev without devices/drivers used. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20210625055155.2252896-1-aik@ozlabs.ru> Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu> [dwg: Adjusted some includes which broke compile in some more obscure compilation setups] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-07-09 10:38:19 +10:00
Alexey Kardashevskiy	7381c5d11f	spapr: tune rtas-size QEMU reserves space for RTAS via /rtas/rtas-size which tells the client how much space the RTAS requires to work which includes the RTAS binary blob implementing RTAS runtime. Because pseries supports FWNMI which requires plenty of space, QEMU reserves more than 2KB which is enough for the RTAS blob as it is just 20 bytes (under QEMU). Since FWNMI reset delivery was added, RTAS_SIZE macro is not used anymore. This replaces RTAS_SIZE with RTAS_MIN_SIZE and uses it in the /rtas/rtas-size calculation to account for the RTAS blob. Fixes: `0e236d3477` ("ppc/spapr: Implement FWNMI System Reset delivery") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20210622070336.1463250-1-aik@ozlabs.ru> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-07-09 10:38:18 +10:00
Greg Kurz	3bf0844f3b	spapr: Don't hijack current_machine->boot_order QEMU 6.0 moved all the -boot variables to the machine. Especially, the removal of the boot_order static changed the handling of '-boot once' from: if (boot_once) { qemu_boot_set(boot_once, &error_fatal); qemu_register_reset(restore_boot_order, g_strdup(boot_order)); } to if (current_machine->boot_once) { qemu_boot_set(current_machine->boot_once, &error_fatal); qemu_register_reset(restore_boot_order, g_strdup(current_machine->boot_order)); } This means that we now register as subsequent boot order a copy of current_machine->boot_once that was just set with the previous call to qemu_boot_set(), i.e. we never transition away from the once boot order. It is certainly fragile^Wwrong for the spapr code to hijack a field of the base machine type object like that. The boot order rework simply turned this software boundary violation into an actual bug. Have the spapr code to handle that with its own field in SpaprMachineState. Also kfree() the initial boot device string when "once" was used. Fixes: `4b7acd2ac8` ("vl: clean up -boot variables") Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1960119 Cc: pbonzini@redhat.com Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20210521160735.1901914-1-groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-06-03 13:22:06 +10:00
Lucas Mateus Castro (alqotel)	03282a3ab8	hw/ppc: moved has_spr to cpu.h Moved has_spr to cpu.h as ppc_has_spr and turned it into an inline function. Change spr verification in pnv.c and spapr.c to a version that can compile in a !TCG environment. Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Message-Id: <20210507164146.67086-1-lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-05-19 10:30:28 +10:00
Fabiano Rosas	ab5add4c7b	hw/ppc/spapr.c: Make sure the host supports the selected MMU mode Starting with Linux kernel v5.12 we dropped support[1] in KVM for hosts that can't have their threads running in different MMU modes (POWER9 < DD2.2). In these hosts, KVM will no longer report the KVM_CAP_PPC_MMU_HASH_V3 capability[2] when the host is running Radix. For guests that support both MMU modes, the negotiation during CAS will make sure it selects the correct one. For guests that only support Hash, such as P8 compat mode guests, the following error is currently thrown: $ ~/qemu-system-ppc64 -machine pseries,accel=kvm,max-cpu-compat=power8 ... error: kvm run failed Invalid argument NIP 0000000000000100 LR 0000000000000000 CTR 0000000000000000 XER 0000000000000000 CPU#0 MSR 8000000000001000 HID0 0000000000000000 HF 8000000000000000 iidx 3 didx 3 TB 00000000 00000000 DECR 0 GPR00 0000000000000000 0000000000000000 0000000000000000 000000007ff00000 GPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR08 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000 CR 00000000 [ - - - - - - - - ] RES ffffffffffffffff SRR0 0000000000000000 SRR1 0000000000000000 PVR 00000000004e1201 VRSAVE 0000000000000000 SPRG0 0000000000000000 SPRG1 0000000000000000 SPRG2 0000000000000000 SPRG3 0000000000000000 SPRG4 0000000000000000 SPRG5 0000000000000000 SPRG6 0000000000000000 SPRG7 0000000000000000 HSRR0 0000000000000000 HSRR1 0000000000000000 CFAR 0000000000000000 LPCR 000000000004f01f PTCR 0000000000000000 DAR 0000000000000000 DSISR 0000000000000000 This patch adds a verification during the writing of the platform support vector so that we error out as soon as we determine this guest only supports Hash and the host doesn't. ~/qemu-system-ppc64 -machine pseries,accel=kvm,max-cpu-compat=power8 ... qemu-system-ppc64: Guest requested unavailable MMU mode (hash). 1- https://git.kernel.org/torvalds/p/b1b1697ae0cc8 2- https://git.kernel.org/torvalds/p/a722076e94702 Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20210505001130.3999968-3-farosas@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-05-19 10:30:28 +10:00
Fabiano Rosas	068479e1e1	hw/ppc/spapr.c: Extract MMU mode error reporting into a function A following patch will make use of it. Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20210505001130.3999968-2-farosas@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-05-19 10:30:28 +10:00
Peter Maydell	d90f154867	ppc patch queue 2021-05-04 Here's the first ppc pull request for qemu-6.1. It has a wide variety of stuff accumulated during the 6.0 freeze. Highlights are: * Multi-phase reset cleanups for PAPR * Preliminary cleanups towards allowing !CONFIG_TCG for the ppc target * Cleanup of AIL logic and extension to POWER10 * Further improvements to handling of hot unplug failures on PAPR * Allow much larger numbers of CPU on pseries * Support for the H_SCM_HEALTH hypercall * Add support for the Pegasos II board * Substantial cleanup to hflag handling * Assorted minor fixes and cleanups -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAmCQ4ScACgkQbDjKyiDZ s5KmNhAAsICdDqeu/jm1uhRCr0DDT/Wa6KE1xlglQ53ybWb5Hm2ae0Uwzti5ZWkt T9yryObX++wiugbU5Dlx9eXTiJIPgTbDoBV1wfOa3a1BAxSEES1t70jwuwAXXBpX mgU++SurQB70IB7vVvyXDi2Z592qGvMiKXqT0sdkfoexPHzAL0+KkQPyJZLeFchM Ap/zRHAodXf9SuWAl+LwLXeb350jivXYXBWNcFRrBbOGpbVT0AJMYrk/TEa2ZIpi SvbzAWuW+9mX0EOmk7JK5JfkT41cGNdcBcwd0bt4xyvUpmkXLaTMFDLVHj3HWSUn PFA4RB3uKXyTfISVtWdxJBbFOzMpchI6lEiRJHCS+KuY7UsACqV1T/y54ATOUauC ycLc9APgRaStdNPxfDl+xeFfoVb/f0mQsNwcmY1tv7z+3qE/trY9bMyrbgaebBFn /TAkmPvXfwtAREnx8xF/57poarWUkvupGTQkANNosdFokpExmrLj8T0sKv90hh5Y vkGf5zP4pYGN1Rs8qhOdHu+IjhVJvUl/L3LZYWcoMI6E61D8rGRc0Dkacx7gcja+ sluFi5Yh2fQn55y6LTi3049cB1wMd6wly0214F11RKoBswguiGuaqJmL4sNDO/s4 IcMCy5mg6C0jNZA5kHcdWmqsVzD2+XwP5J29n/LedlmgXoHYF+M= =N0qr -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.1-20210504' into staging ppc patch queue 2021-05-04 Here's the first ppc pull request for qemu-6.1. It has a wide variety of stuff accumulated during the 6.0 freeze. Highlights are: * Multi-phase reset cleanups for PAPR * Preliminary cleanups towards allowing !CONFIG_TCG for the ppc target * Cleanup of AIL logic and extension to POWER10 * Further improvements to handling of hot unplug failures on PAPR * Allow much larger numbers of CPU on pseries * Support for the H_SCM_HEALTH hypercall * Add support for the Pegasos II board * Substantial cleanup to hflag handling * Assorted minor fixes and cleanups # gpg: Signature made Tue 04 May 2021 06:52:39 BST # gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full] # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full] # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full] # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown] # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dg-gitlab/tags/ppc-for-6.1-20210504: (46 commits) hw/ppc/pnv_psi: Use device_cold_reset() instead of device_legacy_reset() hw/ppc/spapr_vio: Reset TCE table object with device_cold_reset() hw/intc/spapr_xive: Use device_cold_reset() instead of device_legacy_reset() target/ppc: removed VSCR from SPR registration target/ppc: Reduce the size of ppc_spr_t target/ppc: Clean up _spr_register et al target/ppc: Add POWER10 exception model target/ppc: rework AIL logic in interrupt delivery target/ppc: move opcode table logic to translate.c target/ppc: code motion from translate_init.c.inc to gdbstub.c spapr_drc.c: handle hotunplug errors in drc_unisolate_logical() spapr.h: increase FDT_MAX_SIZE spapr.c: do not use MachineClass::max_cpus to limit CPUs ppc: Rename current DAWR macros and variables target/ppc: POWER10 supports scv target/ppc: Fix POWER9 radix guest HV interrupt AIL behaviour docs/system: ppc: Add documentation for ppce500 machine roms/u-boot: Bump ppce500 u-boot to v2021.04 to fix broken pci support roms/Makefile: Update ppce500 u-boot build directory name ppc/spapr: Add support for implement support for H_SCM_HEALTH ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-05-05 20:29:14 +01:00
Daniel Henrique Barboza	5642e4513e	spapr.c: do not use MachineClass::max_cpus to limit CPUs Up to this patch, 'max_cpus' value is hardcoded to 1024 (commit `6244bb7e58`). In theory this patch would simply bump it to 2048, since it's the default NR_CPUS kernel setting for ppc64 servers nowadays, but the whole mechanic of MachineClass:max_cpus is flawed for the pSeries machine. The two supported accelerators, KVM and TCG, can live without it. TCG guests don't have a theoretical limit. The user must be free to emulate as many CPUs as the hardware is capable of. And even if there were a limit, max_cpus is not the proper way to report it since it's a common value checked by SMP code in machine_smp_parse() for KVM as well. For KVM guests, the proper way to limit KVM CPUs is by host configuration via NR_CPUS, not a QEMU hardcoded value. There is no technical reason for a pSeries QEMU guest to forcefully stay below NR_CPUS. This hardcoded value also disregard hosts that might have a lower NR_CPUS limit, say 512. In this case, machine.c:machine_smp_parse() will allow a 1024 value to pass, but then kvm_init() will complain about it because it will exceed NR_CPUS: Number of SMP cpus requested (1024) exceeds the maximum cpus supported by KVM (512) A better 'max_cpus' value would consider host settings, but MachineClass::max_cpus is defined well before machine_init() and kvm_init(). We can't check for KVM limits because it's too soon, so we end up making a guess. This patch makes MachineClass:max_cpus settings innocuous by setting it to INT32_MAX. machine.c:machine_smp_parse() will not fail the verification based on max_cpus, letting kvm_init() do the checking with actual host settings. And TCG guests get to do whatever the hardware is capable of emulating. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210408204049.221802-2-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-05-04 11:41:25 +10:00
Alexey Kardashevskiy	4b98e72d97	spapr: Rename RTAS_MAX_ADDR to FDT_MAX_ADDR SLOF instantiates RTAS since `744a928cce` ("spapr: Stop providing RTAS blob") so the max address applies to the FDT only. This renames the macro and fixes up the comment. This should not cause any behavioral change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20210331025123.29310-1-aik@ozlabs.ru> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-05-04 11:41:25 +10:00
Thomas Huth	ee86213aa3	Do not include exec/address-spaces.h if it's not really necessary Stop including exec/address-spaces.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-5-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:51 +02:00
Thomas Huth	ead62c75f6	Do not include hw/boards.h if it's not really necessary Stop including hw/boards.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-3-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:51 +02:00
Cornelia Huck	da7e13c00b	hw: add compat machines for 6.1 Add 6.1 machine types for arm/i440fx/q35/s390x/spapr. Signed-off-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Greg Kurz <groug@kaod.org> Message-id: 20210331111900.118274-1-cohuck@redhat.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-04-30 11:16:51 +01:00
Daniel Henrique Barboza	2b18fc794f	spapr.c: always pulse guest IRQ in spapr_core_unplug_request() Commit `47c8c915b1` fixed a problem where multiple spapr_drc_detach() requests were breaking QEMU. The solution was to just spapr_drc_detach() once, and use spapr_drc_unplug_requested() to filter whether we already detached it or not. The commit also tied the hotplug request to the guest in the same condition. Turns out that there is a reliable way for a CPU hotunplug to fail. If a guest with one CPU hotplugs a CPU1, then offline CPU0s via 'echo 0 > /sys/devices/system/cpu/cpu0/online', then attempts to hotunplug CPU1, the kernel will refuse it because it's the last online CPU of the system. Given that we're pulsing the IRQ only in the first try, in a failed attempt, all other CPU1 hotunplug attempts will fail, regardless of the online state of CPU1 in the kernel, because we're simply not letting the guest know that we want to hotunplug the device. Let's move spapr_hotplug_req_remove_by_index() back out of the "if (!spapr_drc_unplug_requested(drc))" conditional, allowing for multiple 'device_del' requests to the same CPU core to reach the guest, in case the CPU core didn't fully hotunplugged previously. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210401000437.131140-3-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-04-12 12:27:14 +10:00
Daniel Henrique Barboza	d522cb52e6	spapr: rollback 'unplug timeout' for CPU hotunplugs The pseries machines introduced the concept of 'unplug timeout' for CPU hotunplugs. The idea was to circunvent a deficiency in the pSeries specification (PAPR), that currently does not define a proper way for the hotunplug to fail. If the guest refuses to release the CPU (see [1] for an example) there is no way for QEMU to detect the failure. Further discussions about how to send a QAPI event to inform about the hotunplug timeout [2] exposed problems that weren't predicted back when the idea was developed. Other QEMU machines don't have any type of hotunplug timeout mechanism for any device, e.g. ACPI based machines have a way to make hotunplug errors visible to the hypervisor. This would make this timeout mechanism exclusive to pSeries, which is not ideal. The real problem is that a QAPI event that reports hotunplug timeouts puts the management layer (namely Libvirt) in a weird spot. We're not telling that the hotunplug failed, because we can't be 100% sure of that, and yet we're resetting the unplug state back, preventing any DEVICE_DEL events to reach out in case the guest decides to release the device. Libvirt would need to inspect the guest itself to see if the device was released or not, otherwise the internal domain states will be inconsistent. Moreover, Libvirt already has an 'unplug timeout' concept, and a QEMU side timeout would need to be juggled together with the existing Libvirt timeout. All this considered, this solution ended up creating more trouble than it solved. This patch reverts the 3 commits that introduced the timeout mechanism for CPU hotplugs in pSeries machines. This reverts commit `4515a5f786` "qemu_timer.c: add timer_deadline_ms() helper" This reverts commit `d1c2e3ce3d` "spapr_drc.c: add hotunplug timeout for CPUs" This reverts commit `51254ffb32` "spapr_drc.c: introduce unplug_timeout_timer" [1] https://bugzilla.redhat.com/show_bug.cgi?id=1911414 [2] https://lists.gnu.org/archive/html/qemu-devel/2021-03/msg04682.html CC: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210401000437.131140-2-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-04-12 12:27:14 +10:00

1 2 3 4 5 ...

882 Commits