mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Igor Mammedov	b28f01880e	ppc/{ppc440_bamboo, sam460ex}: use memdev for RAM memory_region_allocate_system_memory() API is going away, so replace it with memdev allocated MemoryRegion. The later is initialized by generic code, so board only needs to opt in to memdev scheme by providing MachineClass::default_ram_id and using MachineState::ram instead of manually initializing RAM memory region. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu> Acked-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20200219160953.13771-67-imammedo@redhat.com>	2020-02-19 16:50:00 +00:00
Igor Mammedov	a0258e4afa	ppc/{ppc440_bamboo, sam460ex}: drop RAM size fixup If user provided non-sense RAM size, board will complain and continue running with max RAM size supported or sometimes crash like this: %QEMU -M bamboo -m 1 exec.c:1926: find_ram_offset: Assertion `size != 0' failed. Aborted (core dumped) Also RAM is going to be allocated by generic code, so it won't be possible for board to fix things up for user. Make it error message and exit to force user fix CLI, instead of accepting non-sense CLI values. That also fixes crash issue, since wrongly calculated size isn't used to allocate RAM Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu> Acked-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20200219160953.13771-66-imammedo@redhat.com>	2020-02-19 16:50:00 +00:00
Aravinda Prasad	2500fb423a	migration: Include migration support for machine check handling This patch includes migration support for machine check handling. Especially this patch blocks VM migration requests until the machine check error handling is complete as these errors are specific to the source hardware and is irrelevant on the target hardware. Signed-off-by: Aravinda Prasad <arawinda.p@gmail.com> [Do not set FWNMI cap in post_load, now its done in .apply hook] Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Message-Id: <20200130184423.20519-7-ganeshgr@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-02-03 11:33:11 +11:00
Aravinda Prasad	f03496bc12	ppc: spapr: Handle "ibm,nmi-register" and "ibm,nmi-interlock" RTAS calls This patch adds support in QEMU to handle "ibm,nmi-register" and "ibm,nmi-interlock" RTAS calls. The machine check notification address is saved when the OS issues "ibm,nmi-register" RTAS call. This patch also handles the case when multiple processors experience machine check at or about the same time by handling "ibm,nmi-interlock" call. In such cases, as per PAPR, subsequent processors serialize waiting for the first processor to issue the "ibm,nmi-interlock" call. The second processor that also received a machine check error waits till the first processor is done reading the error log. The first processor issues "ibm,nmi-interlock" call when the error log is consumed. Signed-off-by: Aravinda Prasad <arawinda.p@gmail.com> [Register fwnmi RTAS calls in core_rtas_register_types() where other RTAS calls are registered] Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Message-Id: <20200130184423.20519-6-ganeshgr@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-02-03 11:33:11 +11:00
Aravinda Prasad	81fe70e443	target/ppc: Build rtas error log upon an MCE Upon a machine check exception (MCE) in a guest address space, KVM causes a guest exit to enable QEMU to build and pass the error to the guest in the PAPR defined rtas error log format. This patch builds the rtas error log, copies it to the rtas_addr and then invokes the guest registered machine check handler. The handler in the guest takes suitable action(s) depending on the type and criticality of the error. For example, if an error is unrecoverable memory corruption in an application inside the guest, then the guest kernel sends a SIGBUS to the application. For recoverable errors, the guest performs recovery actions and logs the error. Signed-off-by: Aravinda Prasad <arawinda.p@gmail.com> [Assume SLOF has allocated enough room for rtas error log] Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20200130184423.20519-5-ganeshgr@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-02-03 11:33:10 +11:00
Aravinda Prasad	9ac703ac5f	target/ppc: Handle NMI guest exit Memory error such as bit flips that cannot be corrected by hardware are passed on to the kernel for handling. If the memory address in error belongs to guest then the guest kernel is responsible for taking suitable action. Patch [1] enhances KVM to exit guest with exit reason set to KVM_EXIT_NMI in such cases. This patch handles KVM_EXIT_NMI exit. [1] https://www.spinics.net/lists/kvm-ppc/msg12637.html (e20bbd3d and related commits) Signed-off-by: Aravinda Prasad <arawinda.p@gmail.com> Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20200130184423.20519-4-ganeshgr@linux.ibm.com> [dwg: #ifdefs to fix compile for 32-bit target] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-02-03 11:33:10 +11:00
Aravinda Prasad	9d953ce447	ppc: spapr: Introduce FWNMI capability Introduce fwnmi an spapr capability and add a helper function which tries to enable it, which would be used by following patch of the series. This patch by itself does not change the existing behavior. Signed-off-by: Aravinda Prasad <arawinda.p@gmail.com> [eliminate cap_ppc_fwnmi, add fwnmi cap to migration state and reprhase the commit message] Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20200130184423.20519-3-ganeshgr@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-02-03 11:33:10 +11:00
Cédric Le Goater	9ae1329ee2	ppc/pnv: Add models for POWER8 PHB3 PCIe Host bridge This is a model of the PCIe Host Bridge (PHB3) found on a POWER8 processor. It includes the PowerBus logic interface (PBCQ), IOMMU support, a single PCIe Gen.3 Root Complex, and support for MSI and LSI interrupt sources as found on a POWER8 system using the XICS interrupt controller. The POWER8 processor comes in different flavors: Venice, Murano, Naple, each having a different number of PHBs. To make things simpler, the models provides 3 PHB3 per chip. Some platforms, like the Firestone, can also couple PHBs on the first chip to provide more bandwidth but this is too specific to model in QEMU. XICS requires some adjustment to support the PHB3 MSI. The changes are provided here but they could be decoupled in prereq patches. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20200127144506.11132-3-clg@kaod.org> [dwg: Use device_class_set_props()] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-02-02 14:07:57 +11:00
Benjamin Herrenschmidt	4f9924c4d4	ppc/pnv: Add models for POWER9 PHB4 PCIe Host bridge These changes introduces models for the PCIe Host Bridge (PHB4) of the POWER9 processor. It includes the PowerBus logic interface (PBCQ), IOMMU support, a single PCIe Gen.4 Root Complex, and support for MSI and LSI interrupt sources as found on a POWER9 system using the XIVE interrupt controller. POWER9 processor comes with 3 PHB4 PEC (PCI Express Controller) and each PEC can have several PHBs. By default, * PEC0 provides 1 PHB (PHB0) * PEC1 provides 2 PHBs (PHB1 and PHB2) * PEC2 provides 3 PHBs (PHB3, PHB4 and PHB5) Each PEC has a set "global" registers and some "per-stack" (per-PHB) registers. Those are organized in two XSCOM ranges, the "Nest" range and the "PCI" range, each range contains both some "PEC" registers and some "per-stack" registers. No default device layout is provided and PCI devices can be added on any of the available PCIe Root Port (pcie.0 .. 2 of a Power9 chip) with address 0x0 as the firwware (skiboot) only accepts a single device per root port. To run a simple system with a network and a storage adapters, use a command line options such as : -device e1000e,netdev=net0,mac=C0:FF:EE:00:00:02,bus=pcie.0,addr=0x0 -netdev bridge,id=net0,helper=/usr/libexec/qemu-bridge-helper,br=virbr0,id=hostnet0 -device megasas,id=scsi0,bus=pcie.1,addr=0x0 -drive file=$disk,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2 If more are needed, include a bridge. Multi chip is supported, each chip adding its set of PHB4 controllers and its PCI busses. The model doesn't emulate the EEH error handling. This model is not ready for hotplug yet. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [ clg: - numerous cleanups - commit log - fix for broken LSI support - PHB pic printinfo - large QOM rework ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20200127144506.11132-2-clg@kaod.org> [dwg: Use device_class_set_props()] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-02-02 14:07:57 +11:00
Stefan Berger	864674fa29	spapr: Implement get_dt_compatible() callback For devices that cannot be statically initialized, implement a get_dt_compatible() callback that allows us to ask the device for the 'compatible' value. Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20200121152935.649898-3-stefanb@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-02-02 14:07:57 +11:00
Cédric Le Goater	08c3f3a734	ppc/pnv: Add support for "hostboot" mode When the "hb-mode" option is activated on the powernv machine, the firmware is mapped at 0x8000000 and the HRMOR of the HW threads are set to the same address. The PNOR mapping on the FW address space of the LPC bus is left enabled to let the firmware load any other images required to boot the host. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20200127144154.10170-4-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-02-02 14:07:57 +11:00
Thomas Huth	b2ce76a073	hw/ppc/prep: Remove the deprecated "prep" machine and the OpenHackware BIOS It's been deprecated since QEMU v3.1. The 40p machine should be used nowadays instead. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20200114114617.28854-1-thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-02-02 14:07:57 +11:00
Cédric Le Goater	fc2527fb02	ppc/pnv: fix check on return value of blk_getlength() blk_getlength() returns an int64_t but the result is stored in a uint32_t. Errors (negative values) won't be caught by the check in pnv_pnor_realize() and blk_blockalign() will allocate a very large buffer in such cases. Fixes Coverity issue CID 1412226. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20200107171809.15556-3-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-01-08 12:01:14 +11:00
Greg Kurz	806fed593d	pnv/xive: Deduce the PnvXive pointer from XiveTCTX::xptr And use it instead of reaching out to the machine. This allows to get rid of pnv_get_chip(). Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20200106145645.4539-11-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-01-08 11:01:59 +11:00
Cédric Le Goater	479509463b	xive: Add a "presenter" link property to the TCTX object This will be used in subsequent patches to access the XIVE associated to a TCTX without reaching out to the machine through qdev_get_machine(). Signed-off-by: Cédric Le Goater <clg@kaod.org> [ groug: - split patch - write subject and changelog ] Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20200106145645.4539-9-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-01-08 11:01:59 +11:00
Greg Kurz	d8137bb729	ppc/pnv: Add a "pnor" const link property to the BMC internal simulator This allows to get rid of a call to qdev_get_machine(). Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20200106145645.4539-8-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-01-08 11:01:59 +11:00
Greg Kurz	764f9b2559	ppc/pnv: Add an "nr-threads" property to the base chip class Set it at chip creation and forward it to the cores. This allows to drop a call to qdev_get_machine(). Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20200106145645.4539-7-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-01-08 11:01:59 +11:00
Greg Kurz	d1214b819f	spapr, pnv, xive: Add a "xive-fabric" link to the XIVE router In order to get rid of qdev_get_machine(), first add a pointer to the XIVE fabric under the XIVE router and make it configurable through a QOM link property. Configure it in the spapr and pnv machine. In the case of pnv, the XIVE routers are under the chip, so this is done with a QOM alias property of the POWER9 pnv chip. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20200106145645.4539-5-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-01-08 11:01:59 +11:00
Greg Kurz	0da41d3c5a	pnv/xive: Use device_class_set_parent_realize() The XIVE router base class currently inherits an empty realize hook from the sysbus device base class, but it will soon implement one of its own to perform some sanity checks. Do the preliminary plumbing to have it called. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20200106145645.4539-4-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-01-08 11:01:59 +11:00
Cédric Le Goater	245cdb7f54	ppc/pnv: Introduce a "xics" property under the POWER8 chip POWER8 is the only chip using the XICS interface. Add a "xics" link and a XICSFabric attribute under this chip to remove the use of qdev_get_machine() Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20200106145645.4539-3-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-01-08 11:01:59 +11:00
Greg Kurz	6cc64796f2	spapr/xive: Use device_class_set_parent_realize() The XIVE router base class currently inherits an empty realize hook from the sysbus device base class, but it will soon implement one of its own to perform some sanity checks. Do the preliminary plumbing to have it called. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191219181155.32530-6-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-01-08 11:01:59 +11:00
Greg Kurz	5084c8b763	ppc/pnv: Drop PnvChipClass::type It isn't used anymore. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157623844102.360005.12070225703151669294.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:59:11 +11:00
Greg Kurz	70c059e926	ppc/pnv: Introduce PnvChipClass::xscom_pcba() method The XSCOM bus is implemented with a QOM interface, which is mostly generic from a CPU type standpoint, except for the computation of addresses on the Pervasive Connect Bus (PCB) network. This is handled by the pnv_xscom_pcba() function with a switch statement based on the chip_type class level attribute of the CPU chip. This can be achieved using QOM. Also the address argument is masked with PNV_XSCOM_SIZE - 1, which is for POWER8 only. Addresses may have different sizes with other CPU types. Have each CPU chip type handle the appropriate computation with a QOM xscom_pcba() method. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157623843543.360005.13996472463887521794.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:59:11 +11:00
Greg Kurz	3caf7bd0a2	ppc/pnv: Drop pnv_chip_is_power9() and pnv_chip_is_power10() helpers They aren't used anymore. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157623842986.360005.1787401623906380181.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:59:11 +11:00
Greg Kurz	c396c58a02	ppc/pnv: Pass content of the "compatible" property to pnv_dt_xscom() Since pnv_dt_xscom() is called from chip specific dt_populate() hooks, it shouldn't have to guess the chip type in order to populate the "compatible" property. Just pass the compat string and its size as arguments. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157623842430.360005.9513965612524265862.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:59:11 +11:00
Greg Kurz	3f5b45ca4f	ppc/pnv: Pass XSCOM base address and address size to pnv_dt_xscom() Since pnv_dt_xscom() is called from chip specific dt_populate() hooks, it shouldn't have to guess the chip type in order to populate the "reg" property. Just pass the base address and address size as arguments. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157623841868.360005.17577624823547136435.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:59:11 +11:00
Greg Kurz	c4b2c40c0e	ppc/pnv: Introduce PnvChipClass::xscom_core_base() method The pnv_chip_core_realize() function configures the XSCOM MMIO subregion for each core of a single chip. The base address of the subregion depends on the CPU type. Its computation is currently open-code using the pnv_chip_is_powerXX() helpers. This can be achieved with QOM. Introduce a method for this in the base chip class and implement it in child classes. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157623841311.360005.4705705734873339545.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:59:11 +11:00
Greg Kurz	85913070a6	ppc/pnv: Introduce PnvChipClass::intc_print_info() method The pnv_pic_print_info() callback checks the type of the chip in order to forward to the request appropriate interrupt controller. This can be achieved with QOM. Introduce a method for this in the base chip class and implement it in child classes. This also prepares ground for the upcoming interrupt controller of POWER10 chips. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157623840755.360005.5002022339473369934.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:59:10 +11:00
Greg Kurz	acc39abb31	ppc/pnv: Drop pnv_is_power9() and pnv_is_power10() helpers They aren't used anymore. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157623840200.360005.1300941274565357363.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:59:10 +11:00
Greg Kurz	7a90c6a1b6	ppc/pnv: Introduce PnvMachineClass::dt_power_mgt() We add an extra node to advertise power management on some machines, namely powernv9 and powernv10. This is achieved by using the pnv_is_power9() and pnv_is_power10() helpers. This can be achieved with QOM. Add a method to the base class for powernv machines and have it implemented by machine types that support power management instead. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157623839642.360005.9243510140436689941.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:59:10 +11:00
Greg Kurz	d76f2da7a5	ppc/pnv: Introduce PnvMachineClass and PnvMachineClass::compat The pnv_dt_create() function generates different contents for the "compatible" property of the root node in the DT, depending on the CPU type. This is open coded with multiple ifs using pnv_is_powerXX() helpers. It seems cleaner to achieve with QOM. Introduce a base class for the powernv machine and a compat attribute that each child class can use to provide the value for the "compatible" property. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157623839085.360005.4046508784077843216.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> [dwg: Folded in small fix Greg spotted after posting] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:58:49 +11:00
Greg Kurz	248e4e924e	ppc/pnv: Drop PnvPsiClass::chip_type It isn't used anymore. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157623838530.360005.15470128760871845396.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Greg Kurz	41c4ef7009	ppc/pnv: Introduce PnvPsiClass::compat The Processor Service Interface (PSI) model has a chip_type class level attribute, which is used to generate the content of the "compatible" DT property according to the CPU type. Since the PSI model already has specialized classes for each supported CPU type, it seems cleaner to achieve this with QOM. Provide the content of the "compatible" property with a new class level attribute. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157623837974.360005.14706607446188964477.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Greg Kurz	aeb7a330f4	ppc: Drop useless extern annotation for functions Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <157623837421.360005.412120366652768311.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Cédric Le Goater	3a1b70b66b	ppc/pnv: Fix OCC common area region mapping The OCC common area is mapped at a unique address on the system and each OCC is assigned a segment to expose its sensor data : ------------------------------------------------------------------------- \| Start (Offset from \| End \| Size \|Description \| \| BAR2 base address) \| \| \| \| ------------------------------------------------------------------------- \| 0x00580000 \| 0x005A57FF \|150kB \|OCC 0 Sensor Data Block\| \| 0x005A5800 \| 0x005CAFFF \|150kB \|OCC 1 Sensor Data Block\| \| : \| : \| : \| : \| \| 0x00686800 \| 0x006ABFFF \|150kB \|OCC 7 Sensor Data Block\| \| 0x006AC000 \| 0x006FFFFF \|336kB \|Reserved \| ------------------------------------------------------------------------- Maximum size is 1.5MB. We could define a "OCC common area" memory region at the machine level and sub regions for each OCC. But it adds some extra complexity to the models. Fix the current layout with a simpler model. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191211082912.2625-3-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Cédric Le Goater	8f09231631	ppc/pnv: Introduce PBA registers The PBA bridge unit (Power Bus Access) connects the OCC (On Chip Controller) to the Power bus and System Memory. The PBA is used to gather sensor data, for power management, for sleep states, for initial boot, among other things. The PBA logic provides a set of four registers PowerBus Access Base Address Registers (PBABAR0..3) which map the OCC address space to the PowerBus space. These registers are setup by the initial FW and define the PowerBus Range of system memory that can be accessed by PBA. The current modeling of the PBABAR registers is done under the common XSCOM handlers. We introduce a specific XSCOM regions for these registers and fix : - BAR sizes and BAR masks - The mapping of the OCC common area. It is common to all chips and should be mapped once. We will address per-OCC area in the next change. - OCC common area is in BAR 3 on P8 Inspired by previous work of Balamuruhan S <bala24@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191211082912.2625-2-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Greg Kurz	90cce00c7b	ppc/pnv: Make PnvXScomInterface an incomplete type PnvXScomInterface is an interface instance. It should never be dereferenced. Drop the dummy type definition for extra safety, which is the common practice with QOM interfaces. While here also convert the bogus OBJECT_CHECK() to INTERFACE_CHECK(). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157608025541.186670.1577861507610404326.stgit@bahia.lan> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Suraj Jitindar Singh	5cc7e69f6d	target/ppc: Work [S]PURR implementation and add HV support The Processor Utilisation of Resources Register (PURR) and Scaled Processor Utilisation of Resources Register (SPURR) provide an estimate of the resources used by the thread, present on POWER7 and later processors. Currently the [S]PURR registers simply count at the rate of the timebase. Preserve this behaviour but rework the implementation to store an offset like the timebase rather than doing the calculation manually. Also allow hypervisor write access to the register along with the currently available read access. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [ clg: rebased on current ppc tree ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191128134700.16091-3-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Suraj Jitindar Singh	5d62725b2f	target/ppc: Implement the VTB for HV access The virtual timebase register (VTB) is a 64-bit register which increments at the same rate as the timebase register, present on POWER8 and later processors. The register is able to be read/written by the hypervisor and read by the supervisor. All other accesses are illegal. Currently the VTB is just an alias for the timebase (TB) register. Implement the VTB so that is can be read/written independent of the TB. Make use of the existing method for accessing timebase facilities where by the compensation is stored and used to compute the value on reads/is updated on writes. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> [ clg: rebased on current ppc tree ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191128134700.16091-2-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Cédric Le Goater	2661f6ab2b	ppc/pnv: add a LPC Controller model for POWER10 Same a POWER9, only the MMIO window changes. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191205184454.10722-6-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Cédric Le Goater	8b50ce8505	ppc/pnv: add a PSI bridge model for POWER10 The POWER10 PSIHB controller is very similar to the one on POWER9. We should probably introduce a common PnvPsiXive object. The ESB page size should be changed to 64k when P10 support is ready. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191205184454.10722-5-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Cédric Le Goater	2b548a4255	ppc/pnv: Introduce a POWER10 PnvChip and a powernv10 machine This is an empty shell with the XSCOM bus and cores. The chip controllers will come later. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191205184454.10722-3-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Greg Kurz	401774387a	ppc: Deassert the external interrupt pin in KVM on reset When a CPU is reset, QEMU makes sure no interrupt is pending by clearing CPUPPCstate::pending_interrupts in ppc_cpu_reset(). In the case of a complete machine emulation, eg. a sPAPR machine, an external interrupt request could still be pending in KVM though, eg. an IPI. It will be eventually presented to the guest, which is supposed to acknowledge it at the interrupt controller. If the interrupt controller is emulated in QEMU, either XICS or XIVE, ppc_set_irq() won't deassert the external interrupt pin in KVM since it isn't pending anymore for QEMU. When the vCPU re-enters the guest, the interrupt request is still pending and the vCPU will try again to acknowledge it. This causes an infinite loop and eventually hangs the guest. The code has been broken since the beginning. The issue wasn't hit before because accel=kvm,kernel-irqchip=off is an awkward setup that never got used until recently with the LC92x IBM systems (aka, Boston). Add a ppc_irq_reset() function to do the necessary cleanup, ie. deassert the IRQ pins of the CPU in QEMU and most importantly the external interrupt pin for this vCPU in KVM. Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157548861740.3650476.16879693165328764758.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
David Gibson	d1d32d6255	spapr: Simplify ovec diff spapr_ovec_diff(ov, old, new) has somewhat complex semantics. ov is set to those bits which are in new but not old, and it returns as a boolean whether or not there are any bits in old but not new. It turns out that both callers only care about the second, not the first. This is basically equivalent to a bitmap subset operation, which is easier to understand and implement. So replace spapr_ovec_diff() with spapr_ovec_subset(). Cc: Mike Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cedric Le Goater <clg@fr.ibm.com>	2019-12-17 10:39:48 +11:00
David Gibson	0c21e07354	spapr: Fold h_cas_compose_response() into h_client_architecture_support() spapr_h_cas_compose_response() handles the last piece of the PAPR feature negotiation process invoked via the ibm,client-architecture-support OF call. Its only caller is h_client_architecture_support() which handles most of the rest of that process. I believe it was placed in a separate file originally to handle some fiddly dependencies between functions, but mostly it's just confusing to have the CAS process split into two pieces like this. Now that compose response is simplified (by just generating the whole device tree anew), it's cleaner to just fold it into h_client_architecture_support(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cedric Le Goater <clg@fr.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-12-17 10:39:48 +11:00
Cédric Le Goater	d302e00080	ppc/pnv: Dump the XIVE NVT table This is useful to dump the saved contexts of the vCPUs : configuration of the base END index of the vCPU and the Interrupt Pending Buffer register, which is updated when an interrupt can not be presented. When dumping the NVT table, we skip empty indirect pages which are not necessarily allocated. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-21-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Cédric Le Goater	f22f56dd48	ppc/pnv: Extend XiveRouter with a get_block_id() handler When doing CAM line compares, fetch the block id from the interrupt controller which can have set the PC_TCTXT_CHIPID field. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-20-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Cédric Le Goater	dc2526e45a	ppc/pnv: Introduce a pnv_xive_block_id() helper When PC_TCTXT_CHIPID_OVERRIDE is configured, the PC_TCTXT_CHIPID field overrides the hardwired chip ID in the Powerbus operations and for CAM compares. This is typically used in the one block-per-chip configuration to associate a unique block id number to each IC of the system. Simplify the model with a pnv_xive_block_id() helper and remove 'tctx_chipid' which becomes useless. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-19-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Cédric Le Goater	a5b841f18c	ppc/xive: Introduce a xive_tctx_ipb_update() helper We will use it to resend missed interrupts when a vCPU context is pushed on a HW thread. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-17-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Cédric Le Goater	8b3aaaa1a9	ppc/xive: Remove the get_tctx() XiveRouter handler It is now unused. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-16-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Cédric Le Goater	d024a2c111	ppc/xive: Move the TIMA operations to the controller model On the P9 Processor, the thread interrupt context registers of a CPU can be accessed "directly" when by load/store from the CPU or "indirectly" by the IC through an indirect TIMA page. This requires to configure first the PC_TCTXT_INDIRx registers. Today, we rely on the get_tctx() handler to deduce from the CPU PIR the chip from which the TIMA access is being done. By handling the TIMA memory ops under the interrupt controller model of each machine, we can uniformize the TIMA direct and indirect ops under PowerNV. We can also check that the CPUs have been enabled in the XIVE controller. This prepares ground for the future versions of XIVE. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-15-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Cédric Le Goater	5373c61d6a	ppc/pnv: Clarify how the TIMA is accessed on a multichip system The TIMA region gives access to the thread interrupt context registers of a CPU. It is mapped at the same address on all chips and can be accessed by any CPU of the system. To identify the chip from which the access is being done, the PowerBUS uses a 'chip' field in the load/store messages. QEMU does not model these messages, instead, we extract the chip id from the CPU PIR and do a lookup at the machine level to fetch the targeted interrupt controller. Introduce pnv_get_chip() and pnv_xive_tm_get_xive() helpers to clarify this process in pnv_xive_get_tctx(). The latter will be removed in the subsequent patches but the same principle will be kept. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-14-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Greg Kurz	4ffb749688	spapr: Pass the maximum number of vCPUs to the KVM interrupt controller The XIVE and XICS-on-XIVE KVM devices on POWER9 hosts can greatly reduce their consumption of some scarce HW resources, namely Virtual Presenter identifiers, if they know the maximum number of vCPUs that may run in the VM. Prepare ground for this by passing the value down to xics_kvm_connect() and kvmppc_xive_connect(). This is purely mechanical, no functional change. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157478678301.67101.2717368060417156338.stgit@bahia.tlslab.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Cédric Le Goater	4fb42350dc	ppc/xive: Extend the TIMA operation with a XivePresenter parameter The TIMA operations are performed on behalf of the XIVE IVPE sub-engine (Presenter) on the thread interrupt context registers. The current operations supported by the model are simple and do not require access to the controller but more complex operations will need access to the controller NVT table and to its configuration. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-13-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Cédric Le Goater	d3eb47a2a1	ppc/xive: Introduce a XiveFabric interface The XiveFabric QOM interface acts as the PowerBUS interface between the interrupt controller and the system and should be implemented by the QEMU machine. On HW, the XIVE sub-engine is responsible for the communication with the other chip is the Common Queue (CQ) bridge unit. This interface offers a 'match_nvt' handler to perform the CAM line matching when looking for a XIVE Presenter with a dispatched NVT. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-9-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Cédric Le Goater	119eaa9d11	ppc/pnv: Fix TIMA indirect access When the TIMA of a CPU needs to be accessed from the indirect page, the thread id of the target CPU is first stored in the PC_TCTXT_INDIR0 register. This thread id is relative to the chip and not to the system. Introduce a helper routine to look for a CPU of a given PIR and fix pnv_xive_get_indirect_tctx() to scan only the threads of the local chip and not the whole machine. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-8-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Cédric Le Goater	5014c60261	ppc/pnv: Introduce a pnv_xive_is_cpu_enabled() helper and use this helper to exclude CPUs which are not enabled in the XIVE controller. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-7-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:47 +11:00
Cédric Le Goater	4a89e20458	ppc: Introduce a ppc_cpu_pir() helper Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-6-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:47 +11:00
Greg Kurz	4fa28f2390	ppc/pnv: Instantiate cores separately Allocating a big void * array to store multiple objects isn't a recommended practice for various reasons: - no compile time type checking - potential dangling pointers if a reference on an individual is taken and the array is freed later on - duplicate boiler plate everywhere the array is browsed through Allocate an array of pointers and populate it instead. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-4-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:47 +11:00
Cédric Le Goater	13bee8521c	ppc/xive: Introduce a XivePresenter interface When the XIVE IVRE sub-engine (XiveRouter) looks for a Notification Virtual Target (NVT) to notify, it broadcasts a message on the PowerBUS to find an XIVE IVPE sub-engine (Presenter) with the NVT dispatched on one of its HW threads, and then forwards the notification if any response was received. The current XIVE presenter model is sufficient for the pseries machine because it has a single interrupt controller device, but the PowerNV machine can have multiple chips each having its own interrupt controller. In this case, the XIVE presenter model is too simple and the CAM line matching should scan all chips of the system. To start fixing this issue, we first extend the XIVE Router model with a new XivePresenter QOM interface representing the XIVE IVPE sub-engine. This interface exposes a 'match_nvt' handler which the sPAPR and PowerNV XIVE Router models will need to implement to perform the CAM line matching. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-2-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:47 +11:00
Cédric Le Goater	e2392d4395	ppc/pnv: Create BMC devices at machine init The BMC of the OpenPOWER systems monitors the machine state using sensors, controls the power and controls the access to the PNOR flash device containing the firmware image required to boot the host. QEMU models the power cycle process, access to the sensors and access to the PNOR device. But, for these features to be available, the QEMU PowerNV machine needs two extras devices on the command line, an IPMI BT device for communication and a BMC backend device: -device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-bt,bmc=bmc0,irq=10 The BMC properties are then defined accordingly in the device tree and OPAL self adapts. If a BMC device and an IPMI BT device are not available, OPAL does not try to communicate with the BMC in any manner. This is not how real systems behave. To be closer to the default behavior, create an IPMI BMC simulator device and an IPMI BT device at machine initialization time. We loose the ability to define an external BMC device but there are benefits: - a better match with real systems, - a better test coverage of the OPAL code, - system powerdown and reset commands that work, - a QEMU device tree compliant with the specifications (). () Still needs a MBOX device. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191121162340.11049-1-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:47 +11:00
Cédric Le Goater	ca661fae81	ppc/pnv: Add HIOMAP commands This activates HIOMAP support on the QEMU PowerNV machine. The PnvPnor model is used to access the flash contents. The model simply maps the contents at a fix offset and enables or disables the mapping. HIOMAP Protocol description : https://github.com/openbmc/hiomapd/blob/master/Documentation/protocol.md Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191028070027.22752-3-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:47 +11:00
Cédric Le Goater	e6488eeba8	ppc/xive: Introduce helpers for the NVT id Each vCPU in the system is identified with an NVT identifier which is pushed in the OS CAM line (QW1W2) of the HW thread interrupt context register when the vCPU is dispatched on a HW thread. This identifier is used by the presenter subengine to find a matching target to notify of an event. It is also used to fetch the associate NVT structure which may contain pending interrupts that need a resend. Add a couple of helpers for the NVT ids. The NVT space is 19 bits wide, giving a maximum of 512K per chip. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191115162436.30548-3-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:47 +11:00
Cédric Le Goater	516883c2f1	ppc/xive: Record the IPB in the associated NVT When an interrupt can not be presented to a vCPU, because it is not running on any of the HW treads, the XIVE presenter updates the Interrupt Pending Buffer register of the associated XIVE NVT structure. This is only done if backlog is activated in the END but this is generally the case. The current code assumes that the fields of the NVT structure is architected with the same layout of the thread interrupt context registers. Fix this assumption and define an offset for the IPB register backup value in the NVT. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191115162436.30548-2-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:47 +11:00
Cédric Le Goater	35dde57662	ppc/pnv: Add a PNOR model On a POWERPC PowerNV system, the host firmware is stored in a PNOR flash chip which contents is mapped on the LPC bus. This model adds a simple dummy device to map the contents of a block device in the host address space. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191021131215.3693-2-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:47 +11:00
Greg Kurz	0990ce6a2e	ppc: Add intc_destroy() handlers to SpaprInterruptController/PnvChip SpaprInterruptControllerClass and PnvChipClass have an intc_create() method that calls the appropriate routine, ie. icp_create() or xive_tctx_create(), to establish the link between the VCPU and the presenter component of the interrupt controller during realize. There aren't any symmetrical call to be called when the VCPU gets unrealized though. It is assumed that object_unparent() is the only thing to do. This is questionable because the parenting logic around the CPU and presenter objects is really an implementation detail of the interrupt controller. It shouldn't be open-coded in the machine code. Fix this by adding an intc_destroy() method that undoes what was done in intc_create(). Also NULLify the presenter pointers to avoid having stale pointers around. This will allow to reliably check if a vCPU has a valid presenter. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157192724208.3146912.7254684777515287626.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Laurent Vivier <lvivier@redhat.com>	2019-11-18 11:49:11 +01:00
Cédric Le Goater	97c00c5444	spapr/xive: Set the OS CAM line at reset When a Virtual Processor is scheduled to run on a HW thread, the hypervisor pushes its identifier in the OS CAM line. When running with kernel_irqchip=off, QEMU needs to emulate the same behavior. Set the OS CAM line when the interrupt presenter of the sPAPR core is reset. This will also cover the case of hot-plugged CPUs. This change also has the benefit to remove the use of CPU_FOREACH() which can be unsafe. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20191022163812.330-8-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-10-24 13:34:15 +11:00
Cédric Le Goater	d49e8a9b46	ppc: Reset the interrupt presenter from the CPU reset handler On the sPAPR machine and PowerNV machine, the interrupt presenters are created by a machine handler at the core level and are reset independently. This is not consistent and it raises issues when it comes to handle hot-plugged CPUs. In that case, the presenters are not reset. This is less of an issue in XICS, although a zero MFFR could be a concern, but in XIVE, the OS CAM line is not set and this breaks the presenting algorithm. The current code has workarounds which need a global cleanup. Extend the sPAPR IRQ backend and the PowerNV Chip class with a new cpu_intc_reset() handler called by the CPU reset handler and remove the XiveTCTX reset handler which is now redundant. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191022163812.330-6-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-10-24 13:33:45 +11:00
Cédric Le Goater	aa5ac64b23	ppc/pnv: Add a PnvChip pointer to PnvCore We will use it to reset the interrupt presenter from the CPU reset handler. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20191022163812.330-5-clg@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-10-24 13:33:33 +11:00
David Gibson	54255c1f65	spapr: Move SpaprIrq::nr_xirqs to SpaprMachineClass For the benefit of peripheral device allocation, the number of available irqs really wants to be the same on a given machine type version, regardless of what irq backends we are using. That's the case now, but only because we make sure the different SpaprIrq instances have the same value except for the special legacy one. Since this really only depends on machine type version, move the value to SpaprMachineClass instead of SpaprIrq. This also puts the code to set it to the lower value on old machine types right next to setting legacy_irq_allocation, which needs to go hand in hand. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2019-10-24 09:36:55 +11:00
David Gibson	8cbe71ecb8	spapr: Remove SpaprIrq::nr_msis The nr_msis value we use here has to line up with whether we're using legacy or modern irq allocation. Therefore it's safer to derive it based on legacy_irq_allocation rather than having SpaprIrq contain a canned value. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2019-10-24 09:36:55 +11:00
David Gibson	605994e5b7	spapr, xics, xive: Move SpaprIrq::post_load hook to backends The remaining logic in the post_load hook really belongs to the interrupt controller backends, and just needs to be called on the active controller (after the active controller is set to the right thing based on the incoming migration in the generic spapr_irq_post_load() logic). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2019-10-24 09:36:55 +11:00
David Gibson	567192d486	spapr, xics, xive: Move SpaprIrq::reset hook logic into activate/deactivate It turns out that all the logic in the SpaprIrq::reset hooks (and some in the SpaprIrq::post_load hooks) isn't really related to resetting the irq backend (that's handled by the backends' own reset routines). Rather its about getting the backend ready to be the active interrupt controller or stopping being the active interrupt controller - reset (and post_load) is just the only time that changes at present. To make this flow clearer, move the logic into the explicit backend activate and deactivate hooks. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2019-10-24 09:36:55 +11:00
David Gibson	0a17e0c39f	spapr: Remove SpaprIrq::init_kvm hook This hook is a bit odd. The only caller is spapr_irq_init_kvm(), but it explicitly takes an SpaprIrq *, so it's never really called through the current SpaprIrq. Essentially this is just a way of passing through a function pointer so that spapr_irq_init_kvm() can handle some configuration and error handling logic without duplicating it between the xics and xive reset paths. So, make it just take that function pointer. Because of earlier reworks to the KVM connect/disconnect code in the xics and xive backends we can also eliminate some wrapper functions and streamline error handling a bit. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2019-10-24 09:36:55 +11:00
David Gibson	98a39a7927	spapr, xics, xive: Match signatures for XICS and XIVE KVM connect routines Both XICS and XIVE have routines to connect and disconnect KVM with similar but not identical signatures. This adjusts them to match exactly, which will be useful for further cleanups later. While we're there, we add an explicit return value to the connect path to streamline error reporting in the callers. We remove error reporting the disconnect path. In the XICS case this wasn't used at all. In the XIVE case the only error case was if the KVM device was set up, but KVM didn't have the capability to do so which is pretty obviously impossible. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2019-10-24 09:36:55 +11:00
David Gibson	05289273c0	spapr, xics, xive: Move dt_populate from SpaprIrq to SpaprInterruptController This method depends only on the active irq controller. Now that we've formalized the notion of active controller we can dispatch directly through that, rather than dispatching via SpaprIrq with the dual version having to do a second conditional dispatch. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2019-10-24 09:36:55 +11:00
David Gibson	328d8eb24d	spapr, xics, xive: Move print_info from SpaprIrq to SpaprInterruptController This method depends only on the active irq controller. Now that we've formalized the notion of active controller we can dispatch directly through that, rather than dispatching via SpaprIrq with the dual version having to do a second conditional dispatch. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2019-10-24 09:36:55 +11:00
David Gibson	7bcdbcca2f	spapr, xics, xive: Move set_irq from SpaprIrq to SpaprInterruptController This method depends only on the active irq controller. Now that we've formalized the notion of active controller we can dispatch directly through that, rather than dispatching via SpaprIrq with the dual version having to do a second conditional dispatch. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2019-10-24 09:36:55 +11:00
David Gibson	81106ddd1a	spapr: Formalize notion of active interrupt controller spapr now has the mechanism of constructing both XICS and XIVE instances of the SpaprInterruptController interface. However, only one of the interrupt controllers will actually be active at any given time, depending on feature negotiation with the guest. This is handled in the current code via spapr_irq_current() which checks the OV5 vector from feature negotiation to determine the current backend. Determining the active controller at the point we need it like this can be pretty confusing, because it makes it very non obvious at what points the active controller can change. This can make it difficult to reason about the code and where a change of active controller could appear in sequence with other events. Make this mechanism more explicit by adding an 'active_intc' pointer and an explicit spapr_irq_update_active_intc() function to update it from the CAS state. We also add hooks on the intc backend which will get called when it is activated or deactivated. For now we just introduce the switch and hooks, later patches will actually start using them. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2019-10-24 09:36:55 +11:00
David Gibson	0b0e52b131	spapr, xics, xive: Move irq claim and free from SpaprIrq to SpaprInterruptController These methods, like cpu_intc_create, really belong to the interrupt controller, but need to be called on all possible intcs. Like cpu_intc_create, therefore, make them methods on the intc and always call it for all existing intcs. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2019-10-24 09:36:55 +11:00
David Gibson	ebd6be089b	spapr, xics, xive: Move cpu_intc_create from SpaprIrq to SpaprInterruptController This method essentially represents code which belongs to the interrupt controller, but needs to be called on all possible intcs, rather than just the currently active one. The "dual" version therefore calls into the xics and xive versions confusingly. Handle this more directly, by making it instead a method on the intc backend, and always calling it on every backend that exists. While we're there, streamline the error reporting a bit. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2019-10-24 09:36:55 +11:00
David Gibson	150e25f85b	spapr, xics, xive: Introduce SpaprInterruptController QOM interface The SpaprIrq structure is used to represent ths spapr machine's irq backend. Except that it kind of conflates two concepts: one is the backend proper - a specific interrupt controller that we might or might not be using, the other is the irq configuration which covers the layout of irq space and which interrupt controllers are allowed. This leads to some pretty confusing code paths for the "dual" configuration where its hooks redirect to other SpaprIrq structures depending on the currently active irq controller. To clean this up, we start by introducing a new SpaprInterruptController QOM interface to represent strictly an interrupt controller backend, not counting anything configuration related. We implement this interface in the XICs and XIVE interrupt controllers, and in future we'll move relevant methods from SpaprIrq into it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2019-10-24 09:36:55 +11:00
Greg Kurz	29cb418749	spapr: Set VSMT to smp_threads by default Support for setting VSMT is available in KVM since linux-4.13. Most distros that support KVM on POWER already have it. It thus seem reasonable enough to have the default machine to set VSMT to smp_threads. This brings contiguous VCPU ids and thus brings their upper bound down to the machine's max_cpus. This is especially useful for XIVE KVM devices, which may thus allocate only one VP descriptor per VCPU. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157010411885.246126.12610015369068227139.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-10-24 09:36:55 +11:00
Cédric Le Goater	106695ab12	ppc/pnv: Improve trigger data definition The trigger data is used for both triggers of a HW source interrupts, PHB, PSI, and triggers for rerouting interrupts between interrupt controllers. When an interrupt is rerouted, the trigger data follows an "END trigger" format. In that case, the remote IC needs EAS containing an END index to perform a lookup of an END. An END trigger, bit0 of word0 set to '1', is defined as : \|0123\|4567\|0123\|4567\|0123\|4567\|0123\|4567\| W0 E=1 \|1P--\|BLOC\| END IDX \| W1 E=1 \|M \| END DATA \| An EAS is defined as : \|0123\|4567\|0123\|4567\|0123\|4567\|0123\|4567\| W0 \|V---\|BLOC\| END IDX \| W1 \|M \| END DATA \| The END trigger adds an extra 'PQ' bit, bit1 of word0 set to '1', signaling that the PQ bits have been checked. That bit is unused in the initial EAS definition. When a HW device performs the trigger, the trigger data follows an "EAS trigger" format because the trigger data in that case contains an EAS index which the IC needs to look for. An EAS trigger, bit0 of word0 set to '0', is defined as : \|0123\|4567\|0123\|4567\|0123\|4567\|0123\|4567\| W0 E=0 \|0P--\|---- ---- ---- ---- ---- ---- ----\| W1 E=0 \|BLOC\| EAS INDEX \| There is also a 'PQ' bit, bit1 of word0 to '1', signaling that the PQ bits have been checked. Introduce these new trigger bits and rename the XIVE_SRCNO macros in XIVE_EAS to reflect better the nature of the data. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191007084102.29776-2-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-10-24 09:36:55 +11:00
David Gibson	f478d9af21	spapr: Eliminate SpaprIrq::init hook This method is used to set up the interrupt backends for the current configuration. However, this means some confusing redirection between the "dual" mode init and the init hooks for xics only and xive only modes. Since we now have simple flags indicating whether XICS and/or XIVE are supported, it's easier to just open code each initialization directly in spapr_irq_init(). This will also make some future cleanups simpler. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-10-04 19:08:23 +10:00
David Gibson	ca62823b79	spapr: Use less cryptic representation of which irq backends are supported SpaprIrq::ov5 stores the value for a particular byte in PAPR option vector 5 which indicates whether XICS, XIVE or both interrupt controllers are available. As usual for PAPR, the encoding is kind of overly complicated and confusing (though to be fair there are some backwards compat things it has to handle). But to make our internal code clearer, have SpaprIrq encode more directly which backends are available as two booleans, and derive the OV5 value from that at the point we need it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-10-04 19:08:23 +10:00
David Gibson	e594c2ad1c	xive: Improve irq claim/free path spapr_xive_irq_claim() returns a bool to indicate if it succeeded. But most of the callers and one callee use int return values and/or an Error * with more information instead. In any case, ints are a more common idiom for success/failure states than bools (one never knows what sense they'll be in). So instead change to an int return value to indicate presence of error + an Error * to describe the details through that call chain. It also didn't actually check if the irq was already claimed, which is one of the primary purposes of the claim path, so do that. spapr_xive_irq_free() also returned a bool... which no callers checked and was always true, so just drop it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-10-04 19:08:23 +10:00
David Gibson	f233cee97b	spapr: Handle freeing of multiple irqs in frontend only spapr_irq_free() can be used to free multiple irqs at once. That's useful for its callers, but there's no need to make the individual backend hooks handle this. We can loop across the irqs in spapr_irq_free() itself and have the hooks just do one at time. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2019-10-04 19:08:23 +10:00
David Gibson	14789694cd	spapr: Eliminate SpaprIrq:get_nodename method This method is used to determine the name of the irq backend's node in the device tree, so that we can find its phandle (after SLOF may have modified it from the phandle we initially gave it). But, in the two cases the only difference between the node name is the presence of a unit address. Searching for a node name without considering unit address is standard practice for the device tree, and fdt_subnode_offset() will do exactly that, making this method unecessary. While we're there, remove the XICS_NODENAME define. The name "interrupt-controller" is required by PAPR (and IEEE1275), and a bunch of places assume it already. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-10-04 19:08:22 +10:00
David Gibson	af1861511d	spapr: Simplify spapr_qirq() handling Currently spapr_qirq(), whic is used to find the qemu_irq for an spapr global irq number, redirects through the SpaprIrq::qirq method. But the array of qemu_irqs is allocated in the PAPR layer, not the backends, and so the method implementations all return the same thing, just differing in the preliminary checks they make. So, we can remove the method, and just implement spapr_qirq() directly, including all the relevant checks in one place. We change all those checks into assert()s as well, since a failure here indicates an error in the calling code. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-10-04 19:08:22 +10:00
David Gibson	fe9b61b246	spapr: Eliminate nr_irqs parameter to SpaprIrq::init The only reason this parameter was needed was to work around the inconsistent meaning of nr_irqs between xics and xive. Now that we've fixed that, we can consistently use the number directly in the SpaprIrq configuration. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-10-04 19:08:22 +10:00
David Gibson	ad8de98636	spapr: Clarify and fix handling of nr_irqs Both the XICS and XIVE interrupt backends have a "nr-irqs" property, but it means slightly different things. For XICS (or, strictly, the ICS) it indicates the number of "real" external IRQs. Those start at XICS_IRQ_BASE (0x1000) and don't include the special IPI vector. For XIVE, however, it includes the whole IRQ space, including XIVE's many IPI vectors. The spapr code currently doesn't handle this sensibly, with the nr_irqs value in SpaprIrq having different meanings depending on the backend. We fix this by renaming nr_irqs to nr_xirqs and making it always indicate just the number of external irqs, adjusting the value we pass to XIVE accordingly. We also move to using common constants in most of the irq configurations, to make it clearer that the IRQ space looks the same to the guest (and emulated devices), even if the backend is different. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2019-10-04 19:08:22 +10:00
David Gibson	7678b74a94	spapr: Replace spapr_vio_qirq() helper with spapr_vio_irq_pulse() helper Every caller of spapr_vio_qirq() immediately calls qemu_irq_pulse() with the result, so we might as well just fold that into the helper. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-10-04 19:08:22 +10:00
David Gibson	9db8c551c9	xics: Create sPAPR specific ICS subtype We create a subtype of TYPE_ICS specifically for sPAPR. For now all this does is move the setup of the PAPR specific hcalls and RTAS calls to the realize() function for this, rather than requiring the PAPR code to explicitly call xics_spapr_init(). In future it will have some more function. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-10-04 19:08:22 +10:00
David Gibson	642e92719e	xics: Merge TYPE_ICS_BASE and TYPE_ICS_SIMPLE classes TYPE_ICS_SIMPLE is the only subtype of TYPE_ICS_BASE that's ever instantiated. The existence of different classes is mostly a hang over from when we (misguidedly) had separate subtypes for the KVM and non-KVM version of the device. There could be some call for an abstract base type for ICS variants that use a different representation of their state (PowerNV PHB3 might want this). The current split isn't really in the right place for that though. If we need this in future, we can re-implement it more in line with what we actually need. So, collapse the two classes together into just TYPE_ICS. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-10-04 19:08:22 +10:00
David Gibson	da2ef5b2f2	xics: Eliminate reset hook Currently TYPE_XICS_BASE and TYPE_XICS_SIMPLE have their own reset methods, using the standard technique for having the subtype call the supertype's methods before doing its own thing. But TYPE_XICS_SIMPLE is the only subtype of TYPE_XICS_BASE ever instantiated, so there's no point having the split here. Merge them together into just an ics_reset() function. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-10-04 19:08:22 +10:00
David Gibson	28976c99cf	xics: Rename misleading ics_simple_() functions There are a number of ics_simple_() functions that aren't actually specific to TYPE_XICS_SIMPLE at all, and are equally valid on TYPE_XICS_BASE. Rename them to ics_*() accordingly. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-10-04 19:08:22 +10:00
David Gibson	d5803c7319	xics: Eliminate 'reject', 'resend' and 'eoi' class hooks Currently ics_reject(), ics_resend() and ics_eoi() indirect through class methods. But there's only one implementation of each method, the one in TYPE_ICS_SIMPLE. TYPE_ICS_BASE has no implementation, but it's never instantiated, and has no other subtypes. So clean up by eliminating the method and just having ics_reject(), ics_resend() and ics_eoi() contain the logic directly. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-10-04 19:08:21 +10:00
David Gibson	00ed3da9b5	xics: Minor fixes for XICSFabric interface Interface instances should never be directly dereferenced. So, the common practice is to make them incomplete types to make sure no-one does that. XICSFrabric, however, had a dummy type which is less safe. We were also using OBJECT_CHECK() where we should have been using INTERFACE_CHECK(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2019-10-04 19:08:21 +10:00
Alexey Kardashevskiy	744a928cce	spapr: Stop providing RTAS blob SLOF implements one itself so let's remove it from QEMU. It is one less image and simpler setup as the RTAS blob never stays in its initial place anyway as the guest OS always decides where to put it. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-10-04 10:25:23 +10:00

1 2 3 4 5 ...

660 Commits