Allocating a big void * array to store multiple objects isn't a
recommended practice for various reasons:
- no compile time type checking
- potential dangling pointers if a reference on an individual is
taken and the array is freed later on
- duplicate boiler plate everywhere the array is browsed through
Allocate an array of pointers and populate it instead.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-4-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The BMC of the OpenPOWER systems monitors the machine state using
sensors, controls the power and controls the access to the PNOR flash
device containing the firmware image required to boot the host.
QEMU models the power cycle process, access to the sensors and access
to the PNOR device. But, for these features to be available, the QEMU
PowerNV machine needs two extras devices on the command line, an IPMI
BT device for communication and a BMC backend device:
-device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-bt,bmc=bmc0,irq=10
The BMC properties are then defined accordingly in the device tree and
OPAL self adapts. If a BMC device and an IPMI BT device are not
available, OPAL does not try to communicate with the BMC in any
manner. This is not how real systems behave.
To be closer to the default behavior, create an IPMI BMC simulator
device and an IPMI BT device at machine initialization time. We loose
the ability to define an external BMC device but there are benefits:
- a better match with real systems,
- a better test coverage of the OPAL code,
- system powerdown and reset commands that work,
- a QEMU device tree compliant with the specifications (*).
(*) Still needs a MBOX device.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191121162340.11049-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This activates HIOMAP support on the QEMU PowerNV machine. The PnvPnor
model is used to access the flash contents. The model simply maps the
contents at a fix offset and enables or disables the mapping.
HIOMAP Protocol description :
https://github.com/openbmc/hiomapd/blob/master/Documentation/protocol.md
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191028070027.22752-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
And fix a typo in the MEM address space definition.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191118091908.15044-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Failing to set any of the ICS property should really never happen:
- object_property_add_child() always succeed unless the child object
already has a parent, which isn't the case here obviously since the
ICS has just been created with object_new()
- the ICS has an "nr-irqs" property than can be set as long as the ICS
isn't realized
In both cases, an error indicates there is a bug in QEMU. Propagating the
error, ie. exiting QEMU since spapr_irq_init() is called with &error_fatal
doesn't make much sense. Abort instead. This is consistent with what is
done with XIVE : both qdev_create() and qdev_prop_set_uint32() abort QEMU
on error.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157403285265.409804.8683093665795248192.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The ICS object has both a pointer and an ICS_PROP_XICS property pointing
to the XICS fabric. Confusing bugs could arise if these ever go out of
sync.
Change the property definition so that it explicitely sets the pointer.
The property isn't optional : not being able to set the link is a bug
and QEMU should rather abort than exit in this case.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157403283596.409804.17347207690271971987.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The XIVE object has both a pointer and a "chip" property pointing to the
chip object. Confusing bugs could arise if these ever go out of sync.
Change the property definition so that it explicitely sets the pointer.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157383336564.165747.10250365296928442882.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The core object has both a pointer and a "chip" property pointing to the
chip object. Confusing bugs could arise if these ever go out of sync.
Change the property definition so that it explicitely sets the pointer.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157383336007.165747.1524120147081367440.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The homer object has both a pointer and a "chip" property pointing to the
chip object. Confusing bugs could arise if these ever go out of sync.
Change the property definition so that it explicitely sets the pointer.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157383335451.165747.32301068645427993.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The OCC object has both a pointer and a "psi" property pointing to the
PSI object. Confusing bugs could arise if these ever go out of sync.
Change the property definition so that it explicitely sets the pointer.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157383334894.165747.7617090757862105199.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The LPC object has both a pointer and a "psi" property pointing to the
PSI object. Confusing bugs could arise if these ever go out of sync.
Change the property definition so that it explicitely sets the pointer.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157383334342.165747.3159314903077305653.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The source object has both a pointer and a "xive" property pointing to the
notifier object. Confusing bugs could arise if these ever go out of sync.
Change the property definition so that it explicitely sets the pointer.
The property isn't optional : not being able to set the link is a bug
and QEMU should rather abort than exit in this case.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157383333227.165747.12901571295951957951.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It has no apparent user.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157383383118.166856.2588933416368211047.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It helps skiboot identifying that is running on a QEMU platform. The
compatible string will define the POWERPC processor version.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191106142129.4908-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
On a POWERPC PowerNV system, the host firmware is stored in a PNOR
flash chip which contents is mapped on the LPC bus. This model adds a
simple dummy device to map the contents of a block device in the host
address space.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191021131215.3693-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add 5.0 machine types for arm/i440fx/q35/s390x/spapr.
For i440fx and q35, unversioned cpu models are still translated
to -v1; I'll leave changing this (if desired) to the respective
maintainers.
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20191112104811.30323-1-cohuck@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Virtio spec 1.1 (and earlier), 5.2.5.2 Driver Requirements: Device
Initialization:
"Devices SHOULD always offer VIRTIO_BLK_F_FLUSH, and MUST offer it if
they offer VIRTIO_BLK_F_CONFIG_WCE"
Currently F_CONFIG_WCE and F_WCE are not connected to each other.
Qemu will advertise F_CONFIG_WCE if config-wce argument is
set for virtio-blk device. And F_WCE is advertised only if
underlying block backend actually has it's caching enabled.
Fix this by advertising F_WCE if F_CONFIG_WCE is also advertised.
To preserve backwards compatibility with newer machine types make this
behaviour governed by "x-enable-wce-if-config-wce" virtio-blk-device
property and introduce hw_compat_4_2 with new property being off by
default for all machine types <= 4.2 (but don't introduce 4.3
machine type itself yet).
Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>
Message-Id: <1572978137-189218-1-git-send-email-wrfsh@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Traditional PCI INTx for vfio devices can only perform well if using
an in-kernel irqchip. Therefore, vfio_intx_update() issues a warning
if an in kernel irqchip is not available.
We usually do have an in-kernel irqchip available for pseries machines
on POWER hosts. However, because the platform allows feature
negotiation of what interrupt controller model to use, we don't
currently initialize it until machine reset. vfio_intx_update() is
called (first) from vfio_realize() before that, so it can issue a
spurious warning, even if we will have an in kernel irqchip by the
time we need it.
To workaround this, make a call to spapr_irq_update_active_intc() from
spapr_irq_init() which is called at machine realize time, before the
vfio realize. This call will be pretty much obsoleted by the later
call at reset time, but it serves to suppress the spurious warning
from VFIO.
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
pseries machine type can have one of two different interrupt controllers in
use depending on feature negotiation with the guest. Usually this is
invisible to devices, because they route to a common set of qemu_irqs which
in turn dispatch to the correct back end.
VFIO passthrough devices, however, wire themselves up directly to the KVM
irqchip for performance, which means they are affected by this change in
interrupt controller. To get them to adjust correctly for the change in
irqchip, we need to fire the kvm irqchip change notifier.
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Since "spapr: Render full FDT on ibm,client-architecture-support" we build
the entire flatten device tree (FDT) twice - at the reset time and
when "ibm,client-architecture-support" (CAS) is called. The full FDT from
CAS is then applied on top of the SLOF internal device tree.
This is mostly ok, however there is a case when the QEMU is started with
-initrd and for some reason the guest decided to move/unpack the init RAM
disk image - the guest correctly notifies SLOF about the change but
at CAS it is overridden with the QEMU initial location addresses and
the guest may fail to boot if the original initrd memory was changed.
This fixes the problem by only adding the /chosen node at the reset time
to prevent the original QEMU's linux,initrd-start/linux,initrd-end to
override the updated addresses.
This only treats /chosen differently as we know there is a special case
already and it is unlikely anything else will need to change /chosen at CAS
we are better off not touching /chosen after we handed it over to SLOF.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20191024041308.5673-1-aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
SpaprInterruptControllerClass and PnvChipClass have an intc_create() method
that calls the appropriate routine, ie. icp_create() or xive_tctx_create(),
to establish the link between the VCPU and the presenter component of the
interrupt controller during realize.
There aren't any symmetrical call to be called when the VCPU gets unrealized
though. It is assumed that object_unparent() is the only thing to do.
This is questionable because the parenting logic around the CPU and
presenter objects is really an implementation detail of the interrupt
controller. It shouldn't be open-coded in the machine code.
Fix this by adding an intc_destroy() method that undoes what was done in
intc_create(). Also NULLify the presenter pointers to avoid having
stale pointers around. This will allow to reliably check if a vCPU has
a valid presenter.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157192724208.3146912.7254684777515287626.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
There are three page size in qemu:
real host page size
host page size
target page size
All of them have dedicate variable to represent. For the last two, we
use the same form in the whole qemu project, while for the first one we
use two forms: qemu_real_host_page_size and getpagesize().
qemu_real_host_page_size is defined to be a replacement of
getpagesize(), so let it serve the role.
[Note] Not fully tested for some arch or device.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20191013021145.16011-3-richardw.yang@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The M48T59 is a Real Time Clock, not a timer.
Move it under the hw/rtc/ subdirectory.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20191003230404.19384-5-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The MC146818 is a Real Time Clock, not a timer.
Move it under the hw/rtc/ subdirectory.
Use copyright statement from 80cabfad16 for "hw/rtc/mc146818rtc.h".
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20191003230404.19384-4-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Last pull request before soft freeze.
* Lots of fixes and cleanups for spapr interrupt controllers
* More SLOF updates to fix problems with full FDT rendering at CAS
time (alas, more yet are to come)
* A few other assorted changes
This isn't quite as well tested as I usually try to do before a pull
request. But I've been sick and running into some other difficulties,
and wanted to get this sent out before heading towards KVM forum.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl2xXWcACgkQbDjKyiDZ
s5Jy/BAAsSo514vGCjdszXcRH3nFeODKJadlSsUX+32JFP1yJS9ooxkcmIN7o9Wp
3VCkMHQPVV9jjIvvShWOSGfDDO3o8fTEucOIX/Nn9wfq+NiY+EJst0v+8OT48CSX
LEXiy9Wghs9pZMLCUZ3rlLPBiU/Lhzf+KTCoUdc40tfoZMMz1lp/Uy8IdIYTYwLl
/z++r4X8FOsXsDDsFopWffVdVBLJz6Var6NgBa8ISk2gGnUOAKsrTE3bD9L6n4PR
YYbMSkv+SbvXg4gm53jUb9cQgpBqQpWHUYBIbKia/16EzbIkkZjFE2jGQMP5c72h
ZOml7ZQtQVWIEEZwKPN67S8bKiVbEfayxHYViejn/uUqv3AwW0wi7FlBVv37YNJ4
TxPvLBu+0DaFbk5y6/XHyL6XomG1/oH6qXOM2JhIWON7HI3rRWoMQbZ6QVJ1Gwk2
uwrvOOL5kVZySotOw5bDkTXYp/Nm1JE4QwOXFPkXzaekcZhRlEqqrkBddhKtF80p
1e5hGp5RgoILIe8uHJQ7decUMk889J7Qdtakv6BWvOci4dbIiZEp/smFlzgTcPnW
DQJONP/awnoAOS3v0bItf59DROvkZ5xyv8yQZFP3qThSOfZl4e95WNRbtR3vtjU4
Bl4Pdte15URKy5nM0XnnLg9mzl2xufdwEsu76lQMuYpe6nCI2h0=
=FRHF
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.2-20191024' into staging
ppc patch queue 2019-10-24
Last pull request before soft freeze.
* Lots of fixes and cleanups for spapr interrupt controllers
* More SLOF updates to fix problems with full FDT rendering at CAS
time (alas, more yet are to come)
* A few other assorted changes
This isn't quite as well tested as I usually try to do before a pull
request. But I've been sick and running into some other difficulties,
and wanted to get this sent out before heading towards KVM forum.
# gpg: Signature made Thu 24 Oct 2019 09:14:31 BST
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-4.2-20191024: (28 commits)
spapr/xive: Set the OS CAM line at reset
ppc/pnv: Fix naming of routines realizing the CPUs
ppc: Reset the interrupt presenter from the CPU reset handler
ppc/pnv: Add a PnvChip pointer to PnvCore
ppc/pnv: Introduce a PnvCore reset handler
spapr_cpu_core: Implement DeviceClass::reset
spapr: move CPU reset after presenter creation
spapr: Don't request to unplug the same core twice
pseries: Update SLOF firmware image
spapr: Move SpaprIrq::nr_xirqs to SpaprMachineClass
spapr: Remove SpaprIrq::nr_msis
spapr, xics, xive: Move SpaprIrq::post_load hook to backends
spapr, xics, xive: Move SpaprIrq::reset hook logic into activate/deactivate
spapr: Remove SpaprIrq::init_kvm hook
spapr, xics, xive: Match signatures for XICS and XIVE KVM connect routines
spapr, xics, xive: Move dt_populate from SpaprIrq to SpaprInterruptController
spapr, xics, xive: Move print_info from SpaprIrq to SpaprInterruptController
spapr, xics, xive: Move set_irq from SpaprIrq to SpaprInterruptController
spapr: Formalize notion of active interrupt controller
spapr, xics, xive: Move irq claim and free from SpaprIrq to SpaprInterruptController
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
rs6000mc_realize() violates memory_region_allocate_system_memory() contract
by calling it multiple times which could break -mem-path. Replace it with
plain memory_region_init_ram() instead.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20191008113318.7012-3-imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The 'vcpu' suffix is inherited from the sPAPR machine. Use better
names for PowerNV.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20191022163812.330-7-clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
On the sPAPR machine and PowerNV machine, the interrupt presenters are
created by a machine handler at the core level and are reset
independently. This is not consistent and it raises issues when it
comes to handle hot-plugged CPUs. In that case, the presenters are not
reset. This is less of an issue in XICS, although a zero MFFR could
be a concern, but in XIVE, the OS CAM line is not set and this breaks
the presenting algorithm. The current code has workarounds which need
a global cleanup.
Extend the sPAPR IRQ backend and the PowerNV Chip class with a new
cpu_intc_reset() handler called by the CPU reset handler and remove
the XiveTCTX reset handler which is now redundant.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191022163812.330-6-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We will use it to reset the interrupt presenter from the CPU reset
handler.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20191022163812.330-5-clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
in which individual CPUs are reset. It will ease the introduction of
future change reseting the interrupt presenter from the CPU reset
handler.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20191022163812.330-4-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since vCPUs aren't plugged into a bus, we manually register a reset
handler for each vCPU. We also call this handler at realize time
to ensure hot plugged vCPUs are reset before being exposed to the
guest. This results in vCPUs being reset twice at machine reset.
It doesn't break anything but it is slightly suboptimal and above
all confusing.
The hotplug path in device_set_realized() already knows how to reset
a hotplugged device if the device reset handler is present. Implement
one for sPAPR CPU cores that resets all vCPUs under a core.
While here rename spapr_cpu_reset() to spapr_reset_vcpu() for
consistency with spapr_realize_vcpu() and spapr_unrealize_vcpu().
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[clg: add documentation on the reset helper usage ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191022163812.330-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This change prepares ground for future changes which will reset the
interrupt presenter in the reset handler of the sPAPR and PowerNV
cores.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191022163812.330-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We must not call spapr_drc_detach() on a detached DRC otherwise bad things
can happen, ie. QEMU hangs or crashes. This is easily demonstrated with
a CPU hotplug/unplug loop using QMP.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157185826035.3073024.1664101000438499392.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
For the benefit of peripheral device allocation, the number of available
irqs really wants to be the same on a given machine type version,
regardless of what irq backends we are using. That's the case now, but
only because we make sure the different SpaprIrq instances have the same
value except for the special legacy one.
Since this really only depends on machine type version, move the value to
SpaprMachineClass instead of SpaprIrq. This also puts the code to set it
to the lower value on old machine types right next to setting
legacy_irq_allocation, which needs to go hand in hand.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
The nr_msis value we use here has to line up with whether we're using
legacy or modern irq allocation. Therefore it's safer to derive it based
on legacy_irq_allocation rather than having SpaprIrq contain a canned
value.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
The remaining logic in the post_load hook really belongs to the interrupt
controller backends, and just needs to be called on the active controller
(after the active controller is set to the right thing based on the
incoming migration in the generic spapr_irq_post_load() logic).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
It turns out that all the logic in the SpaprIrq::reset hooks (and some in
the SpaprIrq::post_load hooks) isn't really related to resetting the irq
backend (that's handled by the backends' own reset routines). Rather its
about getting the backend ready to be the active interrupt controller or
stopping being the active interrupt controller - reset (and post_load) is
just the only time that changes at present.
To make this flow clearer, move the logic into the explicit backend
activate and deactivate hooks.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
This hook is a bit odd. The only caller is spapr_irq_init_kvm(), but
it explicitly takes an SpaprIrq *, so it's never really called through the
current SpaprIrq. Essentially this is just a way of passing through a
function pointer so that spapr_irq_init_kvm() can handle some
configuration and error handling logic without duplicating it between the
xics and xive reset paths.
So, make it just take that function pointer. Because of earlier reworks
to the KVM connect/disconnect code in the xics and xive backends we can
also eliminate some wrapper functions and streamline error handling a bit.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Both XICS and XIVE have routines to connect and disconnect KVM with
similar but not identical signatures. This adjusts them to match
exactly, which will be useful for further cleanups later.
While we're there, we add an explicit return value to the connect path
to streamline error reporting in the callers. We remove error
reporting the disconnect path. In the XICS case this wasn't used at
all. In the XIVE case the only error case was if the KVM device was
set up, but KVM didn't have the capability to do so which is pretty
obviously impossible.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
This method depends only on the active irq controller. Now that we've
formalized the notion of active controller we can dispatch directly
through that, rather than dispatching via SpaprIrq with the dual
version having to do a second conditional dispatch.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
This method depends only on the active irq controller. Now that we've
formalized the notion of active controller we can dispatch directly
through that, rather than dispatching via SpaprIrq with the dual
version having to do a second conditional dispatch.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
This method depends only on the active irq controller. Now that we've
formalized the notion of active controller we can dispatch directly through
that, rather than dispatching via SpaprIrq with the dual version having
to do a second conditional dispatch.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
spapr now has the mechanism of constructing both XICS and XIVE instances of
the SpaprInterruptController interface. However, only one of the interrupt
controllers will actually be active at any given time, depending on feature
negotiation with the guest. This is handled in the current code via
spapr_irq_current() which checks the OV5 vector from feature negotiation to
determine the current backend.
Determining the active controller at the point we need it like this
can be pretty confusing, because it makes it very non obvious at what
points the active controller can change. This can make it difficult
to reason about the code and where a change of active controller could
appear in sequence with other events.
Make this mechanism more explicit by adding an 'active_intc' pointer
and an explicit spapr_irq_update_active_intc() function to update it
from the CAS state. We also add hooks on the intc backend which will
get called when it is activated or deactivated.
For now we just introduce the switch and hooks, later patches will
actually start using them.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
These methods, like cpu_intc_create, really belong to the interrupt
controller, but need to be called on all possible intcs.
Like cpu_intc_create, therefore, make them methods on the intc and
always call it for all existing intcs.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
This method essentially represents code which belongs to the interrupt
controller, but needs to be called on all possible intcs, rather than
just the currently active one. The "dual" version therefore calls
into the xics and xive versions confusingly.
Handle this more directly, by making it instead a method on the intc
backend, and always calling it on every backend that exists.
While we're there, streamline the error reporting a bit.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
The SpaprIrq structure is used to represent ths spapr machine's irq
backend. Except that it kind of conflates two concepts: one is the
backend proper - a specific interrupt controller that we might or
might not be using, the other is the irq configuration which covers
the layout of irq space and which interrupt controllers are allowed.
This leads to some pretty confusing code paths for the "dual"
configuration where its hooks redirect to other SpaprIrq structures
depending on the currently active irq controller.
To clean this up, we start by introducing a new
SpaprInterruptController QOM interface to represent strictly an
interrupt controller backend, not counting anything configuration
related. We implement this interface in the XICs and XIVE interrupt
controllers, and in future we'll move relevant methods from SpaprIrq
into it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Support for setting VSMT is available in KVM since linux-4.13. Most distros
that support KVM on POWER already have it. It thus seem reasonable enough
to have the default machine to set VSMT to smp_threads.
This brings contiguous VCPU ids and thus brings their upper bound down to
the machine's max_cpus. This is especially useful for XIVE KVM devices,
which may thus allocate only one VP descriptor per VCPU.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157010411885.246126.12610015369068227139.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Include the XIVE_TRIGGER_PQ bit in the trigger data which is how
hardware signals to the IC that the PQ bits of the interrupt source
have been checked.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191007084102.29776-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add MachineClass::auto_enable_numa field. When it is true, a NUMA node
is expected to be created implicitly.
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190905083238.1799-1-tao3.xu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>