Commit Graph

1982 Commits

Author SHA1 Message Date
Greg Kurz
e44acde2f8 ppc/pnv: Drop "num-chips" machine property
The number of CPU chips of the powernv machine is configurable through a
"num-chips" property. This doesn't fit well with the CPU topology, eg.
some configurations can come up with more CPUs than the maximum of CPUs
set in the toplogy. This causes assertion to be hit with mttcg:

   -machine powernv,num-chips=2 -smp cores=2 -accel tcg,thread=multi

ERROR:
tcg/tcg.c:789:tcg_register_thread: assertion failed: (n < ms->smp.max_cpus)
Aborted (core dumped)

Mttcg mandates the CPU topology to be dimensioned to the actual number
of CPUs, depending on the number of chips the user asked for. That is,
'-machine num-chips=N' should always have a '-smp' companion with a
topology that meats the resulting number of CPUs, typically
'-smp sockets=N'.

It thus seems that "num-chips" doesn't bring anything but forcing the user
to specify the requested number of chips on the command line twice. Simplify
the command line by computing the number of chips based on the CPU topology
exclusively. The powernv machine isn't a production thing ; it is mostly
used by developpers to prepare the bringup of real HW. Because of this and
for simplicity, this deliberately ignores the official deprecation process
and dumps "num-chips" right away : '-smp sockets=N' is now the only way to
control the number of CPU chips.

This is done at machine init because smp_parse() is called after instance
init.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157830658266.533764.2214183961444213947.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-01-08 11:01:59 +11:00
Daniel Henrique Barboza
400431ef48 ppc440_bamboo.c: remove label from bamboo_load_device_tree()
'out' label can be replaced by 'return -1' in all cases.

CC: David Gibson <david@gibson.dropbear.id.au>
CC: qemu-ppc@nongnu.org
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20200106182425.20312-3-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-01-08 11:01:59 +11:00
Daniel Henrique Barboza
9b6c1da5e9 spapr.c: remove 'out' label in spapr_dt_cas_updates()
'out' can be replaced by 'return' with the appropriate
return value.

CC: David Gibson <david@gibson.dropbear.id.au>
CC: qemu-ppc@nongnu.org
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20200106182425.20312-2-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-01-08 11:01:59 +11:00
Cédric Le Goater
8f06e3705e ppc/pnv: Modify the powerdown notifier to get the PowerNV machine
Use container_of() instead of qdev_get_machine()

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191219181155.32530-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-01-08 11:01:59 +11:00
Bharata B Rao
905db91697 ppc/spapr: Support reboot of secure pseries guest
A pseries guest can be run as a secure guest on Ultravisor-enabled
POWER platforms. When such a secure guest is reset, we need to
release/reset a few resources both on ultravisor and hypervisor side.
This is achieved by invoking this new ioctl KVM_PPC_SVM_OFF from the
machine reset path.

As part of this ioctl, the secure guest is essentially transitioned
back to normal mode so that it can reboot like a regular guest and
become secure again.

This ioctl has no effect when invoked for a normal guest. If this ioctl
fails for a secure guest, the guest is terminated.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Message-Id: <20191219031445.8949-3-bharata@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-01-08 11:01:59 +11:00
Marc-André Lureau
3cad405bab vmstate: replace DeviceState with VMStateIf
Replace DeviceState dependency with VMStateIf on vmstate API.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
2020-01-06 18:41:32 +04:00
Peter Maydell
4800819827 * More uses of RCU_READ_LOCK_GUARD (Dave, myself)
* QOM doc improvments (Greg)
 * Cleanups from the Meson conversion (Marc-André)
 * Support for multiple -accel options (myself)
 * Many x86 machine cleanup (Philippe, myself)
 * tests/migration-test cleanup (Juan)
 * PC machine removal and next round of deprecation (Thomas)
 * kernel-doc integration (Peter, myself)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJd+YJGAAoJEL/70l94x66D0YYIAIZpS6i6NYJC8KHCl49fjI7U
 qHDN7MiKYTU+l3i0+iGmQL6XN5ClAY0pXkY5LBFIDpsohHR5f4jdrIKjyvcHzuIM
 gx/NLsiA45/niHYrn/hEo0P7CwGTrrdWL+SVmScnKcwYiBzMO/uYblxlbUBKLPNn
 eGaKQmEkvlUBR9GS6S1+jYg8234ZRZ4+12t5dqqADBQ7Kc0wn6KC5yebIoQxCgVc
 9F5Ezdkl7befrTI7El3EC6aT18bKhIBZIs1PT/hzqzlGFhBuKM7uKDb43Yx8c7XQ
 bk5vzHmblPAgQyK4OETQ+DM745AOk6vBiJZbR9nrDUXWvUkrEXTQZMJKU0FXdlE=
 =hyYX
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* More uses of RCU_READ_LOCK_GUARD (Dave, myself)
* QOM doc improvments (Greg)
* Cleanups from the Meson conversion (Marc-André)
* Support for multiple -accel options (myself)
* Many x86 machine cleanup (Philippe, myself)
* tests/migration-test cleanup (Juan)
* PC machine removal and next round of deprecation (Thomas)
* kernel-doc integration (Peter, myself)

# gpg: Signature made Wed 18 Dec 2019 01:35:02 GMT
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (87 commits)
  vga: cleanup mapping of VRAM for non-PCI VGA
  hw/display: Remove "rombar" hack from vga-pci and vmware_vga
  hw/pci: Remove the "command_serr_enable" property
  hw/audio: Remove the "use_broken_id" hack from the AC97 device
  hw/i386: Remove the deprecated machines 0.12 up to 0.15
  hw/pci-host: Add Kconfig entry to select the IGD Passthrough Host Bridge
  hw/pci-host/i440fx: Extract the IGD passthrough host bridge device
  hw/pci-host/i440fx: Use definitions instead of magic values
  hw/pci-host/i440fx: Use size_t to iterate over ARRAY_SIZE()
  hw/pci-host/i440fx: Extract PCII440FXState to "hw/pci-host/i440fx.h"
  hw/pci-host/i440fx: Correct the header description
  Fix some comment spelling errors.
  target/i386: remove unused pci-assign codes
  WHPX: refactor load library
  migration: check length directly to make sure the range is aligned
  memory: include MemoryListener documentation and some missing function parameters
  docs: add memory API reference
  memory.h: Silence kernel-doc complaints
  docs: Create bitops.rst as example of kernel-docs
  bitops.h: Silence kernel-doc complaints
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-12-20 11:20:25 +00:00
Vladimir Sementsov-Ogievskiy
0c115681a5 ppc: make Error **errp const where it is appropriate
Mostly, Error ** is for returning error from the function, so the
callee sets it. However kvmppc_hint_smt_possible gets already filled
errp parameter. It doesn't change the pointer itself, only change the
internal state of referenced Error object. So we can make it Error
*const * errp, to stress the behavior. It will also help coccinelle
script (in future) to distinguish such cases from common errp usage.

While there, rename the function to
kvmppc_error_append_smt_possible_hint().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20191205174635.18758-8-vsementsov@virtuozzo.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Commit message replaced]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2019-12-18 08:43:19 +01:00
Markus Armbruster
1a639fdf96 Revert "ppc: well form kvmppc_hint_smt_possible error hint helper"
This reverts commit cdcca22aab.

Commit cdcca22aab is a superseded version of the next commit that
crept in by accident.  Revert it, so the final version applies.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
2019-12-18 08:40:09 +01:00
Markus Armbruster
8ca63ba8c2 error: Clean up unusual names of Error * variables
Local Error * variables are conventionally named @err or @local_err,
and Error ** parameters @errp.  Naming local variables like parameters
is confusing.  Clean that up.

Naming parameters like local variables is also confusing.  Left for
another day.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20191204093625.14836-17-armbru@redhat.com>
2019-12-18 08:36:15 +01:00
Paolo Bonzini
4376c40ded kvm: introduce kvm_kernel_irqchip_* functions
The KVMState struct is opaque, so provide accessors for the fields
that will be moved from current_machine to the accelerator.  For now
they just forward to the machine object, but this will change.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-17 19:32:45 +01:00
Greg Kurz
5084c8b763 ppc/pnv: Drop PnvChipClass::type
It isn't used anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623844102.360005.12070225703151669294.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:59:11 +11:00
Greg Kurz
70c059e926 ppc/pnv: Introduce PnvChipClass::xscom_pcba() method
The XSCOM bus is implemented with a QOM interface, which is mostly
generic from a CPU type standpoint, except for the computation of
addresses on the Pervasive Connect Bus (PCB) network. This is handled
by the pnv_xscom_pcba() function with a switch statement based on
the chip_type class level attribute of the CPU chip.

This can be achieved using QOM. Also the address argument is masked with
PNV_XSCOM_SIZE - 1, which is for POWER8 only. Addresses may have different
sizes with other CPU types. Have each CPU chip type handle the appropriate
computation with a QOM xscom_pcba() method.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623843543.360005.13996472463887521794.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:59:11 +11:00
Greg Kurz
c396c58a02 ppc/pnv: Pass content of the "compatible" property to pnv_dt_xscom()
Since pnv_dt_xscom() is called from chip specific dt_populate() hooks,
it shouldn't have to guess the chip type in order to populate the
"compatible" property. Just pass the compat string and its size as
arguments.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623842430.360005.9513965612524265862.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:59:11 +11:00
Greg Kurz
3f5b45ca4f ppc/pnv: Pass XSCOM base address and address size to pnv_dt_xscom()
Since pnv_dt_xscom() is called from chip specific dt_populate() hooks,
it shouldn't have to guess the chip type in order to populate the "reg"
property. Just pass the base address and address size as arguments.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623841868.360005.17577624823547136435.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:59:11 +11:00
Greg Kurz
c4b2c40c0e ppc/pnv: Introduce PnvChipClass::xscom_core_base() method
The pnv_chip_core_realize() function configures the XSCOM MMIO subregion
for each core of a single chip. The base address of the subregion depends
on the CPU type. Its computation is currently open-code using the
pnv_chip_is_powerXX() helpers. This can be achieved with QOM. Introduce
a method for this in the base chip class and implement it in child classes.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623841311.360005.4705705734873339545.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:59:11 +11:00
Greg Kurz
85913070a6 ppc/pnv: Introduce PnvChipClass::intc_print_info() method
The pnv_pic_print_info() callback checks the type of the chip in order
to forward to the request appropriate interrupt controller. This can
be achieved with QOM. Introduce a method for this in the base chip class
and implement it in child classes.

This also prepares ground for the upcoming interrupt controller of POWER10
chips.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623840755.360005.5002022339473369934.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:59:10 +11:00
Greg Kurz
7a90c6a1b6 ppc/pnv: Introduce PnvMachineClass::dt_power_mgt()
We add an extra node to advertise power management on some machines,
namely powernv9 and powernv10. This is achieved by using the
pnv_is_power9() and pnv_is_power10() helpers.

This can be achieved with QOM. Add a method to the base class for
powernv machines and have it implemented by machine types that
support power management instead.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623839642.360005.9243510140436689941.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:59:10 +11:00
Greg Kurz
d76f2da7a5 ppc/pnv: Introduce PnvMachineClass and PnvMachineClass::compat
The pnv_dt_create() function generates different contents for the
"compatible" property of the root node in the DT, depending on the
CPU type. This is open coded with multiple ifs using pnv_is_powerXX()
helpers.

It seems cleaner to achieve with QOM. Introduce a base class for the
powernv machine and a compat attribute that each child class can use
to provide the value for the "compatible" property.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623839085.360005.4046508784077843216.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[dwg: Folded in small fix Greg spotted after posting]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:58:49 +11:00
Greg Kurz
248e4e924e ppc/pnv: Drop PnvPsiClass::chip_type
It isn't used anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623838530.360005.15470128760871845396.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Greg Kurz
41c4ef7009 ppc/pnv: Introduce PnvPsiClass::compat
The Processor Service Interface (PSI) model has a chip_type class level
attribute, which is used to generate the content of the "compatible" DT
property according to the CPU type.

Since the PSI model already has specialized classes for each supported
CPU type, it seems cleaner to achieve this with QOM. Provide the content
of the "compatible" property with a new class level attribute.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623837974.360005.14706607446188964477.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Cédric Le Goater
3a1b70b66b ppc/pnv: Fix OCC common area region mapping
The OCC common area is mapped at a unique address on the system and
each OCC is assigned a segment to expose its sensor data :

  -------------------------------------------------------------------------
  | Start (Offset from | End           | Size     |Description            |
  | BAR2 base address) |               |          |                       |
  -------------------------------------------------------------------------
  |    0x00580000      |  0x005A57FF   |150kB     |OCC 0 Sensor Data Block|
  |    0x005A5800      |  0x005CAFFF   |150kB     |OCC 1 Sensor Data Block|
  |        :           |       :       |  :       |            :          |
  |    0x00686800      |  0x006ABFFF   |150kB     |OCC 7 Sensor Data Block|
  |    0x006AC000      |  0x006FFFFF   |336kB     |Reserved               |
  -------------------------------------------------------------------------

Maximum size is 1.5MB.

We could define a "OCC common area" memory region at the machine level
and sub regions for each OCC. But it adds some extra complexity to the
models. Fix the current layout with a simpler model.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191211082912.2625-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Cédric Le Goater
8f09231631 ppc/pnv: Introduce PBA registers
The PBA bridge unit (Power Bus Access) connects the OCC (On Chip
Controller) to the Power bus and System Memory. The PBA is used to
gather sensor data, for power management, for sleep states, for
initial boot, among other things.

The PBA logic provides a set of four registers PowerBus Access Base
Address Registers (PBABAR0..3) which map the OCC address space to the
PowerBus space. These registers are setup by the initial FW and define
the PowerBus Range of system memory that can be accessed by PBA.

The current modeling of the PBABAR registers is done under the common
XSCOM handlers. We introduce a specific XSCOM regions for these
registers and fix :

 - BAR sizes and BAR masks
 - The mapping of the OCC common area. It is common to all chips and
   should be mapped once.  We will address per-OCC area in the next
   change.
 - OCC common area is in BAR 3 on P8

Inspired by previous work of Balamuruhan S <bala24@linux.ibm.com>

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191211082912.2625-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Cédric Le Goater
9e028fffaa ppc/pnv: populate the DT with realized XSCOM devices
Some devices could be initialized in the instance_init handler but not
realized for configuration reasons. Nodes should not be added in the DT
for such devices.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191210135845.19773-3-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Cédric Le Goater
109dce3786 ppc/pnv: Loop on the whole hierarchy to populate the DT with the XSCOM nodes
Some PnvXScomInterface objects lie a bit deeper (PnvPBCQState) than
the first layer, so we need to loop on the whole object hierarchy to
catch them.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191210135845.19773-2-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
[dwg: Corrected error in comment]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Suraj Jitindar Singh
f0ec31b1e2 target/ppc: Add SPR TBU40
The spr TBU40 is used to set the upper 40 bits of the timebase
register, present on POWER5+ and later processors.

This register can only be written by the hypervisor, and cannot be read.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191128134700.16091-5-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Suraj Jitindar Singh
5cc7e69f6d target/ppc: Work [S]PURR implementation and add HV support
The Processor Utilisation of Resources Register (PURR) and Scaled
Processor Utilisation of Resources Register (SPURR) provide an estimate
of the resources used by the thread, present on POWER7 and later
processors.

Currently the [S]PURR registers simply count at the rate of the
timebase.

Preserve this behaviour but rework the implementation to store an offset
like the timebase rather than doing the calculation manually. Also allow
hypervisor write access to the register along with the currently
available read access.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[ clg: rebased on current ppc tree ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191128134700.16091-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Suraj Jitindar Singh
5d62725b2f target/ppc: Implement the VTB for HV access
The virtual timebase register (VTB) is a 64-bit register which
increments at the same rate as the timebase register, present on POWER8
and later processors.

The register is able to be read/written by the hypervisor and read by
the supervisor. All other accesses are illegal.

Currently the VTB is just an alias for the timebase (TB) register.

Implement the VTB so that is can be read/written independent of the TB.
Make use of the existing method for accessing timebase facilities where
by the compensation is stored and used to compute the value on reads/is
updated on writes.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[ clg: rebased on current ppc tree ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191128134700.16091-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Cédric Le Goater
2661f6ab2b ppc/pnv: add a LPC Controller model for POWER10
Same a POWER9, only the MMIO window changes.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191205184454.10722-6-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Cédric Le Goater
8b50ce8505 ppc/pnv: add a PSI bridge model for POWER10
The POWER10 PSIHB controller is very similar to the one on POWER9. We
should probably introduce a common PnvPsiXive object.

The ESB page size should be changed to 64k when P10 support is ready.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191205184454.10722-5-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Cédric Le Goater
c5412b1d28 ppc/psi: cleanup definitions
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191205184454.10722-4-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Cédric Le Goater
2b548a4255 ppc/pnv: Introduce a POWER10 PnvChip and a powernv10 machine
This is an empty shell with the XSCOM bus and cores. The chip controllers
will come later.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191205184454.10722-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Greg Kurz
c1ad0b892c ppc: Don't use CPUPPCState::irq_input_state with modern Book3s CPU models
The power7_set_irq() and power9_set_irq() functions set this but it is
never used actually. Modern Book3s compatible CPUs are only supported
by the pnv and spapr machines. They have an interrupt controller, XICS
for POWER7/8 and XIVE for POWER9, whose models don't require to track
IRQ input states at the CPU level.

Drop these lines to avoid confusion.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157548862861.3650476.16622818876928044450.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Greg Kurz
401774387a ppc: Deassert the external interrupt pin in KVM on reset
When a CPU is reset, QEMU makes sure no interrupt is pending by clearing
CPUPPCstate::pending_interrupts in ppc_cpu_reset(). In the case of a
complete machine emulation, eg. a sPAPR machine, an external interrupt
request could still be pending in KVM though, eg. an IPI. It will be
eventually presented to the guest, which is supposed to acknowledge it at
the interrupt controller. If the interrupt controller is emulated in QEMU,
either XICS or XIVE, ppc_set_irq() won't deassert the external interrupt
pin in KVM since it isn't pending anymore for QEMU. When the vCPU re-enters
the guest, the interrupt request is still pending and the vCPU will try
again to acknowledge it. This causes an infinite loop and eventually hangs
the guest.

The code has been broken since the beginning. The issue wasn't hit before
because accel=kvm,kernel-irqchip=off is an awkward setup that never got
used until recently with the LC92x IBM systems (aka, Boston).

Add a ppc_irq_reset() function to do the necessary cleanup, ie. deassert
the IRQ pins of the CPU in QEMU and most importantly the external interrupt
pin for this vCPU in KVM.

Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157548861740.3650476.16879693165328764758.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
David Gibson
d1d32d6255 spapr: Simplify ovec diff
spapr_ovec_diff(ov, old, new) has somewhat complex semantics.  ov is set
to those bits which are in new but not old, and it returns as a boolean
whether or not there are any bits in old but not new.

It turns out that both callers only care about the second, not the first.
This is basically equivalent to a bitmap subset operation, which is easier
to understand and implement.  So replace spapr_ovec_diff() with
spapr_ovec_subset().

Cc: Mike Roth <mdroth@linux.vnet.ibm.com>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cedric Le Goater <clg@fr.ibm.com>
2019-12-17 10:39:48 +11:00
David Gibson
0c21e07354 spapr: Fold h_cas_compose_response() into h_client_architecture_support()
spapr_h_cas_compose_response() handles the last piece of the PAPR feature
negotiation process invoked via the ibm,client-architecture-support OF
call.  Its only caller is h_client_architecture_support() which handles
most of the rest of that process.

I believe it was placed in a separate file originally to handle some
fiddly dependencies between functions, but mostly it's just confusing
to have the CAS process split into two pieces like this.  Now that
compose response is simplified (by just generating the whole device
tree anew), it's cleaner to just fold it into
h_client_architecture_support().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cedric Le Goater <clg@fr.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2019-12-17 10:39:48 +11:00
David Gibson
97b32a6afa spapr: Improve handling of fdt buffer size
Previously, spapr_build_fdt() constructed the device tree in a fixed
buffer of size FDT_MAX_SIZE.  This is a bit inflexible, but more
importantly it's awkward for the case where we use it during CAS.  In
that case the guest firmware supplies a buffer and we have to
awkwardly check that what we generated fits into it afterwards, after
doing a lot of size checks during spapr_build_fdt().

Simplify this by having spapr_build_fdt() take a 'space' parameter.
For the CAS case, we pass in the buffer size provided by SLOF, for the
machine init case, we continue to pass FDT_MAX_SIZE.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cedric Le Goater <clg@fr.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2019-12-17 10:39:48 +11:00
David Gibson
8deb8019d6 spapr: Don't trigger a CAS reboot for XICS/XIVE mode changeover
PAPR allows the interrupt controller used on a POWER9 machine (XICS or
XIVE) to be selected by the guest operating system, by using the
ibm,client-architecture-support (CAS) feature negotiation call.

Currently, if the guest selects an interrupt controller different from the
one selected at initial boot, this causes the system to be reset with the
new model and the boot starts again.  This means we run through the SLOF
boot process twice, as well as any other bootloader (e.g. grub) in use
before the OS calls CAS.  This can be confusing and/or inconvenient for
users.

Thanks to two fairly recent changes, we no longer need this reboot.  1) we
now completely regenerate the device tree when CAS is called (meaning we
don't need special case updates for all the device tree changes caused by
the interrupt controller mode change),  2) we now have explicit code paths
to activate and deactivate the different interrupt controllers, rather than
just implicitly calling those at machine reset time.

We can therefore eliminate the reboot for changing irq mode, simply by
putting a call to spapr_irq_update_active_intc() before we call
spapr_h_cas_compose_response() (which gives the updated device tree to
the guest firmware and OS).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cedric Le Goater <clg@fr.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2019-12-17 10:39:48 +11:00
Vladimir Sementsov-Ogievskiy
cdcca22aab ppc: well form kvmppc_hint_smt_possible error hint helper
Make kvmppc_hint_smt_possible hint append helper well formed:
rename errp to errp_in, as it is IN-parameter here (which is unusual
for errp), rename function to be kvmppc_error_append_*_hint.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20191127191434.20945-1-vsementsov@virtuozzo.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Cédric Le Goater
5373c61d6a ppc/pnv: Clarify how the TIMA is accessed on a multichip system
The TIMA region gives access to the thread interrupt context registers
of a CPU. It is mapped at the same address on all chips and can be
accessed by any CPU of the system. To identify the chip from which the
access is being done, the PowerBUS uses a 'chip' field in the
load/store messages. QEMU does not model these messages, instead, we
extract the chip id from the CPU PIR and do a lookup at the machine
level to fetch the targeted interrupt controller.

Introduce pnv_get_chip() and pnv_xive_tm_get_xive() helpers to clarify
this process in pnv_xive_get_tctx(). The latter will be removed in the
subsequent patches but the same principle will be kept.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-14-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Greg Kurz
4ffb749688 spapr: Pass the maximum number of vCPUs to the KVM interrupt controller
The XIVE and XICS-on-XIVE KVM devices on POWER9 hosts can greatly reduce
their consumption of some scarce HW resources, namely Virtual Presenter
identifiers, if they know the maximum number of vCPUs that may run in the
VM.

Prepare ground for this by passing the value down to xics_kvm_connect()
and kvmppc_xive_connect(). This is purely mechanical, no functional
change.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157478678301.67101.2717368060417156338.stgit@bahia.tlslab.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Cédric Le Goater
932de7aef8 ppc/spapr: Implement the XiveFabric interface
The CAM line matching sequence in the pseries machine does not change
much apart from the use of the new QOM interfaces. There is an extra
indirection because of the sPAPR IRQ backend of the machine. Only the
XIVE backend implements the new 'match_nvt' handler.

Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-11-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Cédric Le Goater
c722579e8c ppc/pnv: Implement the XiveFabric interface
The CAM line matching on the PowerNV machine now scans all chips of
the system and all CPUs of a chip to find a dispatched NVT in the
thread contexts.

Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-10-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Cédric Le Goater
119eaa9d11 ppc/pnv: Fix TIMA indirect access
When the TIMA of a CPU needs to be accessed from the indirect page,
the thread id of the target CPU is first stored in the PC_TCTXT_INDIR0
register. This thread id is relative to the chip and not to the system.

Introduce a helper routine to look for a CPU of a given PIR and fix
pnv_xive_get_indirect_tctx() to scan only the threads of the local
chip and not the whole machine.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-8-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Cédric Le Goater
4a89e20458 ppc: Introduce a ppc_cpu_pir() helper
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-6-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:47 +11:00
Greg Kurz
4fa28f2390 ppc/pnv: Instantiate cores separately
Allocating a big void * array to store multiple objects isn't a
recommended practice for various reasons:
 - no compile time type checking
 - potential dangling pointers if a reference on an individual is
  taken and the array is freed later on
 - duplicate boiler plate everywhere the array is browsed through

Allocate an array of pointers and populate it instead.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-4-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:47 +11:00
Cédric Le Goater
e2392d4395 ppc/pnv: Create BMC devices at machine init
The BMC of the OpenPOWER systems monitors the machine state using
sensors, controls the power and controls the access to the PNOR flash
device containing the firmware image required to boot the host.

QEMU models the power cycle process, access to the sensors and access
to the PNOR device. But, for these features to be available, the QEMU
PowerNV machine needs two extras devices on the command line, an IPMI
BT device for communication and a BMC backend device:

  -device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-bt,bmc=bmc0,irq=10

The BMC properties are then defined accordingly in the device tree and
OPAL self adapts. If a BMC device and an IPMI BT device are not
available, OPAL does not try to communicate with the BMC in any
manner. This is not how real systems behave.

To be closer to the default behavior, create an IPMI BMC simulator
device and an IPMI BT device at machine initialization time. We loose
the ability to define an external BMC device but there are benefits:

  - a better match with real systems,
  - a better test coverage of the OPAL code,
  - system powerdown and reset commands that work,
  - a QEMU device tree compliant with the specifications (*).

(*) Still needs a MBOX device.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191121162340.11049-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:47 +11:00
Cédric Le Goater
ca661fae81 ppc/pnv: Add HIOMAP commands
This activates HIOMAP support on the QEMU PowerNV machine. The PnvPnor
model is used to access the flash contents. The model simply maps the
contents at a fix offset and enables or disables the mapping.

HIOMAP Protocol description :

  https://github.com/openbmc/hiomapd/blob/master/Documentation/protocol.md

Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191028070027.22752-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:47 +11:00
Cédric Le Goater
95bd61c4df ppc/pnv: Add a LPC "ranges" property
And fix a typo in the MEM address space definition.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191118091908.15044-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:47 +11:00
Greg Kurz
818a6d30e0 spapr: Abort if XICS interrupt controller cannot be initialized
Failing to set any of the ICS property should really never happen:
- object_property_add_child() always succeed unless the child object
  already has a parent, which isn't the case here obviously since the
  ICS has just been created with object_new()
- the ICS has an "nr-irqs" property than can be set as long as the ICS
  isn't realized

In both cases, an error indicates there is a bug in QEMU. Propagating the
error, ie. exiting QEMU since spapr_irq_init() is called with &error_fatal
doesn't make much sense. Abort instead. This is consistent with what is
done with XIVE : both qdev_create() and qdev_prop_set_uint32() abort QEMU
on error.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157403285265.409804.8683093665795248192.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:47 +11:00