Commit Graph

504 Commits

Author SHA1 Message Date
Peter Maydell
f3b8f18ebf Monitor patches for 2019-08-21
-----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAl1dZKsSHGFybWJydUBy
 ZWRoYXQuY29tAAoJEDhwtADrkYZTJ4QP/10izA+dSofQ9404GRq3TNzwRCKugU44
 nES9CqDh6x5emx+ADQWYkugblgfH9GOvUaAUNtY+uFaEr55yC/F+VWeVXvyjt5U6
 ZpPZqIRDOHo2+PZrddr/KcKmiomS6plz03m9bzb3pYN1yIl2ZzgClAhAqWQLk0WB
 wwiY+YsJ83YR4sdiRMZkuF+UL7N8fSqYvIIj0yzM8+8ONDor9n16PoPeFg3JSsyG
 aMxXIUnSBZAVtClaNkUPtS0Wf9XEuqoG1rvMRV4Vv+eeb7fwA414DqanRJdLlGMA
 yNRtFcVyztCfjgVEXnY9JJlFe6pDkoe8ycoimQ4YA60C9c1DIMHqyjFWXRHfDwk8
 bYMSX6CTpfoEvbTfmwqYR6KSkb/KuXiFDmcYlTYFvIt3grhhdHQbru9vy+E5sm/b
 j3CPV2DTCkeGY+oZFfKIaQT9yoWZOhmMY5doMTYyinXygPTGQROUrHtzUeRXKmJZ
 arqDRmh+mlEiGETNeYQCI45eYCSDYxO+UNrhszxhmv6B1+ixhIrV2oXhi61vVBeY
 yngY4EILbuA2Z/E4BevJk91ESWJTr3UP13c6p7yf21iN4BD1KkHy5HoXCgYfQDeV
 4kar49g6WQ/VQEiwhi65Xd0OwstynkcV69F+kMagVMgaLeRsdU5ikGJQzxTeWJRl
 SPpc7oDwuAS+
 =2F3E
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2019-08-21' into staging

Monitor patches for 2019-08-21

# gpg: Signature made Wed 21 Aug 2019 16:35:07 BST
# gpg:                using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg:                issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-monitor-2019-08-21:
  monitor/qmp: Update comment for commit 4eaca8de26
  qdev: Collect HMP handlers command handlers in qdev-monitor.c
  qapi: Move query-target from misc.json to machine.json
  hw/core: Move cpu.c, cpu.h from qom/ to hw/core/

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-08-22 10:31:21 +01:00
Markus Armbruster
2e5b09fd0e hw/core: Move cpu.c, cpu.h from qom/ to hw/core/
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190709152053.16670-2-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[Rebased onto merge commit 95a9457fd44; missed instances of qom/cpu.h
in comments replaced]
2019-08-21 13:24:01 +02:00
Greg Kurz
e1588bcdd2 spapr/irq: Drop spapr_irq_msi_reset()
PHBs already take care of clearing the MSIs from the bitmap during reset
or unplug. No need to do this globally from the machine code. Rather add
an assert to ensure that PHBs have acted as expected.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156415228966.1064338.190189424190233355.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[dwg: Fix crash in qtest case where spapr->irq_map can be NULL at the
 new assert()]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:39 +10:00
Nicholas Piggin
93eac7b8f4 spapr: Implement ibm,suspend-me
This has been useful to modify and test the Linux pseries suspend
code but it requires modification to the guest to call it (due to
being gated by other unimplemented features). It is not otherwise
used by Linux yet, but work is slowly progressing there.

This allows a (lightly modified) guest kernel to suspend with
`echo mem > /sys/power/state` and be resumed with system_wakeup
monitor command.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20190722061752.22114-2-npiggin@gmail.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:39 +10:00
Cédric Le Goater
c5e760e0f2 ppc/xive: Improve 'info pic' support
Provide a better output of the XIVE END structures including the
escalation information and extend the PowerNV machine 'info pic'
command with a dump of the END EAS table used for escalations.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190718115420.19919-9-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:39 +10:00
Cédric Le Goater
ad31e2d242 ppc/xive: Provide silent escalation support
When the 's' bit is set the escalation is said to be 'silent' or
'silent/gather'. In such configuration, the notification sequence is
skipped and only the escalation sequence is performed. This is used to
configure all the EQs of a vCPU to escalate on a single EQ which will
then target the hypervisor.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190718115420.19919-8-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:39 +10:00
Cédric Le Goater
53e934921d ppc/xive: Provide unconditional escalation support
When the 'u' bit is set the escalation is said to be 'unconditional'
which means that the ESe PQ bits are not used. Introduce a
xive_router_end_es_notify() routine to share code with the ESn
notification.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190718115420.19919-7-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:39 +10:00
Cédric Le Goater
1994d3aa47 ppc/xive: use an abstract type for XiveNotifier
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190718115420.19919-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:39 +10:00
Michael Roth
0fb6bd0732 spapr: initial implementation for H_TPM_COMM/spapr-tpm-proxy
This implements the H_TPM_COMM hypercall, which is used by an
Ultravisor to pass TPM commands directly to the host's TPM device, or
a TPM Resource Manager associated with the device.

This also introduces a new virtual device, spapr-tpm-proxy, which
is used to configure the host TPM path to be used to service
requests sent by H_TPM_COMM hcalls, for example:

  -device spapr-tpm-proxy,id=tpmp0,host-path=/dev/tpmrm0

By default, no spapr-tpm-proxy will be created, and hcalls will return
H_FUNCTION.

The full specification for this hypercall can be found in
docs/specs/ppc-spapr-uv-hcalls.txt

Since SVM-related hcalls like H_TPM_COMM use a reserved range of
0xEF00-0xEF80, we introduce a separate hcall table here to handle
them.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com
Message-Id: <20190717205842.17827-3-mdroth@linux.vnet.ibm.com>
[dwg: Corrected #include for upstream change]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:12 +10:00
Nicholas Piggin
3a6e6224a9 spapr: Implement H_PROD
H_PROD is added, and H_CEDE is modified to test the prod bit
according to PAPR.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20190718034214.14948-3-npiggin@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:12 +10:00
Nicholas Piggin
03ef074c04 spapr: Implement dispatch tracking for tcg
Implement cpu_exec_enter/exit on ppc which calls into new methods of
the same name in PPCVirtualHypervisorClass. These are used by spapr
to implement the splpar VPA dispatch counter initially.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20190718034214.14948-2-npiggin@gmail.com>
[dwg: Removed unnecessary CONFIG_USER_ONLY checks as suggested by gkurz]
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:11 +10:00
Markus Armbruster
54d31236b9 sysemu: Split sysemu/runstate.h off sysemu/sysemu.h
sysemu/sysemu.h is a rather unfocused dumping ground for stuff related
to the system-emulator.  Evidence:

* It's included widely: in my "build everything" tree, changing
  sysemu/sysemu.h still triggers a recompile of some 1100 out of 6600
  objects (not counting tests and objects that don't depend on
  qemu/osdep.h, down from 5400 due to the previous two commits).

* It pulls in more than a dozen additional headers.

Split stuff related to run state management into its own header
sysemu/runstate.h.

Touching sysemu/sysemu.h now recompiles some 850 objects.  qemu/uuid.h
also drops from 1100 to 850, and qapi/qapi-types-run-state.h from 4400
to 4200.  Touching new sysemu/runstate.h recompiles some 500 objects.

Since I'm touching MAINTAINERS to add sysemu/runstate.h anyway, also
add qemu/main-loop.h.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-30-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
[Unbreak OS-X build]
2019-08-16 13:37:36 +02:00
Markus Armbruster
2f780b6a91 sysemu: Move the VMChangeStateEntry typedef to qemu/typedefs.h
In my "build everything" tree, changing sysemu/sysemu.h triggers a
recompile of some 1800 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h, down from 5400 due to the
previous commit).

Several headers include sysemu/sysemu.h just to get typedef
VMChangeStateEntry.  Move it from sysemu/sysemu.h to qemu/typedefs.h.
Spell its structure tag the same while there.  Drop the now
superfluous includes of sysemu/sysemu.h from headers.

Touching sysemu/sysemu.h now recompiles some 1100 objects.
qemu/uuid.h also drops from 1800 to 1100, and
qapi/qapi-types-run-state.h from 5000 to 4400.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-29-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:53 +02:00
Markus Armbruster
a27bd6c779 Include hw/qdev-properties.h less
In my "build everything" tree, changing hw/qdev-properties.h triggers
a recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

Many places including hw/qdev-properties.h (directly or via hw/qdev.h)
actually need only hw/qdev-core.h.  Include hw/qdev-core.h there
instead.

hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h
and hw/qdev-properties.h, which in turn includes hw/qdev-core.h.
Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h.

While there, delete a few superfluous inclusions of hw/qdev-core.h.

Touching hw/qdev-properties.h now recompiles some 1200 objects.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190812052359.30071-22-armbru@redhat.com>
2019-08-16 13:31:53 +02:00
Markus Armbruster
d484205210 Include exec/memory.h slightly less
Drop unnecessary inclusions from headers.  Downgrade a few more to
exec/hwaddr.h.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-17-armbru@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster
d645427057 Include migration/vmstate.h less
In my "build everything" tree, changing migration/vmstate.h triggers a
recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

hw/hw.h supposedly includes it for convenience.  Several other headers
include it just to get VMStateDescription.  The previous commit made
that unnecessary.

Include migration/vmstate.h only where it's still needed.  Touching it
now recompiles only some 1600 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-16-armbru@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster
6a0acfff99 Clean up inclusion of exec/cpu-common.h
migration/qemu-file.h neglects to include it even though it needs
ram_addr_t.  Fix that.  Drop a few superfluous inclusions elsewhere.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-14-armbru@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster
64552b6be4 Include hw/irq.h a lot less
In my "build everything" tree, changing hw/irq.h triggers a recompile
of some 5400 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).

hw/hw.h supposedly includes it for convenience.  Several other headers
include it just to get qemu_irq and.or qemu_irq_handler.

Move the qemu_irq and qemu_irq_handler typedefs from hw/irq.h to
qemu/typedefs.h, and then include hw/irq.h only where it's still
needed.  Touching it now recompiles only some 500 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-13-armbru@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster
2ae16a6aa4 Include generated QAPI headers less
Some of the generated qapi-types-MODULE.h are included all over the
place.  Changing a QAPI type can trigger massive recompiling.  Top
scorers recompile more than 1000 out of some 6600 objects (not
counting tests and objects that don't depend on qemu/osdep.h):

    6300 qapi/qapi-builtin-types.h
    5700 qapi/qapi-types-run-state.h
    3900 qapi/qapi-types-common.h
    3300 qapi/qapi-types-sockets.h
    3000 qapi/qapi-types-misc.h
    3000 qapi/qapi-types-crypto.h
    3000 qapi/qapi-types-job.h
    3000 qapi/qapi-types-block-core.h
    2800 qapi/qapi-types-block.h
    1300 qapi/qapi-types-net.h

Clean up headers to include generated QAPI headers only where needed.
Impact is negligible except for hw/qdev-properties.h.

This header includes qapi/qapi-types-block.h and
qapi/qapi-types-misc.h.  They are used only in expansions of property
definition macros such as DEFINE_PROP_BLOCKDEV_ON_ERROR() and
DEFINE_PROP_OFF_AUTO().  Moving their inclusion from
hw/qdev-properties.h to the users of these macros avoids pointless
recompiles.  This is how other property definition macros, such as
DEFINE_PROP_NETDEV(), already work.

Improves things for some of the top scorers:

    3600 qapi/qapi-types-common.h
    2800 qapi/qapi-types-sockets.h
     900 qapi/qapi-types-misc.h
    2200 qapi/qapi-types-crypto.h
    2100 qapi/qapi-types-job.h
    2100 qapi/qapi-types-block-core.h
     270 qapi/qapi-types-block.h

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-3-armbru@redhat.com>
2019-08-16 13:31:51 +02:00
Markus Armbruster
ec150c7e09 include: Make headers more self-contained
Back in 2016, we discussed[1] rules for headers, and these were
generally liked:

1. Have a carefully curated header that's included everywhere first.  We
   got that already thanks to Peter: osdep.h.

2. Headers should normally include everything they need beyond osdep.h.
   If exceptions are needed for some reason, they must be documented in
   the header.  If all that's needed from a header is typedefs, put
   those into qemu/typedefs.h instead of including the header.

3. Cyclic inclusion is forbidden.

This patch gets include/ closer to obeying 2.

It's actually extracted from my "[RFC] Baby steps towards saner
headers" series[2], which demonstrates a possible path towards
checking 2 automatically.  It passes the RFC test there.

[1] Message-ID: <87h9g8j57d.fsf@blackfin.pond.sub.org>
    https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg03345.html
[2] Message-Id: <20190711122827.18970-1-armbru@redhat.com>
    https://lists.nongnu.org/archive/html/qemu-devel/2019-07/msg02715.html

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-2-armbru@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:51 +02:00
Cédric Le Goater
310cda5b5e spapr/xive: Fix migration of hot-plugged CPUs
The migration sequence of a guest using the XIVE exploitation mode
relies on the fact that the states of all devices are restored before
the machine is. This is not true for hot-plug devices such as CPUs
which state come after the machine. This breaks migration because the
thread interrupt context registers are not correctly set.

Fix migration of hotplugged CPUs by restoring their context in the
'post_load' handler of the XiveTCTX model.

Fixes: 277dd3d771 ("spapr/xive: add migration support for KVM")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190813064853.29310-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-13 16:50:30 +10:00
Cédric Le Goater
d0e9bc0407 spapr/xive: simplify spapr_irq_init_device() to remove the emulated init
The init_emu() handles are now empty. Remove them and rename
spapr_irq_init_device() to spapr_irq_init_kvm().

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190614165920.12670-3-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Cédric Le Goater
981b1c6266 spapr/xive: rework the mapping the KVM memory regions
Today, the interrupt device is fully initialized at reset when the CAS
negotiation process has completed. Depending on the KVM capabilities,
the SpaprXive memory regions (ESB, TIMA) are initialized with a host
MMIO backend or a QEMU emulated backend. This results in a complex
initialization sequence partially done at realize and later at reset,
and some memory region leaks.

To simplify this sequence and to remove of the late initialization of
the emulated device which is required to be done only once, we
introduce new memory regions specific for KVM. These regions are
mapped as overlaps on top of the emulated device to make use of the
host MMIOs. Also provide proper cleanups of these regions when the
XIVE KVM device is destroyed to fix the leaks.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190614165920.12670-2-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz
330a21e3c4 xics/kvm: Add error propagation to ic*_set_kvm_state() functions
This allows errors happening there to be propagated up to spapr_irq,
just like XIVE already does.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156077921763.433243.4614327010172954196.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz
eab9f191a0 xics/spapr: Rename xics_kvm_init()
Switch to using the connect/disconnect terminology like we already do for
XIVE.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156077920102.433243.6605099291134598170.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz
2fb4c6528e xics/spapr: Drop unused function declaration
Commit 9fb6eb7ca50c added the declaration of xics_spapr_connect(), which
has no implementation and no users.

This is a leftover from a previous iteration of this patch. Drop it.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156077919546.433243.8748677531446035746.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz
7abc0c6d35 xics/spapr: Detect old KVM XICS on POWER9 hosts
Older KVMs on POWER9 don't support destroying/recreating a KVM XICS
device, which is required by 'dual' interrupt controller mode. This
causes QEMU to emit a warning when the guest is rebooted and to fall
back on XICS emulation:

qemu-system-ppc64: warning: kernel_irqchip allowed but unavailable:
 Error on KVM_CREATE_DEVICE for XICS: File exists

If kernel irqchip is required, QEMU will thus exit when the guest is
first rebooted. Failing QEMU this late may be a painful experience
for the user.

Detect that and exit at machine init instead.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156044430517.125694.6207865998817342638.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz
d9293c4843 xics/spapr: Register RTAS/hypercalls once at machine init
QEMU may crash when running a spapr machine in 'dual' interrupt controller
mode on some older (but not that old, eg. ubuntu 18.04.2) KVMs with partial
XIVE support:

qemu-system-ppc64: hw/ppc/spapr_rtas.c:411: spapr_rtas_register:
 Assertion `!name || !rtas_table[token].name' failed.

XICS is controlled by the guest thanks to a set of RTAS calls. Depending
on whether KVM XICS is used or not, the RTAS calls are handled by KVM or
QEMU. In both cases, QEMU needs to expose the RTAS calls to the guest
through the "rtas" node of the device tree.

The spapr_rtas_register() helper takes care of all of that: it adds the
RTAS call token to the "rtas" node and registers a QEMU callback to be
invoked when the guest issues the RTAS call. In the KVM XICS case, QEMU
registers a dummy callback that just prints an error since it isn't
supposed to be invoked, ever.

Historically, the XICS controller was setup during machine init and
released during final teardown. This changed when the 'dual' interrupt
controller mode was added to the spapr machine: in this case we need
to tear the XICS down and set it up again during machine reset. The
crash happens because we indeed have an incompatibility with older
KVMs that forces QEMU to fallback on emulated XICS, which tries to
re-registers the same RTAS calls.

This could be fixed by adding proper rollback that would unregister
RTAS calls on error. But since the emulated RTAS calls in QEMU can
now detect when they are mistakenly called while KVM XICS is in
use, it seems simpler to register them once and for all at machine
init. This fixes the crash and allows to remove some now useless
lines of code.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156044429963.125694.13710679451927268758.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Cédric Le Goater
c29a0b0fb3 ppc/pnv: remove xscom_base field from PnvChip
It has now became useless with the previous patch.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190612174345.9799-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Cédric Le Goater
709044fd2d ppc/pnv: fix XSCOM MMIO base address for P9 machines with multiple chips
The PNV_XSCOM_BASE and PNV_XSCOM_SIZE macros are specific to POWER8
and they are used when the device tree is populated and the MMIO
region created, even for POWER9 chips. This is not too much of a
problem today because we don't have important devices on the second
chip, but we might have oneday (PHBs).

Fix by using the appropriate macros in case of P9.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190612174345.9799-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Peter Maydell
a050901d4b ppc patch queue 2019-06-12
Next pull request against qemu-4.1.  The big thing here is adding
 support for hot plug of P2P bridges, and PCI devices under P2P bridges
 on the "pseries" machine (which doesn't use SHPC).  Other than that
 there's just a handful of fixes and small enhancements.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl0AkgwACgkQbDjKyiDZ
 s5Jyug//cwxP+t1t2CNHtffKwiXFzuEKx9YSNE1V0wog6aB40EbPKU72FzCq6FfA
 lev+pZWV9AwVMzFYe4VM/7Lqh7WFMYDT3DOXaZwfANs4471vYtgvPi21L2TBj80d
 hMszlyLWMLY9ByOzCxIq3xnbivGpA94G2q9rKbwXdK4T/5i62Pe3SIfgG+gXiiwW
 +YlHWCPX0I1cJz2bBs9ElXdl7ONWnn+7uDf7gNfWkTKuiUq6Ps7mxzy3GhJ1T7nz
 OFKmQ5dKzLJsgOULSSun8kWpXBmnPffkM3+fCE07edrWZVor09fMCk4HvtfaRy2K
 FFa2Kvzn/V/70TL+44dsSX4QcwdcHQztiaMO7UGPq9CMswx5L7gsNmfX6zvK1Nrb
 1t7ORZKNJ72hMyvDPSMiGU2DpVjO3ZbBlSL4/xG8Qeal4An0kgkN5NcFlB/XEfnz
 dsKu9XzuGSeD1bWz1Mgcf1x7lPDBoHIKLcX6notZ8epP/otu4ywNFvAkPu4fk8s0
 4jQGajIT7328SmzpjXClsmiEskpKsEr7hQjPRhu0hFGrhVc+i9PjkmbDl0TYRAf6
 N6k6gJQAi+StJde2rcua1iS7Ra+Tka6QRKy+EctLqfqOKPb2VmkZ6fswQ3nfRRlT
 LgcTHt2iJcLeud2klVXs1e4pKXzXchkVyFL4ucvmyYG5VeimMzU=
 =ERgu
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190612' into staging

ppc patch queue 2019-06-12

Next pull request against qemu-4.1.  The big thing here is adding
support for hot plug of P2P bridges, and PCI devices under P2P bridges
on the "pseries" machine (which doesn't use SHPC).  Other than that
there's just a handful of fixes and small enhancements.

# gpg: Signature made Wed 12 Jun 2019 06:47:56 BST
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-4.1-20190612:
  ppc/xive: Make XIVE generate the proper interrupt types
  ppc/pnv: activate the "dumpdtb" option on the powernv machine
  target/ppc: Use tcg_gen_gvec_bitsel
  spapr: Allow hot plug/unplug of PCI bridges and devices under PCI bridges
  spapr: Direct all PCI hotplug to host bridge, rather than P2P bridge
  spapr: Don't use bus number for building DRC ids
  spapr: Clean up DRC index construction
  spapr: Clean up spapr_drc_populate_dt()
  spapr: Clean up dt creation for PCI buses
  spapr: Clean up device tree construction for PCI devices
  spapr: Clean up device node name generation for PCI devices
  target/ppc: Fix lxvw4x, lxvh8x and lxvb16x
  spapr_pci: Improve error message

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-12 14:43:47 +01:00
Markus Armbruster
a8d2532645 Include qemu-common.h exactly where needed
No header includes qemu-common.h after this commit, as prescribed by
qemu-common.h's file comment.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-5-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c
block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c
target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h
target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h
target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h
target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and
net/tap-bsd.c fixed up]
2019-06-12 13:20:20 +02:00
Benjamin Herrenschmidt
4aca978654 ppc/xive: Make XIVE generate the proper interrupt types
It should be generic Hypervisor Virtualization interrupts for HV
directed rings and traditional External Interrupts for the OS directed
ring.

Don't generate anything for the user ring as it isn't actually
supported.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190606174409.12502-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-06-12 10:41:50 +10:00
David Gibson
9e7d38e8a3 spapr: Clean up spapr_drc_populate_dt()
This makes some minor cleanups to spapr_drc_populate_dt(), renaming it to
the shorter and more idiomatic spapr_dt_drc() along the way.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2019-06-12 10:41:49 +10:00
Greg Kurz
3725ef1a94 spapr: Don't migrate the hpt_maxpagesize cap to older machine types
Commit 0b8c89be7f7b added the hpt_maxpagesize capability to the migration
stream. This is okay for new machine types but it breaks backward migration
to older QEMUs, which don't expect the extra subsection.

Add a compatibility boolean flag to the sPAPR machine class and use it to
skip migration of the capability for machine types 4.0 and older. This
fixes migration to an older QEMU. Note that the destination will emit a
warning:

qemu-system-ppc64: warning: cap-hpt-max-page-size lower level (16) in incoming stream than on destination (24)

This is expected and harmless though. It is okay to migrate from a lower
HPT maximum page size (64k) to a greater one (16M).

Fixes: 0b8c89be7f7b "spapr: Add forgotten capability to migration stream"
Based-on: <20190522074016.10521-3-clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155853262675.1158324.17301777846476373459.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:47 +10:00
Cédric Le Goater
3f777abc71 spapr/irq: add KVM support to the 'dual' machine
The interrupt mode is chosen by the CAS negotiation process and
activated after a reset to take into account the required changes in
the machine. This brings new constraints on how the associated KVM IRQ
device is initialized.

Currently, each model takes care of the initialization of the KVM
device in their realize method but this is not possible anymore as the
initialization needs to be done globaly when the interrupt mode is
known, i.e. when machine is reseted. It also means that we need a way
to delete a KVM device when another mode is chosen.

Also, to support migration, the QEMU objects holding the state to
transfer should always be available but not necessarily activated.

The overall approach of this proposal is to initialize both interrupt
mode at the QEMU level to keep the IRQ number space in sync and to
allow switching from one mode to another. For the KVM side of things,
the whole initialization of the KVM device, sources and presenters, is
grouped in a single routine. The XICS and XIVE sPAPR IRQ reset
handlers are modified accordingly to handle the init and the delete
sequences of the KVM device.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-15-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater
cf435df697 spapr/irq: initialize the IRQ device only once
Add a check to make sure that the routine initializing the emulated
IRQ device is called once. We don't have much to test on the XICS
side, so we introduce a 'init' boolean under ICSState.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190513084245.25755-13-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater
ae805ea907 spapr/irq: introduce a spapr_irq_init_device() helper
The way the XICS and the XIVE devices are initialized follows the same
pattern. First, try to connect to the KVM device and if not possible
fallback on the emulated device, unless a kernel_irqchip is required.
The spapr_irq_init_device() routine implements this sequence in
generic way using new sPAPR IRQ handlers ->init_emu() and ->init_kvm().

The XIVE init sequence is moved under the associated sPAPR IRQ
->init() handler. This will change again when KVM support is added for
the dual interrupt mode.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-12-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater
56b11587df spapr: introduce routines to delete the KVM IRQ device
If a new interrupt mode is chosen by CAS, the machine generates a
reset to reconfigure. At this point, the connection with the previous
KVM device needs to be closed and a new connection needs to opened
with the KVM device operating the chosen interrupt mode.

New routines are introduced to destroy the XICS and the XIVE KVM
devices. They make use of a new KVM device ioctl which destroys the
device and also disconnects the IRQ presenters from the vCPUs.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-10-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater
277dd3d771 spapr/xive: add migration support for KVM
When the VM is stopped, the VM state handler stabilizes the XIVE IC
and marks the EQ pages dirty. These are then transferred to destination
before the transfer of the device vmstates starts.

The SpaprXive interrupt controller model captures the XIVE internal
tables, EAT and ENDT and the XiveTCTX model does the same for the
thread interrupt context registers.

At restart, the SpaprXive 'post_load' method restores all the XIVE
states. It is called by the sPAPR machine 'post_load' method, when all
XIVE states have been transferred and loaded.

Finally, the source states are restored in the VM change state handler
when the machine reaches the running state.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-7-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater
9b88cd7673 spapr/xive: introduce a VM state change handler
This handler is in charge of stabilizing the flow of event notifications
in the XIVE controller before migrating a guest. This is a requirement
before transferring the guest EQ pages to a destination.

When the VM is stopped, the handler sets the source PQs to PENDING to
stop the flow of events and to possibly catch a triggered interrupt
occuring while the VM is stopped. Their previous state is saved. The
XIVE controller is then synced through KVM to flush any in-flight
event notification and to stabilize the EQs. At this stage, the EQ
pages are marked dirty to make sure the EQ pages are transferred if a
migration sequence is in progress.

The previous configuration of the sources is restored when the VM
resumes, after a migration or a stop. If an interrupt was queued while
the VM was stopped, the handler simply generates the missing trigger.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-6-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater
7bfc759c02 spapr/xive: add state synchronization with KVM
This extends the KVM XIVE device backend with 'synchronize_state'
methods used to retrieve the state from KVM. The HW state of the
sources, the KVM device and the thread interrupt contexts are
collected for the monitor usage and also migration.

These get operations rely on their KVM counterpart in the host kernel
which acts as a proxy for OPAL, the host firmware. The set operations
will be added for migration support later.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190513084245.25755-5-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater
0c575703e4 spapr/xive: add hcall support when under KVM
XIVE hcalls are all redirected to QEMU as none are on a fast path.
When necessary, QEMU invokes KVM through specific ioctls to perform
host operations. QEMU should have done the necessary checks before
calling KVM and, in case of failure, H_HARDWARE is simply returned.

H_INT_ESB is a special case that could have been handled under KVM
but the impact on performance was low when under QEMU. Here are some
figures :

    kernel irqchip      OFF          ON
    H_INT_ESB                    KVM   QEMU

    rtl8139 (LSI )      1.19     1.24  1.23  Gbits/sec
    virtio             31.80    42.30   --   Gbits/sec

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-4-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:45 +10:00
Cédric Le Goater
38afd772f8 spapr/xive: add KVM support
This introduces a set of helpers when KVM is in use, which create the
KVM XIVE device, initialize the interrupt sources at a KVM level and
connect the interrupt presenters to the vCPU.

They also handle the initialization of the TIMA and the source ESB
memory regions of the controller. These have a different type under
KVM. They are 'ram device' memory mappings, similarly to VFIO, exposed
to the guest and the associated VMAs on the host are populated
dynamically with the appropriate pages using a fault handler.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:45 +10:00
David Gibson
64d4a53431 spapr: Add forgotten capability to migration stream
spapr machine capabilities are supposed to be sent in the migration stream
so that we can sanity check the source and destination have compatible
configuration.  Unfortunately, when we added the hpt-max-page-size
capability, we forgot to add it to the migration state.  This means that we
can generate spurious warnings when both ends are configured for large
pages, or potentially fail to warn if the source is configured for huge
pages, but the destination is not.

Fixes: 2309832afd "spapr: Maximum (HPT) pagesize property"

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-05-29 11:39:45 +10:00
Cédric Le Goater
13df93244e spapr/xive: fix EQ page addresses above 64GB
The high order bits of the address of the OS event queue is stored in
bits [4-31] of word2 of the XIVE END internal structures and the low
order bits in word3. This structure is using Big Endian ordering and
computing the value requires some simple arithmetic which happens to
be wrong. The mask removing bits [0-3] of word2 is applied to the
wrong value and the resulting address is bogus when above 64GB.

Guests with more than 64GB of RAM will allocate pages for the OS event
queues which will reside above the 64GB limit. In this case, the XIVE
device model will wake up the CPUs in case of a notification, such as
IPIs, but the update of the event queue will be written at the wrong
place in memory. The result is uncertain as the guest memory is
trashed and IPI are not delivered.

Introduce a helper xive_end_qaddr() to compute this value correctly in
all places where it is used.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190508171946.657-3-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:44 +10:00
Markus Armbruster
a8b991b52d Clean up ill-advised or unusual header guards
Leading underscores are ill-advised because such identifiers are
reserved.  Trailing underscores are merely ugly.  Strip both.

Our header guards commonly end in _H.  Normalize the exceptions.

Done with scripts/clean-header-guards.pl.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190315145123.28030-7-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[Changes to slirp/ dropped, as we're about to spin it off]
2019-05-13 08:58:55 +02:00
Benjamin Herrenschmidt
a2dd4e83e7 ppc/hash64: Rework R and C bit updates
With MT-TCG, we are now running translation in a racy way, thus
we need to mimic hardware when it comes to updating the R and
C bits, by doing byte stores.

The current "store_hpte" abstraction is ill suited for this, we
replace it with two separate callbacks for setting R and C.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190411080004.8690-4-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-04-26 11:37:57 +10:00
Cédric Le Goater
64db6c70dc spapr/rtas: modify spapr_rtas_register() to remove RTAS handlers
Removing RTAS handlers will become necessary when the new pseries
machine supporting multiple interrupt mode is introduced.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190321144914.19934-9-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-04-26 10:41:23 +10:00
Alexey Kardashevskiy
ec132efaa8 spapr: Support NVIDIA V100 GPU with NVLink2
NVIDIA V100 GPUs have on-board RAM which is mapped into the host memory
space and accessible as normal RAM via an NVLink bus. The VFIO-PCI driver
implements special regions for such GPUs and emulates an NVLink bridge.
NVLink2-enabled POWER9 CPUs also provide address translation services
which includes an ATS shootdown (ATSD) register exported via the NVLink
bridge device.

This adds a quirk to VFIO to map the GPU memory and create an MR;
the new MR is stored in a PCI device as a QOM link. The sPAPR PCI uses
this to get the MR and map it to the system address space.
Another quirk does the same for ATSD.

This adds additional steps to sPAPR PHB setup:

1. Search for specific GPUs and NPUs, collect findings in
sPAPRPHBState::nvgpus, manage system address space mappings;

2. Add device-specific properties such as "ibm,npu", "ibm,gpu",
"memory-block", "link-speed" to advertise the NVLink2 function to
the guest;

3. Add "mmio-atsd" to vPHB to advertise the ATSD capability;

4. Add new memory blocks (with extra "linux,memory-usable" to prevent
the guest OS from accessing the new memory until it is onlined) and
npuphb# nodes representing an NPU unit for every vPHB as the GPU driver
uses it for link discovery.

This allocates space for GPU RAM and ATSD like we do for MMIOs by
adding 2 new parameters to the phb_placement() hook. Older machine types
set these to zero.

This puts new memory nodes in a separate NUMA node to as the GPU RAM
needs to be configured equally distant from any other node in the system.
Unlike the host setup which assigns numa ids from 255 downwards, this
adds new NUMA nodes after the user configures nodes or from 1 if none
were configured.

This adds requirement similar to EEH - one IOMMU group per vPHB.
The reason for this is that ATSD registers belong to a physical NPU
so they cannot invalidate translations on GPUs attached to another NPU.
It is guaranteed by the host platform as it does not mix NVLink bridges
or GPUs from different NPU in the same IOMMU group. If more than one
IOMMU group is detected on a vPHB, this disables ATSD support for that
vPHB and prints a warning.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for vfio portions]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <20190312082103.130561-1-aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-04-26 10:41:23 +10:00