2016-06-29 14:47:03 +03:00
|
|
|
#ifndef HW_SPAPR_H
|
|
|
|
#define HW_SPAPR_H
|
2011-04-01 08:15:20 +04:00
|
|
|
|
2018-06-25 15:42:24 +03:00
|
|
|
#include "qemu/units.h"
|
2012-12-17 21:20:04 +04:00
|
|
|
#include "sysemu/dma.h"
|
2015-07-02 09:23:04 +03:00
|
|
|
#include "hw/boards.h"
|
2015-05-07 08:33:49 +03:00
|
|
|
#include "hw/ppc/spapr_drc.h"
|
2015-06-29 11:44:27 +03:00
|
|
|
#include "hw/mem/pc-dimm.h"
|
2016-10-25 07:47:28 +03:00
|
|
|
#include "hw/ppc/spapr_ovec.h"
|
2018-07-30 17:11:32 +03:00
|
|
|
#include "hw/ppc/spapr_irq.h"
|
2020-09-03 23:43:22 +03:00
|
|
|
#include "qom/object.h"
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
#include "hw/ppc/spapr_xive.h" /* For SpaprXive */
|
2019-01-10 10:09:13 +03:00
|
|
|
#include "hw/ppc/xics.h" /* For ICSState */
|
spapr: initial implementation for H_TPM_COMM/spapr-tpm-proxy
This implements the H_TPM_COMM hypercall, which is used by an
Ultravisor to pass TPM commands directly to the host's TPM device, or
a TPM Resource Manager associated with the device.
This also introduces a new virtual device, spapr-tpm-proxy, which
is used to configure the host TPM path to be used to service
requests sent by H_TPM_COMM hcalls, for example:
-device spapr-tpm-proxy,id=tpmp0,host-path=/dev/tpmrm0
By default, no spapr-tpm-proxy will be created, and hcalls will return
H_FUNCTION.
The full specification for this hypercall can be found in
docs/specs/ppc-spapr-uv-hcalls.txt
Since SVM-related hcalls like H_TPM_COMM use a reserved range of
0xEF00-0xEF80, we introduce a separate hcall table here to handle
them.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com
Message-Id: <20190717205842.17827-3-mdroth@linux.vnet.ibm.com>
[dwg: Corrected #include for upstream change]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-17 23:58:42 +03:00
|
|
|
#include "hw/ppc/spapr_tpm_proxy.h"
|
spapr: Implement Open Firmware client interface
The PAPR platform describes an OS environment that's presented by
a combination of a hypervisor and firmware. The features it specifies
require collaboration between the firmware and the hypervisor.
Since the beginning, the runtime component of the firmware (RTAS) has
been implemented as a 20 byte shim which simply forwards it to
a hypercall implemented in qemu. The boot time firmware component is
SLOF - but a build that's specific to qemu, and has always needed to be
updated in sync with it. Even though we've managed to limit the amount
of runtime communication we need between qemu and SLOF, there's some,
and it has become increasingly awkward to handle as we've implemented
new features.
This implements a boot time OF client interface (CI) which is
enabled by a new "x-vof" pseries machine option (stands for "Virtual Open
Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall
which implements Open Firmware Client Interface (OF CI). This allows
using a smaller stateless firmware which does not have to manage
the device tree.
The new "vof.bin" firmware image is included with source code under
pc-bios/. It also includes RTAS blob.
This implements a handful of CI methods just to get -kernel/-initrd
working. In particular, this implements the device tree fetching and
simple memory allocator - "claim" (an OF CI memory allocator) and updates
"/memory@0/available" to report the client about available memory.
This implements changing some device tree properties which we know how
to deal with, the rest is ignored. To allow changes, this skips
fdt_pack() when x-vof=on as not packing the blob leaves some room for
appending.
In absence of SLOF, this assigns phandles to device tree nodes to make
device tree traversing work.
When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree.
This adds basic instances support which are managed by a hash map
ihandle -> [phandle].
Before the guest started, the used memory is:
0..e60 - the initial firmware
8000..10000 - stack
400000.. - kernel
3ea0000.. - initramdisk
This OF CI does not implement "interpret".
Unlike SLOF, this does not format uninitialized nvram. Instead, this
includes a disk image with pre-formatted nvram.
With this basic support, this can only boot into kernel directly.
However this is just enough for the petitboot kernel and initradmdisk to
boot from any possible source. Note this requires reasonably recent guest
kernel with:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735
The immediate benefit is much faster booting time which especially
crucial with fully emulated early CPU bring up environments. Also this
may come handy when/if GRUB-in-the-userspace sees light of the day.
This separates VOF and sPAPR in a hope that VOF bits may be reused by
other POWERPC boards which do not support pSeries.
This assumes potential support for booting from QEMU backends
such as blockdev or netdev without devices/drivers used.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20210625055155.2252896-1-aik@ozlabs.ru>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
[dwg: Adjusted some includes which broke compile in some more obscure
compilation setups]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-25 08:51:55 +03:00
|
|
|
#include "hw/ppc/vof.h"
|
2011-05-26 13:52:44 +04:00
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
struct SpaprVioBus;
|
|
|
|
struct SpaprPhbState;
|
|
|
|
struct SpaprNvram;
|
2019-01-10 10:09:13 +03:00
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
typedef struct SpaprEventLogEntry SpaprEventLogEntry;
|
|
|
|
typedef struct SpaprEventSource SpaprEventSource;
|
|
|
|
typedef struct SpaprPendingHpt SpaprPendingHpt;
|
2011-04-01 08:15:21 +04:00
|
|
|
|
2013-07-18 23:33:01 +04:00
|
|
|
#define HPTE64_V_HPTE_DIRTY 0x0000000000000040ULL
|
2015-07-02 09:23:06 +03:00
|
|
|
#define SPAPR_ENTRY_POINT 0x100
|
2013-07-18 23:33:01 +04:00
|
|
|
|
2016-06-10 03:59:02 +03:00
|
|
|
#define SPAPR_TIMEBASE_FREQ 512000000ULL
|
|
|
|
|
2017-03-07 12:23:40 +03:00
|
|
|
#define TYPE_SPAPR_RTC "spapr-rtc"
|
|
|
|
|
2020-09-16 21:25:19 +03:00
|
|
|
OBJECT_DECLARE_SIMPLE_TYPE(SpaprRtcState, SPAPR_RTC)
|
2017-03-07 12:23:40 +03:00
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
struct SpaprRtcState {
|
2017-03-07 12:23:40 +03:00
|
|
|
/*< private >*/
|
|
|
|
DeviceState parent_obj;
|
|
|
|
int64_t ns_offset;
|
|
|
|
};
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
typedef struct SpaprDimmState SpaprDimmState;
|
2015-07-02 09:23:04 +03:00
|
|
|
|
|
|
|
#define TYPE_SPAPR_MACHINE "spapr-machine"
|
2020-09-16 21:25:18 +03:00
|
|
|
OBJECT_DECLARE_TYPE(SpaprMachineState, SpaprMachineClass, SPAPR_MACHINE)
|
2015-07-02 09:23:07 +03:00
|
|
|
|
2017-05-12 08:46:11 +03:00
|
|
|
typedef enum {
|
|
|
|
SPAPR_RESIZE_HPT_DEFAULT = 0,
|
|
|
|
SPAPR_RESIZE_HPT_DISABLED,
|
|
|
|
SPAPR_RESIZE_HPT_ENABLED,
|
|
|
|
SPAPR_RESIZE_HPT_REQUIRED,
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
} SpaprResizeHpt;
|
2017-05-12 08:46:11 +03:00
|
|
|
|
spapr: Capabilities infrastructure
Because PAPR is a paravirtual environment access to certain CPU (or other)
facilities can be blocked by the hypervisor. PAPR provides ways to
advertise in the device tree whether or not those features are available to
the guest.
In some places we automatically determine whether to make a feature
available based on whether our host can support it, in most cases this is
based on limitations in the available KVM implementation.
Although we correctly advertise this to the guest, it means that host
factors might make changes to the guest visible environment which is bad:
as well as generaly reducing reproducibility, it means that a migration
between different host environments can easily go bad.
We've mostly gotten away with it because the environments considered mature
enough to be well supported (basically, KVM on POWER8) have had consistent
feature availability. But, it's still not right and some limitations on
POWER9 is going to make it more of an issue in future.
This introduces an infrastructure for defining "sPAPR capabilities". These
are set by default based on the machine version, masked by the capabilities
of the chosen cpu, but can be overriden with machine properties.
The intention is at reset time we verify that the requested capabilities
can be supported on the host (considering TCG, KVM and/or host cpu
limitations). If not we simply fail, rather than silently modifying the
advertised featureset to the guest.
This does mean that certain configurations that "worked" may now fail, but
such configurations were already more subtly broken.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2017-12-08 02:35:35 +03:00
|
|
|
/**
|
|
|
|
* Capabilities
|
|
|
|
*/
|
|
|
|
|
2017-12-11 05:10:44 +03:00
|
|
|
/* Hardware Transactional Memory */
|
2018-01-12 08:33:43 +03:00
|
|
|
#define SPAPR_CAP_HTM 0x00
|
2017-12-07 09:08:47 +03:00
|
|
|
/* Vector Scalar Extensions */
|
2018-01-12 08:33:43 +03:00
|
|
|
#define SPAPR_CAP_VSX 0x01
|
2017-12-11 09:34:30 +03:00
|
|
|
/* Decimal Floating Point */
|
2018-01-12 08:33:43 +03:00
|
|
|
#define SPAPR_CAP_DFP 0x02
|
2018-01-19 08:00:02 +03:00
|
|
|
/* Cache Flush on Privilege Change */
|
|
|
|
#define SPAPR_CAP_CFPC 0x03
|
2018-01-19 08:00:03 +03:00
|
|
|
/* Speculation Barrier Bounds Checking */
|
|
|
|
#define SPAPR_CAP_SBBC 0x04
|
2018-01-19 08:00:04 +03:00
|
|
|
/* Indirect Branch Serialisation */
|
|
|
|
#define SPAPR_CAP_IBS 0x05
|
2018-03-16 11:19:13 +03:00
|
|
|
/* HPT Maximum Page Size (encoded as a shift) */
|
|
|
|
#define SPAPR_CAP_HPT_MAXPAGESIZE 0x06
|
2018-10-08 06:25:39 +03:00
|
|
|
/* Nested KVM-HV */
|
|
|
|
#define SPAPR_CAP_NESTED_KVM_HV 0x07
|
2019-03-01 05:43:14 +03:00
|
|
|
/* Large Decrementer */
|
|
|
|
#define SPAPR_CAP_LARGE_DECREMENTER 0x08
|
2019-03-01 06:19:12 +03:00
|
|
|
/* Count Cache Flush Assist HW Instruction */
|
|
|
|
#define SPAPR_CAP_CCF_ASSIST 0x09
|
2020-03-16 17:26:07 +03:00
|
|
|
/* Implements PAPR FWNMI option */
|
|
|
|
#define SPAPR_CAP_FWNMI 0x0A
|
2021-07-06 14:24:40 +03:00
|
|
|
/* Support H_RPT_INVALIDATE */
|
|
|
|
#define SPAPR_CAP_RPT_INVALIDATE 0x0B
|
2018-01-12 08:33:43 +03:00
|
|
|
/* Num Caps */
|
2021-07-06 14:24:40 +03:00
|
|
|
#define SPAPR_CAP_NUM (SPAPR_CAP_RPT_INVALIDATE + 1)
|
2018-01-12 08:33:43 +03:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Capability Values
|
|
|
|
*/
|
|
|
|
/* Bool Caps */
|
|
|
|
#define SPAPR_CAP_OFF 0x00
|
|
|
|
#define SPAPR_CAP_ON 0x01
|
2019-03-01 06:19:11 +03:00
|
|
|
|
2018-03-01 09:38:02 +03:00
|
|
|
/* Custom Caps */
|
2019-03-01 06:19:11 +03:00
|
|
|
|
|
|
|
/* Generic */
|
2018-01-19 08:00:01 +03:00
|
|
|
#define SPAPR_CAP_BROKEN 0x00
|
|
|
|
#define SPAPR_CAP_WORKAROUND 0x01
|
|
|
|
#define SPAPR_CAP_FIXED 0x02
|
2019-03-01 06:19:11 +03:00
|
|
|
/* SPAPR_CAP_IBS (cap-ibs) */
|
2018-03-01 09:38:02 +03:00
|
|
|
#define SPAPR_CAP_FIXED_IBS 0x02
|
|
|
|
#define SPAPR_CAP_FIXED_CCD 0x03
|
2019-03-01 06:19:11 +03:00
|
|
|
#define SPAPR_CAP_FIXED_NA 0x10 /* Lets leave a bit of a gap... */
|
2017-12-11 09:34:30 +03:00
|
|
|
|
2021-04-08 23:40:49 +03:00
|
|
|
#define FDT_MAX_SIZE 0x200000
|
2020-03-25 18:25:42 +03:00
|
|
|
|
2021-09-20 20:49:43 +03:00
|
|
|
/* Max number of GPUs per system */
|
|
|
|
#define NVGPU_MAX_NUM 6
|
|
|
|
|
|
|
|
/* Max number of NUMA nodes */
|
|
|
|
#define NUMA_NODES_MAX_NUM (MAX_NODES + NVGPU_MAX_NUM)
|
|
|
|
|
spapr: introduce SpaprMachineState::numa_assoc_array
The next step to centralize all NUMA/associativity handling in
the spapr machine is to create a 'one stop place' for all
things ibm,associativity.
This patch introduces numa_assoc_array, a 2 dimensional array
that will store all ibm,associativity arrays of all NUMA nodes.
This array is initialized in a new spapr_numa_associativity_init()
function, called in spapr_machine_init(). It is being initialized
with the same values used in other ibm,associativity properties
around spapr files (i.e. all zeros, last value is node_id).
The idea is to remove all hardcoded definitions and FDT writes
of ibm,associativity arrays, doing instead a call to the new
helper spapr_numa_write_associativity_dt() helper, that will
be able to write the DT with the correct values.
We'll start small, handling the trivial cases first. The
remaining instances of ibm,associativity will be handled
next.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20200903220639.563090-2-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-09-04 01:06:33 +03:00
|
|
|
/*
|
2021-09-20 20:49:43 +03:00
|
|
|
* NUMA FORM1 macros. FORM1_DIST_REF_POINTS was taken from
|
|
|
|
* MAX_DISTANCE_REF_POINTS in arch/powerpc/mm/numa.h from Linux
|
|
|
|
* kernel source. It represents the amount of associativity domains
|
|
|
|
* for non-CPU resources.
|
spapr: introduce SpaprMachineState::numa_assoc_array
The next step to centralize all NUMA/associativity handling in
the spapr machine is to create a 'one stop place' for all
things ibm,associativity.
This patch introduces numa_assoc_array, a 2 dimensional array
that will store all ibm,associativity arrays of all NUMA nodes.
This array is initialized in a new spapr_numa_associativity_init()
function, called in spapr_machine_init(). It is being initialized
with the same values used in other ibm,associativity properties
around spapr files (i.e. all zeros, last value is node_id).
The idea is to remove all hardcoded definitions and FDT writes
of ibm,associativity arrays, doing instead a call to the new
helper spapr_numa_write_associativity_dt() helper, that will
be able to write the DT with the correct values.
We'll start small, handling the trivial cases first. The
remaining instances of ibm,associativity will be handled
next.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20200903220639.563090-2-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-09-04 01:06:33 +03:00
|
|
|
*
|
2021-09-20 20:49:43 +03:00
|
|
|
* FORM1_NUMA_ASSOC_SIZE is the base array size of an ibm,associativity
|
spapr: introduce SpaprMachineState::numa_assoc_array
The next step to centralize all NUMA/associativity handling in
the spapr machine is to create a 'one stop place' for all
things ibm,associativity.
This patch introduces numa_assoc_array, a 2 dimensional array
that will store all ibm,associativity arrays of all NUMA nodes.
This array is initialized in a new spapr_numa_associativity_init()
function, called in spapr_machine_init(). It is being initialized
with the same values used in other ibm,associativity properties
around spapr files (i.e. all zeros, last value is node_id).
The idea is to remove all hardcoded definitions and FDT writes
of ibm,associativity arrays, doing instead a call to the new
helper spapr_numa_write_associativity_dt() helper, that will
be able to write the DT with the correct values.
We'll start small, handling the trivial cases first. The
remaining instances of ibm,associativity will be handled
next.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20200903220639.563090-2-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-09-04 01:06:33 +03:00
|
|
|
* array for any non-CPU resource.
|
|
|
|
*/
|
2021-09-20 20:49:43 +03:00
|
|
|
#define FORM1_DIST_REF_POINTS 4
|
|
|
|
#define FORM1_NUMA_ASSOC_SIZE (FORM1_DIST_REF_POINTS + 1)
|
2020-12-18 16:53:24 +03:00
|
|
|
|
spapr_numa.c: FORM2 NUMA affinity support
The main feature of FORM2 affinity support is the separation of NUMA
distances from ibm,associativity information. This allows for a more
flexible and straightforward NUMA distance assignment without relying on
complex associations between several levels of NUMA via
ibm,associativity matches. Another feature is its extensibility. This base
support contains the facilities for NUMA distance assignment, but in the
future more facilities will be added for latency, performance, bandwidth
and so on.
This patch implements the base FORM2 affinity support as follows:
- the use of FORM2 associativity is indicated by using bit 2 of byte 5
of ibm,architecture-vec-5. A FORM2 aware guest can choose to use FORM1
or FORM2 affinity. Setting both forms will default to FORM2. We're not
advertising FORM2 for pseries-6.1 and older machine versions to prevent
guest visible changes in those;
- ibm,associativity-reference-points has a new semantic. Instead of
being used to calculate distances via NUMA levels, it's now used to
indicate the primary domain index in the ibm,associativity domain of
each resource. In our case it's set to {0x4}, matching the position
where we already place logical_domain_id;
- two new RTAS DT artifacts are introduced: ibm,numa-lookup-index-table
and ibm,numa-distance-table. The index table is used to list all the
NUMA logical domains of the platform, in ascending order, and allows for
spartial NUMA configurations (although QEMU ATM doesn't support that).
ibm,numa-distance-table is an array that contains all the distances from
the first NUMA node to all other nodes, then the second NUMA node
distances to all other nodes and so on;
- get_max_dist_ref_points(), get_numa_assoc_size() and get_associativity()
now checks for OV5_FORM2_AFFINITY and returns FORM2 values if the guest
selected FORM2 affinity during CAS.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210920174947.556324-7-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-20 20:49:46 +03:00
|
|
|
/*
|
|
|
|
* FORM2 NUMA affinity has a single associativity domain, giving
|
|
|
|
* us a assoc size of 2.
|
|
|
|
*/
|
|
|
|
#define FORM2_DIST_REF_POINTS 1
|
|
|
|
#define FORM2_NUMA_ASSOC_SIZE (FORM2_DIST_REF_POINTS + 1)
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
typedef struct SpaprCapabilities SpaprCapabilities;
|
|
|
|
struct SpaprCapabilities {
|
2018-01-12 08:33:43 +03:00
|
|
|
uint8_t caps[SPAPR_CAP_NUM];
|
spapr: Capabilities infrastructure
Because PAPR is a paravirtual environment access to certain CPU (or other)
facilities can be blocked by the hypervisor. PAPR provides ways to
advertise in the device tree whether or not those features are available to
the guest.
In some places we automatically determine whether to make a feature
available based on whether our host can support it, in most cases this is
based on limitations in the available KVM implementation.
Although we correctly advertise this to the guest, it means that host
factors might make changes to the guest visible environment which is bad:
as well as generaly reducing reproducibility, it means that a migration
between different host environments can easily go bad.
We've mostly gotten away with it because the environments considered mature
enough to be well supported (basically, KVM on POWER8) have had consistent
feature availability. But, it's still not right and some limitations on
POWER9 is going to make it more of an issue in future.
This introduces an infrastructure for defining "sPAPR capabilities". These
are set by default based on the machine version, masked by the capabilities
of the chosen cpu, but can be overriden with machine properties.
The intention is at reset time we verify that the requested capabilities
can be supported on the host (considering TCG, KVM and/or host cpu
limitations). If not we simply fail, rather than silently modifying the
advertised featureset to the guest.
This does mean that certain configurations that "worked" may now fail, but
such configurations were already more subtly broken.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2017-12-08 02:35:35 +03:00
|
|
|
};
|
|
|
|
|
2015-07-02 09:23:07 +03:00
|
|
|
/**
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
* SpaprMachineClass:
|
2015-07-02 09:23:07 +03:00
|
|
|
*/
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
struct SpaprMachineClass {
|
2015-07-02 09:23:07 +03:00
|
|
|
/*< private >*/
|
|
|
|
MachineClass parent_class;
|
|
|
|
|
|
|
|
/*< public >*/
|
2015-12-09 15:34:13 +03:00
|
|
|
bool dr_lmb_enabled; /* enable dynamic-reconfig/hotplug of LMBs */
|
2019-02-19 20:18:23 +03:00
|
|
|
bool dr_phb_enabled; /* enable dynamic-reconfig/hotplug of PHBs */
|
2018-12-21 03:34:48 +03:00
|
|
|
bool update_dt_enabled; /* enable KVMPPC_H_UPDATE_DT */
|
2015-12-09 15:34:13 +03:00
|
|
|
bool use_ohci_by_default; /* use USB-OHCI instead of XHCI */
|
2017-06-14 16:29:19 +03:00
|
|
|
bool pre_2_10_has_unused_icps;
|
2018-07-30 17:11:32 +03:00
|
|
|
bool legacy_irq_allocation;
|
2019-09-27 06:54:23 +03:00
|
|
|
uint32_t nr_xirqs;
|
2019-03-27 05:54:11 +03:00
|
|
|
bool broken_host_serial_model; /* present real host info to the guest */
|
2019-05-22 16:43:46 +03:00
|
|
|
bool pre_4_1_migration; /* don't migrate hpt-max-page-size */
|
2019-07-19 07:37:34 +03:00
|
|
|
bool linux_pci_probe;
|
2019-10-03 15:02:00 +03:00
|
|
|
bool smp_threads_vsmt; /* set VSMT to smp_threads by default */
|
spapr: Don't clamp RMA to 16GiB on new machine types
In spapr_machine_init() we clamp the size of the RMA to 16GiB and the
comment saying why doesn't make a whole lot of sense. In fact, this was
done because the real mode handling code elsewhere limited the RMA in TCG
mode to the maximum value configurable in LPCR[RMLS], 16GiB.
But,
* Actually LPCR[RMLS] has been able to encode a 256GiB size for a very
long time, we just didn't implement it properly in the softmmu
* LPCR[RMLS] shouldn't really be relevant anyway, it only was because we
used to abuse the RMOR based translation mode in order to handle the
fact that we're not modelling the hypervisor parts of the cpu
We've now removed those limitations in the modelling so the 16GiB clamp no
longer serves a function. However, we can't just remove the limit
universally: that would break migration to earlier qemu versions, where
the 16GiB RMLS limit still applies, no matter how bad the reasons for it
are.
So, we replace the 16GiB clamp, with a clamp to a limit defined in the
machine type class. We set it to 16 GiB for machine types 4.2 and earlier,
but set it to 0 meaning unlimited for the new 5.0 machine type.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2020-02-19 12:53:13 +03:00
|
|
|
hwaddr rma_limit; /* clamp the RMA to this size */
|
2020-07-17 01:56:55 +03:00
|
|
|
bool pre_5_1_assoc_refpoints;
|
2020-10-07 20:28:45 +03:00
|
|
|
bool pre_5_2_numa_associativity;
|
spapr_numa.c: FORM2 NUMA affinity support
The main feature of FORM2 affinity support is the separation of NUMA
distances from ibm,associativity information. This allows for a more
flexible and straightforward NUMA distance assignment without relying on
complex associations between several levels of NUMA via
ibm,associativity matches. Another feature is its extensibility. This base
support contains the facilities for NUMA distance assignment, but in the
future more facilities will be added for latency, performance, bandwidth
and so on.
This patch implements the base FORM2 affinity support as follows:
- the use of FORM2 associativity is indicated by using bit 2 of byte 5
of ibm,architecture-vec-5. A FORM2 aware guest can choose to use FORM1
or FORM2 affinity. Setting both forms will default to FORM2. We're not
advertising FORM2 for pseries-6.1 and older machine versions to prevent
guest visible changes in those;
- ibm,associativity-reference-points has a new semantic. Instead of
being used to calculate distances via NUMA levels, it's now used to
indicate the primary domain index in the ibm,associativity domain of
each resource. In our case it's set to {0x4}, matching the position
where we already place logical_domain_id;
- two new RTAS DT artifacts are introduced: ibm,numa-lookup-index-table
and ibm,numa-distance-table. The index table is used to list all the
NUMA logical domains of the platform, in ascending order, and allows for
spartial NUMA configurations (although QEMU ATM doesn't support that).
ibm,numa-distance-table is an array that contains all the distances from
the first NUMA node to all other nodes, then the second NUMA node
distances to all other nodes and so on;
- get_max_dist_ref_points(), get_numa_assoc_size() and get_associativity()
now checks for OV5_FORM2_AFFINITY and returns FORM2 values if the guest
selected FORM2 affinity during CAS.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210920174947.556324-7-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-20 20:49:46 +03:00
|
|
|
bool pre_6_2_numa_affinity;
|
2018-07-30 17:11:32 +03:00
|
|
|
|
2020-11-21 02:42:05 +03:00
|
|
|
bool (*phb_placement)(SpaprMachineState *spapr, uint32_t index,
|
2021-01-14 21:06:22 +03:00
|
|
|
uint64_t *buid, hwaddr *pio,
|
spapr_pci: Add a 64-bit MMIO window
On real hardware, and under pHyp, the PCI host bridges on Power machines
typically advertise two outbound MMIO windows from the guest's physical
memory space to PCI memory space:
- A 32-bit window which maps onto 2GiB..4GiB in the PCI address space
- A 64-bit window which maps onto a large region somewhere high in PCI
address space (traditionally this used an identity mapping from guest
physical address to PCI address, but that's not always the case)
The qemu implementation in spapr-pci-host-bridge, however, only supports a
single outbound MMIO window, however. At least some Linux versions expect
the two windows however, so we arranged this window to map onto the PCI
memory space from 2 GiB..~64 GiB, then advertised it as two contiguous
windows, the "32-bit" window from 2G..4G and the "64-bit" window from
4G..~64G.
This approach means, however, that the 64G window is not naturally aligned.
In turn this limits the size of the largest BAR we can map (which does have
to be naturally aligned) to roughly half of the total window. With some
large nVidia GPGPU cards which have huge memory BARs, this is starting to
be a problem.
This patch adds true support for separate 32-bit and 64-bit outbound MMIO
windows to the spapr-pci-host-bridge implementation, each of which can
be independently configured. The 32-bit window always maps to 2G.. in PCI
space, but the PCI address of the 64-bit window can be configured (it
defaults to the same as the guest physical address).
So as not to break possible existing configurations, as long as a 64-bit
window is not specified, a large single window can be specified. This
will appear the same way to the guest as the old approach, although it's
now implemented by two contiguous memory regions rather than a single one.
For now, this only adds the possibility of 64-bit windows. The default
configuration still uses the legacy mode.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-11 06:23:33 +03:00
|
|
|
hwaddr *mmio32, hwaddr *mmio64,
|
spapr: Support NVIDIA V100 GPU with NVLink2
NVIDIA V100 GPUs have on-board RAM which is mapped into the host memory
space and accessible as normal RAM via an NVLink bus. The VFIO-PCI driver
implements special regions for such GPUs and emulates an NVLink bridge.
NVLink2-enabled POWER9 CPUs also provide address translation services
which includes an ATS shootdown (ATSD) register exported via the NVLink
bridge device.
This adds a quirk to VFIO to map the GPU memory and create an MR;
the new MR is stored in a PCI device as a QOM link. The sPAPR PCI uses
this to get the MR and map it to the system address space.
Another quirk does the same for ATSD.
This adds additional steps to sPAPR PHB setup:
1. Search for specific GPUs and NPUs, collect findings in
sPAPRPHBState::nvgpus, manage system address space mappings;
2. Add device-specific properties such as "ibm,npu", "ibm,gpu",
"memory-block", "link-speed" to advertise the NVLink2 function to
the guest;
3. Add "mmio-atsd" to vPHB to advertise the ATSD capability;
4. Add new memory blocks (with extra "linux,memory-usable" to prevent
the guest OS from accessing the new memory until it is onlined) and
npuphb# nodes representing an NPU unit for every vPHB as the GPU driver
uses it for link discovery.
This allocates space for GPU RAM and ATSD like we do for MMIOs by
adding 2 new parameters to the phb_placement() hook. Older machine types
set these to zero.
This puts new memory nodes in a separate NUMA node to as the GPU RAM
needs to be configured equally distant from any other node in the system.
Unlike the host setup which assigns numa ids from 255 downwards, this
adds new NUMA nodes after the user configures nodes or from 1 if none
were configured.
This adds requirement similar to EEH - one IOMMU group per vPHB.
The reason for this is that ATSD registers belong to a physical NPU
so they cannot invalidate translations on GPUs attached to another NPU.
It is guaranteed by the host platform as it does not mix NVLink bridges
or GPUs from different NPU in the same IOMMU group. If more than one
IOMMU group is detected on a vPHB, this disables ATSD support for that
vPHB and prints a warning.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for vfio portions]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <20190312082103.130561-1-aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 11:21:03 +03:00
|
|
|
unsigned n_dma, uint32_t *liobns, hwaddr *nv2gpa,
|
|
|
|
hwaddr *nv2atsd, Error **errp);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprResizeHpt resize_hpt_default;
|
|
|
|
SpaprCapabilities default_caps;
|
|
|
|
SpaprIrq *irq;
|
2015-07-02 09:23:07 +03:00
|
|
|
};
|
2015-07-02 09:23:04 +03:00
|
|
|
|
2022-06-22 08:10:08 +03:00
|
|
|
#define WDT_MAX_WATCHDOGS 4 /* Maximum number of watchdog devices */
|
|
|
|
|
|
|
|
#define TYPE_SPAPR_WDT "spapr-wdt"
|
|
|
|
OBJECT_DECLARE_SIMPLE_TYPE(SpaprWatchdog, SPAPR_WDT)
|
|
|
|
|
|
|
|
typedef struct SpaprWatchdog {
|
|
|
|
/*< private >*/
|
|
|
|
DeviceState parent_obj;
|
|
|
|
/*< public >*/
|
|
|
|
|
|
|
|
QEMUTimer timer;
|
|
|
|
uint8_t action; /* One of PSERIES_WDTF_ACTION_xxx */
|
|
|
|
uint8_t leave_others; /* leaveOtherWatchdogsRunningOnTimeout */
|
|
|
|
} SpaprWatchdog;
|
|
|
|
|
2015-07-02 09:23:04 +03:00
|
|
|
/**
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
* SpaprMachineState:
|
2015-07-02 09:23:04 +03:00
|
|
|
*/
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
struct SpaprMachineState {
|
2015-07-02 09:23:04 +03:00
|
|
|
/*< private >*/
|
|
|
|
MachineState parent_obj;
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
struct SpaprVioBus *vio_bus;
|
|
|
|
QLIST_HEAD(, SpaprPhbState) phbs;
|
|
|
|
struct SpaprNvram *nvram;
|
|
|
|
SpaprRtcState rtc;
|
Delay creation of pseries device tree until reset
At present, the 'pseries' machine creates a flattened device tree in the
machine->init function to pass to either the guest kernel or to firmware.
However, the machine->init function runs before processing of -device
command line options, which means that the device tree so created will
be (incorrectly) missing devices specified that way.
Supplying a correct device tree is, in any case, part of the required
platform entry conditions. Therefore, this patch moves the creation and
loading of the device tree from machine->init to a reset callback. The
setup of entry point address and initial register state moves with it,
which leads to a slight cleanup.
This is not, alas, quite enough to make a fully working reset for pseries.
For that we would need to reload the firmware images, which on this
machine are loaded into RAM. It's a step in the right direction, though.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-04-05 09:12:10 +04:00
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprResizeHpt resize_hpt;
|
Delay creation of pseries device tree until reset
At present, the 'pseries' machine creates a flattened device tree in the
machine->init function to pass to either the guest kernel or to firmware.
However, the machine->init function runs before processing of -device
command line options, which means that the device tree so created will
be (incorrectly) missing devices specified that way.
Supplying a correct device tree is, in any case, part of the required
platform entry conditions. Therefore, this patch moves the creation and
loading of the device tree from machine->init to a reset callback. The
setup of entry point address and initial register state moves with it,
which leads to a slight cleanup.
This is not, alas, quite enough to make a fully working reset for pseries.
For that we would need to reload the firmware images, which on this
machine are loaded into RAM. It's a step in the right direction, though.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-04-05 09:12:10 +04:00
|
|
|
void *htab;
|
2013-07-18 23:33:01 +04:00
|
|
|
uint32_t htab_shift;
|
2021-02-25 06:23:35 +03:00
|
|
|
uint64_t patb_entry; /* Process tbl registed in H_REGISTER_PROC_TBL */
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPendingHpt *pending_hpt; /* in-progress resize */
|
pseries: Implement HPT resizing
This patch implements hypercalls allowing a PAPR guest to resize its own
hash page table. This will eventually allow for more flexible memory
hotplug.
The implementation is partially asynchronous, handled in a special thread
running the hpt_prepare_thread() function. The state of a pending resize
is stored in SPAPR_MACHINE->pending_hpt.
The H_RESIZE_HPT_PREPARE hypercall will kick off creation of a new HPT, or,
if one is already in progress, monitor it for completion. If there is an
existing HPT resize in progress that doesn't match the size specified in
the call, it will cancel it, replacing it with a new one matching the
given size.
The H_RESIZE_HPT_COMMIT completes transition to a resized HPT, and can only
be called successfully once H_RESIZE_HPT_PREPARE has successfully
completed initialization of a new HPT. The guest must ensure that there
are no concurrent accesses to the existing HPT while this is called (this
effectively means stop_machine() for Linux guests).
For now H_RESIZE_HPT_COMMIT goes through the whole old HPT, rehashing each
HPTE into the new HPT. This can have quite high latency, but it seems to
be of the order of typical migration downtime latencies for HPTs of size
up to ~2GiB (which would be used in a 256GiB guest).
In future we probably want to move more of the rehashing to the "prepare"
phase, by having H_ENTER and other hcalls update both current and
pending HPTs. That's a project for another day, but should be possible
without any changes to the guest interface.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-12 08:46:49 +03:00
|
|
|
|
2012-10-23 14:30:10 +04:00
|
|
|
hwaddr rma_size;
|
2018-12-21 03:34:48 +03:00
|
|
|
uint32_t fdt_size;
|
|
|
|
uint32_t fdt_initial_size;
|
|
|
|
void *fdt_blob;
|
2016-10-20 07:31:45 +03:00
|
|
|
long kernel_size;
|
|
|
|
bool kernel_le;
|
2020-02-03 06:29:42 +03:00
|
|
|
uint64_t kernel_addr;
|
2016-10-20 07:31:45 +03:00
|
|
|
uint32_t initrd_base;
|
|
|
|
long initrd_size;
|
spapr: Implement Open Firmware client interface
The PAPR platform describes an OS environment that's presented by
a combination of a hypervisor and firmware. The features it specifies
require collaboration between the firmware and the hypervisor.
Since the beginning, the runtime component of the firmware (RTAS) has
been implemented as a 20 byte shim which simply forwards it to
a hypercall implemented in qemu. The boot time firmware component is
SLOF - but a build that's specific to qemu, and has always needed to be
updated in sync with it. Even though we've managed to limit the amount
of runtime communication we need between qemu and SLOF, there's some,
and it has become increasingly awkward to handle as we've implemented
new features.
This implements a boot time OF client interface (CI) which is
enabled by a new "x-vof" pseries machine option (stands for "Virtual Open
Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall
which implements Open Firmware Client Interface (OF CI). This allows
using a smaller stateless firmware which does not have to manage
the device tree.
The new "vof.bin" firmware image is included with source code under
pc-bios/. It also includes RTAS blob.
This implements a handful of CI methods just to get -kernel/-initrd
working. In particular, this implements the device tree fetching and
simple memory allocator - "claim" (an OF CI memory allocator) and updates
"/memory@0/available" to report the client about available memory.
This implements changing some device tree properties which we know how
to deal with, the rest is ignored. To allow changes, this skips
fdt_pack() when x-vof=on as not packing the blob leaves some room for
appending.
In absence of SLOF, this assigns phandles to device tree nodes to make
device tree traversing work.
When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree.
This adds basic instances support which are managed by a hash map
ihandle -> [phandle].
Before the guest started, the used memory is:
0..e60 - the initial firmware
8000..10000 - stack
400000.. - kernel
3ea0000.. - initramdisk
This OF CI does not implement "interpret".
Unlike SLOF, this does not format uninitialized nvram. Instead, this
includes a disk image with pre-formatted nvram.
With this basic support, this can only boot into kernel directly.
However this is just enough for the petitboot kernel and initradmdisk to
boot from any possible source. Note this requires reasonably recent guest
kernel with:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735
The immediate benefit is much faster booting time which especially
crucial with fully emulated early CPU bring up environments. Also this
may come handy when/if GRUB-in-the-userspace sees light of the day.
This separates VOF and sPAPR in a hope that VOF bits may be reused by
other POWERPC boards which do not support pSeries.
This assumes potential support for booting from QEMU backends
such as blockdev or netdev without devices/drivers used.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20210625055155.2252896-1-aik@ozlabs.ru>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
[dwg: Adjusted some includes which broke compile in some more obscure
compilation setups]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-25 08:51:55 +03:00
|
|
|
Vof *vof;
|
2015-02-06 06:55:52 +03:00
|
|
|
uint64_t rtc_offset; /* Now used only during incoming migration */
|
2014-05-01 14:37:09 +04:00
|
|
|
struct PPCTimebase tb;
|
2022-05-07 08:48:26 +03:00
|
|
|
bool want_stdout_path;
|
2017-08-18 08:50:22 +03:00
|
|
|
uint32_t vsmt; /* Virtual SMT mode (KVM's "core stride") */
|
2012-10-08 22:17:39 +04:00
|
|
|
|
2022-02-18 10:34:14 +03:00
|
|
|
/* Nested HV support (TCG only) */
|
|
|
|
uint64_t nested_ptcr;
|
|
|
|
|
2012-10-08 22:17:39 +04:00
|
|
|
Notifier epow_notifier;
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
QTAILQ_HEAD(, SpaprEventLogEntry) pending_events;
|
2016-10-27 05:20:26 +03:00
|
|
|
bool use_hotplug_event_source;
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprEventSource *event_sources;
|
2013-07-18 23:33:01 +04:00
|
|
|
|
2017-06-11 15:33:59 +03:00
|
|
|
/* ibm,client-architecture-support option negotiation */
|
2019-08-28 06:59:27 +03:00
|
|
|
bool cas_pre_isa3_guest;
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprOptionVector *ov5; /* QEMU-supported option vectors */
|
|
|
|
SpaprOptionVector *ov5_cas; /* negotiated (via CAS) option vectors */
|
2017-06-11 15:33:59 +03:00
|
|
|
uint32_t max_compat_pvr;
|
|
|
|
|
2013-07-18 23:33:01 +04:00
|
|
|
/* Migration state */
|
|
|
|
int htab_save_index;
|
|
|
|
bool htab_first_pass;
|
2013-07-18 23:33:03 +04:00
|
|
|
int htab_fd;
|
2015-05-07 08:33:48 +03:00
|
|
|
|
2017-05-24 10:01:48 +03:00
|
|
|
/* Pending DIMM unplug cache. It is populated when a LMB
|
|
|
|
* unplug starts. It can be regenerated if a migration
|
|
|
|
* occurs during the unplug process. */
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
QTAILQ_HEAD(, SpaprDimmState) pending_dimm_unplugs;
|
2017-05-24 10:01:48 +03:00
|
|
|
|
2020-03-16 17:26:07 +03:00
|
|
|
/* State related to FWNMI option */
|
|
|
|
|
2020-03-16 17:26:08 +03:00
|
|
|
/* System Reset and Machine Check Notification Routine addresses
|
2020-03-16 17:26:07 +03:00
|
|
|
* registered by "ibm,nmi-register" RTAS call.
|
|
|
|
*/
|
2020-03-16 17:26:08 +03:00
|
|
|
target_ulong fwnmi_system_reset_addr;
|
2020-03-16 17:26:07 +03:00
|
|
|
target_ulong fwnmi_machine_check_addr;
|
|
|
|
|
|
|
|
/* Machine Check FWNMI synchronization, fwnmi_machine_check_interlock is
|
|
|
|
* set to -1 if a FWNMI machine check is not in progress, else is set to
|
|
|
|
* the CPU that was delivered the machine check, and is set back to -1
|
|
|
|
* when that CPU makes an "ibm,nmi-interlock" RTAS call. The cond is used
|
|
|
|
* to synchronize other CPUs.
|
2020-01-30 21:44:19 +03:00
|
|
|
*/
|
2020-03-16 17:26:07 +03:00
|
|
|
int fwnmi_machine_check_interlock;
|
|
|
|
QemuCond fwnmi_machine_check_interlock_cond;
|
2020-01-30 21:44:19 +03:00
|
|
|
|
2021-05-21 19:07:35 +03:00
|
|
|
/* Set by -boot */
|
|
|
|
char *boot_device;
|
|
|
|
|
2015-07-02 09:23:04 +03:00
|
|
|
/*< public >*/
|
|
|
|
char *kvm_type;
|
2019-02-18 21:13:49 +03:00
|
|
|
char *host_model;
|
|
|
|
char *host_serial;
|
2017-02-27 17:29:28 +03:00
|
|
|
|
2018-07-30 17:11:32 +03:00
|
|
|
int32_t irq_map_nr;
|
|
|
|
unsigned long *irq_map;
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprIrq *irq;
|
2019-01-02 08:57:40 +03:00
|
|
|
qemu_irq *qirqs;
|
2019-09-26 08:41:39 +03:00
|
|
|
SpaprInterruptController *active_intc;
|
|
|
|
ICSState *ics;
|
|
|
|
SpaprXive *xive;
|
spapr: Capabilities infrastructure
Because PAPR is a paravirtual environment access to certain CPU (or other)
facilities can be blocked by the hypervisor. PAPR provides ways to
advertise in the device tree whether or not those features are available to
the guest.
In some places we automatically determine whether to make a feature
available based on whether our host can support it, in most cases this is
based on limitations in the available KVM implementation.
Although we correctly advertise this to the guest, it means that host
factors might make changes to the guest visible environment which is bad:
as well as generaly reducing reproducibility, it means that a migration
between different host environments can easily go bad.
We've mostly gotten away with it because the environments considered mature
enough to be well supported (basically, KVM on POWER8) have had consistent
feature availability. But, it's still not right and some limitations on
POWER9 is going to make it more of an issue in future.
This introduces an infrastructure for defining "sPAPR capabilities". These
are set by default based on the machine version, masked by the capabilities
of the chosen cpu, but can be overriden with machine properties.
The intention is at reset time we verify that the requested capabilities
can be supported on the host (considering TCG, KVM and/or host cpu
limitations). If not we simply fail, rather than silently modifying the
advertised featureset to the guest.
This does mean that certain configurations that "worked" may now fail, but
such configurations were already more subtly broken.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2017-12-08 02:35:35 +03:00
|
|
|
|
2018-01-12 08:33:43 +03:00
|
|
|
bool cmd_line_caps[SPAPR_CAP_NUM];
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprCapabilities def, eff, mig;
|
spapr: Support NVIDIA V100 GPU with NVLink2
NVIDIA V100 GPUs have on-board RAM which is mapped into the host memory
space and accessible as normal RAM via an NVLink bus. The VFIO-PCI driver
implements special regions for such GPUs and emulates an NVLink bridge.
NVLink2-enabled POWER9 CPUs also provide address translation services
which includes an ATS shootdown (ATSD) register exported via the NVLink
bridge device.
This adds a quirk to VFIO to map the GPU memory and create an MR;
the new MR is stored in a PCI device as a QOM link. The sPAPR PCI uses
this to get the MR and map it to the system address space.
Another quirk does the same for ATSD.
This adds additional steps to sPAPR PHB setup:
1. Search for specific GPUs and NPUs, collect findings in
sPAPRPHBState::nvgpus, manage system address space mappings;
2. Add device-specific properties such as "ibm,npu", "ibm,gpu",
"memory-block", "link-speed" to advertise the NVLink2 function to
the guest;
3. Add "mmio-atsd" to vPHB to advertise the ATSD capability;
4. Add new memory blocks (with extra "linux,memory-usable" to prevent
the guest OS from accessing the new memory until it is onlined) and
npuphb# nodes representing an NPU unit for every vPHB as the GPU driver
uses it for link discovery.
This allocates space for GPU RAM and ATSD like we do for MMIOs by
adding 2 new parameters to the phb_placement() hook. Older machine types
set these to zero.
This puts new memory nodes in a separate NUMA node to as the GPU RAM
needs to be configured equally distant from any other node in the system.
Unlike the host setup which assigns numa ids from 255 downwards, this
adds new NUMA nodes after the user configures nodes or from 1 if none
were configured.
This adds requirement similar to EEH - one IOMMU group per vPHB.
The reason for this is that ATSD registers belong to a physical NPU
so they cannot invalidate translations on GPUs attached to another NPU.
It is guaranteed by the host platform as it does not mix NVLink bridges
or GPUs from different NPU in the same IOMMU group. If more than one
IOMMU group is detected on a vPHB, this disables ATSD support for that
vPHB and prints a warning.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for vfio portions]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <20190312082103.130561-1-aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 11:21:03 +03:00
|
|
|
|
|
|
|
unsigned gpu_numa_id;
|
spapr: initial implementation for H_TPM_COMM/spapr-tpm-proxy
This implements the H_TPM_COMM hypercall, which is used by an
Ultravisor to pass TPM commands directly to the host's TPM device, or
a TPM Resource Manager associated with the device.
This also introduces a new virtual device, spapr-tpm-proxy, which
is used to configure the host TPM path to be used to service
requests sent by H_TPM_COMM hcalls, for example:
-device spapr-tpm-proxy,id=tpmp0,host-path=/dev/tpmrm0
By default, no spapr-tpm-proxy will be created, and hcalls will return
H_FUNCTION.
The full specification for this hypercall can be found in
docs/specs/ppc-spapr-uv-hcalls.txt
Since SVM-related hcalls like H_TPM_COMM use a reserved range of
0xEF00-0xEF80, we introduce a separate hcall table here to handle
them.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com
Message-Id: <20190717205842.17827-3-mdroth@linux.vnet.ibm.com>
[dwg: Corrected #include for upstream change]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-17 23:58:42 +03:00
|
|
|
SpaprTpmProxy *tpm_proxy;
|
2020-01-30 21:44:22 +03:00
|
|
|
|
2021-09-20 20:49:44 +03:00
|
|
|
uint32_t FORM1_assoc_array[NUMA_NODES_MAX_NUM][FORM1_NUMA_ASSOC_SIZE];
|
spapr_numa.c: FORM2 NUMA affinity support
The main feature of FORM2 affinity support is the separation of NUMA
distances from ibm,associativity information. This allows for a more
flexible and straightforward NUMA distance assignment without relying on
complex associations between several levels of NUMA via
ibm,associativity matches. Another feature is its extensibility. This base
support contains the facilities for NUMA distance assignment, but in the
future more facilities will be added for latency, performance, bandwidth
and so on.
This patch implements the base FORM2 affinity support as follows:
- the use of FORM2 associativity is indicated by using bit 2 of byte 5
of ibm,architecture-vec-5. A FORM2 aware guest can choose to use FORM1
or FORM2 affinity. Setting both forms will default to FORM2. We're not
advertising FORM2 for pseries-6.1 and older machine versions to prevent
guest visible changes in those;
- ibm,associativity-reference-points has a new semantic. Instead of
being used to calculate distances via NUMA levels, it's now used to
indicate the primary domain index in the ibm,associativity domain of
each resource. In our case it's set to {0x4}, matching the position
where we already place logical_domain_id;
- two new RTAS DT artifacts are introduced: ibm,numa-lookup-index-table
and ibm,numa-distance-table. The index table is used to list all the
NUMA logical domains of the platform, in ascending order, and allows for
spartial NUMA configurations (although QEMU ATM doesn't support that).
ibm,numa-distance-table is an array that contains all the distances from
the first NUMA node to all other nodes, then the second NUMA node
distances to all other nodes and so on;
- get_max_dist_ref_points(), get_numa_assoc_size() and get_associativity()
now checks for OV5_FORM2_AFFINITY and returns FORM2 values if the guest
selected FORM2 affinity during CAS.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210920174947.556324-7-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-20 20:49:46 +03:00
|
|
|
uint32_t FORM2_assoc_array[NUMA_NODES_MAX_NUM][FORM2_NUMA_ASSOC_SIZE];
|
spapr: introduce SpaprMachineState::numa_assoc_array
The next step to centralize all NUMA/associativity handling in
the spapr machine is to create a 'one stop place' for all
things ibm,associativity.
This patch introduces numa_assoc_array, a 2 dimensional array
that will store all ibm,associativity arrays of all NUMA nodes.
This array is initialized in a new spapr_numa_associativity_init()
function, called in spapr_machine_init(). It is being initialized
with the same values used in other ibm,associativity properties
around spapr files (i.e. all zeros, last value is node_id).
The idea is to remove all hardcoded definitions and FDT writes
of ibm,associativity arrays, doing instead a call to the new
helper spapr_numa_write_associativity_dt() helper, that will
be able to write the DT with the correct values.
We'll start small, handling the trivial cases first. The
remaining instances of ibm,associativity will be handled
next.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20200903220639.563090-2-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-09-04 01:06:33 +03:00
|
|
|
|
2020-01-30 21:44:22 +03:00
|
|
|
Error *fwnmi_migration_blocker;
|
2022-06-22 08:10:08 +03:00
|
|
|
|
|
|
|
SpaprWatchdog wds[WDT_MAX_WATCHDOGS];
|
2015-07-02 09:23:04 +03:00
|
|
|
};
|
2011-04-01 08:15:20 +04:00
|
|
|
|
|
|
|
#define H_SUCCESS 0
|
|
|
|
#define H_BUSY 1 /* Hardware busy -- retry later */
|
|
|
|
#define H_CLOSED 2 /* Resource closed */
|
|
|
|
#define H_NOT_AVAILABLE 3
|
|
|
|
#define H_CONSTRAINED 4 /* Resource request constrained to max allowed */
|
|
|
|
#define H_PARTIAL 5
|
|
|
|
#define H_IN_PROGRESS 14 /* Kind of like busy */
|
|
|
|
#define H_PAGE_REGISTERED 15
|
|
|
|
#define H_PARTIAL_STORE 16
|
|
|
|
#define H_PENDING 17 /* returned from H_POLL_PENDING */
|
|
|
|
#define H_CONTINUE 18 /* Returned from H_Join on success */
|
|
|
|
#define H_LONG_BUSY_START_RANGE 9900 /* Start of long busy range */
|
|
|
|
#define H_LONG_BUSY_ORDER_1_MSEC 9900 /* Long busy, hint that 1msec \
|
|
|
|
is a good time to retry */
|
|
|
|
#define H_LONG_BUSY_ORDER_10_MSEC 9901 /* Long busy, hint that 10msec \
|
|
|
|
is a good time to retry */
|
|
|
|
#define H_LONG_BUSY_ORDER_100_MSEC 9902 /* Long busy, hint that 100msec \
|
|
|
|
is a good time to retry */
|
|
|
|
#define H_LONG_BUSY_ORDER_1_SEC 9903 /* Long busy, hint that 1sec \
|
|
|
|
is a good time to retry */
|
|
|
|
#define H_LONG_BUSY_ORDER_10_SEC 9904 /* Long busy, hint that 10sec \
|
|
|
|
is a good time to retry */
|
|
|
|
#define H_LONG_BUSY_ORDER_100_SEC 9905 /* Long busy, hint that 100sec \
|
|
|
|
is a good time to retry */
|
|
|
|
#define H_LONG_BUSY_END_RANGE 9905 /* End of long busy range */
|
|
|
|
#define H_HARDWARE -1 /* Hardware error */
|
|
|
|
#define H_FUNCTION -2 /* Function not supported */
|
|
|
|
#define H_PRIVILEGE -3 /* Caller not privileged */
|
|
|
|
#define H_PARAMETER -4 /* Parameter invalid, out-of-range or conflicting */
|
|
|
|
#define H_BAD_MODE -5 /* Illegal msr value */
|
|
|
|
#define H_PTEG_FULL -6 /* PTEG is full */
|
|
|
|
#define H_NOT_FOUND -7 /* PTE was not found" */
|
|
|
|
#define H_RESERVED_DABR -8 /* DABR address is reserved by the hypervisor on this processor" */
|
|
|
|
#define H_NO_MEM -9
|
|
|
|
#define H_AUTHORITY -10
|
|
|
|
#define H_PERMISSION -11
|
|
|
|
#define H_DROPPED -12
|
|
|
|
#define H_SOURCE_PARM -13
|
|
|
|
#define H_DEST_PARM -14
|
|
|
|
#define H_REMOTE_PARM -15
|
|
|
|
#define H_RESOURCE -16
|
|
|
|
#define H_ADAPTER_PARM -17
|
|
|
|
#define H_RH_PARM -18
|
|
|
|
#define H_RCQ_PARM -19
|
|
|
|
#define H_SCQ_PARM -20
|
|
|
|
#define H_EQ_PARM -21
|
|
|
|
#define H_RT_PARM -22
|
|
|
|
#define H_ST_PARM -23
|
|
|
|
#define H_SIGT_PARM -24
|
|
|
|
#define H_TOKEN_PARM -25
|
|
|
|
#define H_MLENGTH_PARM -27
|
|
|
|
#define H_MEM_PARM -28
|
|
|
|
#define H_MEM_ACCESS_PARM -29
|
|
|
|
#define H_ATTR_PARM -30
|
|
|
|
#define H_PORT_PARM -31
|
|
|
|
#define H_MCG_PARM -32
|
|
|
|
#define H_VL_PARM -33
|
|
|
|
#define H_TSIZE_PARM -34
|
|
|
|
#define H_TRACE_PARM -35
|
|
|
|
|
|
|
|
#define H_MASK_PARM -37
|
|
|
|
#define H_MCG_FULL -38
|
|
|
|
#define H_ALIAS_EXIST -39
|
|
|
|
#define H_P_COUNTER -40
|
|
|
|
#define H_TABLE_FULL -41
|
|
|
|
#define H_ALT_TABLE -42
|
|
|
|
#define H_MR_CONDITION -43
|
|
|
|
#define H_NOT_ENOUGH_RESOURCES -44
|
|
|
|
#define H_R_STATE -45
|
|
|
|
#define H_RESCINDEND -46
|
2013-08-19 15:04:20 +04:00
|
|
|
#define H_P2 -55
|
|
|
|
#define H_P3 -56
|
|
|
|
#define H_P4 -57
|
|
|
|
#define H_P5 -58
|
|
|
|
#define H_P6 -59
|
|
|
|
#define H_P7 -60
|
|
|
|
#define H_P8 -61
|
|
|
|
#define H_P9 -62
|
2022-06-22 08:10:08 +03:00
|
|
|
#define H_NOOP -63
|
2022-02-18 10:34:14 +03:00
|
|
|
#define H_UNSUPPORTED -67
|
2020-02-10 07:56:42 +03:00
|
|
|
#define H_OVERLAP -68
|
2013-08-19 15:04:20 +04:00
|
|
|
#define H_UNSUPPORTED_FLAG -256
|
2011-04-01 08:15:20 +04:00
|
|
|
#define H_MULTI_THREADS_ACTIVE -9005
|
|
|
|
|
|
|
|
|
|
|
|
/* Long Busy is a condition that can be returned by the firmware
|
|
|
|
* when a call cannot be completed now, but the identical call
|
|
|
|
* should be retried later. This prevents calls blocking in the
|
|
|
|
* firmware for long periods of time. Annoyingly the firmware can return
|
|
|
|
* a range of return codes, hinting at how long we should wait before
|
|
|
|
* retrying. If you don't care for the hint, the macro below is a good
|
|
|
|
* way to check for the long_busy return codes
|
|
|
|
*/
|
|
|
|
#define H_IS_LONG_BUSY(x) ((x >= H_LONG_BUSY_START_RANGE) \
|
|
|
|
&& (x <= H_LONG_BUSY_END_RANGE))
|
|
|
|
|
|
|
|
/* Flags */
|
|
|
|
#define H_LARGE_PAGE (1ULL<<(63-16))
|
|
|
|
#define H_EXACT (1ULL<<(63-24)) /* Use exact PTE or return H_PTEG_FULL */
|
|
|
|
#define H_R_XLATE (1ULL<<(63-25)) /* include a valid logical page num in the pte if the valid bit is set */
|
|
|
|
#define H_READ_4 (1ULL<<(63-26)) /* Return 4 PTEs */
|
|
|
|
#define H_PAGE_STATE_CHANGE (1ULL<<(63-28))
|
|
|
|
#define H_PAGE_UNUSED ((1ULL<<(63-29)) | (1ULL<<(63-30)))
|
|
|
|
#define H_PAGE_SET_UNUSED (H_PAGE_STATE_CHANGE | H_PAGE_UNUSED)
|
|
|
|
#define H_PAGE_SET_LOANED (H_PAGE_SET_UNUSED | (1ULL<<(63-31)))
|
|
|
|
#define H_PAGE_SET_ACTIVE H_PAGE_STATE_CHANGE
|
|
|
|
#define H_AVPN (1ULL<<(63-32)) /* An avpn is provided as a sanity test */
|
|
|
|
#define H_ANDCOND (1ULL<<(63-33))
|
|
|
|
#define H_ICACHE_INVALIDATE (1ULL<<(63-40)) /* icbi, etc. (ignored for IO pages) */
|
|
|
|
#define H_ICACHE_SYNCHRONIZE (1ULL<<(63-41)) /* dcbst, icbi, etc (ignored for IO pages */
|
|
|
|
#define H_ZERO_PAGE (1ULL<<(63-48)) /* zero the page before mapping (ignored for IO pages) */
|
|
|
|
#define H_COPY_PAGE (1ULL<<(63-49))
|
|
|
|
#define H_N (1ULL<<(63-61))
|
|
|
|
#define H_PP1 (1ULL<<(63-62))
|
|
|
|
#define H_PP2 (1ULL<<(63-63))
|
|
|
|
|
2014-03-07 08:37:40 +04:00
|
|
|
/* Values for 2nd argument to H_SET_MODE */
|
|
|
|
#define H_SET_MODE_RESOURCE_SET_CIABR 1
|
2021-04-12 14:44:32 +03:00
|
|
|
#define H_SET_MODE_RESOURCE_SET_DAWR0 2
|
2014-03-07 08:37:40 +04:00
|
|
|
#define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE 3
|
|
|
|
#define H_SET_MODE_RESOURCE_LE 4
|
|
|
|
|
|
|
|
/* Flags for H_SET_MODE_RESOURCE_LE */
|
2013-08-19 15:04:20 +04:00
|
|
|
#define H_SET_MODE_ENDIAN_BIG 0
|
|
|
|
#define H_SET_MODE_ENDIAN_LITTLE 1
|
|
|
|
|
2011-04-01 08:15:20 +04:00
|
|
|
/* VASI States */
|
|
|
|
#define H_VASI_INVALID 0
|
|
|
|
#define H_VASI_ENABLED 1
|
|
|
|
#define H_VASI_ABORTED 2
|
|
|
|
#define H_VASI_SUSPENDING 3
|
|
|
|
#define H_VASI_SUSPENDED 4
|
|
|
|
#define H_VASI_RESUMED 5
|
|
|
|
#define H_VASI_COMPLETED 6
|
|
|
|
|
|
|
|
/* DABRX flags */
|
|
|
|
#define H_DABRX_HYPERVISOR (1ULL<<(63-61))
|
|
|
|
#define H_DABRX_KERNEL (1ULL<<(63-62))
|
|
|
|
#define H_DABRX_USER (1ULL<<(63-63))
|
|
|
|
|
2018-01-19 07:59:59 +03:00
|
|
|
/* Values for KVM_PPC_GET_CPU_CHAR & H_GET_CPU_CHARACTERISTICS */
|
|
|
|
#define H_CPU_CHAR_SPEC_BAR_ORI31 PPC_BIT(0)
|
|
|
|
#define H_CPU_CHAR_BCCTRL_SERIALISED PPC_BIT(1)
|
|
|
|
#define H_CPU_CHAR_L1D_FLUSH_ORI30 PPC_BIT(2)
|
|
|
|
#define H_CPU_CHAR_L1D_FLUSH_TRIG2 PPC_BIT(3)
|
|
|
|
#define H_CPU_CHAR_L1D_THREAD_PRIV PPC_BIT(4)
|
|
|
|
#define H_CPU_CHAR_HON_BRANCH_HINTS PPC_BIT(5)
|
|
|
|
#define H_CPU_CHAR_THR_RECONF_TRIG PPC_BIT(6)
|
2018-03-01 09:38:02 +03:00
|
|
|
#define H_CPU_CHAR_CACHE_COUNT_DIS PPC_BIT(7)
|
2019-03-01 06:19:11 +03:00
|
|
|
#define H_CPU_CHAR_BCCTR_FLUSH_ASSIST PPC_BIT(9)
|
2021-06-15 07:41:07 +03:00
|
|
|
|
2018-01-19 07:59:59 +03:00
|
|
|
#define H_CPU_BEHAV_FAVOUR_SECURITY PPC_BIT(0)
|
|
|
|
#define H_CPU_BEHAV_L1D_FLUSH_PR PPC_BIT(1)
|
|
|
|
#define H_CPU_BEHAV_BNDS_CHK_SPEC_BAR PPC_BIT(2)
|
2019-03-01 06:19:11 +03:00
|
|
|
#define H_CPU_BEHAV_FLUSH_COUNT_CACHE PPC_BIT(5)
|
2021-06-15 07:41:07 +03:00
|
|
|
#define H_CPU_BEHAV_NO_L1D_FLUSH_ENTRY PPC_BIT(7)
|
|
|
|
#define H_CPU_BEHAV_NO_L1D_FLUSH_UACCESS PPC_BIT(8)
|
2018-01-19 07:59:59 +03:00
|
|
|
|
2011-11-29 12:52:39 +04:00
|
|
|
/* Each control block has to be on a 4K boundary */
|
2011-04-01 08:15:20 +04:00
|
|
|
#define H_CB_ALIGNMENT 4096
|
|
|
|
|
|
|
|
/* pSeries hypervisor opcodes */
|
|
|
|
#define H_REMOVE 0x04
|
|
|
|
#define H_ENTER 0x08
|
|
|
|
#define H_READ 0x0c
|
|
|
|
#define H_CLEAR_MOD 0x10
|
|
|
|
#define H_CLEAR_REF 0x14
|
|
|
|
#define H_PROTECT 0x18
|
|
|
|
#define H_GET_TCE 0x1c
|
|
|
|
#define H_PUT_TCE 0x20
|
|
|
|
#define H_SET_SPRG0 0x24
|
|
|
|
#define H_SET_DABR 0x28
|
|
|
|
#define H_PAGE_INIT 0x2c
|
|
|
|
#define H_SET_ASR 0x30
|
|
|
|
#define H_ASR_ON 0x34
|
|
|
|
#define H_ASR_OFF 0x38
|
|
|
|
#define H_LOGICAL_CI_LOAD 0x3c
|
|
|
|
#define H_LOGICAL_CI_STORE 0x40
|
|
|
|
#define H_LOGICAL_CACHE_LOAD 0x44
|
|
|
|
#define H_LOGICAL_CACHE_STORE 0x48
|
|
|
|
#define H_LOGICAL_ICBI 0x4c
|
|
|
|
#define H_LOGICAL_DCBF 0x50
|
|
|
|
#define H_GET_TERM_CHAR 0x54
|
|
|
|
#define H_PUT_TERM_CHAR 0x58
|
|
|
|
#define H_REAL_TO_LOGICAL 0x5c
|
|
|
|
#define H_HYPERVISOR_DATA 0x60
|
|
|
|
#define H_EOI 0x64
|
|
|
|
#define H_CPPR 0x68
|
|
|
|
#define H_IPI 0x6c
|
|
|
|
#define H_IPOLL 0x70
|
|
|
|
#define H_XIRR 0x74
|
|
|
|
#define H_PERFMON 0x7c
|
|
|
|
#define H_MIGRATE_DMA 0x78
|
|
|
|
#define H_REGISTER_VPA 0xDC
|
|
|
|
#define H_CEDE 0xE0
|
|
|
|
#define H_CONFER 0xE4
|
|
|
|
#define H_PROD 0xE8
|
|
|
|
#define H_GET_PPP 0xEC
|
|
|
|
#define H_SET_PPP 0xF0
|
|
|
|
#define H_PURR 0xF4
|
|
|
|
#define H_PIC 0xF8
|
|
|
|
#define H_REG_CRQ 0xFC
|
|
|
|
#define H_FREE_CRQ 0x100
|
|
|
|
#define H_VIO_SIGNAL 0x104
|
|
|
|
#define H_SEND_CRQ 0x108
|
|
|
|
#define H_COPY_RDMA 0x110
|
|
|
|
#define H_REGISTER_LOGICAL_LAN 0x114
|
|
|
|
#define H_FREE_LOGICAL_LAN 0x118
|
|
|
|
#define H_ADD_LOGICAL_LAN_BUFFER 0x11C
|
|
|
|
#define H_SEND_LOGICAL_LAN 0x120
|
|
|
|
#define H_BULK_REMOVE 0x124
|
|
|
|
#define H_MULTICAST_CTRL 0x130
|
|
|
|
#define H_SET_XDABR 0x134
|
|
|
|
#define H_STUFF_TCE 0x138
|
|
|
|
#define H_PUT_TCE_INDIRECT 0x13C
|
|
|
|
#define H_CHANGE_LOGICAL_LAN_MAC 0x14C
|
|
|
|
#define H_VTERM_PARTNER_INFO 0x150
|
|
|
|
#define H_REGISTER_VTERM 0x154
|
|
|
|
#define H_FREE_VTERM 0x158
|
|
|
|
#define H_RESET_EVENTS 0x15C
|
|
|
|
#define H_ALLOC_RESOURCE 0x160
|
|
|
|
#define H_FREE_RESOURCE 0x164
|
|
|
|
#define H_MODIFY_QP 0x168
|
|
|
|
#define H_QUERY_QP 0x16C
|
|
|
|
#define H_REREGISTER_PMR 0x170
|
|
|
|
#define H_REGISTER_SMR 0x174
|
|
|
|
#define H_QUERY_MR 0x178
|
|
|
|
#define H_QUERY_MW 0x17C
|
|
|
|
#define H_QUERY_HCA 0x180
|
|
|
|
#define H_QUERY_PORT 0x184
|
|
|
|
#define H_MODIFY_PORT 0x188
|
|
|
|
#define H_DEFINE_AQP1 0x18C
|
|
|
|
#define H_GET_TRACE_BUFFER 0x190
|
|
|
|
#define H_DEFINE_AQP0 0x194
|
|
|
|
#define H_RESIZE_MR 0x198
|
|
|
|
#define H_ATTACH_MCQP 0x19C
|
|
|
|
#define H_DETACH_MCQP 0x1A0
|
|
|
|
#define H_CREATE_RPT 0x1A4
|
|
|
|
#define H_REMOVE_RPT 0x1A8
|
|
|
|
#define H_REGISTER_RPAGES 0x1AC
|
|
|
|
#define H_DISABLE_AND_GETC 0x1B0
|
|
|
|
#define H_ERROR_DATA 0x1B4
|
|
|
|
#define H_GET_HCA_INFO 0x1B8
|
|
|
|
#define H_GET_PERF_COUNT 0x1BC
|
|
|
|
#define H_MANAGE_TRACE 0x1C0
|
2018-01-19 08:00:05 +03:00
|
|
|
#define H_GET_CPU_CHARACTERISTICS 0x1C8
|
2011-04-01 08:15:20 +04:00
|
|
|
#define H_FREE_LOGICAL_LAN_BUFFER 0x1D4
|
|
|
|
#define H_QUERY_INT_STATE 0x1E4
|
|
|
|
#define H_POLL_PENDING 0x1D8
|
|
|
|
#define H_ILLAN_ATTRIBUTES 0x244
|
|
|
|
#define H_MODIFY_HEA_QP 0x250
|
|
|
|
#define H_QUERY_HEA_QP 0x254
|
|
|
|
#define H_QUERY_HEA 0x258
|
|
|
|
#define H_QUERY_HEA_PORT 0x25C
|
|
|
|
#define H_MODIFY_HEA_PORT 0x260
|
|
|
|
#define H_REG_BCMC 0x264
|
|
|
|
#define H_DEREG_BCMC 0x268
|
|
|
|
#define H_REGISTER_HEA_RPAGES 0x26C
|
|
|
|
#define H_DISABLE_AND_GET_HEA 0x270
|
|
|
|
#define H_GET_HEA_INFO 0x274
|
|
|
|
#define H_ALLOC_HEA_RESOURCE 0x278
|
|
|
|
#define H_ADD_CONN 0x284
|
|
|
|
#define H_DEL_CONN 0x288
|
|
|
|
#define H_JOIN 0x298
|
|
|
|
#define H_VASI_STATE 0x2A4
|
|
|
|
#define H_ENABLE_CRQ 0x2B0
|
|
|
|
#define H_GET_EM_PARMS 0x2B8
|
|
|
|
#define H_SET_MPP 0x2D0
|
|
|
|
#define H_GET_MPP 0x2D4
|
2018-12-19 19:35:41 +03:00
|
|
|
#define H_HOME_NODE_ASSOCIATIVITY 0x2EC
|
2013-09-26 10:18:46 +04:00
|
|
|
#define H_XIRR_X 0x2FC
|
ppc/spapr: Implement H_RANDOM hypercall in QEMU
The PAPR interface defines a hypercall to pass high-quality
hardware generated random numbers to guests. Recent kernels can
already provide this hypercall to the guest if the right hardware
random number generator is available. But in case the user wants
to use another source like EGD, or QEMU is running with an older
kernel, we should also have this call in QEMU, so that guests that
do not support virtio-rng yet can get good random numbers, too.
This patch now adds a new pseudo-device to QEMU that either
directly provides this hypercall to the guest or is able to
enable the in-kernel hypercall if available. The in-kernel
hypercall can be enabled with the use-kvm property, e.g.:
qemu-system-ppc64 -device spapr-rng,use-kvm=true
For handling the hypercall in QEMU instead, a "RngBackend" is
required since the hypercall should provide "good" random data
instead of pseudo-random (like from a "simple" library function
like rand() or g_random_int()). Since there are multiple RngBackends
available, the user must select an appropriate back-end via the
"rng" property of the device, e.g.:
qemu-system-ppc64 -object rng-random,filename=/dev/hwrng,id=gid0 \
-device spapr-rng,rng=gid0 ...
See http://wiki.qemu-project.org/Features-Done/VirtIORNG for
other example of specifying RngBackends.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-17 11:49:41 +03:00
|
|
|
#define H_RANDOM 0x300
|
2013-08-19 15:04:20 +04:00
|
|
|
#define H_SET_MODE 0x31C
|
2017-05-12 08:46:11 +03:00
|
|
|
#define H_RESIZE_HPT_PREPARE 0x36C
|
|
|
|
#define H_RESIZE_HPT_COMMIT 0x370
|
2017-03-20 02:46:45 +03:00
|
|
|
#define H_CLEAN_SLB 0x374
|
|
|
|
#define H_INVALIDATE_PID 0x378
|
|
|
|
#define H_REGISTER_PROC_TBL 0x37C
|
2016-12-05 08:50:21 +03:00
|
|
|
#define H_SIGNAL_SYS_RESET 0x380
|
2018-12-12 01:38:13 +03:00
|
|
|
|
|
|
|
#define H_INT_GET_SOURCE_INFO 0x3A8
|
|
|
|
#define H_INT_SET_SOURCE_CONFIG 0x3AC
|
|
|
|
#define H_INT_GET_SOURCE_CONFIG 0x3B0
|
|
|
|
#define H_INT_GET_QUEUE_INFO 0x3B4
|
|
|
|
#define H_INT_SET_QUEUE_CONFIG 0x3B8
|
|
|
|
#define H_INT_GET_QUEUE_CONFIG 0x3BC
|
|
|
|
#define H_INT_SET_OS_REPORTING_LINE 0x3C0
|
|
|
|
#define H_INT_GET_OS_REPORTING_LINE 0x3C4
|
|
|
|
#define H_INT_ESB 0x3C8
|
|
|
|
#define H_INT_SYNC 0x3CC
|
|
|
|
#define H_INT_RESET 0x3D0
|
2020-02-10 07:56:42 +03:00
|
|
|
#define H_SCM_READ_METADATA 0x3E4
|
|
|
|
#define H_SCM_WRITE_METADATA 0x3E8
|
|
|
|
#define H_SCM_BIND_MEM 0x3EC
|
|
|
|
#define H_SCM_UNBIND_MEM 0x3F0
|
|
|
|
#define H_SCM_UNBIND_ALL 0x3FC
|
2021-04-02 13:21:28 +03:00
|
|
|
#define H_SCM_HEALTH 0x400
|
2021-07-06 14:24:40 +03:00
|
|
|
#define H_RPT_INVALIDATE 0x448
|
2022-02-18 10:34:14 +03:00
|
|
|
#define H_SCM_FLUSH 0x44C
|
2022-06-22 08:10:08 +03:00
|
|
|
#define H_WATCHDOG 0x45C
|
2018-12-12 01:38:13 +03:00
|
|
|
|
2022-06-22 08:10:08 +03:00
|
|
|
#define MAX_HCALL_OPCODE H_WATCHDOG
|
2011-04-01 08:15:20 +04:00
|
|
|
|
2011-04-01 08:15:23 +04:00
|
|
|
/* The hcalls above are standardized in PAPR and implemented by pHyp
|
|
|
|
* as well.
|
|
|
|
*
|
|
|
|
* We also need some hcalls which are specific to qemu / KVM-on-POWER.
|
2017-06-30 13:05:32 +03:00
|
|
|
* We put those into the 0xf000-0xfffc range which is reserved by PAPR
|
|
|
|
* for "platform-specific" hcalls.
|
2011-04-01 08:15:23 +04:00
|
|
|
*/
|
|
|
|
#define KVMPPC_HCALL_BASE 0xf000
|
|
|
|
#define KVMPPC_H_RTAS (KVMPPC_HCALL_BASE + 0x0)
|
2012-06-19 00:21:37 +04:00
|
|
|
#define KVMPPC_H_LOGICAL_MEMOP (KVMPPC_HCALL_BASE + 0x1)
|
2014-05-23 06:26:54 +04:00
|
|
|
/* Client Architecture support */
|
|
|
|
#define KVMPPC_H_CAS (KVMPPC_HCALL_BASE + 0x2)
|
2018-12-21 03:34:48 +03:00
|
|
|
#define KVMPPC_H_UPDATE_DT (KVMPPC_HCALL_BASE + 0x3)
|
spapr: Implement Open Firmware client interface
The PAPR platform describes an OS environment that's presented by
a combination of a hypervisor and firmware. The features it specifies
require collaboration between the firmware and the hypervisor.
Since the beginning, the runtime component of the firmware (RTAS) has
been implemented as a 20 byte shim which simply forwards it to
a hypercall implemented in qemu. The boot time firmware component is
SLOF - but a build that's specific to qemu, and has always needed to be
updated in sync with it. Even though we've managed to limit the amount
of runtime communication we need between qemu and SLOF, there's some,
and it has become increasingly awkward to handle as we've implemented
new features.
This implements a boot time OF client interface (CI) which is
enabled by a new "x-vof" pseries machine option (stands for "Virtual Open
Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall
which implements Open Firmware Client Interface (OF CI). This allows
using a smaller stateless firmware which does not have to manage
the device tree.
The new "vof.bin" firmware image is included with source code under
pc-bios/. It also includes RTAS blob.
This implements a handful of CI methods just to get -kernel/-initrd
working. In particular, this implements the device tree fetching and
simple memory allocator - "claim" (an OF CI memory allocator) and updates
"/memory@0/available" to report the client about available memory.
This implements changing some device tree properties which we know how
to deal with, the rest is ignored. To allow changes, this skips
fdt_pack() when x-vof=on as not packing the blob leaves some room for
appending.
In absence of SLOF, this assigns phandles to device tree nodes to make
device tree traversing work.
When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree.
This adds basic instances support which are managed by a hash map
ihandle -> [phandle].
Before the guest started, the used memory is:
0..e60 - the initial firmware
8000..10000 - stack
400000.. - kernel
3ea0000.. - initramdisk
This OF CI does not implement "interpret".
Unlike SLOF, this does not format uninitialized nvram. Instead, this
includes a disk image with pre-formatted nvram.
With this basic support, this can only boot into kernel directly.
However this is just enough for the petitboot kernel and initradmdisk to
boot from any possible source. Note this requires reasonably recent guest
kernel with:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735
The immediate benefit is much faster booting time which especially
crucial with fully emulated early CPU bring up environments. Also this
may come handy when/if GRUB-in-the-userspace sees light of the day.
This separates VOF and sPAPR in a hope that VOF bits may be reused by
other POWERPC boards which do not support pSeries.
This assumes potential support for booting from QEMU backends
such as blockdev or netdev without devices/drivers used.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20210625055155.2252896-1-aik@ozlabs.ru>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
[dwg: Adjusted some includes which broke compile in some more obscure
compilation setups]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-25 08:51:55 +03:00
|
|
|
/* 0x4 was used for KVMPPC_H_UPDATE_PHANDLE in SLOF */
|
|
|
|
#define KVMPPC_H_VOF_CLIENT (KVMPPC_HCALL_BASE + 0x5)
|
2022-02-18 10:34:14 +03:00
|
|
|
|
|
|
|
/* Platform-specific hcalls used for nested HV KVM */
|
|
|
|
#define KVMPPC_H_SET_PARTITION_TABLE (KVMPPC_HCALL_BASE + 0x800)
|
|
|
|
#define KVMPPC_H_ENTER_NESTED (KVMPPC_HCALL_BASE + 0x804)
|
|
|
|
#define KVMPPC_H_TLB_INVALIDATE (KVMPPC_HCALL_BASE + 0x808)
|
|
|
|
#define KVMPPC_H_COPY_TOFROM_GUEST (KVMPPC_HCALL_BASE + 0x80C)
|
|
|
|
|
|
|
|
#define KVMPPC_HCALL_MAX KVMPPC_H_COPY_TOFROM_GUEST
|
2011-04-01 08:15:23 +04:00
|
|
|
|
spapr: initial implementation for H_TPM_COMM/spapr-tpm-proxy
This implements the H_TPM_COMM hypercall, which is used by an
Ultravisor to pass TPM commands directly to the host's TPM device, or
a TPM Resource Manager associated with the device.
This also introduces a new virtual device, spapr-tpm-proxy, which
is used to configure the host TPM path to be used to service
requests sent by H_TPM_COMM hcalls, for example:
-device spapr-tpm-proxy,id=tpmp0,host-path=/dev/tpmrm0
By default, no spapr-tpm-proxy will be created, and hcalls will return
H_FUNCTION.
The full specification for this hypercall can be found in
docs/specs/ppc-spapr-uv-hcalls.txt
Since SVM-related hcalls like H_TPM_COMM use a reserved range of
0xEF00-0xEF80, we introduce a separate hcall table here to handle
them.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com
Message-Id: <20190717205842.17827-3-mdroth@linux.vnet.ibm.com>
[dwg: Corrected #include for upstream change]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-17 23:58:42 +03:00
|
|
|
/*
|
|
|
|
* The hcall range 0xEF00 to 0xEF80 is reserved for use in facilitating
|
|
|
|
* Secure VM mode via an Ultravisor / Protected Execution Facility
|
|
|
|
*/
|
|
|
|
#define SVM_HCALL_BASE 0xEF00
|
|
|
|
#define SVM_H_TPM_COMM 0xEF10
|
|
|
|
#define SVM_HCALL_MAX SVM_H_TPM_COMM
|
|
|
|
|
2022-02-18 10:34:14 +03:00
|
|
|
/*
|
|
|
|
* Register state for entering a nested guest with H_ENTER_NESTED.
|
|
|
|
* New member must be added at the end.
|
|
|
|
*/
|
|
|
|
struct kvmppc_hv_guest_state {
|
|
|
|
uint64_t version; /* version of this structure layout, must be first */
|
|
|
|
uint32_t lpid;
|
|
|
|
uint32_t vcpu_token;
|
|
|
|
/* These registers are hypervisor privileged (at least for writing) */
|
|
|
|
uint64_t lpcr;
|
|
|
|
uint64_t pcr;
|
|
|
|
uint64_t amor;
|
|
|
|
uint64_t dpdes;
|
|
|
|
uint64_t hfscr;
|
|
|
|
int64_t tb_offset;
|
|
|
|
uint64_t dawr0;
|
|
|
|
uint64_t dawrx0;
|
|
|
|
uint64_t ciabr;
|
|
|
|
uint64_t hdec_expiry;
|
|
|
|
uint64_t purr;
|
|
|
|
uint64_t spurr;
|
|
|
|
uint64_t ic;
|
|
|
|
uint64_t vtb;
|
|
|
|
uint64_t hdar;
|
|
|
|
uint64_t hdsisr;
|
|
|
|
uint64_t heir;
|
|
|
|
uint64_t asdr;
|
|
|
|
/* These are OS privileged but need to be set late in guest entry */
|
|
|
|
uint64_t srr0;
|
|
|
|
uint64_t srr1;
|
|
|
|
uint64_t sprg[4];
|
|
|
|
uint64_t pidr;
|
|
|
|
uint64_t cfar;
|
|
|
|
uint64_t ppr;
|
|
|
|
/* Version 1 ends here */
|
|
|
|
uint64_t dawr1;
|
|
|
|
uint64_t dawrx1;
|
|
|
|
/* Version 2 ends here */
|
|
|
|
};
|
|
|
|
|
|
|
|
/* Latest version of hv_guest_state structure */
|
|
|
|
#define HV_GUEST_STATE_VERSION 2
|
|
|
|
|
|
|
|
/* Linux 64-bit powerpc pt_regs struct, used by nested HV */
|
|
|
|
struct kvmppc_pt_regs {
|
|
|
|
uint64_t gpr[32];
|
|
|
|
uint64_t nip;
|
|
|
|
uint64_t msr;
|
|
|
|
uint64_t orig_gpr3; /* Used for restarting system calls */
|
|
|
|
uint64_t ctr;
|
|
|
|
uint64_t link;
|
|
|
|
uint64_t xer;
|
|
|
|
uint64_t ccr;
|
|
|
|
uint64_t softe; /* Soft enabled/disabled */
|
|
|
|
uint64_t trap; /* Reason for being here */
|
|
|
|
uint64_t dar; /* Fault registers */
|
|
|
|
uint64_t dsisr; /* on 4xx/Book-E used for ESR */
|
|
|
|
uint64_t result; /* Result of a system call */
|
|
|
|
};
|
spapr: initial implementation for H_TPM_COMM/spapr-tpm-proxy
This implements the H_TPM_COMM hypercall, which is used by an
Ultravisor to pass TPM commands directly to the host's TPM device, or
a TPM Resource Manager associated with the device.
This also introduces a new virtual device, spapr-tpm-proxy, which
is used to configure the host TPM path to be used to service
requests sent by H_TPM_COMM hcalls, for example:
-device spapr-tpm-proxy,id=tpmp0,host-path=/dev/tpmrm0
By default, no spapr-tpm-proxy will be created, and hcalls will return
H_FUNCTION.
The full specification for this hypercall can be found in
docs/specs/ppc-spapr-uv-hcalls.txt
Since SVM-related hcalls like H_TPM_COMM use a reserved range of
0xEF00-0xEF80, we introduce a separate hcall table here to handle
them.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com
Message-Id: <20190717205842.17827-3-mdroth@linux.vnet.ibm.com>
[dwg: Corrected #include for upstream change]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-17 23:58:42 +03:00
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
typedef struct SpaprDeviceTreeUpdateHeader {
|
2014-05-23 06:26:54 +04:00
|
|
|
uint32_t version_id;
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
} SpaprDeviceTreeUpdateHeader;
|
2014-05-23 06:26:54 +04:00
|
|
|
|
2011-04-01 08:15:20 +04:00
|
|
|
#define hcall_dprintf(fmt, ...) \
|
2015-09-01 04:29:02 +03:00
|
|
|
do { \
|
|
|
|
qemu_log_mask(LOG_GUEST_ERROR, "%s: " fmt, __func__, ## __VA_ARGS__); \
|
|
|
|
} while (0)
|
2011-04-01 08:15:20 +04:00
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
typedef target_ulong (*spapr_hcall_fn)(PowerPCCPU *cpu, SpaprMachineState *sm,
|
2011-04-01 08:15:20 +04:00
|
|
|
target_ulong opcode,
|
|
|
|
target_ulong *args);
|
|
|
|
|
|
|
|
void spapr_register_hypercall(target_ulong opcode, spapr_hcall_fn fn);
|
2012-05-03 08:13:14 +04:00
|
|
|
target_ulong spapr_hypercall(PowerPCCPU *cpu, target_ulong opcode,
|
2011-04-01 08:15:20 +04:00
|
|
|
target_ulong *args);
|
2022-02-18 10:34:14 +03:00
|
|
|
|
|
|
|
void spapr_exit_nested(PowerPCCPU *cpu, int excp);
|
|
|
|
|
2021-05-06 19:39:38 +03:00
|
|
|
target_ulong softmmu_resize_hpt_prepare(PowerPCCPU *cpu, SpaprMachineState *spapr,
|
|
|
|
target_ulong shift);
|
|
|
|
target_ulong softmmu_resize_hpt_commit(PowerPCCPU *cpu, SpaprMachineState *spapr,
|
|
|
|
target_ulong flags, target_ulong shift);
|
|
|
|
bool is_ram_address(SpaprMachineState *spapr, hwaddr addr);
|
|
|
|
void push_sregs_to_kvm_pr(SpaprMachineState *spapr);
|
2011-04-01 08:15:20 +04:00
|
|
|
|
2019-07-18 06:42:11 +03:00
|
|
|
/* Virtual Processor Area structure constants */
|
|
|
|
#define VPA_MIN_SIZE 640
|
|
|
|
#define VPA_SIZE_OFFSET 0x4
|
|
|
|
#define VPA_SHARED_PROC_OFFSET 0x9
|
|
|
|
#define VPA_SHARED_PROC_VAL 0x2
|
|
|
|
#define VPA_DISPATCH_COUNTER 0x100
|
|
|
|
|
2015-02-20 07:58:52 +03:00
|
|
|
/* ibm,set-eeh-option */
|
|
|
|
#define RTAS_EEH_DISABLE 0
|
|
|
|
#define RTAS_EEH_ENABLE 1
|
|
|
|
#define RTAS_EEH_THAW_IO 2
|
|
|
|
#define RTAS_EEH_THAW_DMA 3
|
|
|
|
|
|
|
|
/* ibm,get-config-addr-info2 */
|
|
|
|
#define RTAS_GET_PE_ADDR 0
|
|
|
|
#define RTAS_GET_PE_MODE 1
|
|
|
|
#define RTAS_PE_MODE_NONE 0
|
|
|
|
#define RTAS_PE_MODE_NOT_SHARED 1
|
|
|
|
#define RTAS_PE_MODE_SHARED 2
|
|
|
|
|
|
|
|
/* ibm,read-slot-reset-state2 */
|
|
|
|
#define RTAS_EEH_PE_STATE_NORMAL 0
|
|
|
|
#define RTAS_EEH_PE_STATE_RESET 1
|
|
|
|
#define RTAS_EEH_PE_STATE_STOPPED_IO_DMA 2
|
|
|
|
#define RTAS_EEH_PE_STATE_STOPPED_DMA 4
|
|
|
|
#define RTAS_EEH_PE_STATE_UNAVAIL 5
|
|
|
|
#define RTAS_EEH_NOT_SUPPORT 0
|
|
|
|
#define RTAS_EEH_SUPPORT 1
|
|
|
|
#define RTAS_EEH_PE_UNAVAIL_INFO 1000
|
|
|
|
#define RTAS_EEH_PE_RECOVER_INFO 0
|
|
|
|
|
|
|
|
/* ibm,set-slot-reset */
|
|
|
|
#define RTAS_SLOT_RESET_DEACTIVATE 0
|
|
|
|
#define RTAS_SLOT_RESET_HOT 1
|
|
|
|
#define RTAS_SLOT_RESET_FUNDAMENTAL 3
|
|
|
|
|
|
|
|
/* ibm,slot-error-detail */
|
|
|
|
#define RTAS_SLOT_TEMP_ERR_LOG 1
|
|
|
|
#define RTAS_SLOT_PERM_ERR_LOG 2
|
|
|
|
|
2013-11-19 08:28:54 +04:00
|
|
|
/* RTAS return codes */
|
2016-01-19 07:57:42 +03:00
|
|
|
#define RTAS_OUT_SUCCESS 0
|
|
|
|
#define RTAS_OUT_NO_ERRORS_FOUND 1
|
|
|
|
#define RTAS_OUT_HW_ERROR -1
|
|
|
|
#define RTAS_OUT_BUSY -2
|
|
|
|
#define RTAS_OUT_PARAM_ERROR -3
|
|
|
|
#define RTAS_OUT_NOT_SUPPORTED -3
|
|
|
|
#define RTAS_OUT_NO_SUCH_INDICATOR -3
|
|
|
|
#define RTAS_OUT_NOT_AUTHORIZED -9002
|
|
|
|
#define RTAS_OUT_SYSPARM_PARAM_ERROR -9999
|
2013-11-19 08:28:54 +04:00
|
|
|
|
2016-07-04 06:33:07 +03:00
|
|
|
/* DDW pagesize mask values from ibm,query-pe-dma-window */
|
|
|
|
#define RTAS_DDW_PGSIZE_4K 0x01
|
|
|
|
#define RTAS_DDW_PGSIZE_64K 0x02
|
|
|
|
#define RTAS_DDW_PGSIZE_16M 0x04
|
|
|
|
#define RTAS_DDW_PGSIZE_32M 0x08
|
|
|
|
#define RTAS_DDW_PGSIZE_64M 0x10
|
|
|
|
#define RTAS_DDW_PGSIZE_128M 0x20
|
|
|
|
#define RTAS_DDW_PGSIZE_256M 0x40
|
|
|
|
#define RTAS_DDW_PGSIZE_16G 0x80
|
2022-03-21 10:19:45 +03:00
|
|
|
#define RTAS_DDW_PGSIZE_2M 0x100
|
2016-07-04 06:33:07 +03:00
|
|
|
|
2014-06-23 17:26:32 +04:00
|
|
|
/* RTAS tokens */
|
|
|
|
#define RTAS_TOKEN_BASE 0x2000
|
|
|
|
|
|
|
|
#define RTAS_DISPLAY_CHARACTER (RTAS_TOKEN_BASE + 0x00)
|
|
|
|
#define RTAS_GET_TIME_OF_DAY (RTAS_TOKEN_BASE + 0x01)
|
|
|
|
#define RTAS_SET_TIME_OF_DAY (RTAS_TOKEN_BASE + 0x02)
|
|
|
|
#define RTAS_POWER_OFF (RTAS_TOKEN_BASE + 0x03)
|
|
|
|
#define RTAS_SYSTEM_REBOOT (RTAS_TOKEN_BASE + 0x04)
|
|
|
|
#define RTAS_QUERY_CPU_STOPPED_STATE (RTAS_TOKEN_BASE + 0x05)
|
|
|
|
#define RTAS_START_CPU (RTAS_TOKEN_BASE + 0x06)
|
|
|
|
#define RTAS_STOP_SELF (RTAS_TOKEN_BASE + 0x07)
|
|
|
|
#define RTAS_IBM_GET_SYSTEM_PARAMETER (RTAS_TOKEN_BASE + 0x08)
|
|
|
|
#define RTAS_IBM_SET_SYSTEM_PARAMETER (RTAS_TOKEN_BASE + 0x09)
|
|
|
|
#define RTAS_IBM_SET_XIVE (RTAS_TOKEN_BASE + 0x0A)
|
|
|
|
#define RTAS_IBM_GET_XIVE (RTAS_TOKEN_BASE + 0x0B)
|
|
|
|
#define RTAS_IBM_INT_OFF (RTAS_TOKEN_BASE + 0x0C)
|
|
|
|
#define RTAS_IBM_INT_ON (RTAS_TOKEN_BASE + 0x0D)
|
|
|
|
#define RTAS_CHECK_EXCEPTION (RTAS_TOKEN_BASE + 0x0E)
|
|
|
|
#define RTAS_EVENT_SCAN (RTAS_TOKEN_BASE + 0x0F)
|
|
|
|
#define RTAS_IBM_SET_TCE_BYPASS (RTAS_TOKEN_BASE + 0x10)
|
|
|
|
#define RTAS_QUIESCE (RTAS_TOKEN_BASE + 0x11)
|
|
|
|
#define RTAS_NVRAM_FETCH (RTAS_TOKEN_BASE + 0x12)
|
|
|
|
#define RTAS_NVRAM_STORE (RTAS_TOKEN_BASE + 0x13)
|
|
|
|
#define RTAS_READ_PCI_CONFIG (RTAS_TOKEN_BASE + 0x14)
|
|
|
|
#define RTAS_WRITE_PCI_CONFIG (RTAS_TOKEN_BASE + 0x15)
|
|
|
|
#define RTAS_IBM_READ_PCI_CONFIG (RTAS_TOKEN_BASE + 0x16)
|
|
|
|
#define RTAS_IBM_WRITE_PCI_CONFIG (RTAS_TOKEN_BASE + 0x17)
|
|
|
|
#define RTAS_IBM_QUERY_INTERRUPT_SOURCE_NUMBER (RTAS_TOKEN_BASE + 0x18)
|
|
|
|
#define RTAS_IBM_CHANGE_MSI (RTAS_TOKEN_BASE + 0x19)
|
|
|
|
#define RTAS_SET_INDICATOR (RTAS_TOKEN_BASE + 0x1A)
|
|
|
|
#define RTAS_SET_POWER_LEVEL (RTAS_TOKEN_BASE + 0x1B)
|
|
|
|
#define RTAS_GET_POWER_LEVEL (RTAS_TOKEN_BASE + 0x1C)
|
|
|
|
#define RTAS_GET_SENSOR_STATE (RTAS_TOKEN_BASE + 0x1D)
|
|
|
|
#define RTAS_IBM_CONFIGURE_CONNECTOR (RTAS_TOKEN_BASE + 0x1E)
|
|
|
|
#define RTAS_IBM_OS_TERM (RTAS_TOKEN_BASE + 0x1F)
|
2015-02-20 07:58:52 +03:00
|
|
|
#define RTAS_IBM_SET_EEH_OPTION (RTAS_TOKEN_BASE + 0x20)
|
|
|
|
#define RTAS_IBM_GET_CONFIG_ADDR_INFO2 (RTAS_TOKEN_BASE + 0x21)
|
|
|
|
#define RTAS_IBM_READ_SLOT_RESET_STATE2 (RTAS_TOKEN_BASE + 0x22)
|
|
|
|
#define RTAS_IBM_SET_SLOT_RESET (RTAS_TOKEN_BASE + 0x23)
|
|
|
|
#define RTAS_IBM_CONFIGURE_PE (RTAS_TOKEN_BASE + 0x24)
|
|
|
|
#define RTAS_IBM_SLOT_ERROR_DETAIL (RTAS_TOKEN_BASE + 0x25)
|
2016-07-04 06:33:07 +03:00
|
|
|
#define RTAS_IBM_QUERY_PE_DMA_WINDOW (RTAS_TOKEN_BASE + 0x26)
|
|
|
|
#define RTAS_IBM_CREATE_PE_DMA_WINDOW (RTAS_TOKEN_BASE + 0x27)
|
|
|
|
#define RTAS_IBM_REMOVE_PE_DMA_WINDOW (RTAS_TOKEN_BASE + 0x28)
|
|
|
|
#define RTAS_IBM_RESET_PE_DMA_WINDOW (RTAS_TOKEN_BASE + 0x29)
|
2019-07-22 09:17:52 +03:00
|
|
|
#define RTAS_IBM_SUSPEND_ME (RTAS_TOKEN_BASE + 0x2A)
|
ppc: spapr: Handle "ibm,nmi-register" and "ibm,nmi-interlock" RTAS calls
This patch adds support in QEMU to handle "ibm,nmi-register"
and "ibm,nmi-interlock" RTAS calls.
The machine check notification address is saved when the
OS issues "ibm,nmi-register" RTAS call.
This patch also handles the case when multiple processors
experience machine check at or about the same time by
handling "ibm,nmi-interlock" call. In such cases, as per
PAPR, subsequent processors serialize waiting for the first
processor to issue the "ibm,nmi-interlock" call. The second
processor that also received a machine check error waits
till the first processor is done reading the error log.
The first processor issues "ibm,nmi-interlock" call
when the error log is consumed.
Signed-off-by: Aravinda Prasad <arawinda.p@gmail.com>
[Register fwnmi RTAS calls in core_rtas_register_types()
where other RTAS calls are registered]
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Message-Id: <20200130184423.20519-6-ganeshgr@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-01-30 21:44:21 +03:00
|
|
|
#define RTAS_IBM_NMI_REGISTER (RTAS_TOKEN_BASE + 0x2B)
|
|
|
|
#define RTAS_IBM_NMI_INTERLOCK (RTAS_TOKEN_BASE + 0x2C)
|
2015-02-20 07:58:52 +03:00
|
|
|
|
ppc: spapr: Handle "ibm,nmi-register" and "ibm,nmi-interlock" RTAS calls
This patch adds support in QEMU to handle "ibm,nmi-register"
and "ibm,nmi-interlock" RTAS calls.
The machine check notification address is saved when the
OS issues "ibm,nmi-register" RTAS call.
This patch also handles the case when multiple processors
experience machine check at or about the same time by
handling "ibm,nmi-interlock" call. In such cases, as per
PAPR, subsequent processors serialize waiting for the first
processor to issue the "ibm,nmi-interlock" call. The second
processor that also received a machine check error waits
till the first processor is done reading the error log.
The first processor issues "ibm,nmi-interlock" call
when the error log is consumed.
Signed-off-by: Aravinda Prasad <arawinda.p@gmail.com>
[Register fwnmi RTAS calls in core_rtas_register_types()
where other RTAS calls are registered]
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Message-Id: <20200130184423.20519-6-ganeshgr@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-01-30 21:44:21 +03:00
|
|
|
#define RTAS_TOKEN_MAX (RTAS_TOKEN_BASE + 0x2D)
|
2014-06-23 17:26:32 +04:00
|
|
|
|
2014-06-25 07:54:30 +04:00
|
|
|
/* RTAS ibm,get-system-parameter token values */
|
2014-06-25 07:54:32 +04:00
|
|
|
#define RTAS_SYSPARM_SPLPAR_CHARACTERISTICS 20
|
2014-06-25 07:54:30 +04:00
|
|
|
#define RTAS_SYSPARM_DIAGNOSTICS_RUN_MODE 42
|
2014-06-25 07:54:31 +04:00
|
|
|
#define RTAS_SYSPARM_UUID 48
|
2014-06-25 07:54:30 +04:00
|
|
|
|
2015-05-07 08:33:45 +03:00
|
|
|
/* RTAS indicator/sensor types
|
|
|
|
*
|
|
|
|
* as defined by PAPR+ 2.7 7.3.5.4, Table 41
|
|
|
|
*
|
|
|
|
* NOTE: currently only DR-related sensors are implemented here
|
|
|
|
*/
|
|
|
|
#define RTAS_SENSOR_TYPE_ISOLATION_STATE 9001
|
|
|
|
#define RTAS_SENSOR_TYPE_DR 9002
|
|
|
|
#define RTAS_SENSOR_TYPE_ALLOCATION_STATE 9003
|
|
|
|
#define RTAS_SENSOR_TYPE_ENTITY_SENSE RTAS_SENSOR_TYPE_ALLOCATION_STATE
|
|
|
|
|
2014-06-25 07:54:30 +04:00
|
|
|
/* Possible values for the platform-processor-diagnostics-run-mode parameter
|
|
|
|
* of the RTAS ibm,get-system-parameter call.
|
|
|
|
*/
|
|
|
|
#define DIAGNOSTICS_RUN_MODE_DISABLED 0
|
|
|
|
#define DIAGNOSTICS_RUN_MODE_STAGGERED 1
|
|
|
|
#define DIAGNOSTICS_RUN_MODE_IMMEDIATE 2
|
|
|
|
#define DIAGNOSTICS_RUN_MODE_PERIODIC 3
|
|
|
|
|
2013-09-27 12:10:18 +04:00
|
|
|
static inline uint64_t ppc64_phys_to_real(uint64_t addr)
|
|
|
|
{
|
|
|
|
return addr & ~0xF000000000000000ULL;
|
|
|
|
}
|
|
|
|
|
2011-04-01 08:15:23 +04:00
|
|
|
static inline uint32_t rtas_ld(target_ulong phys, int n)
|
|
|
|
{
|
2013-11-15 17:46:38 +04:00
|
|
|
return ldl_be_phys(&address_space_memory, ppc64_phys_to_real(phys + 4*n));
|
2011-04-01 08:15:23 +04:00
|
|
|
}
|
|
|
|
|
2015-09-01 04:05:12 +03:00
|
|
|
static inline uint64_t rtas_ldq(target_ulong phys, int n)
|
|
|
|
{
|
|
|
|
return (uint64_t)rtas_ld(phys, n) << 32 | rtas_ld(phys, n + 1);
|
|
|
|
}
|
|
|
|
|
2011-04-01 08:15:23 +04:00
|
|
|
static inline void rtas_st(target_ulong phys, int n, uint32_t val)
|
|
|
|
{
|
2013-12-17 09:07:29 +04:00
|
|
|
stl_be_phys(&address_space_memory, ppc64_phys_to_real(phys + 4*n), val);
|
2011-04-01 08:15:23 +04:00
|
|
|
}
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
typedef void (*spapr_rtas_fn)(PowerPCCPU *cpu, SpaprMachineState *sm,
|
2013-06-20 00:40:30 +04:00
|
|
|
uint32_t token,
|
2011-04-01 08:15:23 +04:00
|
|
|
uint32_t nargs, target_ulong args,
|
|
|
|
uint32_t nret, target_ulong rets);
|
2014-06-23 17:26:32 +04:00
|
|
|
void spapr_rtas_register(int token, const char *name, spapr_rtas_fn fn);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
target_ulong spapr_rtas_call(PowerPCCPU *cpu, SpaprMachineState *sm,
|
2011-04-01 08:15:23 +04:00
|
|
|
uint32_t token, uint32_t nargs, target_ulong args,
|
|
|
|
uint32_t nret, target_ulong rets);
|
2016-10-20 07:55:36 +03:00
|
|
|
void spapr_dt_rtas_tokens(void *fdt, int rtas);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
void spapr_load_rtas(SpaprMachineState *spapr, void *fdt, hwaddr addr);
|
2011-04-01 08:15:23 +04:00
|
|
|
|
2012-06-27 08:50:44 +04:00
|
|
|
#define SPAPR_TCE_PAGE_SHIFT 12
|
|
|
|
#define SPAPR_TCE_PAGE_SIZE (1ULL << SPAPR_TCE_PAGE_SHIFT)
|
|
|
|
#define SPAPR_TCE_PAGE_MASK (SPAPR_TCE_PAGE_SIZE - 1)
|
|
|
|
|
|
|
|
#define SPAPR_VIO_BASE_LIOBN 0x00000000
|
2015-05-07 08:33:31 +03:00
|
|
|
#define SPAPR_VIO_LIOBN(reg) (0x00000000 | (reg))
|
2015-05-07 08:33:30 +03:00
|
|
|
#define SPAPR_PCI_LIOBN(phb_index, window_num) \
|
|
|
|
(0x80000000 | ((phb_index) << 8) | (window_num))
|
2015-05-07 08:33:33 +03:00
|
|
|
#define SPAPR_IS_PCI_LIOBN(liobn) (!!((liobn) & 0x80000000))
|
2015-05-07 08:33:30 +03:00
|
|
|
#define SPAPR_PCI_DMA_WINDOW_NUM(liobn) ((liobn) & 0xff)
|
2012-06-27 08:50:44 +04:00
|
|
|
|
2021-06-22 10:03:36 +03:00
|
|
|
#define RTAS_MIN_SIZE 20 /* hv_rtas_size in SLOF */
|
2012-10-08 22:17:39 +04:00
|
|
|
#define RTAS_ERROR_LOG_MAX 2048
|
|
|
|
|
2020-01-30 21:44:20 +03:00
|
|
|
/* Offset from rtas-base where error log is placed */
|
|
|
|
#define RTAS_ERROR_LOG_OFFSET 0x30
|
|
|
|
|
2015-05-07 08:33:50 +03:00
|
|
|
#define RTAS_EVENT_SCAN_RATE 1
|
|
|
|
|
2017-12-06 11:13:16 +03:00
|
|
|
/* This helper should be used to encode interrupt specifiers when the related
|
|
|
|
* "interrupt-controller" node has its "#interrupt-cells" property set to 2 (ie,
|
|
|
|
* VIO devices, RTAS event sources and PHBs).
|
|
|
|
*/
|
2019-01-17 20:14:39 +03:00
|
|
|
static inline void spapr_dt_irq(uint32_t *intspec, int irq, bool is_lsi)
|
2017-12-06 11:13:16 +03:00
|
|
|
{
|
|
|
|
intspec[0] = cpu_to_be32(irq);
|
|
|
|
intspec[1] = is_lsi ? cpu_to_be32(1) : 0;
|
|
|
|
}
|
|
|
|
|
2012-10-08 22:17:39 +04:00
|
|
|
|
2013-07-18 23:32:58 +04:00
|
|
|
#define TYPE_SPAPR_TCE_TABLE "spapr-tce-table"
|
2020-09-16 21:25:19 +03:00
|
|
|
OBJECT_DECLARE_SIMPLE_TYPE(SpaprTceTable, SPAPR_TCE_TABLE)
|
2013-07-18 23:32:58 +04:00
|
|
|
|
2017-07-11 06:56:20 +03:00
|
|
|
#define TYPE_SPAPR_IOMMU_MEMORY_REGION "spapr-iommu-memory-region"
|
2020-09-01 00:07:33 +03:00
|
|
|
DECLARE_INSTANCE_CHECKER(IOMMUMemoryRegion, SPAPR_IOMMU_MEMORY_REGION,
|
|
|
|
TYPE_SPAPR_IOMMU_MEMORY_REGION)
|
2017-07-11 06:56:20 +03:00
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
struct SpaprTceTable {
|
2013-07-18 23:32:58 +04:00
|
|
|
DeviceState parent;
|
|
|
|
uint32_t liobn;
|
|
|
|
uint32_t nb_table;
|
2014-05-27 09:36:37 +04:00
|
|
|
uint64_t bus_offset;
|
2014-05-27 09:36:36 +04:00
|
|
|
uint32_t page_shift;
|
2013-07-18 23:32:58 +04:00
|
|
|
uint64_t *table;
|
2016-06-01 11:57:34 +03:00
|
|
|
uint32_t mig_nb_table;
|
|
|
|
uint64_t *mig_table;
|
2013-07-18 23:32:58 +04:00
|
|
|
bool bypass;
|
2015-09-30 06:42:55 +03:00
|
|
|
bool need_vfio;
|
2019-03-07 08:05:16 +03:00
|
|
|
bool skipping_replay;
|
2022-06-22 08:29:55 +03:00
|
|
|
bool def_win;
|
2013-07-18 23:32:58 +04:00
|
|
|
int fd;
|
2017-07-11 06:56:19 +03:00
|
|
|
MemoryRegion root;
|
|
|
|
IOMMUMemoryRegion iommu;
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
struct SpaprVioDevice *vdev; /* for @bypass migration compatibility only */
|
|
|
|
QLIST_ENTRY(SpaprTceTable) list;
|
2013-07-18 23:32:58 +04:00
|
|
|
};
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprTceTable *spapr_tce_find_by_liobn(target_ulong liobn);
|
2015-05-07 08:33:49 +03:00
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
struct SpaprEventLogEntry {
|
2017-07-11 21:07:55 +03:00
|
|
|
uint32_t summary;
|
|
|
|
uint32_t extended_length;
|
|
|
|
void *extended_log;
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
QTAILQ_ENTRY(SpaprEventLogEntry) next;
|
2015-05-07 08:33:49 +03:00
|
|
|
};
|
|
|
|
|
2019-11-29 07:00:58 +03:00
|
|
|
void *spapr_build_fdt(SpaprMachineState *spapr, bool reset, size_t space);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
void spapr_events_init(SpaprMachineState *sm);
|
|
|
|
void spapr_dt_events(SpaprMachineState *sm, void *fdt);
|
|
|
|
void close_htab_fd(SpaprMachineState *spapr);
|
spapr: Don't attempt to clamp RMA to VRMA constraint
The Real Mode Area (RMA) is the part of memory which a guest can access
when in real (MMU off) mode. Of course, for a guest under KVM, the MMU
isn't really turned off, it's just in a special translation mode - Virtual
Real Mode Area (VRMA) - which looks like real mode in guest mode.
The mechanics of how this works when using the hash MMU (HPT) put a
constraint on the size of the RMA, which depends on the size of the
HPT. So, the latter part of spapr_setup_hpt_and_vrma() clamps the RMA
we advertise to the guest based on this VRMA limit.
There are several things wrong with this:
1) spapr_setup_hpt_and_vrma() doesn't actually clamp, it takes the minimum
of Node 0 memory size and the VRMA limit. That will *often* work the
same as clamping, but there can be other constraints on RMA size which
supersede Node 0 memory size. We have real bugs caused by this
(currently worked around in the guest kernel)
2) Some callers of spapr_setup_hpt_and_vrma() are in a situation where
we're past the point that we can actually advertise an RMA limit to the
guest
3) But most fundamentally, the VRMA limit depends on host configuration
(page size) which shouldn't be visible to the guest, but this partially
exposes it. This can cause problems with migration in certain edge
cases, although we will mostly get away with it.
In practice, this clamping is almost never applied anyway. With 64kiB
pages and the normal rules for sizing of the HPT, the theoretical VRMA
limit will be 4x(guest memory size) and so never hit. It will hit with
4kiB pages, where it will be (guest memory size)/4. However all mainstream
distro kernels for POWER have used a 64kiB page size for at least 10 years.
So, simply replace this logic with a check that the RMA we've calculated
based only on guest visible configuration will fit within the host implied
VRMA limit. This can break if running HPT guests on a host kernel with
4kiB page size. As noted that's very rare. There also exist several
possible workarounds:
* Change the host kernel to use 64kiB pages
* Use radix MMU (RPT) guests instead of HPT
* Use 64kiB hugepages on the host to back guest memory
* Increase the guest memory size so that the RMA hits one of the fixed
limits before the RMA limit. This is relatively easy on POWER8 which
has a 16GiB limit, harder on POWER9 which has a 1TiB limit.
* Use a guest NUMA configuration which artificially constrains the RMA
within the VRMA limit (the RMA must always fit within Node 0).
Previously, on KVM, we also temporarily reduced the rma_size to 256M so
that the we'd load the kernel and initrd safely, regardless of the VRMA
limit. This was a) confusing, b) could significantly limit the size of
images we could load and c) introduced a behavioural difference between
KVM and TCG. So we remove that as well.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Greg Kurz <groug@kaod.org>
2019-11-28 08:37:04 +03:00
|
|
|
void spapr_setup_hpt(SpaprMachineState *spapr);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
void spapr_free_hpt(SpaprMachineState *spapr);
|
2021-05-05 03:11:29 +03:00
|
|
|
void spapr_check_mmu_mode(bool guest_radix);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprTceTable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn);
|
|
|
|
void spapr_tce_table_enable(SpaprTceTable *tcet,
|
2016-06-01 11:57:33 +03:00
|
|
|
uint32_t page_shift, uint64_t bus_offset,
|
|
|
|
uint32_t nb_table);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
void spapr_tce_table_disable(SpaprTceTable *tcet);
|
|
|
|
void spapr_tce_set_need_vfio(SpaprTceTable *tcet, bool need_vfio);
|
2015-10-01 03:46:10 +03:00
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
MemoryRegion *spapr_tce_get_iommu(SpaprTceTable *tcet);
|
2012-06-27 08:50:44 +04:00
|
|
|
int spapr_dma_dt(void *fdt, int node_off, const char *propname,
|
2012-08-07 20:10:38 +04:00
|
|
|
uint32_t liobn, uint64_t window, uint32_t size);
|
|
|
|
int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprTceTable *tcet);
|
2020-12-09 20:00:49 +03:00
|
|
|
void spapr_pci_switch_vga(SpaprMachineState *spapr, bool big_endian);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
void spapr_hotplug_req_add_by_index(SpaprDrc *drc);
|
|
|
|
void spapr_hotplug_req_remove_by_index(SpaprDrc *drc);
|
|
|
|
void spapr_hotplug_req_add_by_count(SpaprDrcType drc_type,
|
2015-08-03 08:35:42 +03:00
|
|
|
uint32_t count);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
void spapr_hotplug_req_remove_by_count(SpaprDrcType drc_type,
|
2015-08-03 08:35:42 +03:00
|
|
|
uint32_t count);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
void spapr_hotplug_req_add_by_count_indexed(SpaprDrcType drc_type,
|
2016-10-27 05:20:28 +03:00
|
|
|
uint32_t count, uint32_t index);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
void spapr_hotplug_req_remove_by_count_indexed(SpaprDrcType drc_type,
|
2016-10-27 05:20:28 +03:00
|
|
|
uint32_t count, uint32_t index);
|
pseries: Implement HPT resizing
This patch implements hypercalls allowing a PAPR guest to resize its own
hash page table. This will eventually allow for more flexible memory
hotplug.
The implementation is partially asynchronous, handled in a special thread
running the hpt_prepare_thread() function. The state of a pending resize
is stored in SPAPR_MACHINE->pending_hpt.
The H_RESIZE_HPT_PREPARE hypercall will kick off creation of a new HPT, or,
if one is already in progress, monitor it for completion. If there is an
existing HPT resize in progress that doesn't match the size specified in
the call, it will cancel it, replacing it with a new one matching the
given size.
The H_RESIZE_HPT_COMMIT completes transition to a resized HPT, and can only
be called successfully once H_RESIZE_HPT_PREPARE has successfully
completed initialization of a new HPT. The guest must ensure that there
are no concurrent accesses to the existing HPT while this is called (this
effectively means stop_machine() for Linux guests).
For now H_RESIZE_HPT_COMMIT goes through the whole old HPT, rehashing each
HPTE into the new HPT. This can have quite high latency, but it seems to
be of the order of typical migration downtime latencies for HPTs of size
up to ~2GiB (which would be used in a 256GiB guest).
In future we probably want to move more of the rehashing to the "prepare"
phase, by having H_ENTER and other hcalls update both current and
pending HPTs. That's a project for another day, but should be possible
without any changes to the guest interface.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-12 08:46:49 +03:00
|
|
|
int spapr_hpt_shift_for_ramsize(uint64_t ramsize);
|
2020-10-26 15:40:54 +03:00
|
|
|
int spapr_reallocate_hpt(SpaprMachineState *spapr, int shift, Error **errp);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
void spapr_clear_pending_events(SpaprMachineState *spapr);
|
2020-02-24 22:23:43 +03:00
|
|
|
void spapr_clear_pending_hotplug_events(SpaprMachineState *spapr);
|
2021-03-02 17:10:19 +03:00
|
|
|
void spapr_memory_unplug_rollback(SpaprMachineState *spapr, DeviceState *dev);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
int spapr_max_server_number(SpaprMachineState *spapr);
|
2019-04-11 11:00:01 +03:00
|
|
|
void spapr_store_hpte(PowerPCCPU *cpu, hwaddr ptex,
|
|
|
|
uint64_t pte0, uint64_t pte1);
|
2020-01-30 21:44:20 +03:00
|
|
|
void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered);
|
2015-02-06 06:55:51 +03:00
|
|
|
|
2019-02-19 20:17:43 +03:00
|
|
|
/* DRC callbacks. */
|
2017-05-22 22:35:48 +03:00
|
|
|
void spapr_core_release(DeviceState *dev);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
int spapr_core_dt_populate(SpaprDrc *drc, SpaprMachineState *spapr,
|
2019-02-19 20:17:48 +03:00
|
|
|
void *fdt, int *fdt_start_offset, Error **errp);
|
2017-05-22 22:35:48 +03:00
|
|
|
void spapr_lmb_release(DeviceState *dev);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
int spapr_lmb_dt_populate(SpaprDrc *drc, SpaprMachineState *spapr,
|
2019-02-19 20:17:43 +03:00
|
|
|
void *fdt, int *fdt_start_offset, Error **errp);
|
2019-02-19 20:18:49 +03:00
|
|
|
void spapr_phb_release(DeviceState *dev);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
int spapr_phb_dt_populate(SpaprDrc *drc, SpaprMachineState *spapr,
|
2019-02-19 20:18:49 +03:00
|
|
|
void *fdt, int *fdt_start_offset, Error **errp);
|
2017-05-22 22:35:48 +03:00
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
void spapr_rtc_read(SpaprRtcState *rtc, struct tm *tm, uint32_t *ns);
|
|
|
|
int spapr_rtc_import_offset(SpaprRtcState *rtc, int64_t legacy_offset);
|
2015-02-06 06:55:51 +03:00
|
|
|
|
2017-03-07 12:23:40 +03:00
|
|
|
#define TYPE_SPAPR_RNG "spapr-rng"
|
2012-06-27 08:50:44 +04:00
|
|
|
|
2019-03-06 06:15:26 +03:00
|
|
|
#define SPAPR_MEMORY_BLOCK_SIZE ((hwaddr)1 << 28) /* 256MB */
|
2015-07-02 09:23:15 +03:00
|
|
|
|
2015-06-29 11:44:27 +03:00
|
|
|
/*
|
|
|
|
* This defines the maximum number of DIMM slots we can have for sPAPR
|
|
|
|
* guest. This is not defined by sPAPR but we are defining it to 32 slots
|
|
|
|
* based on default number of slots provided by PowerPC kernel.
|
|
|
|
*/
|
|
|
|
#define SPAPR_MAX_RAM_SLOTS 32
|
|
|
|
|
2018-06-25 15:42:24 +03:00
|
|
|
/* 1GB alignment for hotplug memory region */
|
|
|
|
#define SPAPR_DEVICE_MEM_ALIGN (1 * GiB)
|
2015-06-29 11:44:27 +03:00
|
|
|
|
2015-07-13 03:34:00 +03:00
|
|
|
/*
|
|
|
|
* Number of 32 bit words in each LMB list entry in ibm,dynamic-memory
|
|
|
|
* property under ibm,dynamic-reconfiguration-memory node.
|
|
|
|
*/
|
|
|
|
#define SPAPR_DR_LMB_LIST_ENTRY_SIZE 6
|
|
|
|
|
|
|
|
/*
|
2016-06-10 08:14:48 +03:00
|
|
|
* Defines for flag value in ibm,dynamic-memory property under
|
|
|
|
* ibm,dynamic-reconfiguration-memory node.
|
2015-07-13 03:34:00 +03:00
|
|
|
*/
|
|
|
|
#define SPAPR_LMB_FLAGS_ASSIGNED 0x00000008
|
2016-06-10 08:14:48 +03:00
|
|
|
#define SPAPR_LMB_FLAGS_DRC_INVALID 0x00000020
|
|
|
|
#define SPAPR_LMB_FLAGS_RESERVED 0x00000080
|
2020-05-11 23:02:02 +03:00
|
|
|
#define SPAPR_LMB_FLAGS_HOTREMOVABLE 0x00000100
|
2015-07-13 03:34:00 +03:00
|
|
|
|
2016-12-05 08:50:21 +03:00
|
|
|
void spapr_do_system_reset_on_cpu(CPUState *cs, run_on_cpu_data arg);
|
|
|
|
|
pseries: Implement HPT resizing
This patch implements hypercalls allowing a PAPR guest to resize its own
hash page table. This will eventually allow for more flexible memory
hotplug.
The implementation is partially asynchronous, handled in a special thread
running the hpt_prepare_thread() function. The state of a pending resize
is stored in SPAPR_MACHINE->pending_hpt.
The H_RESIZE_HPT_PREPARE hypercall will kick off creation of a new HPT, or,
if one is already in progress, monitor it for completion. If there is an
existing HPT resize in progress that doesn't match the size specified in
the call, it will cancel it, replacing it with a new one matching the
given size.
The H_RESIZE_HPT_COMMIT completes transition to a resized HPT, and can only
be called successfully once H_RESIZE_HPT_PREPARE has successfully
completed initialization of a new HPT. The guest must ensure that there
are no concurrent accesses to the existing HPT while this is called (this
effectively means stop_machine() for Linux guests).
For now H_RESIZE_HPT_COMMIT goes through the whole old HPT, rehashing each
HPTE into the new HPT. This can have quite high latency, but it seems to
be of the order of typical migration downtime latencies for HPTs of size
up to ~2GiB (which would be used in a 256GiB guest).
In future we probably want to move more of the rehashing to the "prepare"
phase, by having H_ENTER and other hcalls update both current and
pending HPTs. That's a project for another day, but should be possible
without any changes to the guest interface.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-12 08:46:49 +03:00
|
|
|
#define HTAB_SIZE(spapr) (1ULL << ((spapr)->htab_shift))
|
|
|
|
|
2018-02-14 22:40:44 +03:00
|
|
|
int spapr_get_vcpu_id(PowerPCCPU *cpu);
|
2020-09-14 15:35:00 +03:00
|
|
|
bool spapr_set_vcpu_id(PowerPCCPU *cpu, int cpu_index, Error **errp);
|
2017-08-09 08:38:56 +03:00
|
|
|
PowerPCCPU *spapr_find_cpu(int vcpu_id);
|
|
|
|
|
2018-01-12 08:33:43 +03:00
|
|
|
int spapr_caps_pre_load(void *opaque);
|
|
|
|
int spapr_caps_pre_save(void *opaque);
|
|
|
|
|
spapr: Capabilities infrastructure
Because PAPR is a paravirtual environment access to certain CPU (or other)
facilities can be blocked by the hypervisor. PAPR provides ways to
advertise in the device tree whether or not those features are available to
the guest.
In some places we automatically determine whether to make a feature
available based on whether our host can support it, in most cases this is
based on limitations in the available KVM implementation.
Although we correctly advertise this to the guest, it means that host
factors might make changes to the guest visible environment which is bad:
as well as generaly reducing reproducibility, it means that a migration
between different host environments can easily go bad.
We've mostly gotten away with it because the environments considered mature
enough to be well supported (basically, KVM on POWER8) have had consistent
feature availability. But, it's still not right and some limitations on
POWER9 is going to make it more of an issue in future.
This introduces an infrastructure for defining "sPAPR capabilities". These
are set by default based on the machine version, masked by the capabilities
of the chosen cpu, but can be overriden with machine properties.
The intention is at reset time we verify that the requested capabilities
can be supported on the host (considering TCG, KVM and/or host cpu
limitations). If not we simply fail, rather than silently modifying the
advertised featureset to the guest.
This does mean that certain configurations that "worked" may now fail, but
such configurations were already more subtly broken.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2017-12-08 02:35:35 +03:00
|
|
|
/*
|
|
|
|
* Handling of optional capabilities
|
|
|
|
*/
|
2018-01-12 08:33:43 +03:00
|
|
|
extern const VMStateDescription vmstate_spapr_cap_htm;
|
|
|
|
extern const VMStateDescription vmstate_spapr_cap_vsx;
|
|
|
|
extern const VMStateDescription vmstate_spapr_cap_dfp;
|
2018-01-19 08:00:02 +03:00
|
|
|
extern const VMStateDescription vmstate_spapr_cap_cfpc;
|
2018-01-19 08:00:03 +03:00
|
|
|
extern const VMStateDescription vmstate_spapr_cap_sbbc;
|
2018-01-19 08:00:04 +03:00
|
|
|
extern const VMStateDescription vmstate_spapr_cap_ibs;
|
2019-05-17 07:10:44 +03:00
|
|
|
extern const VMStateDescription vmstate_spapr_cap_hpt_maxpagesize;
|
2018-10-08 06:25:39 +03:00
|
|
|
extern const VMStateDescription vmstate_spapr_cap_nested_kvm_hv;
|
2019-03-01 05:43:14 +03:00
|
|
|
extern const VMStateDescription vmstate_spapr_cap_large_decr;
|
2019-03-01 06:19:12 +03:00
|
|
|
extern const VMStateDescription vmstate_spapr_cap_ccf_assist;
|
2020-01-30 21:44:18 +03:00
|
|
|
extern const VMStateDescription vmstate_spapr_cap_fwnmi;
|
2021-07-06 14:24:40 +03:00
|
|
|
extern const VMStateDescription vmstate_spapr_cap_rpt_invalidate;
|
2022-06-22 08:10:08 +03:00
|
|
|
extern const VMStateDescription vmstate_spapr_wdt;
|
spapr: Capabilities infrastructure
Because PAPR is a paravirtual environment access to certain CPU (or other)
facilities can be blocked by the hypervisor. PAPR provides ways to
advertise in the device tree whether or not those features are available to
the guest.
In some places we automatically determine whether to make a feature
available based on whether our host can support it, in most cases this is
based on limitations in the available KVM implementation.
Although we correctly advertise this to the guest, it means that host
factors might make changes to the guest visible environment which is bad:
as well as generaly reducing reproducibility, it means that a migration
between different host environments can easily go bad.
We've mostly gotten away with it because the environments considered mature
enough to be well supported (basically, KVM on POWER8) have had consistent
feature availability. But, it's still not right and some limitations on
POWER9 is going to make it more of an issue in future.
This introduces an infrastructure for defining "sPAPR capabilities". These
are set by default based on the machine version, masked by the capabilities
of the chosen cpu, but can be overriden with machine properties.
The intention is at reset time we verify that the requested capabilities
can be supported on the host (considering TCG, KVM and/or host cpu
limitations). If not we simply fail, rather than silently modifying the
advertised featureset to the guest.
This does mean that certain configurations that "worked" may now fail, but
such configurations were already more subtly broken.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2017-12-08 02:35:35 +03:00
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
static inline uint8_t spapr_get_cap(SpaprMachineState *spapr, int cap)
|
spapr: Capabilities infrastructure
Because PAPR is a paravirtual environment access to certain CPU (or other)
facilities can be blocked by the hypervisor. PAPR provides ways to
advertise in the device tree whether or not those features are available to
the guest.
In some places we automatically determine whether to make a feature
available based on whether our host can support it, in most cases this is
based on limitations in the available KVM implementation.
Although we correctly advertise this to the guest, it means that host
factors might make changes to the guest visible environment which is bad:
as well as generaly reducing reproducibility, it means that a migration
between different host environments can easily go bad.
We've mostly gotten away with it because the environments considered mature
enough to be well supported (basically, KVM on POWER8) have had consistent
feature availability. But, it's still not right and some limitations on
POWER9 is going to make it more of an issue in future.
This introduces an infrastructure for defining "sPAPR capabilities". These
are set by default based on the machine version, masked by the capabilities
of the chosen cpu, but can be overriden with machine properties.
The intention is at reset time we verify that the requested capabilities
can be supported on the host (considering TCG, KVM and/or host cpu
limitations). If not we simply fail, rather than silently modifying the
advertised featureset to the guest.
This does mean that certain configurations that "worked" may now fail, but
such configurations were already more subtly broken.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2017-12-08 02:35:35 +03:00
|
|
|
{
|
2018-01-12 08:33:43 +03:00
|
|
|
return spapr->eff.caps[cap];
|
spapr: Capabilities infrastructure
Because PAPR is a paravirtual environment access to certain CPU (or other)
facilities can be blocked by the hypervisor. PAPR provides ways to
advertise in the device tree whether or not those features are available to
the guest.
In some places we automatically determine whether to make a feature
available based on whether our host can support it, in most cases this is
based on limitations in the available KVM implementation.
Although we correctly advertise this to the guest, it means that host
factors might make changes to the guest visible environment which is bad:
as well as generaly reducing reproducibility, it means that a migration
between different host environments can easily go bad.
We've mostly gotten away with it because the environments considered mature
enough to be well supported (basically, KVM on POWER8) have had consistent
feature availability. But, it's still not right and some limitations on
POWER9 is going to make it more of an issue in future.
This introduces an infrastructure for defining "sPAPR capabilities". These
are set by default based on the machine version, masked by the capabilities
of the chosen cpu, but can be overriden with machine properties.
The intention is at reset time we verify that the requested capabilities
can be supported on the host (considering TCG, KVM and/or host cpu
limitations). If not we simply fail, rather than silently modifying the
advertised featureset to the guest.
This does mean that certain configurations that "worked" may now fail, but
such configurations were already more subtly broken.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2017-12-08 02:35:35 +03:00
|
|
|
}
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
void spapr_caps_init(SpaprMachineState *spapr);
|
|
|
|
void spapr_caps_apply(SpaprMachineState *spapr);
|
|
|
|
void spapr_caps_cpu_apply(SpaprMachineState *spapr, PowerPCCPU *cpu);
|
2020-05-05 18:29:23 +03:00
|
|
|
void spapr_caps_add_properties(SpaprMachineClass *smc);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
int spapr_caps_post_migration(SpaprMachineState *spapr);
|
spapr: Capabilities infrastructure
Because PAPR is a paravirtual environment access to certain CPU (or other)
facilities can be blocked by the hypervisor. PAPR provides ways to
advertise in the device tree whether or not those features are available to
the guest.
In some places we automatically determine whether to make a feature
available based on whether our host can support it, in most cases this is
based on limitations in the available KVM implementation.
Although we correctly advertise this to the guest, it means that host
factors might make changes to the guest visible environment which is bad:
as well as generaly reducing reproducibility, it means that a migration
between different host environments can easily go bad.
We've mostly gotten away with it because the environments considered mature
enough to be well supported (basically, KVM on POWER8) have had consistent
feature availability. But, it's still not right and some limitations on
POWER9 is going to make it more of an issue in future.
This introduces an infrastructure for defining "sPAPR capabilities". These
are set by default based on the machine version, masked by the capabilities
of the chosen cpu, but can be overriden with machine properties.
The intention is at reset time we verify that the requested capabilities
can be supported on the host (considering TCG, KVM and/or host cpu
limitations). If not we simply fail, rather than silently modifying the
advertised featureset to the guest.
This does mean that certain configurations that "worked" may now fail, but
such configurations were already more subtly broken.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2017-12-08 02:35:35 +03:00
|
|
|
|
2020-09-14 15:35:03 +03:00
|
|
|
bool spapr_check_pagesize(SpaprMachineState *spapr, hwaddr pagesize,
|
2018-04-18 07:21:45 +03:00
|
|
|
Error **errp);
|
2018-12-18 01:34:42 +03:00
|
|
|
/*
|
|
|
|
* XIVE definitions
|
|
|
|
*/
|
|
|
|
#define SPAPR_OV5_XIVE_LEGACY 0x0
|
|
|
|
#define SPAPR_OV5_XIVE_EXPLOIT 0x40
|
|
|
|
#define SPAPR_OV5_XIVE_BOTH 0x80 /* Only to advertise on the platform */
|
2018-04-18 07:21:45 +03:00
|
|
|
|
2019-02-15 20:00:18 +03:00
|
|
|
void spapr_set_all_lpcrs(target_ulong value, target_ulong mask);
|
2020-01-30 21:44:20 +03:00
|
|
|
hwaddr spapr_get_rtas_addr(void);
|
2021-01-08 20:31:27 +03:00
|
|
|
bool spapr_memory_hot_unplug_supported(SpaprMachineState *spapr);
|
spapr: Implement Open Firmware client interface
The PAPR platform describes an OS environment that's presented by
a combination of a hypervisor and firmware. The features it specifies
require collaboration between the firmware and the hypervisor.
Since the beginning, the runtime component of the firmware (RTAS) has
been implemented as a 20 byte shim which simply forwards it to
a hypercall implemented in qemu. The boot time firmware component is
SLOF - but a build that's specific to qemu, and has always needed to be
updated in sync with it. Even though we've managed to limit the amount
of runtime communication we need between qemu and SLOF, there's some,
and it has become increasingly awkward to handle as we've implemented
new features.
This implements a boot time OF client interface (CI) which is
enabled by a new "x-vof" pseries machine option (stands for "Virtual Open
Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall
which implements Open Firmware Client Interface (OF CI). This allows
using a smaller stateless firmware which does not have to manage
the device tree.
The new "vof.bin" firmware image is included with source code under
pc-bios/. It also includes RTAS blob.
This implements a handful of CI methods just to get -kernel/-initrd
working. In particular, this implements the device tree fetching and
simple memory allocator - "claim" (an OF CI memory allocator) and updates
"/memory@0/available" to report the client about available memory.
This implements changing some device tree properties which we know how
to deal with, the rest is ignored. To allow changes, this skips
fdt_pack() when x-vof=on as not packing the blob leaves some room for
appending.
In absence of SLOF, this assigns phandles to device tree nodes to make
device tree traversing work.
When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree.
This adds basic instances support which are managed by a hash map
ihandle -> [phandle].
Before the guest started, the used memory is:
0..e60 - the initial firmware
8000..10000 - stack
400000.. - kernel
3ea0000.. - initramdisk
This OF CI does not implement "interpret".
Unlike SLOF, this does not format uninitialized nvram. Instead, this
includes a disk image with pre-formatted nvram.
With this basic support, this can only boot into kernel directly.
However this is just enough for the petitboot kernel and initradmdisk to
boot from any possible source. Note this requires reasonably recent guest
kernel with:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735
The immediate benefit is much faster booting time which especially
crucial with fully emulated early CPU bring up environments. Also this
may come handy when/if GRUB-in-the-userspace sees light of the day.
This separates VOF and sPAPR in a hope that VOF bits may be reused by
other POWERPC boards which do not support pSeries.
This assumes potential support for booting from QEMU backends
such as blockdev or netdev without devices/drivers used.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20210625055155.2252896-1-aik@ozlabs.ru>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
[dwg: Adjusted some includes which broke compile in some more obscure
compilation setups]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-25 08:51:55 +03:00
|
|
|
|
2021-07-08 09:56:25 +03:00
|
|
|
void spapr_vof_reset(SpaprMachineState *spapr, void *fdt, Error **errp);
|
spapr: Implement Open Firmware client interface
The PAPR platform describes an OS environment that's presented by
a combination of a hypervisor and firmware. The features it specifies
require collaboration between the firmware and the hypervisor.
Since the beginning, the runtime component of the firmware (RTAS) has
been implemented as a 20 byte shim which simply forwards it to
a hypercall implemented in qemu. The boot time firmware component is
SLOF - but a build that's specific to qemu, and has always needed to be
updated in sync with it. Even though we've managed to limit the amount
of runtime communication we need between qemu and SLOF, there's some,
and it has become increasingly awkward to handle as we've implemented
new features.
This implements a boot time OF client interface (CI) which is
enabled by a new "x-vof" pseries machine option (stands for "Virtual Open
Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall
which implements Open Firmware Client Interface (OF CI). This allows
using a smaller stateless firmware which does not have to manage
the device tree.
The new "vof.bin" firmware image is included with source code under
pc-bios/. It also includes RTAS blob.
This implements a handful of CI methods just to get -kernel/-initrd
working. In particular, this implements the device tree fetching and
simple memory allocator - "claim" (an OF CI memory allocator) and updates
"/memory@0/available" to report the client about available memory.
This implements changing some device tree properties which we know how
to deal with, the rest is ignored. To allow changes, this skips
fdt_pack() when x-vof=on as not packing the blob leaves some room for
appending.
In absence of SLOF, this assigns phandles to device tree nodes to make
device tree traversing work.
When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree.
This adds basic instances support which are managed by a hash map
ihandle -> [phandle].
Before the guest started, the used memory is:
0..e60 - the initial firmware
8000..10000 - stack
400000.. - kernel
3ea0000.. - initramdisk
This OF CI does not implement "interpret".
Unlike SLOF, this does not format uninitialized nvram. Instead, this
includes a disk image with pre-formatted nvram.
With this basic support, this can only boot into kernel directly.
However this is just enough for the petitboot kernel and initradmdisk to
boot from any possible source. Note this requires reasonably recent guest
kernel with:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735
The immediate benefit is much faster booting time which especially
crucial with fully emulated early CPU bring up environments. Also this
may come handy when/if GRUB-in-the-userspace sees light of the day.
This separates VOF and sPAPR in a hope that VOF bits may be reused by
other POWERPC boards which do not support pSeries.
This assumes potential support for booting from QEMU backends
such as blockdev or netdev without devices/drivers used.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20210625055155.2252896-1-aik@ozlabs.ru>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
[dwg: Adjusted some includes which broke compile in some more obscure
compilation setups]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-25 08:51:55 +03:00
|
|
|
void spapr_vof_quiesce(MachineState *ms);
|
|
|
|
bool spapr_vof_setprop(MachineState *ms, const char *path, const char *propname,
|
|
|
|
void *val, int vallen);
|
|
|
|
target_ulong spapr_h_vof_client(PowerPCCPU *cpu, SpaprMachineState *spapr,
|
|
|
|
target_ulong opcode, target_ulong *args);
|
|
|
|
target_ulong spapr_vof_client_architecture_support(MachineState *ms,
|
|
|
|
CPUState *cs,
|
|
|
|
target_ulong ovec_addr);
|
|
|
|
void spapr_vof_client_dt_finalize(SpaprMachineState *spapr, void *fdt);
|
|
|
|
|
2022-06-22 08:10:08 +03:00
|
|
|
/* H_WATCHDOG */
|
|
|
|
void spapr_watchdog_init(SpaprMachineState *spapr);
|
|
|
|
|
2016-06-29 14:47:03 +03:00
|
|
|
#endif /* HW_SPAPR_H */
|