2011-10-30 21:16:46 +04:00
|
|
|
/*
|
|
|
|
* QEMU sPAPR PCI host originated from Uninorth PCI host
|
|
|
|
*
|
|
|
|
* Copyright (c) 2011 Alexey Kardashevskiy, IBM Corporation.
|
|
|
|
* Copyright (C) 2011 David Gibson, IBM Corporation.
|
|
|
|
*
|
|
|
|
* Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
|
|
* of this software and associated documentation files (the "Software"), to deal
|
|
|
|
* in the Software without restriction, including without limitation the rights
|
|
|
|
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
|
|
* copies of the Software, and to permit persons to whom the Software is
|
|
|
|
* furnished to do so, subject to the following conditions:
|
|
|
|
*
|
|
|
|
* The above copyright notice and this permission notice shall be included in
|
|
|
|
* all copies or substantial portions of the Software.
|
|
|
|
*
|
|
|
|
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
|
|
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
|
|
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
|
|
|
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
|
|
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
|
|
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
|
|
|
* THE SOFTWARE.
|
|
|
|
*/
|
2019-05-23 17:35:07 +03:00
|
|
|
|
2016-01-26 21:16:58 +03:00
|
|
|
#include "qemu/osdep.h"
|
include/qemu/osdep.h: Don't include qapi/error.h
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the
Error typedef. Since then, we've moved to include qemu/osdep.h
everywhere. Its file comment explains: "To avoid getting into
possible circular include dependencies, this file should not include
any other QEMU headers, with the exceptions of config-host.h,
compiler.h, os-posix.h and os-win32.h, all of which are doing a
similar job to this file and are under similar constraints."
qapi/error.h doesn't do a similar job, and it doesn't adhere to
similar constraints: it includes qapi-types.h. That's in excess of
100KiB of crap most .c files don't actually need.
Add the typedef to qemu/typedefs.h, and include that instead of
qapi/error.h. Include qapi/error.h in .c files that need it and don't
get it now. Include qapi-types.h in qom/object.h for uint16List.
Update scripts/clean-includes accordingly. Update it further to match
reality: replace config.h by config-target.h, add sysemu/os-posix.h,
sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h
comment quoted above similarly.
This reduces the number of objects depending on qapi/error.h from "all
of them" to less than a third. Unfortunately, the number depending on
qapi-types.h shrinks only a little. More work is needed for that one.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
[Fix compilation without the spice devel packages. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-14 11:01:28 +03:00
|
|
|
#include "qapi/error.h"
|
2019-08-12 08:23:42 +03:00
|
|
|
#include "hw/irq.h"
|
2015-07-02 09:23:21 +03:00
|
|
|
#include "hw/sysbus.h"
|
2019-08-12 08:23:45 +03:00
|
|
|
#include "migration/vmstate.h"
|
2013-02-04 18:40:22 +04:00
|
|
|
#include "hw/pci/pci.h"
|
|
|
|
#include "hw/pci/msi.h"
|
|
|
|
#include "hw/pci/msix.h"
|
|
|
|
#include "hw/pci/pci_host.h"
|
2013-02-05 20:06:20 +04:00
|
|
|
#include "hw/ppc/spapr.h"
|
|
|
|
#include "hw/pci-host/spapr.h"
|
2016-07-04 06:33:07 +03:00
|
|
|
#include "exec/ram_addr.h"
|
2011-10-30 21:16:46 +04:00
|
|
|
#include <libfdt.h>
|
2012-08-07 20:10:36 +04:00
|
|
|
#include "trace.h"
|
2013-11-21 08:08:58 +04:00
|
|
|
#include "qemu/error-report.h"
|
2019-05-23 17:35:07 +03:00
|
|
|
#include "qemu/module.h"
|
2015-05-07 08:33:55 +03:00
|
|
|
#include "qapi/qmp/qerror.h"
|
2017-09-09 18:06:25 +03:00
|
|
|
#include "hw/ppc/fdt.h"
|
2015-07-02 09:23:21 +03:00
|
|
|
#include "hw/pci/pci_bridge.h"
|
2012-12-12 17:00:45 +04:00
|
|
|
#include "hw/pci/pci_bus.h"
|
2017-02-17 16:31:34 +03:00
|
|
|
#include "hw/pci/pci_ids.h"
|
2015-05-07 08:33:53 +03:00
|
|
|
#include "hw/ppc/spapr_drc.h"
|
2019-08-12 08:23:51 +03:00
|
|
|
#include "hw/qdev-properties.h"
|
2015-05-07 08:33:55 +03:00
|
|
|
#include "sysemu/device_tree.h"
|
2014-09-17 14:21:29 +04:00
|
|
|
#include "sysemu/kvm.h"
|
2016-07-04 06:33:07 +03:00
|
|
|
#include "sysemu/hostmem.h"
|
2016-07-27 11:03:38 +03:00
|
|
|
#include "sysemu/numa.h"
|
spapr: introduce SpaprMachineState::numa_assoc_array
The next step to centralize all NUMA/associativity handling in
the spapr machine is to create a 'one stop place' for all
things ibm,associativity.
This patch introduces numa_assoc_array, a 2 dimensional array
that will store all ibm,associativity arrays of all NUMA nodes.
This array is initialized in a new spapr_numa_associativity_init()
function, called in spapr_machine_init(). It is being initialized
with the same values used in other ibm,associativity properties
around spapr files (i.e. all zeros, last value is node_id).
The idea is to remove all hardcoded definitions and FDT writes
of ibm,associativity arrays, doing instead a call to the new
helper spapr_numa_write_associativity_dt() helper, that will
be able to write the DT with the correct values.
We'll start small, handling the trivial cases first. The
remaining instances of ibm,associativity will be handled
next.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20200903220639.563090-2-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-09-04 01:06:33 +03:00
|
|
|
#include "hw/ppc/spapr_numa.h"
|
2020-08-11 14:41:30 +03:00
|
|
|
#include "qemu/log.h"
|
2011-10-30 21:16:46 +04:00
|
|
|
|
2012-08-07 20:10:37 +04:00
|
|
|
/* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
|
|
|
|
#define RTAS_QUERY_FN 0
|
|
|
|
#define RTAS_CHANGE_FN 1
|
|
|
|
#define RTAS_RESET_FN 2
|
|
|
|
#define RTAS_CHANGE_MSI_FN 3
|
|
|
|
#define RTAS_CHANGE_MSIX_FN 4
|
|
|
|
|
|
|
|
/* Interrupt types to return on RTAS_CHANGE_* */
|
|
|
|
#define RTAS_TYPE_MSI 1
|
|
|
|
#define RTAS_TYPE_MSIX 2
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *spapr_pci_find_phb(SpaprMachineState *spapr, uint64_t buid)
|
2011-10-30 21:16:46 +04:00
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb;
|
2011-10-30 21:16:46 +04:00
|
|
|
|
2012-08-20 21:08:05 +04:00
|
|
|
QLIST_FOREACH(sphb, &spapr->phbs, list) {
|
|
|
|
if (sphb->buid != buid) {
|
2011-10-30 21:16:46 +04:00
|
|
|
continue;
|
|
|
|
}
|
2012-08-20 21:08:05 +04:00
|
|
|
return sphb;
|
2012-08-07 20:10:35 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
PCIDevice *spapr_pci_find_dev(SpaprMachineState *spapr, uint64_t buid,
|
2015-05-07 08:33:34 +03:00
|
|
|
uint32_t config_addr)
|
2012-08-07 20:10:35 +04:00
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb = spapr_pci_find_phb(spapr, buid);
|
2012-08-20 21:08:08 +04:00
|
|
|
PCIHostState *phb = PCI_HOST_BRIDGE(sphb);
|
2013-08-21 10:02:15 +04:00
|
|
|
int bus_num = (config_addr >> 16) & 0xFF;
|
2012-08-07 20:10:35 +04:00
|
|
|
int devfn = (config_addr >> 8) & 0xFF;
|
|
|
|
|
|
|
|
if (!phb) {
|
|
|
|
return NULL;
|
|
|
|
}
|
2011-10-30 21:16:46 +04:00
|
|
|
|
2013-08-21 10:02:15 +04:00
|
|
|
return pci_find_device(phb->bus, bus_num, devfn);
|
2011-10-30 21:16:46 +04:00
|
|
|
}
|
|
|
|
|
2012-01-11 23:46:25 +04:00
|
|
|
static uint32_t rtas_pci_cfgaddr(uint32_t arg)
|
|
|
|
{
|
2012-04-02 08:17:35 +04:00
|
|
|
/* This handles the encoding of extended config space addresses */
|
2012-01-11 23:46:25 +04:00
|
|
|
return ((arg >> 20) & 0xf00) | (arg & 0xff);
|
|
|
|
}
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
static void finish_read_pci_config(SpaprMachineState *spapr, uint64_t buid,
|
2012-04-02 08:17:35 +04:00
|
|
|
uint32_t addr, uint32_t size,
|
|
|
|
target_ulong rets)
|
2012-01-18 19:42:09 +04:00
|
|
|
{
|
2012-04-02 08:17:35 +04:00
|
|
|
PCIDevice *pci_dev;
|
|
|
|
uint32_t val;
|
|
|
|
|
|
|
|
if ((size != 1) && (size != 2) && (size != 4)) {
|
|
|
|
/* access must be 1, 2 or 4 bytes */
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
|
2012-04-02 08:17:35 +04:00
|
|
|
return;
|
2012-01-18 19:42:09 +04:00
|
|
|
}
|
|
|
|
|
2015-05-07 08:33:34 +03:00
|
|
|
pci_dev = spapr_pci_find_dev(spapr, buid, addr);
|
2012-04-02 08:17:35 +04:00
|
|
|
addr = rtas_pci_cfgaddr(addr);
|
|
|
|
|
|
|
|
if (!pci_dev || (addr % size) || (addr >= pci_config_size(pci_dev))) {
|
|
|
|
/* Access must be to a valid device, within bounds and
|
|
|
|
* naturally aligned */
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
|
2012-04-02 08:17:35 +04:00
|
|
|
return;
|
2012-01-18 19:42:09 +04:00
|
|
|
}
|
2012-04-02 08:17:35 +04:00
|
|
|
|
|
|
|
val = pci_host_config_read_common(pci_dev, addr,
|
|
|
|
pci_config_size(pci_dev), size);
|
|
|
|
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_SUCCESS);
|
2012-04-02 08:17:35 +04:00
|
|
|
rtas_st(rets, 1, val);
|
2012-01-18 19:42:09 +04:00
|
|
|
}
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
static void rtas_ibm_read_pci_config(PowerPCCPU *cpu, SpaprMachineState *spapr,
|
2011-10-30 21:16:46 +04:00
|
|
|
uint32_t token, uint32_t nargs,
|
|
|
|
target_ulong args,
|
|
|
|
uint32_t nret, target_ulong rets)
|
|
|
|
{
|
2012-04-02 08:17:35 +04:00
|
|
|
uint64_t buid;
|
|
|
|
uint32_t size, addr;
|
2011-10-30 21:16:46 +04:00
|
|
|
|
2012-04-02 08:17:35 +04:00
|
|
|
if ((nargs != 4) || (nret != 2)) {
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
|
2011-10-30 21:16:46 +04:00
|
|
|
return;
|
|
|
|
}
|
2012-04-02 08:17:35 +04:00
|
|
|
|
2015-09-01 04:05:12 +03:00
|
|
|
buid = rtas_ldq(args, 1);
|
2011-10-30 21:16:46 +04:00
|
|
|
size = rtas_ld(args, 3);
|
2012-04-02 08:17:35 +04:00
|
|
|
addr = rtas_ld(args, 0);
|
|
|
|
|
|
|
|
finish_read_pci_config(spapr, buid, addr, size, rets);
|
2011-10-30 21:16:46 +04:00
|
|
|
}
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
static void rtas_read_pci_config(PowerPCCPU *cpu, SpaprMachineState *spapr,
|
2011-10-30 21:16:46 +04:00
|
|
|
uint32_t token, uint32_t nargs,
|
|
|
|
target_ulong args,
|
|
|
|
uint32_t nret, target_ulong rets)
|
|
|
|
{
|
2012-04-02 08:17:35 +04:00
|
|
|
uint32_t size, addr;
|
2011-10-30 21:16:46 +04:00
|
|
|
|
2012-04-02 08:17:35 +04:00
|
|
|
if ((nargs != 2) || (nret != 2)) {
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
|
2011-10-30 21:16:46 +04:00
|
|
|
return;
|
|
|
|
}
|
2012-04-02 08:17:35 +04:00
|
|
|
|
2011-10-30 21:16:46 +04:00
|
|
|
size = rtas_ld(args, 1);
|
2012-04-02 08:17:35 +04:00
|
|
|
addr = rtas_ld(args, 0);
|
|
|
|
|
|
|
|
finish_read_pci_config(spapr, 0, addr, size, rets);
|
|
|
|
}
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
static void finish_write_pci_config(SpaprMachineState *spapr, uint64_t buid,
|
2012-04-02 08:17:35 +04:00
|
|
|
uint32_t addr, uint32_t size,
|
|
|
|
uint32_t val, target_ulong rets)
|
|
|
|
{
|
|
|
|
PCIDevice *pci_dev;
|
|
|
|
|
|
|
|
if ((size != 1) && (size != 2) && (size != 4)) {
|
|
|
|
/* access must be 1, 2 or 4 bytes */
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
|
2012-04-02 08:17:35 +04:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2015-05-07 08:33:34 +03:00
|
|
|
pci_dev = spapr_pci_find_dev(spapr, buid, addr);
|
2012-04-02 08:17:35 +04:00
|
|
|
addr = rtas_pci_cfgaddr(addr);
|
|
|
|
|
|
|
|
if (!pci_dev || (addr % size) || (addr >= pci_config_size(pci_dev))) {
|
|
|
|
/* Access must be to a valid device, within bounds and
|
|
|
|
* naturally aligned */
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
|
2012-04-02 08:17:35 +04:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
pci_host_config_write_common(pci_dev, addr, pci_config_size(pci_dev),
|
|
|
|
val, size);
|
|
|
|
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_SUCCESS);
|
2011-10-30 21:16:46 +04:00
|
|
|
}
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
static void rtas_ibm_write_pci_config(PowerPCCPU *cpu, SpaprMachineState *spapr,
|
2011-10-30 21:16:46 +04:00
|
|
|
uint32_t token, uint32_t nargs,
|
|
|
|
target_ulong args,
|
|
|
|
uint32_t nret, target_ulong rets)
|
|
|
|
{
|
2012-04-02 08:17:35 +04:00
|
|
|
uint64_t buid;
|
2011-10-30 21:16:46 +04:00
|
|
|
uint32_t val, size, addr;
|
|
|
|
|
2012-04-02 08:17:35 +04:00
|
|
|
if ((nargs != 5) || (nret != 1)) {
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
|
2011-10-30 21:16:46 +04:00
|
|
|
return;
|
|
|
|
}
|
2012-04-02 08:17:35 +04:00
|
|
|
|
2015-09-01 04:05:12 +03:00
|
|
|
buid = rtas_ldq(args, 1);
|
2011-10-30 21:16:46 +04:00
|
|
|
val = rtas_ld(args, 4);
|
|
|
|
size = rtas_ld(args, 3);
|
2012-04-02 08:17:35 +04:00
|
|
|
addr = rtas_ld(args, 0);
|
|
|
|
|
|
|
|
finish_write_pci_config(spapr, buid, addr, size, val, rets);
|
2011-10-30 21:16:46 +04:00
|
|
|
}
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
static void rtas_write_pci_config(PowerPCCPU *cpu, SpaprMachineState *spapr,
|
2011-10-30 21:16:46 +04:00
|
|
|
uint32_t token, uint32_t nargs,
|
|
|
|
target_ulong args,
|
|
|
|
uint32_t nret, target_ulong rets)
|
|
|
|
{
|
|
|
|
uint32_t val, size, addr;
|
|
|
|
|
2012-04-02 08:17:35 +04:00
|
|
|
if ((nargs != 3) || (nret != 1)) {
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
|
2011-10-30 21:16:46 +04:00
|
|
|
return;
|
|
|
|
}
|
2012-04-02 08:17:35 +04:00
|
|
|
|
|
|
|
|
2011-10-30 21:16:46 +04:00
|
|
|
val = rtas_ld(args, 2);
|
|
|
|
size = rtas_ld(args, 1);
|
2012-04-02 08:17:35 +04:00
|
|
|
addr = rtas_ld(args, 0);
|
|
|
|
|
|
|
|
finish_write_pci_config(spapr, 0, addr, size, val, rets);
|
2011-10-30 21:16:46 +04:00
|
|
|
}
|
|
|
|
|
2012-08-07 20:10:37 +04:00
|
|
|
/*
|
|
|
|
* Set MSI/MSIX message data.
|
|
|
|
* This is required for msi_notify()/msix_notify() which
|
|
|
|
* will write at the addresses via spapr_msi_write().
|
2014-05-30 13:34:20 +04:00
|
|
|
*
|
|
|
|
* If hwaddr == 0, all entries will have .data == first_irq i.e.
|
|
|
|
* table will be reset.
|
2012-08-07 20:10:37 +04:00
|
|
|
*/
|
2013-07-12 11:38:24 +04:00
|
|
|
static void spapr_msi_setmsg(PCIDevice *pdev, hwaddr addr, bool msix,
|
|
|
|
unsigned first_irq, unsigned req_num)
|
2012-08-07 20:10:37 +04:00
|
|
|
{
|
|
|
|
unsigned i;
|
2013-07-12 11:38:24 +04:00
|
|
|
MSIMessage msg = { .address = addr, .data = first_irq };
|
2012-08-07 20:10:37 +04:00
|
|
|
|
|
|
|
if (!msix) {
|
|
|
|
msi_set_message(pdev, msg);
|
|
|
|
trace_spapr_pci_msi_setup(pdev->name, 0, msg.address);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2014-05-30 13:34:20 +04:00
|
|
|
for (i = 0; i < req_num; ++i) {
|
2012-08-07 20:10:37 +04:00
|
|
|
msix_set_message(pdev, i, msg);
|
|
|
|
trace_spapr_pci_msi_setup(pdev->name, i, msg.address);
|
2014-05-30 13:34:20 +04:00
|
|
|
if (addr) {
|
|
|
|
++msg.data;
|
|
|
|
}
|
2012-08-07 20:10:37 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
static void rtas_ibm_change_msi(PowerPCCPU *cpu, SpaprMachineState *spapr,
|
2012-08-07 20:10:37 +04:00
|
|
|
uint32_t token, uint32_t nargs,
|
|
|
|
target_ulong args, uint32_t nret,
|
|
|
|
target_ulong rets)
|
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
|
2012-08-07 20:10:37 +04:00
|
|
|
uint32_t config_addr = rtas_ld(args, 0);
|
2015-09-01 04:05:12 +03:00
|
|
|
uint64_t buid = rtas_ldq(args, 1);
|
2012-08-07 20:10:37 +04:00
|
|
|
unsigned int func = rtas_ld(args, 3);
|
|
|
|
unsigned int req_num = rtas_ld(args, 4); /* 0 == remove all */
|
|
|
|
unsigned int seq_num = rtas_ld(args, 5);
|
|
|
|
unsigned int ret_intr_type;
|
2016-02-25 21:02:12 +03:00
|
|
|
unsigned int irq, max_irqs = 0;
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *phb = NULL;
|
2012-08-07 20:10:37 +04:00
|
|
|
PCIDevice *pdev = NULL;
|
2019-08-28 21:20:44 +03:00
|
|
|
SpaprPciMsi *msi;
|
2014-05-30 13:34:20 +04:00
|
|
|
int *config_addr_key;
|
2016-02-26 12:44:07 +03:00
|
|
|
Error *err = NULL;
|
2018-06-18 20:34:00 +03:00
|
|
|
int i;
|
2012-08-07 20:10:37 +04:00
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
/* Fins SpaprPhbState */
|
spapr_pci: fix MSI/MSIX selection
In various place we don't correctly check if the device supports MSI or
MSI-X. This can cause devices to be advertised with MSI support, even
if they only support MSI-X (like virtio-pci-* devices for example):
ethernet@0 {
ibm,req#msi = <0x1>; <--- wrong!
.
ibm,loc-code = "qemu_virtio-net-pci:0000:00:00.0";
.
ibm,req#msi-x = <0x3>;
};
Worse, this can also cause the "ibm,change-msi" RTAS call to corrupt the
PCI status and cause migration to fail:
qemu-system-ppc64: get_pci_config_device: Bad config data: i=0x6
read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
^^
PCI_STATUS_CAP_LIST bit which is assumed to be constant
This patch changes spapr_populate_pci_child_dt() to properly check for
MSI support using msi_present(): this ensures that PCIDevice::msi_cap
was set by msi_init() and that msi_nr_vectors_allocated() will look at
the right place in the config space.
Checking PCIDevice::msix_entries_nr is enough for MSI-X but let's add
a call to msix_present() there as well for consistency.
It also changes rtas_ibm_change_msi() to select the appropriate MSI
type in Function 1 instead of always selecting plain MSI. This new
behaviour is compliant with LoPAPR 1.1, as described in "Table 71.
ibm,change-msi Argument Call Buffer":
Function 1: If Number Outputs is equal to 3, request to set to a new
number of MSIs (including set to 0).
If the “ibm,change-msix-capable” property exists and Number
Outputs is equal to 4, request is to set to a new number of
MSI or MSI-X (platform choice) interrupts (including set to
0).
Since MSI is the the platform default (LoPAPR 6.2.3 MSI Option), let's
check for MSI support first.
And finally, it checks the input parameters are valid, as described in
LoPAPR 1.1 "R1–7.3.10.5.1–3":
For the MSI option: The platform must return a Status of -3 (Parameter
error) from ibm,change-msi, with no change in interrupt assignments if
the PCI configuration address does not support MSI and Function 3 was
requested (that is, the “ibm,req#msi” property must exist for the PCI
configuration address in order to use Function 3), or does not support
MSI-X and Function 4 is requested (that is, the “ibm,req#msi-x” property
must exist for the PCI configuration address in order to use Function 4),
or if neither MSIs nor MSI-Xs are supported and Function 1 is requested.
This ensures that the ret_intr_type variable contains a valid MSI type
for this device, and that spapr_msi_setmsg() won't corrupt the PCI status.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-27 01:25:24 +03:00
|
|
|
phb = spapr_pci_find_phb(spapr, buid);
|
|
|
|
if (phb) {
|
|
|
|
pdev = spapr_pci_find_dev(spapr, buid, config_addr);
|
|
|
|
}
|
|
|
|
if (!phb || !pdev) {
|
|
|
|
rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2012-08-07 20:10:37 +04:00
|
|
|
switch (func) {
|
|
|
|
case RTAS_CHANGE_FN:
|
spapr_pci: fix MSI/MSIX selection
In various place we don't correctly check if the device supports MSI or
MSI-X. This can cause devices to be advertised with MSI support, even
if they only support MSI-X (like virtio-pci-* devices for example):
ethernet@0 {
ibm,req#msi = <0x1>; <--- wrong!
.
ibm,loc-code = "qemu_virtio-net-pci:0000:00:00.0";
.
ibm,req#msi-x = <0x3>;
};
Worse, this can also cause the "ibm,change-msi" RTAS call to corrupt the
PCI status and cause migration to fail:
qemu-system-ppc64: get_pci_config_device: Bad config data: i=0x6
read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
^^
PCI_STATUS_CAP_LIST bit which is assumed to be constant
This patch changes spapr_populate_pci_child_dt() to properly check for
MSI support using msi_present(): this ensures that PCIDevice::msi_cap
was set by msi_init() and that msi_nr_vectors_allocated() will look at
the right place in the config space.
Checking PCIDevice::msix_entries_nr is enough for MSI-X but let's add
a call to msix_present() there as well for consistency.
It also changes rtas_ibm_change_msi() to select the appropriate MSI
type in Function 1 instead of always selecting plain MSI. This new
behaviour is compliant with LoPAPR 1.1, as described in "Table 71.
ibm,change-msi Argument Call Buffer":
Function 1: If Number Outputs is equal to 3, request to set to a new
number of MSIs (including set to 0).
If the “ibm,change-msix-capable” property exists and Number
Outputs is equal to 4, request is to set to a new number of
MSI or MSI-X (platform choice) interrupts (including set to
0).
Since MSI is the the platform default (LoPAPR 6.2.3 MSI Option), let's
check for MSI support first.
And finally, it checks the input parameters are valid, as described in
LoPAPR 1.1 "R1–7.3.10.5.1–3":
For the MSI option: The platform must return a Status of -3 (Parameter
error) from ibm,change-msi, with no change in interrupt assignments if
the PCI configuration address does not support MSI and Function 3 was
requested (that is, the “ibm,req#msi” property must exist for the PCI
configuration address in order to use Function 3), or does not support
MSI-X and Function 4 is requested (that is, the “ibm,req#msi-x” property
must exist for the PCI configuration address in order to use Function 4),
or if neither MSIs nor MSI-Xs are supported and Function 1 is requested.
This ensures that the ret_intr_type variable contains a valid MSI type
for this device, and that spapr_msi_setmsg() won't corrupt the PCI status.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-27 01:25:24 +03:00
|
|
|
if (msi_present(pdev)) {
|
|
|
|
ret_intr_type = RTAS_TYPE_MSI;
|
|
|
|
} else if (msix_present(pdev)) {
|
|
|
|
ret_intr_type = RTAS_TYPE_MSIX;
|
|
|
|
} else {
|
|
|
|
rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
case RTAS_CHANGE_MSI_FN:
|
|
|
|
if (msi_present(pdev)) {
|
|
|
|
ret_intr_type = RTAS_TYPE_MSI;
|
|
|
|
} else {
|
|
|
|
rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
|
|
|
|
return;
|
|
|
|
}
|
2012-08-07 20:10:37 +04:00
|
|
|
break;
|
|
|
|
case RTAS_CHANGE_MSIX_FN:
|
spapr_pci: fix MSI/MSIX selection
In various place we don't correctly check if the device supports MSI or
MSI-X. This can cause devices to be advertised with MSI support, even
if they only support MSI-X (like virtio-pci-* devices for example):
ethernet@0 {
ibm,req#msi = <0x1>; <--- wrong!
.
ibm,loc-code = "qemu_virtio-net-pci:0000:00:00.0";
.
ibm,req#msi-x = <0x3>;
};
Worse, this can also cause the "ibm,change-msi" RTAS call to corrupt the
PCI status and cause migration to fail:
qemu-system-ppc64: get_pci_config_device: Bad config data: i=0x6
read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
^^
PCI_STATUS_CAP_LIST bit which is assumed to be constant
This patch changes spapr_populate_pci_child_dt() to properly check for
MSI support using msi_present(): this ensures that PCIDevice::msi_cap
was set by msi_init() and that msi_nr_vectors_allocated() will look at
the right place in the config space.
Checking PCIDevice::msix_entries_nr is enough for MSI-X but let's add
a call to msix_present() there as well for consistency.
It also changes rtas_ibm_change_msi() to select the appropriate MSI
type in Function 1 instead of always selecting plain MSI. This new
behaviour is compliant with LoPAPR 1.1, as described in "Table 71.
ibm,change-msi Argument Call Buffer":
Function 1: If Number Outputs is equal to 3, request to set to a new
number of MSIs (including set to 0).
If the “ibm,change-msix-capable” property exists and Number
Outputs is equal to 4, request is to set to a new number of
MSI or MSI-X (platform choice) interrupts (including set to
0).
Since MSI is the the platform default (LoPAPR 6.2.3 MSI Option), let's
check for MSI support first.
And finally, it checks the input parameters are valid, as described in
LoPAPR 1.1 "R1–7.3.10.5.1–3":
For the MSI option: The platform must return a Status of -3 (Parameter
error) from ibm,change-msi, with no change in interrupt assignments if
the PCI configuration address does not support MSI and Function 3 was
requested (that is, the “ibm,req#msi” property must exist for the PCI
configuration address in order to use Function 3), or does not support
MSI-X and Function 4 is requested (that is, the “ibm,req#msi-x” property
must exist for the PCI configuration address in order to use Function 4),
or if neither MSIs nor MSI-Xs are supported and Function 1 is requested.
This ensures that the ret_intr_type variable contains a valid MSI type
for this device, and that spapr_msi_setmsg() won't corrupt the PCI status.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-27 01:25:24 +03:00
|
|
|
if (msix_present(pdev)) {
|
|
|
|
ret_intr_type = RTAS_TYPE_MSIX;
|
|
|
|
} else {
|
|
|
|
rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
|
|
|
|
return;
|
|
|
|
}
|
2012-08-07 20:10:37 +04:00
|
|
|
break;
|
|
|
|
default:
|
2013-11-21 08:08:58 +04:00
|
|
|
error_report("rtas_ibm_change_msi(%u) is not implemented", func);
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
|
2012-08-07 20:10:37 +04:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2019-08-28 21:20:44 +03:00
|
|
|
msi = (SpaprPciMsi *) g_hash_table_lookup(phb->msi, &config_addr);
|
2016-02-25 21:02:18 +03:00
|
|
|
|
2012-08-07 20:10:37 +04:00
|
|
|
/* Releasing MSIs */
|
|
|
|
if (!req_num) {
|
2014-05-30 13:34:20 +04:00
|
|
|
if (!msi) {
|
|
|
|
trace_spapr_pci_msi("Releasing wrong config", config_addr);
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
|
2012-08-07 20:10:37 +04:00
|
|
|
return;
|
|
|
|
}
|
2014-05-30 13:34:20 +04:00
|
|
|
|
2014-08-13 11:20:53 +04:00
|
|
|
if (msi_present(pdev)) {
|
2016-02-25 21:02:12 +03:00
|
|
|
spapr_msi_setmsg(pdev, 0, false, 0, 0);
|
2014-08-13 11:20:53 +04:00
|
|
|
}
|
|
|
|
if (msix_present(pdev)) {
|
2016-02-25 21:02:12 +03:00
|
|
|
spapr_msi_setmsg(pdev, 0, true, 0, 0);
|
2014-08-13 11:20:53 +04:00
|
|
|
}
|
2014-05-30 13:34:20 +04:00
|
|
|
g_hash_table_remove(phb->msi, &config_addr);
|
|
|
|
|
|
|
|
trace_spapr_pci_msi("Released MSIs", config_addr);
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_SUCCESS);
|
2012-08-07 20:10:37 +04:00
|
|
|
rtas_st(rets, 1, 0);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Enabling MSI */
|
|
|
|
|
2014-05-04 18:09:48 +04:00
|
|
|
/* Check if the device supports as many IRQs as requested */
|
|
|
|
if (ret_intr_type == RTAS_TYPE_MSI) {
|
|
|
|
max_irqs = msi_nr_vectors_allocated(pdev);
|
|
|
|
} else if (ret_intr_type == RTAS_TYPE_MSIX) {
|
|
|
|
max_irqs = pdev->msix_entries_nr;
|
|
|
|
}
|
|
|
|
if (!max_irqs) {
|
2014-05-30 13:34:20 +04:00
|
|
|
error_report("Requested interrupt type %d is not enabled for device %x",
|
|
|
|
ret_intr_type, config_addr);
|
2014-05-04 18:09:48 +04:00
|
|
|
rtas_st(rets, 0, -1); /* Hardware error */
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
/* Correct the number if the guest asked for too many */
|
|
|
|
if (req_num > max_irqs) {
|
2014-05-30 13:34:20 +04:00
|
|
|
trace_spapr_pci_msi_retry(config_addr, req_num, max_irqs);
|
2014-05-04 18:09:48 +04:00
|
|
|
req_num = max_irqs;
|
2014-05-30 13:34:20 +04:00
|
|
|
irq = 0; /* to avoid misleading trace */
|
|
|
|
goto out;
|
2014-05-04 18:09:48 +04:00
|
|
|
}
|
|
|
|
|
2014-05-30 13:34:20 +04:00
|
|
|
/* Allocate MSIs */
|
2018-08-10 11:00:26 +03:00
|
|
|
if (smc->legacy_irq_allocation) {
|
2018-07-30 17:11:32 +03:00
|
|
|
irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI,
|
|
|
|
&err);
|
|
|
|
} else {
|
|
|
|
irq = spapr_irq_msi_alloc(spapr, req_num,
|
|
|
|
ret_intr_type == RTAS_TYPE_MSI, &err);
|
|
|
|
}
|
2016-02-26 12:44:07 +03:00
|
|
|
if (err) {
|
|
|
|
error_reportf_err(err, "Can't allocate MSIs for device %x: ",
|
|
|
|
config_addr);
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
|
2012-08-07 20:10:37 +04:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2018-06-18 20:34:00 +03:00
|
|
|
for (i = 0; i < req_num; i++) {
|
|
|
|
spapr_irq_claim(spapr, irq + i, false, &err);
|
|
|
|
if (err) {
|
2019-02-07 20:28:37 +03:00
|
|
|
if (i) {
|
|
|
|
spapr_irq_free(spapr, irq, i);
|
|
|
|
}
|
|
|
|
if (!smc->legacy_irq_allocation) {
|
|
|
|
spapr_irq_msi_free(spapr, irq, req_num);
|
|
|
|
}
|
2018-06-18 20:34:00 +03:00
|
|
|
error_reportf_err(err, "Can't allocate MSIs for device %x: ",
|
|
|
|
config_addr);
|
|
|
|
rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2016-02-25 21:02:18 +03:00
|
|
|
/* Release previous MSIs */
|
|
|
|
if (msi) {
|
|
|
|
g_hash_table_remove(phb->msi, &config_addr);
|
|
|
|
}
|
|
|
|
|
2012-08-07 20:10:37 +04:00
|
|
|
/* Setup MSI/MSIX vectors in the device (via cfgspace or MSIX BAR) */
|
2014-08-27 20:17:12 +04:00
|
|
|
spapr_msi_setmsg(pdev, SPAPR_PCI_MSI_WINDOW, ret_intr_type == RTAS_TYPE_MSIX,
|
2014-05-30 13:34:20 +04:00
|
|
|
irq, req_num);
|
2012-08-07 20:10:37 +04:00
|
|
|
|
2014-05-30 13:34:20 +04:00
|
|
|
/* Add MSI device to cache */
|
2019-08-28 21:20:44 +03:00
|
|
|
msi = g_new(SpaprPciMsi, 1);
|
2014-05-30 13:34:20 +04:00
|
|
|
msi->first_irq = irq;
|
|
|
|
msi->num = req_num;
|
|
|
|
config_addr_key = g_new(int, 1);
|
|
|
|
*config_addr_key = config_addr;
|
|
|
|
g_hash_table_insert(phb->msi, config_addr_key, msi);
|
|
|
|
|
|
|
|
out:
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_SUCCESS);
|
2012-08-07 20:10:37 +04:00
|
|
|
rtas_st(rets, 1, req_num);
|
|
|
|
rtas_st(rets, 2, ++seq_num);
|
2015-09-01 04:23:47 +03:00
|
|
|
if (nret > 3) {
|
|
|
|
rtas_st(rets, 3, ret_intr_type);
|
|
|
|
}
|
2012-08-07 20:10:37 +04:00
|
|
|
|
2014-05-30 13:34:20 +04:00
|
|
|
trace_spapr_pci_rtas_ibm_change_msi(config_addr, func, req_num, irq);
|
2012-08-07 20:10:37 +04:00
|
|
|
}
|
|
|
|
|
2013-06-20 00:40:30 +04:00
|
|
|
static void rtas_ibm_query_interrupt_source_number(PowerPCCPU *cpu,
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprMachineState *spapr,
|
2012-08-07 20:10:37 +04:00
|
|
|
uint32_t token,
|
|
|
|
uint32_t nargs,
|
|
|
|
target_ulong args,
|
|
|
|
uint32_t nret,
|
|
|
|
target_ulong rets)
|
|
|
|
{
|
|
|
|
uint32_t config_addr = rtas_ld(args, 0);
|
2015-09-01 04:05:12 +03:00
|
|
|
uint64_t buid = rtas_ldq(args, 1);
|
2012-08-07 20:10:37 +04:00
|
|
|
unsigned int intr_src_num = -1, ioa_intr_num = rtas_ld(args, 3);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *phb = NULL;
|
2014-05-30 13:34:20 +04:00
|
|
|
PCIDevice *pdev = NULL;
|
2019-08-28 21:20:44 +03:00
|
|
|
SpaprPciMsi *msi;
|
2012-08-07 20:10:37 +04:00
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
/* Find SpaprPhbState */
|
2015-05-07 08:33:34 +03:00
|
|
|
phb = spapr_pci_find_phb(spapr, buid);
|
2014-05-30 13:34:20 +04:00
|
|
|
if (phb) {
|
2015-05-07 08:33:34 +03:00
|
|
|
pdev = spapr_pci_find_dev(spapr, buid, config_addr);
|
2014-05-30 13:34:20 +04:00
|
|
|
}
|
|
|
|
if (!phb || !pdev) {
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
|
2012-08-07 20:10:37 +04:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Find device descriptor and start IRQ */
|
2019-08-28 21:20:44 +03:00
|
|
|
msi = (SpaprPciMsi *) g_hash_table_lookup(phb->msi, &config_addr);
|
2014-05-30 13:34:20 +04:00
|
|
|
if (!msi || !msi->first_irq || !msi->num || (ioa_intr_num >= msi->num)) {
|
|
|
|
trace_spapr_pci_msi("Failed to return vector", config_addr);
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
|
2012-08-07 20:10:37 +04:00
|
|
|
return;
|
|
|
|
}
|
2014-05-30 13:34:20 +04:00
|
|
|
intr_src_num = msi->first_irq + ioa_intr_num;
|
2012-08-07 20:10:37 +04:00
|
|
|
trace_spapr_pci_rtas_ibm_query_interrupt_source_number(ioa_intr_num,
|
|
|
|
intr_src_num);
|
|
|
|
|
2013-11-19 08:28:54 +04:00
|
|
|
rtas_st(rets, 0, RTAS_OUT_SUCCESS);
|
2012-08-07 20:10:37 +04:00
|
|
|
rtas_st(rets, 1, intr_src_num);
|
|
|
|
rtas_st(rets, 2, 1);/* 0 == level; 1 == edge */
|
|
|
|
}
|
|
|
|
|
2015-02-20 07:58:52 +03:00
|
|
|
static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu,
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprMachineState *spapr,
|
2015-02-20 07:58:52 +03:00
|
|
|
uint32_t token, uint32_t nargs,
|
|
|
|
target_ulong args, uint32_t nret,
|
|
|
|
target_ulong rets)
|
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb;
|
2015-02-20 07:58:52 +03:00
|
|
|
uint32_t addr, option;
|
|
|
|
uint64_t buid;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
if ((nargs != 4) || (nret != 1)) {
|
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
2015-09-01 04:05:12 +03:00
|
|
|
buid = rtas_ldq(args, 1);
|
2015-02-20 07:58:52 +03:00
|
|
|
addr = rtas_ld(args, 0);
|
|
|
|
option = rtas_ld(args, 3);
|
|
|
|
|
2015-05-07 08:33:34 +03:00
|
|
|
sphb = spapr_pci_find_phb(spapr, buid);
|
2015-02-20 07:58:52 +03:00
|
|
|
if (!sphb) {
|
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
2016-02-29 09:45:05 +03:00
|
|
|
if (!spapr_phb_eeh_available(sphb)) {
|
2015-02-20 07:58:52 +03:00
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
2016-02-29 09:45:05 +03:00
|
|
|
ret = spapr_phb_vfio_eeh_set_option(sphb, addr, option);
|
2015-02-20 07:58:52 +03:00
|
|
|
rtas_st(rets, 0, ret);
|
|
|
|
return;
|
|
|
|
|
|
|
|
param_error_exit:
|
|
|
|
rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void rtas_ibm_get_config_addr_info2(PowerPCCPU *cpu,
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprMachineState *spapr,
|
2015-02-20 07:58:52 +03:00
|
|
|
uint32_t token, uint32_t nargs,
|
|
|
|
target_ulong args, uint32_t nret,
|
|
|
|
target_ulong rets)
|
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb;
|
2015-02-20 07:58:52 +03:00
|
|
|
PCIDevice *pdev;
|
|
|
|
uint32_t addr, option;
|
|
|
|
uint64_t buid;
|
|
|
|
|
|
|
|
if ((nargs != 4) || (nret != 2)) {
|
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
2015-09-01 04:05:12 +03:00
|
|
|
buid = rtas_ldq(args, 1);
|
2015-05-07 08:33:34 +03:00
|
|
|
sphb = spapr_pci_find_phb(spapr, buid);
|
2015-02-20 07:58:52 +03:00
|
|
|
if (!sphb) {
|
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
2016-02-29 09:45:05 +03:00
|
|
|
if (!spapr_phb_eeh_available(sphb)) {
|
2015-02-20 07:58:52 +03:00
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We always have PE address of form "00BB0001". "BB"
|
|
|
|
* represents the bus number of PE's primary bus.
|
|
|
|
*/
|
|
|
|
option = rtas_ld(args, 3);
|
|
|
|
switch (option) {
|
|
|
|
case RTAS_GET_PE_ADDR:
|
|
|
|
addr = rtas_ld(args, 0);
|
2015-05-07 08:33:34 +03:00
|
|
|
pdev = spapr_pci_find_dev(spapr, buid, addr);
|
2015-02-20 07:58:52 +03:00
|
|
|
if (!pdev) {
|
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
2017-11-29 11:46:27 +03:00
|
|
|
rtas_st(rets, 1, (pci_bus_num(pci_get_bus(pdev)) << 16) + 1);
|
2015-02-20 07:58:52 +03:00
|
|
|
break;
|
|
|
|
case RTAS_GET_PE_MODE:
|
|
|
|
rtas_st(rets, 1, RTAS_PE_MODE_SHARED);
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
|
|
|
rtas_st(rets, 0, RTAS_OUT_SUCCESS);
|
|
|
|
return;
|
|
|
|
|
|
|
|
param_error_exit:
|
|
|
|
rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu,
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprMachineState *spapr,
|
2015-02-20 07:58:52 +03:00
|
|
|
uint32_t token, uint32_t nargs,
|
|
|
|
target_ulong args, uint32_t nret,
|
|
|
|
target_ulong rets)
|
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb;
|
2015-02-20 07:58:52 +03:00
|
|
|
uint64_t buid;
|
|
|
|
int state, ret;
|
|
|
|
|
|
|
|
if ((nargs != 3) || (nret != 4 && nret != 5)) {
|
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
2015-09-01 04:05:12 +03:00
|
|
|
buid = rtas_ldq(args, 1);
|
2015-05-07 08:33:34 +03:00
|
|
|
sphb = spapr_pci_find_phb(spapr, buid);
|
2015-02-20 07:58:52 +03:00
|
|
|
if (!sphb) {
|
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
2016-02-29 09:45:05 +03:00
|
|
|
if (!spapr_phb_eeh_available(sphb)) {
|
2015-02-20 07:58:52 +03:00
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
2016-02-29 09:45:05 +03:00
|
|
|
ret = spapr_phb_vfio_eeh_get_state(sphb, &state);
|
2015-02-20 07:58:52 +03:00
|
|
|
rtas_st(rets, 0, ret);
|
|
|
|
if (ret != RTAS_OUT_SUCCESS) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
rtas_st(rets, 1, state);
|
|
|
|
rtas_st(rets, 2, RTAS_EEH_SUPPORT);
|
|
|
|
rtas_st(rets, 3, RTAS_EEH_PE_UNAVAIL_INFO);
|
|
|
|
if (nret >= 5) {
|
|
|
|
rtas_st(rets, 4, RTAS_EEH_PE_RECOVER_INFO);
|
|
|
|
}
|
|
|
|
return;
|
|
|
|
|
|
|
|
param_error_exit:
|
|
|
|
rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu,
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprMachineState *spapr,
|
2015-02-20 07:58:52 +03:00
|
|
|
uint32_t token, uint32_t nargs,
|
|
|
|
target_ulong args, uint32_t nret,
|
|
|
|
target_ulong rets)
|
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb;
|
2015-02-20 07:58:52 +03:00
|
|
|
uint32_t option;
|
|
|
|
uint64_t buid;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
if ((nargs != 4) || (nret != 1)) {
|
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
2015-09-01 04:05:12 +03:00
|
|
|
buid = rtas_ldq(args, 1);
|
2015-02-20 07:58:52 +03:00
|
|
|
option = rtas_ld(args, 3);
|
2015-05-07 08:33:34 +03:00
|
|
|
sphb = spapr_pci_find_phb(spapr, buid);
|
2015-02-20 07:58:52 +03:00
|
|
|
if (!sphb) {
|
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
2016-02-29 09:45:05 +03:00
|
|
|
if (!spapr_phb_eeh_available(sphb)) {
|
2015-02-20 07:58:52 +03:00
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
2016-02-29 09:45:05 +03:00
|
|
|
ret = spapr_phb_vfio_eeh_reset(sphb, option);
|
2015-02-20 07:58:52 +03:00
|
|
|
rtas_st(rets, 0, ret);
|
|
|
|
return;
|
|
|
|
|
|
|
|
param_error_exit:
|
|
|
|
rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void rtas_ibm_configure_pe(PowerPCCPU *cpu,
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprMachineState *spapr,
|
2015-02-20 07:58:52 +03:00
|
|
|
uint32_t token, uint32_t nargs,
|
|
|
|
target_ulong args, uint32_t nret,
|
|
|
|
target_ulong rets)
|
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb;
|
2015-02-20 07:58:52 +03:00
|
|
|
uint64_t buid;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
if ((nargs != 3) || (nret != 1)) {
|
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
2015-09-01 04:05:12 +03:00
|
|
|
buid = rtas_ldq(args, 1);
|
2015-05-07 08:33:34 +03:00
|
|
|
sphb = spapr_pci_find_phb(spapr, buid);
|
2015-02-20 07:58:52 +03:00
|
|
|
if (!sphb) {
|
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
2016-02-29 09:45:05 +03:00
|
|
|
if (!spapr_phb_eeh_available(sphb)) {
|
2015-02-20 07:58:52 +03:00
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
2016-02-29 09:45:05 +03:00
|
|
|
ret = spapr_phb_vfio_eeh_configure(sphb);
|
2015-02-20 07:58:52 +03:00
|
|
|
rtas_st(rets, 0, ret);
|
|
|
|
return;
|
|
|
|
|
|
|
|
param_error_exit:
|
|
|
|
rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* To support it later */
|
|
|
|
static void rtas_ibm_slot_error_detail(PowerPCCPU *cpu,
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprMachineState *spapr,
|
2015-02-20 07:58:52 +03:00
|
|
|
uint32_t token, uint32_t nargs,
|
|
|
|
target_ulong args, uint32_t nret,
|
|
|
|
target_ulong rets)
|
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb;
|
2015-02-20 07:58:52 +03:00
|
|
|
int option;
|
|
|
|
uint64_t buid;
|
|
|
|
|
|
|
|
if ((nargs != 8) || (nret != 1)) {
|
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
2015-09-01 04:05:12 +03:00
|
|
|
buid = rtas_ldq(args, 1);
|
2015-05-07 08:33:34 +03:00
|
|
|
sphb = spapr_pci_find_phb(spapr, buid);
|
2015-02-20 07:58:52 +03:00
|
|
|
if (!sphb) {
|
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
2016-02-29 09:45:05 +03:00
|
|
|
if (!spapr_phb_eeh_available(sphb)) {
|
2015-02-20 07:58:52 +03:00
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
|
|
|
option = rtas_ld(args, 7);
|
|
|
|
switch (option) {
|
|
|
|
case RTAS_SLOT_TEMP_ERR_LOG:
|
|
|
|
case RTAS_SLOT_PERM_ERR_LOG:
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
goto param_error_exit;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* We don't have error log yet */
|
|
|
|
rtas_st(rets, 0, RTAS_OUT_NO_ERRORS_FOUND);
|
|
|
|
return;
|
|
|
|
|
|
|
|
param_error_exit:
|
|
|
|
rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
|
|
|
|
}
|
|
|
|
|
2011-10-30 21:16:46 +04:00
|
|
|
static void pci_spapr_set_irq(void *opaque, int irq_num, int level)
|
|
|
|
{
|
|
|
|
/*
|
2019-04-05 19:30:48 +03:00
|
|
|
* Here we use the number returned by pci_swizzle_map_irq_fn to find a
|
2011-10-30 21:16:46 +04:00
|
|
|
* corresponding qemu_irq.
|
|
|
|
*/
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *phb = opaque;
|
2019-09-23 08:43:58 +03:00
|
|
|
SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
|
2011-10-30 21:16:46 +04:00
|
|
|
|
2013-01-23 21:20:39 +04:00
|
|
|
trace_spapr_pci_lsi_set(phb->dtbusname, irq_num, phb->lsi_table[irq_num].irq);
|
2019-09-23 08:43:58 +03:00
|
|
|
qemu_set_irq(spapr_qirq(spapr, phb->lsi_table[irq_num].irq), level);
|
2011-10-30 21:16:46 +04:00
|
|
|
}
|
|
|
|
|
2013-09-26 10:18:48 +04:00
|
|
|
static PCIINTxRoute spapr_route_intx_pin_to_irq(void *opaque, int pin)
|
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb = SPAPR_PCI_HOST_BRIDGE(opaque);
|
2013-09-26 10:18:48 +04:00
|
|
|
PCIINTxRoute route;
|
|
|
|
|
|
|
|
route.mode = PCI_INTX_ENABLED;
|
|
|
|
route.irq = sphb->lsi_table[pin].irq;
|
|
|
|
|
|
|
|
return route;
|
|
|
|
}
|
|
|
|
|
2020-08-11 14:41:30 +03:00
|
|
|
static uint64_t spapr_msi_read(void *opaque, hwaddr addr, unsigned size)
|
|
|
|
{
|
|
|
|
qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid access\n", __func__);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2012-08-07 20:10:37 +04:00
|
|
|
/*
|
|
|
|
* MSI/MSIX memory region implementation.
|
|
|
|
* The handler handles both MSI and MSIX.
|
2017-07-18 05:00:33 +03:00
|
|
|
* The vector number is encoded in least bits in data.
|
2012-08-07 20:10:37 +04:00
|
|
|
*/
|
2012-10-23 14:30:10 +04:00
|
|
|
static void spapr_msi_write(void *opaque, hwaddr addr,
|
2012-08-07 20:10:37 +04:00
|
|
|
uint64_t data, unsigned size)
|
|
|
|
{
|
2020-12-09 20:00:50 +03:00
|
|
|
SpaprMachineState *spapr = opaque;
|
2013-07-12 11:38:24 +04:00
|
|
|
uint32_t irq = data;
|
2012-08-07 20:10:37 +04:00
|
|
|
|
|
|
|
trace_spapr_pci_msi_write(addr, data, irq);
|
|
|
|
|
2017-12-01 19:06:04 +03:00
|
|
|
qemu_irq_pulse(spapr_qirq(spapr, irq));
|
2012-08-07 20:10:37 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static const MemoryRegionOps spapr_msi_ops = {
|
2020-08-11 14:41:30 +03:00
|
|
|
/*
|
|
|
|
* .read result is undefined by PCI spec.
|
|
|
|
* define .read method to avoid assert failure in memory_region_init_io
|
|
|
|
*/
|
|
|
|
.read = spapr_msi_read,
|
2012-08-07 20:10:37 +04:00
|
|
|
.write = spapr_msi_write,
|
|
|
|
.endianness = DEVICE_LITTLE_ENDIAN
|
|
|
|
};
|
|
|
|
|
2012-03-12 21:50:24 +04:00
|
|
|
/*
|
|
|
|
* PHB PCI device
|
|
|
|
*/
|
2012-10-30 15:47:48 +04:00
|
|
|
static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn)
|
2012-06-27 08:50:46 +04:00
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *phb = opaque;
|
2012-06-27 08:50:46 +04:00
|
|
|
|
2012-10-30 15:47:48 +04:00
|
|
|
return &phb->iommu_as;
|
2012-06-27 08:50:46 +04:00
|
|
|
}
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
static char *spapr_phb_vfio_get_loc_code(SpaprPhbState *sphb, PCIDevice *pdev)
|
2015-07-02 09:23:22 +03:00
|
|
|
{
|
2021-08-10 07:28:19 +03:00
|
|
|
g_autofree char *path = NULL;
|
|
|
|
g_autofree char *host = NULL;
|
|
|
|
g_autofree char *devspec = NULL;
|
|
|
|
char *buf = NULL;
|
2015-07-02 09:23:22 +03:00
|
|
|
|
|
|
|
/* Get the PCI VFIO host id */
|
|
|
|
host = object_property_get_str(OBJECT(pdev), "host", NULL);
|
|
|
|
if (!host) {
|
2021-08-10 07:28:19 +03:00
|
|
|
return NULL;
|
2015-07-02 09:23:22 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Construct the path of the file that will give us the DT location */
|
|
|
|
path = g_strdup_printf("/sys/bus/pci/devices/%s/devspec", host);
|
2021-08-10 07:28:19 +03:00
|
|
|
if (!g_file_get_contents(path, &devspec, NULL, NULL)) {
|
|
|
|
return NULL;
|
2015-07-02 09:23:22 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Construct and read from host device tree the loc-code */
|
2021-08-10 07:28:19 +03:00
|
|
|
path = g_strdup_printf("/proc/device-tree%s/ibm,loc-code", devspec);
|
2017-09-09 18:06:02 +03:00
|
|
|
if (!g_file_get_contents(path, &buf, NULL, NULL)) {
|
2021-08-10 07:28:19 +03:00
|
|
|
return NULL;
|
2015-07-02 09:23:22 +03:00
|
|
|
}
|
|
|
|
return buf;
|
|
|
|
}
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
static char *spapr_phb_get_loc_code(SpaprPhbState *sphb, PCIDevice *pdev)
|
2015-07-02 09:23:22 +03:00
|
|
|
{
|
|
|
|
char *buf;
|
|
|
|
const char *devtype = "qemu";
|
|
|
|
uint32_t busnr = pci_bus_num(PCI_BUS(qdev_get_parent_bus(DEVICE(pdev))));
|
|
|
|
|
|
|
|
if (object_dynamic_cast(OBJECT(pdev), "vfio-pci")) {
|
|
|
|
buf = spapr_phb_vfio_get_loc_code(sphb, pdev);
|
|
|
|
if (buf) {
|
|
|
|
return buf;
|
|
|
|
}
|
|
|
|
devtype = "vfio";
|
|
|
|
}
|
|
|
|
/*
|
|
|
|
* For emulated devices and VFIO-failure case, make up
|
|
|
|
* the loc-code.
|
|
|
|
*/
|
|
|
|
buf = g_strdup_printf("%s_%s:%04x:%02x:%02x.%x",
|
|
|
|
devtype, pdev->name, sphb->index, busnr,
|
|
|
|
PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
|
|
|
|
return buf;
|
|
|
|
}
|
|
|
|
|
2015-05-07 08:33:55 +03:00
|
|
|
/* Macros to operate with address in OF binding to PCI */
|
|
|
|
#define b_x(x, p, l) (((x) & ((1<<(l))-1)) << (p))
|
|
|
|
#define b_n(x) b_x((x), 31, 1) /* 0 if relocatable */
|
|
|
|
#define b_p(x) b_x((x), 30, 1) /* 1 if prefetchable */
|
|
|
|
#define b_t(x) b_x((x), 29, 1) /* 1 if the address is aliased */
|
|
|
|
#define b_ss(x) b_x((x), 24, 2) /* the space code */
|
|
|
|
#define b_bbbbbbbb(x) b_x((x), 16, 8) /* bus number */
|
|
|
|
#define b_ddddd(x) b_x((x), 11, 5) /* device number */
|
|
|
|
#define b_fff(x) b_x((x), 8, 3) /* function number */
|
|
|
|
#define b_rrrrrrrr(x) b_x((x), 0, 8) /* register number */
|
|
|
|
|
2019-09-27 05:26:51 +03:00
|
|
|
/* for 'reg' OF properties */
|
2015-05-07 08:33:55 +03:00
|
|
|
#define RESOURCE_CELLS_SIZE 2
|
|
|
|
#define RESOURCE_CELLS_ADDRESS 3
|
|
|
|
|
|
|
|
typedef struct ResourceFields {
|
|
|
|
uint32_t phys_hi;
|
|
|
|
uint32_t phys_mid;
|
|
|
|
uint32_t phys_lo;
|
|
|
|
uint32_t size_hi;
|
|
|
|
uint32_t size_lo;
|
|
|
|
} QEMU_PACKED ResourceFields;
|
|
|
|
|
|
|
|
typedef struct ResourceProps {
|
|
|
|
ResourceFields reg[8];
|
|
|
|
uint32_t reg_len;
|
|
|
|
} ResourceProps;
|
|
|
|
|
2019-09-27 05:26:51 +03:00
|
|
|
/* fill in the 'reg' OF properties for
|
2015-05-07 08:33:55 +03:00
|
|
|
* a PCI device. 'reg' describes resource requirements for a
|
2019-09-27 05:26:51 +03:00
|
|
|
* device's IO/MEM regions.
|
2015-05-07 08:33:55 +03:00
|
|
|
*
|
2019-09-27 05:26:51 +03:00
|
|
|
* the property is an array of ('phys-addr', 'size') pairs describing
|
2015-05-07 08:33:55 +03:00
|
|
|
* the addressable regions of the PCI device, where 'phys-addr' is a
|
|
|
|
* RESOURCE_CELLS_ADDRESS-tuple of 32-bit integers corresponding to
|
|
|
|
* (phys.hi, phys.mid, phys.lo), and 'size' is a
|
|
|
|
* RESOURCE_CELLS_SIZE-tuple corresponding to (size.hi, size.lo).
|
|
|
|
*
|
|
|
|
* phys.hi = 0xYYXXXXZZ, where:
|
|
|
|
* 0xYY = npt000ss
|
|
|
|
* ||| |
|
2015-07-02 09:23:08 +03:00
|
|
|
* ||| +-- space code
|
|
|
|
* ||| |
|
|
|
|
* ||| + 00 if configuration space
|
|
|
|
* ||| + 01 if IO region,
|
|
|
|
* ||| + 10 if 32-bit MEM region
|
|
|
|
* ||| + 11 if 64-bit MEM region
|
|
|
|
* |||
|
2015-05-07 08:33:55 +03:00
|
|
|
* ||+------ for non-relocatable IO: 1 if aliased
|
|
|
|
* || for relocatable IO: 1 if below 64KB
|
|
|
|
* || for MEM: 1 if below 1MB
|
|
|
|
* |+------- 1 if region is prefetchable
|
|
|
|
* +-------- 1 if region is non-relocatable
|
|
|
|
* 0xXXXX = bbbbbbbb dddddfff, encoding bus, slot, and function
|
|
|
|
* bits respectively
|
|
|
|
* 0xZZ = rrrrrrrr, the register number of the BAR corresponding
|
|
|
|
* to the region
|
|
|
|
*
|
|
|
|
* phys.mid and phys.lo correspond respectively to the hi/lo portions
|
|
|
|
* of the actual address of the region.
|
|
|
|
*
|
2019-09-27 05:26:51 +03:00
|
|
|
* note also that addresses defined in this property are, at least
|
2015-05-07 08:33:55 +03:00
|
|
|
* for PAPR guests, relative to the PHBs IO/MEM windows, and
|
|
|
|
* correspond directly to the addresses in the BARs.
|
|
|
|
*
|
|
|
|
* in accordance with PCI Bus Binding to Open Firmware,
|
|
|
|
* IEEE Std 1275-1994, section 4.1.1, as implemented by PAPR+ v2.7,
|
|
|
|
* Appendix C.
|
|
|
|
*/
|
|
|
|
static void populate_resource_props(PCIDevice *d, ResourceProps *rp)
|
|
|
|
{
|
|
|
|
int bus_num = pci_bus_num(PCI_BUS(qdev_get_parent_bus(DEVICE(d))));
|
|
|
|
uint32_t dev_id = (b_bbbbbbbb(bus_num) |
|
|
|
|
b_ddddd(PCI_SLOT(d->devfn)) |
|
|
|
|
b_fff(PCI_FUNC(d->devfn)));
|
2019-09-27 05:26:51 +03:00
|
|
|
ResourceFields *reg;
|
|
|
|
int i, reg_idx = 0;
|
2015-05-07 08:33:55 +03:00
|
|
|
|
|
|
|
/* config space region */
|
|
|
|
reg = &rp->reg[reg_idx++];
|
|
|
|
reg->phys_hi = cpu_to_be32(dev_id);
|
|
|
|
reg->phys_mid = 0;
|
|
|
|
reg->phys_lo = 0;
|
|
|
|
reg->size_hi = 0;
|
|
|
|
reg->size_lo = 0;
|
|
|
|
|
|
|
|
for (i = 0; i < PCI_NUM_REGIONS; i++) {
|
|
|
|
if (!d->io_regions[i].size) {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
reg = &rp->reg[reg_idx++];
|
|
|
|
|
|
|
|
reg->phys_hi = cpu_to_be32(dev_id | b_rrrrrrrr(pci_bar(d, i)));
|
|
|
|
if (d->io_regions[i].type & PCI_BASE_ADDRESS_SPACE_IO) {
|
|
|
|
reg->phys_hi |= cpu_to_be32(b_ss(1));
|
2015-07-02 09:23:08 +03:00
|
|
|
} else if (d->io_regions[i].type & PCI_BASE_ADDRESS_MEM_TYPE_64) {
|
|
|
|
reg->phys_hi |= cpu_to_be32(b_ss(3));
|
2015-05-07 08:33:55 +03:00
|
|
|
} else {
|
|
|
|
reg->phys_hi |= cpu_to_be32(b_ss(2));
|
|
|
|
}
|
|
|
|
reg->phys_mid = 0;
|
|
|
|
reg->phys_lo = 0;
|
|
|
|
reg->size_hi = cpu_to_be32(d->io_regions[i].size >> 32);
|
|
|
|
reg->size_lo = cpu_to_be32(d->io_regions[i].size);
|
|
|
|
}
|
|
|
|
|
|
|
|
rp->reg_len = reg_idx * sizeof(ResourceFields);
|
|
|
|
}
|
|
|
|
|
2017-02-17 16:31:34 +03:00
|
|
|
typedef struct PCIClass PCIClass;
|
|
|
|
typedef struct PCISubClass PCISubClass;
|
|
|
|
typedef struct PCIIFace PCIIFace;
|
|
|
|
|
|
|
|
struct PCIIFace {
|
|
|
|
int iface;
|
|
|
|
const char *name;
|
|
|
|
};
|
|
|
|
|
|
|
|
struct PCISubClass {
|
|
|
|
int subclass;
|
|
|
|
const char *name;
|
|
|
|
const PCIIFace *iface;
|
|
|
|
};
|
|
|
|
|
|
|
|
struct PCIClass {
|
|
|
|
const char *name;
|
|
|
|
const PCISubClass *subc;
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCISubClass undef_subclass[] = {
|
|
|
|
{ PCI_CLASS_NOT_DEFINED_VGA, "display", NULL },
|
|
|
|
{ 0xFF, NULL, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCISubClass mass_subclass[] = {
|
|
|
|
{ PCI_CLASS_STORAGE_SCSI, "scsi", NULL },
|
|
|
|
{ PCI_CLASS_STORAGE_IDE, "ide", NULL },
|
|
|
|
{ PCI_CLASS_STORAGE_FLOPPY, "fdc", NULL },
|
|
|
|
{ PCI_CLASS_STORAGE_IPI, "ipi", NULL },
|
|
|
|
{ PCI_CLASS_STORAGE_RAID, "raid", NULL },
|
|
|
|
{ PCI_CLASS_STORAGE_ATA, "ata", NULL },
|
|
|
|
{ PCI_CLASS_STORAGE_SATA, "sata", NULL },
|
|
|
|
{ PCI_CLASS_STORAGE_SAS, "sas", NULL },
|
|
|
|
{ 0xFF, NULL, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCISubClass net_subclass[] = {
|
|
|
|
{ PCI_CLASS_NETWORK_ETHERNET, "ethernet", NULL },
|
|
|
|
{ PCI_CLASS_NETWORK_TOKEN_RING, "token-ring", NULL },
|
|
|
|
{ PCI_CLASS_NETWORK_FDDI, "fddi", NULL },
|
|
|
|
{ PCI_CLASS_NETWORK_ATM, "atm", NULL },
|
|
|
|
{ PCI_CLASS_NETWORK_ISDN, "isdn", NULL },
|
|
|
|
{ PCI_CLASS_NETWORK_WORLDFIP, "worldfip", NULL },
|
|
|
|
{ PCI_CLASS_NETWORK_PICMG214, "picmg", NULL },
|
|
|
|
{ 0xFF, NULL, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCISubClass displ_subclass[] = {
|
|
|
|
{ PCI_CLASS_DISPLAY_VGA, "vga", NULL },
|
|
|
|
{ PCI_CLASS_DISPLAY_XGA, "xga", NULL },
|
|
|
|
{ PCI_CLASS_DISPLAY_3D, "3d-controller", NULL },
|
|
|
|
{ 0xFF, NULL, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCISubClass media_subclass[] = {
|
|
|
|
{ PCI_CLASS_MULTIMEDIA_VIDEO, "video", NULL },
|
|
|
|
{ PCI_CLASS_MULTIMEDIA_AUDIO, "sound", NULL },
|
|
|
|
{ PCI_CLASS_MULTIMEDIA_PHONE, "telephony", NULL },
|
|
|
|
{ 0xFF, NULL, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCISubClass mem_subclass[] = {
|
|
|
|
{ PCI_CLASS_MEMORY_RAM, "memory", NULL },
|
|
|
|
{ PCI_CLASS_MEMORY_FLASH, "flash", NULL },
|
|
|
|
{ 0xFF, NULL, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCISubClass bridg_subclass[] = {
|
|
|
|
{ PCI_CLASS_BRIDGE_HOST, "host", NULL },
|
|
|
|
{ PCI_CLASS_BRIDGE_ISA, "isa", NULL },
|
|
|
|
{ PCI_CLASS_BRIDGE_EISA, "eisa", NULL },
|
|
|
|
{ PCI_CLASS_BRIDGE_MC, "mca", NULL },
|
|
|
|
{ PCI_CLASS_BRIDGE_PCI, "pci", NULL },
|
|
|
|
{ PCI_CLASS_BRIDGE_PCMCIA, "pcmcia", NULL },
|
|
|
|
{ PCI_CLASS_BRIDGE_NUBUS, "nubus", NULL },
|
|
|
|
{ PCI_CLASS_BRIDGE_CARDBUS, "cardbus", NULL },
|
|
|
|
{ PCI_CLASS_BRIDGE_RACEWAY, "raceway", NULL },
|
|
|
|
{ PCI_CLASS_BRIDGE_PCI_SEMITP, "semi-transparent-pci", NULL },
|
|
|
|
{ PCI_CLASS_BRIDGE_IB_PCI, "infiniband", NULL },
|
|
|
|
{ 0xFF, NULL, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCISubClass comm_subclass[] = {
|
|
|
|
{ PCI_CLASS_COMMUNICATION_SERIAL, "serial", NULL },
|
|
|
|
{ PCI_CLASS_COMMUNICATION_PARALLEL, "parallel", NULL },
|
|
|
|
{ PCI_CLASS_COMMUNICATION_MULTISERIAL, "multiport-serial", NULL },
|
|
|
|
{ PCI_CLASS_COMMUNICATION_MODEM, "modem", NULL },
|
|
|
|
{ PCI_CLASS_COMMUNICATION_GPIB, "gpib", NULL },
|
|
|
|
{ PCI_CLASS_COMMUNICATION_SC, "smart-card", NULL },
|
|
|
|
{ 0xFF, NULL, NULL, },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCIIFace pic_iface[] = {
|
|
|
|
{ PCI_CLASS_SYSTEM_PIC_IOAPIC, "io-apic" },
|
|
|
|
{ PCI_CLASS_SYSTEM_PIC_IOXAPIC, "io-xapic" },
|
|
|
|
{ 0xFF, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCISubClass sys_subclass[] = {
|
|
|
|
{ PCI_CLASS_SYSTEM_PIC, "interrupt-controller", pic_iface },
|
|
|
|
{ PCI_CLASS_SYSTEM_DMA, "dma-controller", NULL },
|
|
|
|
{ PCI_CLASS_SYSTEM_TIMER, "timer", NULL },
|
|
|
|
{ PCI_CLASS_SYSTEM_RTC, "rtc", NULL },
|
|
|
|
{ PCI_CLASS_SYSTEM_PCI_HOTPLUG, "hot-plug-controller", NULL },
|
|
|
|
{ PCI_CLASS_SYSTEM_SDHCI, "sd-host-controller", NULL },
|
|
|
|
{ 0xFF, NULL, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCISubClass inp_subclass[] = {
|
|
|
|
{ PCI_CLASS_INPUT_KEYBOARD, "keyboard", NULL },
|
|
|
|
{ PCI_CLASS_INPUT_PEN, "pen", NULL },
|
|
|
|
{ PCI_CLASS_INPUT_MOUSE, "mouse", NULL },
|
|
|
|
{ PCI_CLASS_INPUT_SCANNER, "scanner", NULL },
|
|
|
|
{ PCI_CLASS_INPUT_GAMEPORT, "gameport", NULL },
|
|
|
|
{ 0xFF, NULL, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCISubClass dock_subclass[] = {
|
|
|
|
{ PCI_CLASS_DOCKING_GENERIC, "dock", NULL },
|
|
|
|
{ 0xFF, NULL, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCISubClass cpu_subclass[] = {
|
|
|
|
{ PCI_CLASS_PROCESSOR_PENTIUM, "pentium", NULL },
|
|
|
|
{ PCI_CLASS_PROCESSOR_POWERPC, "powerpc", NULL },
|
|
|
|
{ PCI_CLASS_PROCESSOR_MIPS, "mips", NULL },
|
|
|
|
{ PCI_CLASS_PROCESSOR_CO, "co-processor", NULL },
|
|
|
|
{ 0xFF, NULL, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCIIFace usb_iface[] = {
|
|
|
|
{ PCI_CLASS_SERIAL_USB_UHCI, "usb-uhci" },
|
|
|
|
{ PCI_CLASS_SERIAL_USB_OHCI, "usb-ohci", },
|
|
|
|
{ PCI_CLASS_SERIAL_USB_EHCI, "usb-ehci" },
|
|
|
|
{ PCI_CLASS_SERIAL_USB_XHCI, "usb-xhci" },
|
|
|
|
{ PCI_CLASS_SERIAL_USB_UNKNOWN, "usb-unknown" },
|
|
|
|
{ PCI_CLASS_SERIAL_USB_DEVICE, "usb-device" },
|
|
|
|
{ 0xFF, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCISubClass ser_subclass[] = {
|
|
|
|
{ PCI_CLASS_SERIAL_FIREWIRE, "firewire", NULL },
|
|
|
|
{ PCI_CLASS_SERIAL_ACCESS, "access-bus", NULL },
|
|
|
|
{ PCI_CLASS_SERIAL_SSA, "ssa", NULL },
|
|
|
|
{ PCI_CLASS_SERIAL_USB, "usb", usb_iface },
|
|
|
|
{ PCI_CLASS_SERIAL_FIBER, "fibre-channel", NULL },
|
|
|
|
{ PCI_CLASS_SERIAL_SMBUS, "smb", NULL },
|
|
|
|
{ PCI_CLASS_SERIAL_IB, "infiniband", NULL },
|
|
|
|
{ PCI_CLASS_SERIAL_IPMI, "ipmi", NULL },
|
|
|
|
{ PCI_CLASS_SERIAL_SERCOS, "sercos", NULL },
|
|
|
|
{ PCI_CLASS_SERIAL_CANBUS, "canbus", NULL },
|
|
|
|
{ 0xFF, NULL, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCISubClass wrl_subclass[] = {
|
|
|
|
{ PCI_CLASS_WIRELESS_IRDA, "irda", NULL },
|
|
|
|
{ PCI_CLASS_WIRELESS_CIR, "consumer-ir", NULL },
|
|
|
|
{ PCI_CLASS_WIRELESS_RF_CONTROLLER, "rf-controller", NULL },
|
|
|
|
{ PCI_CLASS_WIRELESS_BLUETOOTH, "bluetooth", NULL },
|
|
|
|
{ PCI_CLASS_WIRELESS_BROADBAND, "broadband", NULL },
|
|
|
|
{ 0xFF, NULL, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCISubClass sat_subclass[] = {
|
|
|
|
{ PCI_CLASS_SATELLITE_TV, "satellite-tv", NULL },
|
|
|
|
{ PCI_CLASS_SATELLITE_AUDIO, "satellite-audio", NULL },
|
|
|
|
{ PCI_CLASS_SATELLITE_VOICE, "satellite-voice", NULL },
|
|
|
|
{ PCI_CLASS_SATELLITE_DATA, "satellite-data", NULL },
|
|
|
|
{ 0xFF, NULL, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCISubClass crypt_subclass[] = {
|
|
|
|
{ PCI_CLASS_CRYPT_NETWORK, "network-encryption", NULL },
|
|
|
|
{ PCI_CLASS_CRYPT_ENTERTAINMENT,
|
|
|
|
"entertainment-encryption", NULL },
|
|
|
|
{ 0xFF, NULL, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCISubClass spc_subclass[] = {
|
|
|
|
{ PCI_CLASS_SP_DPIO, "dpio", NULL },
|
|
|
|
{ PCI_CLASS_SP_PERF, "counter", NULL },
|
|
|
|
{ PCI_CLASS_SP_SYNCH, "measurement", NULL },
|
|
|
|
{ PCI_CLASS_SP_MANAGEMENT, "management-card", NULL },
|
|
|
|
{ 0xFF, NULL, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
static const PCIClass pci_classes[] = {
|
|
|
|
{ "legacy-device", undef_subclass },
|
|
|
|
{ "mass-storage", mass_subclass },
|
|
|
|
{ "network", net_subclass },
|
|
|
|
{ "display", displ_subclass, },
|
|
|
|
{ "multimedia-device", media_subclass },
|
|
|
|
{ "memory-controller", mem_subclass },
|
|
|
|
{ "unknown-bridge", bridg_subclass },
|
|
|
|
{ "communication-controller", comm_subclass},
|
|
|
|
{ "system-peripheral", sys_subclass },
|
|
|
|
{ "input-controller", inp_subclass },
|
|
|
|
{ "docking-station", dock_subclass },
|
|
|
|
{ "cpu", cpu_subclass },
|
|
|
|
{ "serial-bus", ser_subclass },
|
|
|
|
{ "wireless-controller", wrl_subclass },
|
|
|
|
{ "intelligent-io", NULL },
|
|
|
|
{ "satellite-device", sat_subclass },
|
|
|
|
{ "encryption", crypt_subclass },
|
|
|
|
{ "data-processing-controller", spc_subclass },
|
|
|
|
};
|
|
|
|
|
2019-03-22 08:13:09 +03:00
|
|
|
static const char *dt_name_from_class(uint8_t class, uint8_t subclass,
|
|
|
|
uint8_t iface)
|
2017-02-17 16:31:34 +03:00
|
|
|
{
|
|
|
|
const PCIClass *pclass;
|
|
|
|
const PCISubClass *psubclass;
|
|
|
|
const PCIIFace *piface;
|
|
|
|
const char *name;
|
|
|
|
|
|
|
|
if (class >= ARRAY_SIZE(pci_classes)) {
|
|
|
|
return "pci";
|
|
|
|
}
|
|
|
|
|
|
|
|
pclass = pci_classes + class;
|
|
|
|
name = pclass->name;
|
|
|
|
|
|
|
|
if (pclass->subc == NULL) {
|
|
|
|
return name;
|
|
|
|
}
|
|
|
|
|
|
|
|
psubclass = pclass->subc;
|
|
|
|
while ((psubclass->subclass & 0xff) != 0xff) {
|
|
|
|
if ((psubclass->subclass & 0xff) == subclass) {
|
|
|
|
name = psubclass->name;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
psubclass++;
|
|
|
|
}
|
|
|
|
|
|
|
|
piface = psubclass->iface;
|
|
|
|
if (piface == NULL) {
|
|
|
|
return name;
|
|
|
|
}
|
|
|
|
while ((piface->iface & 0xff) != 0xff) {
|
|
|
|
if ((piface->iface & 0xff) == iface) {
|
|
|
|
name = piface->name;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
piface++;
|
|
|
|
}
|
|
|
|
|
|
|
|
return name;
|
|
|
|
}
|
|
|
|
|
2019-04-05 07:51:00 +03:00
|
|
|
/*
|
|
|
|
* DRC helper functions
|
|
|
|
*/
|
|
|
|
|
|
|
|
static uint32_t drc_id_from_devfn(SpaprPhbState *phb,
|
2019-04-10 04:49:28 +03:00
|
|
|
uint8_t chassis, int32_t devfn)
|
2017-02-17 16:31:34 +03:00
|
|
|
{
|
2019-04-10 04:49:28 +03:00
|
|
|
return (phb->index << 16) | (chassis << 8) | devfn;
|
2019-04-05 07:51:00 +03:00
|
|
|
}
|
2017-02-17 16:31:34 +03:00
|
|
|
|
2019-04-05 07:51:00 +03:00
|
|
|
static SpaprDrc *drc_from_devfn(SpaprPhbState *phb,
|
2019-04-10 04:49:28 +03:00
|
|
|
uint8_t chassis, int32_t devfn)
|
2019-04-05 07:51:00 +03:00
|
|
|
{
|
|
|
|
return spapr_drc_by_id(TYPE_SPAPR_DRC_PCI,
|
2019-04-10 04:49:28 +03:00
|
|
|
drc_id_from_devfn(phb, chassis, devfn));
|
|
|
|
}
|
2017-02-17 16:31:34 +03:00
|
|
|
|
2020-05-05 18:29:25 +03:00
|
|
|
static uint8_t chassis_from_bus(PCIBus *bus)
|
2019-04-10 04:49:28 +03:00
|
|
|
{
|
|
|
|
if (pci_bus_is_root(bus)) {
|
|
|
|
return 0;
|
|
|
|
} else {
|
|
|
|
PCIDevice *bridge = pci_bridge_get_device(bus);
|
|
|
|
|
2020-05-05 18:29:25 +03:00
|
|
|
return object_property_get_uint(OBJECT(bridge), "chassis_nr",
|
|
|
|
&error_abort);
|
2019-04-10 04:49:28 +03:00
|
|
|
}
|
2019-04-05 07:51:00 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
static SpaprDrc *drc_from_dev(SpaprPhbState *phb, PCIDevice *dev)
|
|
|
|
{
|
2020-05-05 18:29:25 +03:00
|
|
|
uint8_t chassis = chassis_from_bus(pci_get_bus(dev));
|
2019-04-10 04:49:28 +03:00
|
|
|
|
|
|
|
return drc_from_devfn(phb, chassis, dev->devfn);
|
2019-04-05 07:51:00 +03:00
|
|
|
}
|
|
|
|
|
2020-05-05 18:29:25 +03:00
|
|
|
static void add_drcs(SpaprPhbState *phb, PCIBus *bus)
|
2019-04-05 07:51:00 +03:00
|
|
|
{
|
2019-04-05 08:34:19 +03:00
|
|
|
Object *owner;
|
2019-04-05 07:51:00 +03:00
|
|
|
int i;
|
2019-04-05 08:34:19 +03:00
|
|
|
uint8_t chassis;
|
2019-04-05 07:51:00 +03:00
|
|
|
|
|
|
|
if (!phb->dr_enabled) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2020-05-05 18:29:25 +03:00
|
|
|
chassis = chassis_from_bus(bus);
|
2019-04-05 08:34:19 +03:00
|
|
|
|
|
|
|
if (pci_bus_is_root(bus)) {
|
|
|
|
owner = OBJECT(phb);
|
2017-02-17 16:31:34 +03:00
|
|
|
} else {
|
2019-04-05 08:34:19 +03:00
|
|
|
owner = OBJECT(pci_bridge_get_device(bus));
|
|
|
|
}
|
|
|
|
|
2019-04-05 07:51:00 +03:00
|
|
|
for (i = 0; i < PCI_SLOT_MAX * PCI_FUNC_MAX; i++) {
|
2019-04-05 08:34:19 +03:00
|
|
|
spapr_dr_connector_new(owner, TYPE_SPAPR_DRC_PCI,
|
|
|
|
drc_id_from_devfn(phb, chassis, i));
|
2019-04-05 07:51:00 +03:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-05-05 18:29:25 +03:00
|
|
|
static void remove_drcs(SpaprPhbState *phb, PCIBus *bus)
|
2019-04-05 07:51:00 +03:00
|
|
|
{
|
|
|
|
int i;
|
2019-04-05 08:34:19 +03:00
|
|
|
uint8_t chassis;
|
2019-04-05 07:51:00 +03:00
|
|
|
|
|
|
|
if (!phb->dr_enabled) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2020-05-05 18:29:25 +03:00
|
|
|
chassis = chassis_from_bus(bus);
|
2019-04-05 08:34:19 +03:00
|
|
|
|
2019-04-05 07:51:00 +03:00
|
|
|
for (i = PCI_SLOT_MAX * PCI_FUNC_MAX - 1; i >= 0; i--) {
|
2019-04-05 08:34:19 +03:00
|
|
|
SpaprDrc *drc = drc_from_devfn(phb, chassis, i);
|
2019-04-05 07:51:00 +03:00
|
|
|
|
|
|
|
if (drc) {
|
|
|
|
object_unparent(OBJECT(drc));
|
|
|
|
}
|
2017-02-17 16:31:34 +03:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-04-05 05:31:48 +03:00
|
|
|
typedef struct PciWalkFdt {
|
|
|
|
void *fdt;
|
|
|
|
int offset;
|
|
|
|
SpaprPhbState *sphb;
|
|
|
|
int err;
|
|
|
|
} PciWalkFdt;
|
|
|
|
|
|
|
|
static int spapr_dt_pci_device(SpaprPhbState *sphb, PCIDevice *dev,
|
|
|
|
void *fdt, int parent_offset);
|
|
|
|
|
|
|
|
static void spapr_dt_pci_device_cb(PCIBus *bus, PCIDevice *pdev,
|
|
|
|
void *opaque)
|
|
|
|
{
|
|
|
|
PciWalkFdt *p = opaque;
|
|
|
|
int err;
|
|
|
|
|
|
|
|
if (p->err) {
|
|
|
|
/* Something's already broken, don't keep going */
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
err = spapr_dt_pci_device(p->sphb, pdev, p->fdt, p->offset);
|
|
|
|
if (err < 0) {
|
|
|
|
p->err = err;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Augment PCI device node with bridge specific information */
|
|
|
|
static int spapr_dt_pci_bus(SpaprPhbState *sphb, PCIBus *bus,
|
|
|
|
void *fdt, int offset)
|
|
|
|
{
|
2019-06-18 11:42:53 +03:00
|
|
|
Object *owner;
|
2019-04-05 05:31:48 +03:00
|
|
|
PciWalkFdt cbinfo = {
|
|
|
|
.fdt = fdt,
|
|
|
|
.offset = offset,
|
|
|
|
.sphb = sphb,
|
|
|
|
.err = 0,
|
|
|
|
};
|
2019-04-05 08:34:19 +03:00
|
|
|
int ret;
|
2019-04-05 05:31:48 +03:00
|
|
|
|
|
|
|
_FDT(fdt_setprop_cell(fdt, offset, "#address-cells",
|
|
|
|
RESOURCE_CELLS_ADDRESS));
|
|
|
|
_FDT(fdt_setprop_cell(fdt, offset, "#size-cells",
|
|
|
|
RESOURCE_CELLS_SIZE));
|
|
|
|
|
2019-06-14 00:34:06 +03:00
|
|
|
assert(bus);
|
2021-10-28 07:31:26 +03:00
|
|
|
pci_for_each_device_under_bus_reverse(bus, spapr_dt_pci_device_cb, &cbinfo);
|
2019-06-14 00:34:06 +03:00
|
|
|
if (cbinfo.err) {
|
|
|
|
return cbinfo.err;
|
2019-04-05 05:31:48 +03:00
|
|
|
}
|
|
|
|
|
2019-06-18 11:42:53 +03:00
|
|
|
if (pci_bus_is_root(bus)) {
|
|
|
|
owner = OBJECT(sphb);
|
|
|
|
} else {
|
|
|
|
owner = OBJECT(pci_bridge_get_device(bus));
|
|
|
|
}
|
|
|
|
|
|
|
|
ret = spapr_dt_drc(fdt, offset, owner,
|
2019-04-05 08:34:19 +03:00
|
|
|
SPAPR_DR_CONNECTOR_TYPE_PCI);
|
|
|
|
if (ret) {
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2019-04-05 05:31:48 +03:00
|
|
|
return offset;
|
|
|
|
}
|
2015-07-02 09:23:23 +03:00
|
|
|
|
spapr: Adjust firmware path of PCI devices
It is currently not possible to perform a strict boot from USB storage:
$ qemu-system-ppc64 -accel kvm -nodefaults -nographic -serial stdio \
-boot strict=on \
-device qemu-xhci \
-device usb-storage,drive=disk,bootindex=0 \
-blockdev driver=file,node-name=disk,filename=fedora-ppc64le.qcow2
SLOF **********************************************************************
QEMU Starting
Build Date = Jul 17 2020 11:15:24
FW Version = git-e18ddad8516ff2cf
Press "s" to enter Open Firmware.
Populating /vdevice methods
Populating /vdevice/vty@71000000
Populating /vdevice/nvram@71000001
Populating /pci@800000020000000
00 0000 (D) : 1b36 000d serial bus [ usb-xhci ]
No NVRAM common partition, re-initializing...
Scanning USB
XHCI: Initializing
USB Storage
SCSI: Looking for devices
101000000000000 DISK : "QEMU QEMU HARDDISK 2.5+"
Using default console: /vdevice/vty@71000000
Welcome to Open Firmware
Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
This program and the accompanying materials are made available
under the terms of the BSD License available at
http://www.opensource.org/licenses/bsd-license.php
Trying to load: from: /pci@800000020000000/usb@0/storage@1/disk@101000000000000 ...
E3405: No such device
E3407: Load failed
Type 'boot' and press return to continue booting the system.
Type 'reset-all' and press return to reboot the system.
Ready!
0 >
The device tree handed over by QEMU to SLOF indeed contains:
qemu,boot-list =
"/pci@800000020000000/usb@0/storage@1/disk@101000000000000 HALT";
but the device node is named usb-xhci@0, not usb@0.
This happens because the firmware names of PCI devices returned
by get_boot_devices_list() come from pcibus_get_fw_dev_path(),
while the sPAPR PHB code uses a different naming scheme for
device nodes. This inconsistency has always been there but it was
hidden for a long time because SLOF used to rename USB device
nodes, until this commit, merged in QEMU 4.2.0 :
commit 85164ad4ed9960cac842fa4cc067c6b6699b0994
Author: Alexey Kardashevskiy <aik@ozlabs.ru>
Date: Wed Sep 11 16:24:32 2019 +1000
pseries: Update SLOF firmware image
This fixes USB host bus adapter name in the device tree to match QEMU's
one.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Fortunately, sPAPR implements the firmware path provider interface.
This provides a way to override the default firmware paths.
Just factor out the sPAPR PHB naming logic from spapr_dt_pci_device()
to a helper, and use it in the sPAPR firmware path provider hook.
Fixes: 85164ad4ed99 ("pseries: Update SLOF firmware image")
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210122170157.246374-1-groug@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-01-22 20:01:57 +03:00
|
|
|
char *spapr_pci_fw_dev_name(PCIDevice *dev)
|
|
|
|
{
|
|
|
|
const gchar *basename;
|
|
|
|
int slot = PCI_SLOT(dev->devfn);
|
|
|
|
int func = PCI_FUNC(dev->devfn);
|
|
|
|
uint32_t ccode = pci_default_read_config(dev, PCI_CLASS_PROG, 3);
|
|
|
|
|
|
|
|
basename = dt_name_from_class((ccode >> 16) & 0xff, (ccode >> 8) & 0xff,
|
|
|
|
ccode & 0xff);
|
|
|
|
|
|
|
|
if (func != 0) {
|
|
|
|
return g_strdup_printf("%s@%x,%x", basename, slot, func);
|
|
|
|
} else {
|
|
|
|
return g_strdup_printf("%s@%x", basename, slot);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-03-22 08:17:14 +03:00
|
|
|
/* create OF node for pci device and required OF DT properties */
|
|
|
|
static int spapr_dt_pci_device(SpaprPhbState *sphb, PCIDevice *dev,
|
|
|
|
void *fdt, int parent_offset)
|
2015-05-07 08:33:55 +03:00
|
|
|
{
|
2019-03-22 08:17:14 +03:00
|
|
|
int offset;
|
spapr: Adjust firmware path of PCI devices
It is currently not possible to perform a strict boot from USB storage:
$ qemu-system-ppc64 -accel kvm -nodefaults -nographic -serial stdio \
-boot strict=on \
-device qemu-xhci \
-device usb-storage,drive=disk,bootindex=0 \
-blockdev driver=file,node-name=disk,filename=fedora-ppc64le.qcow2
SLOF **********************************************************************
QEMU Starting
Build Date = Jul 17 2020 11:15:24
FW Version = git-e18ddad8516ff2cf
Press "s" to enter Open Firmware.
Populating /vdevice methods
Populating /vdevice/vty@71000000
Populating /vdevice/nvram@71000001
Populating /pci@800000020000000
00 0000 (D) : 1b36 000d serial bus [ usb-xhci ]
No NVRAM common partition, re-initializing...
Scanning USB
XHCI: Initializing
USB Storage
SCSI: Looking for devices
101000000000000 DISK : "QEMU QEMU HARDDISK 2.5+"
Using default console: /vdevice/vty@71000000
Welcome to Open Firmware
Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
This program and the accompanying materials are made available
under the terms of the BSD License available at
http://www.opensource.org/licenses/bsd-license.php
Trying to load: from: /pci@800000020000000/usb@0/storage@1/disk@101000000000000 ...
E3405: No such device
E3407: Load failed
Type 'boot' and press return to continue booting the system.
Type 'reset-all' and press return to reboot the system.
Ready!
0 >
The device tree handed over by QEMU to SLOF indeed contains:
qemu,boot-list =
"/pci@800000020000000/usb@0/storage@1/disk@101000000000000 HALT";
but the device node is named usb-xhci@0, not usb@0.
This happens because the firmware names of PCI devices returned
by get_boot_devices_list() come from pcibus_get_fw_dev_path(),
while the sPAPR PHB code uses a different naming scheme for
device nodes. This inconsistency has always been there but it was
hidden for a long time because SLOF used to rename USB device
nodes, until this commit, merged in QEMU 4.2.0 :
commit 85164ad4ed9960cac842fa4cc067c6b6699b0994
Author: Alexey Kardashevskiy <aik@ozlabs.ru>
Date: Wed Sep 11 16:24:32 2019 +1000
pseries: Update SLOF firmware image
This fixes USB host bus adapter name in the device tree to match QEMU's
one.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Fortunately, sPAPR implements the firmware path provider interface.
This provides a way to override the default firmware paths.
Just factor out the sPAPR PHB naming logic from spapr_dt_pci_device()
to a helper, and use it in the sPAPR firmware path provider hook.
Fixes: 85164ad4ed99 ("pseries: Update SLOF firmware image")
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210122170157.246374-1-groug@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-01-22 20:01:57 +03:00
|
|
|
g_autofree gchar *nodename = spapr_pci_fw_dev_name(dev);
|
2019-04-05 05:31:48 +03:00
|
|
|
PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(dev);
|
2015-05-07 08:33:55 +03:00
|
|
|
ResourceProps rp;
|
2019-04-05 07:51:00 +03:00
|
|
|
SpaprDrc *drc = drc_from_dev(sphb, dev);
|
2019-03-22 08:17:14 +03:00
|
|
|
uint32_t vendor_id = pci_default_read_config(dev, PCI_VENDOR_ID, 2);
|
|
|
|
uint32_t device_id = pci_default_read_config(dev, PCI_DEVICE_ID, 2);
|
|
|
|
uint32_t revision_id = pci_default_read_config(dev, PCI_REVISION_ID, 1);
|
2017-02-17 16:31:34 +03:00
|
|
|
uint32_t ccode = pci_default_read_config(dev, PCI_CLASS_PROG, 3);
|
2019-03-22 08:17:14 +03:00
|
|
|
uint32_t irq_pin = pci_default_read_config(dev, PCI_INTERRUPT_PIN, 1);
|
|
|
|
uint32_t subsystem_id = pci_default_read_config(dev, PCI_SUBSYSTEM_ID, 2);
|
|
|
|
uint32_t subsystem_vendor_id =
|
|
|
|
pci_default_read_config(dev, PCI_SUBSYSTEM_VENDOR_ID, 2);
|
|
|
|
uint32_t cache_line_size =
|
|
|
|
pci_default_read_config(dev, PCI_CACHE_LINE_SIZE, 1);
|
|
|
|
uint32_t pci_status = pci_default_read_config(dev, PCI_STATUS, 2);
|
|
|
|
gchar *loc_code;
|
2015-05-07 08:33:55 +03:00
|
|
|
|
2019-03-22 08:17:14 +03:00
|
|
|
_FDT(offset = fdt_add_subnode(fdt, parent_offset, nodename));
|
|
|
|
|
2015-05-07 08:33:55 +03:00
|
|
|
/* in accordance with PAPR+ v2.7 13.6.3, Table 181 */
|
2019-03-22 08:17:14 +03:00
|
|
|
_FDT(fdt_setprop_cell(fdt, offset, "vendor-id", vendor_id));
|
|
|
|
_FDT(fdt_setprop_cell(fdt, offset, "device-id", device_id));
|
|
|
|
_FDT(fdt_setprop_cell(fdt, offset, "revision-id", revision_id));
|
2015-05-07 08:33:55 +03:00
|
|
|
|
2017-02-17 16:31:34 +03:00
|
|
|
_FDT(fdt_setprop_cell(fdt, offset, "class-code", ccode));
|
2019-03-22 08:17:14 +03:00
|
|
|
if (irq_pin) {
|
|
|
|
_FDT(fdt_setprop_cell(fdt, offset, "interrupts", irq_pin));
|
2015-05-07 08:33:55 +03:00
|
|
|
}
|
|
|
|
|
2019-03-22 08:17:14 +03:00
|
|
|
if (subsystem_id) {
|
|
|
|
_FDT(fdt_setprop_cell(fdt, offset, "subsystem-id", subsystem_id));
|
2015-05-07 08:33:55 +03:00
|
|
|
}
|
|
|
|
|
2019-03-22 08:17:14 +03:00
|
|
|
if (subsystem_vendor_id) {
|
2015-05-07 08:33:55 +03:00
|
|
|
_FDT(fdt_setprop_cell(fdt, offset, "subsystem-vendor-id",
|
2019-03-22 08:17:14 +03:00
|
|
|
subsystem_vendor_id));
|
2015-05-07 08:33:55 +03:00
|
|
|
}
|
|
|
|
|
2019-03-22 08:17:14 +03:00
|
|
|
_FDT(fdt_setprop_cell(fdt, offset, "cache-line-size", cache_line_size));
|
|
|
|
|
2015-05-07 08:33:55 +03:00
|
|
|
|
|
|
|
/* the following fdt cells are masked off the pci status register */
|
|
|
|
_FDT(fdt_setprop_cell(fdt, offset, "devsel-speed",
|
|
|
|
PCI_STATUS_DEVSEL_MASK & pci_status));
|
|
|
|
|
|
|
|
if (pci_status & PCI_STATUS_FAST_BACK) {
|
|
|
|
_FDT(fdt_setprop(fdt, offset, "fast-back-to-back", NULL, 0));
|
|
|
|
}
|
|
|
|
if (pci_status & PCI_STATUS_66MHZ) {
|
|
|
|
_FDT(fdt_setprop(fdt, offset, "66mhz-capable", NULL, 0));
|
|
|
|
}
|
|
|
|
if (pci_status & PCI_STATUS_UDF) {
|
|
|
|
_FDT(fdt_setprop(fdt, offset, "udf-supported", NULL, 0));
|
|
|
|
}
|
|
|
|
|
2019-03-22 08:17:14 +03:00
|
|
|
loc_code = spapr_phb_get_loc_code(sphb, dev);
|
|
|
|
_FDT(fdt_setprop_string(fdt, offset, "ibm,loc-code", loc_code));
|
|
|
|
g_free(loc_code);
|
2015-07-02 09:23:22 +03:00
|
|
|
|
2019-04-05 07:51:00 +03:00
|
|
|
if (drc) {
|
|
|
|
_FDT(fdt_setprop_cell(fdt, offset, "ibm,my-drc-index",
|
|
|
|
spapr_drc_index(drc)));
|
2015-07-02 09:23:23 +03:00
|
|
|
}
|
2015-05-07 08:33:55 +03:00
|
|
|
|
spapr_pci: fix MSI/MSIX selection
In various place we don't correctly check if the device supports MSI or
MSI-X. This can cause devices to be advertised with MSI support, even
if they only support MSI-X (like virtio-pci-* devices for example):
ethernet@0 {
ibm,req#msi = <0x1>; <--- wrong!
.
ibm,loc-code = "qemu_virtio-net-pci:0000:00:00.0";
.
ibm,req#msi-x = <0x3>;
};
Worse, this can also cause the "ibm,change-msi" RTAS call to corrupt the
PCI status and cause migration to fail:
qemu-system-ppc64: get_pci_config_device: Bad config data: i=0x6
read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
^^
PCI_STATUS_CAP_LIST bit which is assumed to be constant
This patch changes spapr_populate_pci_child_dt() to properly check for
MSI support using msi_present(): this ensures that PCIDevice::msi_cap
was set by msi_init() and that msi_nr_vectors_allocated() will look at
the right place in the config space.
Checking PCIDevice::msix_entries_nr is enough for MSI-X but let's add
a call to msix_present() there as well for consistency.
It also changes rtas_ibm_change_msi() to select the appropriate MSI
type in Function 1 instead of always selecting plain MSI. This new
behaviour is compliant with LoPAPR 1.1, as described in "Table 71.
ibm,change-msi Argument Call Buffer":
Function 1: If Number Outputs is equal to 3, request to set to a new
number of MSIs (including set to 0).
If the “ibm,change-msix-capable” property exists and Number
Outputs is equal to 4, request is to set to a new number of
MSI or MSI-X (platform choice) interrupts (including set to
0).
Since MSI is the the platform default (LoPAPR 6.2.3 MSI Option), let's
check for MSI support first.
And finally, it checks the input parameters are valid, as described in
LoPAPR 1.1 "R1–7.3.10.5.1–3":
For the MSI option: The platform must return a Status of -3 (Parameter
error) from ibm,change-msi, with no change in interrupt assignments if
the PCI configuration address does not support MSI and Function 3 was
requested (that is, the “ibm,req#msi” property must exist for the PCI
configuration address in order to use Function 3), or does not support
MSI-X and Function 4 is requested (that is, the “ibm,req#msi-x” property
must exist for the PCI configuration address in order to use Function 4),
or if neither MSIs nor MSI-Xs are supported and Function 1 is requested.
This ensures that the ret_intr_type variable contains a valid MSI type
for this device, and that spapr_msi_setmsg() won't corrupt the PCI status.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-27 01:25:24 +03:00
|
|
|
if (msi_present(dev)) {
|
2019-03-22 08:17:14 +03:00
|
|
|
uint32_t max_msi = msi_nr_vectors_allocated(dev);
|
spapr_pci: fix MSI/MSIX selection
In various place we don't correctly check if the device supports MSI or
MSI-X. This can cause devices to be advertised with MSI support, even
if they only support MSI-X (like virtio-pci-* devices for example):
ethernet@0 {
ibm,req#msi = <0x1>; <--- wrong!
.
ibm,loc-code = "qemu_virtio-net-pci:0000:00:00.0";
.
ibm,req#msi-x = <0x3>;
};
Worse, this can also cause the "ibm,change-msi" RTAS call to corrupt the
PCI status and cause migration to fail:
qemu-system-ppc64: get_pci_config_device: Bad config data: i=0x6
read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
^^
PCI_STATUS_CAP_LIST bit which is assumed to be constant
This patch changes spapr_populate_pci_child_dt() to properly check for
MSI support using msi_present(): this ensures that PCIDevice::msi_cap
was set by msi_init() and that msi_nr_vectors_allocated() will look at
the right place in the config space.
Checking PCIDevice::msix_entries_nr is enough for MSI-X but let's add
a call to msix_present() there as well for consistency.
It also changes rtas_ibm_change_msi() to select the appropriate MSI
type in Function 1 instead of always selecting plain MSI. This new
behaviour is compliant with LoPAPR 1.1, as described in "Table 71.
ibm,change-msi Argument Call Buffer":
Function 1: If Number Outputs is equal to 3, request to set to a new
number of MSIs (including set to 0).
If the “ibm,change-msix-capable” property exists and Number
Outputs is equal to 4, request is to set to a new number of
MSI or MSI-X (platform choice) interrupts (including set to
0).
Since MSI is the the platform default (LoPAPR 6.2.3 MSI Option), let's
check for MSI support first.
And finally, it checks the input parameters are valid, as described in
LoPAPR 1.1 "R1–7.3.10.5.1–3":
For the MSI option: The platform must return a Status of -3 (Parameter
error) from ibm,change-msi, with no change in interrupt assignments if
the PCI configuration address does not support MSI and Function 3 was
requested (that is, the “ibm,req#msi” property must exist for the PCI
configuration address in order to use Function 3), or does not support
MSI-X and Function 4 is requested (that is, the “ibm,req#msi-x” property
must exist for the PCI configuration address in order to use Function 4),
or if neither MSIs nor MSI-Xs are supported and Function 1 is requested.
This ensures that the ret_intr_type variable contains a valid MSI type
for this device, and that spapr_msi_setmsg() won't corrupt the PCI status.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-27 01:25:24 +03:00
|
|
|
if (max_msi) {
|
|
|
|
_FDT(fdt_setprop_cell(fdt, offset, "ibm,req#msi", max_msi));
|
|
|
|
}
|
spapr_pci: fix device tree props for MSI/MSI-X
PAPR requires ibm,req#msi and ibm,req#msi-x to be present in the
device node to define the number of msi/msi-x interrupts the device
supports, respectively.
Currently we have ibm,req#msi-x hardcoded to a non-sensical constant
that happens to be 2, and are missing ibm,req#msi entirely. The result
of that is that msi-x capable devices get limited to 2 msi-x
interrupts (which can impact performance), and msi-only devices likely
wouldn't work at all. Additionally, if devices expect a minimum that
exceeds 2, the guest driver may fail to load entirely.
SLOF still owns the generation of these properties at boot-time
(although other device properties have since been offloaded to QEMU),
but for hotplugged devices we rely on the values generated by QEMU
and thus hit the limitations above.
Fix this by generating these properties in QEMU as expected by guests.
In the future it may make sense to modify SLOF to pass through these
values directly as we do with other props since we're duplicating SLOF
code.
Cc: qemu-ppc@nongnu.org
Cc: qemu-stable@nongnu.org
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-16 00:34:59 +03:00
|
|
|
}
|
spapr_pci: fix MSI/MSIX selection
In various place we don't correctly check if the device supports MSI or
MSI-X. This can cause devices to be advertised with MSI support, even
if they only support MSI-X (like virtio-pci-* devices for example):
ethernet@0 {
ibm,req#msi = <0x1>; <--- wrong!
.
ibm,loc-code = "qemu_virtio-net-pci:0000:00:00.0";
.
ibm,req#msi-x = <0x3>;
};
Worse, this can also cause the "ibm,change-msi" RTAS call to corrupt the
PCI status and cause migration to fail:
qemu-system-ppc64: get_pci_config_device: Bad config data: i=0x6
read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
^^
PCI_STATUS_CAP_LIST bit which is assumed to be constant
This patch changes spapr_populate_pci_child_dt() to properly check for
MSI support using msi_present(): this ensures that PCIDevice::msi_cap
was set by msi_init() and that msi_nr_vectors_allocated() will look at
the right place in the config space.
Checking PCIDevice::msix_entries_nr is enough for MSI-X but let's add
a call to msix_present() there as well for consistency.
It also changes rtas_ibm_change_msi() to select the appropriate MSI
type in Function 1 instead of always selecting plain MSI. This new
behaviour is compliant with LoPAPR 1.1, as described in "Table 71.
ibm,change-msi Argument Call Buffer":
Function 1: If Number Outputs is equal to 3, request to set to a new
number of MSIs (including set to 0).
If the “ibm,change-msix-capable” property exists and Number
Outputs is equal to 4, request is to set to a new number of
MSI or MSI-X (platform choice) interrupts (including set to
0).
Since MSI is the the platform default (LoPAPR 6.2.3 MSI Option), let's
check for MSI support first.
And finally, it checks the input parameters are valid, as described in
LoPAPR 1.1 "R1–7.3.10.5.1–3":
For the MSI option: The platform must return a Status of -3 (Parameter
error) from ibm,change-msi, with no change in interrupt assignments if
the PCI configuration address does not support MSI and Function 3 was
requested (that is, the “ibm,req#msi” property must exist for the PCI
configuration address in order to use Function 3), or does not support
MSI-X and Function 4 is requested (that is, the “ibm,req#msi-x” property
must exist for the PCI configuration address in order to use Function 4),
or if neither MSIs nor MSI-Xs are supported and Function 1 is requested.
This ensures that the ret_intr_type variable contains a valid MSI type
for this device, and that spapr_msi_setmsg() won't corrupt the PCI status.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-27 01:25:24 +03:00
|
|
|
if (msix_present(dev)) {
|
2019-03-22 08:17:14 +03:00
|
|
|
uint32_t max_msix = dev->msix_entries_nr;
|
spapr_pci: fix MSI/MSIX selection
In various place we don't correctly check if the device supports MSI or
MSI-X. This can cause devices to be advertised with MSI support, even
if they only support MSI-X (like virtio-pci-* devices for example):
ethernet@0 {
ibm,req#msi = <0x1>; <--- wrong!
.
ibm,loc-code = "qemu_virtio-net-pci:0000:00:00.0";
.
ibm,req#msi-x = <0x3>;
};
Worse, this can also cause the "ibm,change-msi" RTAS call to corrupt the
PCI status and cause migration to fail:
qemu-system-ppc64: get_pci_config_device: Bad config data: i=0x6
read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
^^
PCI_STATUS_CAP_LIST bit which is assumed to be constant
This patch changes spapr_populate_pci_child_dt() to properly check for
MSI support using msi_present(): this ensures that PCIDevice::msi_cap
was set by msi_init() and that msi_nr_vectors_allocated() will look at
the right place in the config space.
Checking PCIDevice::msix_entries_nr is enough for MSI-X but let's add
a call to msix_present() there as well for consistency.
It also changes rtas_ibm_change_msi() to select the appropriate MSI
type in Function 1 instead of always selecting plain MSI. This new
behaviour is compliant with LoPAPR 1.1, as described in "Table 71.
ibm,change-msi Argument Call Buffer":
Function 1: If Number Outputs is equal to 3, request to set to a new
number of MSIs (including set to 0).
If the “ibm,change-msix-capable” property exists and Number
Outputs is equal to 4, request is to set to a new number of
MSI or MSI-X (platform choice) interrupts (including set to
0).
Since MSI is the the platform default (LoPAPR 6.2.3 MSI Option), let's
check for MSI support first.
And finally, it checks the input parameters are valid, as described in
LoPAPR 1.1 "R1–7.3.10.5.1–3":
For the MSI option: The platform must return a Status of -3 (Parameter
error) from ibm,change-msi, with no change in interrupt assignments if
the PCI configuration address does not support MSI and Function 3 was
requested (that is, the “ibm,req#msi” property must exist for the PCI
configuration address in order to use Function 3), or does not support
MSI-X and Function 4 is requested (that is, the “ibm,req#msi-x” property
must exist for the PCI configuration address in order to use Function 4),
or if neither MSIs nor MSI-Xs are supported and Function 1 is requested.
This ensures that the ret_intr_type variable contains a valid MSI type
for this device, and that spapr_msi_setmsg() won't corrupt the PCI status.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-27 01:25:24 +03:00
|
|
|
if (max_msix) {
|
|
|
|
_FDT(fdt_setprop_cell(fdt, offset, "ibm,req#msi-x", max_msix));
|
|
|
|
}
|
spapr_pci: fix device tree props for MSI/MSI-X
PAPR requires ibm,req#msi and ibm,req#msi-x to be present in the
device node to define the number of msi/msi-x interrupts the device
supports, respectively.
Currently we have ibm,req#msi-x hardcoded to a non-sensical constant
that happens to be 2, and are missing ibm,req#msi entirely. The result
of that is that msi-x capable devices get limited to 2 msi-x
interrupts (which can impact performance), and msi-only devices likely
wouldn't work at all. Additionally, if devices expect a minimum that
exceeds 2, the guest driver may fail to load entirely.
SLOF still owns the generation of these properties at boot-time
(although other device properties have since been offloaded to QEMU),
but for hotplugged devices we rely on the values generated by QEMU
and thus hit the limitations above.
Fix this by generating these properties in QEMU as expected by guests.
In the future it may make sense to modify SLOF to pass through these
values directly as we do with other props since we're duplicating SLOF
code.
Cc: qemu-ppc@nongnu.org
Cc: qemu-stable@nongnu.org
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-16 00:34:59 +03:00
|
|
|
}
|
2015-05-07 08:33:55 +03:00
|
|
|
|
|
|
|
populate_resource_props(dev, &rp);
|
|
|
|
_FDT(fdt_setprop(fdt, offset, "reg", (uint8_t *)rp.reg, rp.reg_len));
|
|
|
|
|
2017-03-14 03:54:17 +03:00
|
|
|
if (sphb->pcie_ecs && pci_is_express(dev)) {
|
2017-03-01 08:23:12 +03:00
|
|
|
_FDT(fdt_setprop_cell(fdt, offset, "ibm,pci-config-space-type", 0x1));
|
|
|
|
}
|
spapr: Support NVIDIA V100 GPU with NVLink2
NVIDIA V100 GPUs have on-board RAM which is mapped into the host memory
space and accessible as normal RAM via an NVLink bus. The VFIO-PCI driver
implements special regions for such GPUs and emulates an NVLink bridge.
NVLink2-enabled POWER9 CPUs also provide address translation services
which includes an ATS shootdown (ATSD) register exported via the NVLink
bridge device.
This adds a quirk to VFIO to map the GPU memory and create an MR;
the new MR is stored in a PCI device as a QOM link. The sPAPR PCI uses
this to get the MR and map it to the system address space.
Another quirk does the same for ATSD.
This adds additional steps to sPAPR PHB setup:
1. Search for specific GPUs and NPUs, collect findings in
sPAPRPHBState::nvgpus, manage system address space mappings;
2. Add device-specific properties such as "ibm,npu", "ibm,gpu",
"memory-block", "link-speed" to advertise the NVLink2 function to
the guest;
3. Add "mmio-atsd" to vPHB to advertise the ATSD capability;
4. Add new memory blocks (with extra "linux,memory-usable" to prevent
the guest OS from accessing the new memory until it is onlined) and
npuphb# nodes representing an NPU unit for every vPHB as the GPU driver
uses it for link discovery.
This allocates space for GPU RAM and ATSD like we do for MMIOs by
adding 2 new parameters to the phb_placement() hook. Older machine types
set these to zero.
This puts new memory nodes in a separate NUMA node to as the GPU RAM
needs to be configured equally distant from any other node in the system.
Unlike the host setup which assigns numa ids from 255 downwards, this
adds new NUMA nodes after the user configures nodes or from 1 if none
were configured.
This adds requirement similar to EEH - one IOMMU group per vPHB.
The reason for this is that ATSD registers belong to a physical NPU
so they cannot invalidate translations on GPUs attached to another NPU.
It is guaranteed by the host platform as it does not mix NVLink bridges
or GPUs from different NPU in the same IOMMU group. If more than one
IOMMU group is detected on a vPHB, this disables ATSD support for that
vPHB and prints a warning.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for vfio portions]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <20190312082103.130561-1-aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 11:21:03 +03:00
|
|
|
|
|
|
|
spapr_phb_nvgpu_populate_pcidev_dt(dev, fdt, offset, sphb);
|
2015-05-07 08:33:55 +03:00
|
|
|
|
2019-04-05 05:31:48 +03:00
|
|
|
if (!pc->is_bridge) {
|
|
|
|
/* Properties only for non-bridges */
|
|
|
|
uint32_t min_grant = pci_default_read_config(dev, PCI_MIN_GNT, 1);
|
|
|
|
uint32_t max_latency = pci_default_read_config(dev, PCI_MAX_LAT, 1);
|
|
|
|
_FDT(fdt_setprop_cell(fdt, offset, "min-grant", min_grant));
|
|
|
|
_FDT(fdt_setprop_cell(fdt, offset, "max-latency", max_latency));
|
|
|
|
return offset;
|
|
|
|
} else {
|
|
|
|
PCIBus *sec_bus = pci_bridge_get_sec_bus(PCI_BRIDGE(dev));
|
2015-07-02 09:23:23 +03:00
|
|
|
|
2019-04-05 05:31:48 +03:00
|
|
|
return spapr_dt_pci_bus(sphb, sec_bus, fdt, offset);
|
|
|
|
}
|
2015-05-07 08:33:55 +03:00
|
|
|
}
|
|
|
|
|
2017-05-22 22:35:48 +03:00
|
|
|
/* Callback to be called during DRC release. */
|
|
|
|
void spapr_phb_remove_pci_device_cb(DeviceState *dev)
|
2015-05-07 08:33:55 +03:00
|
|
|
{
|
2018-12-12 12:16:23 +03:00
|
|
|
HotplugHandler *hotplug_ctrl = qdev_get_hotplug_handler(dev);
|
|
|
|
|
|
|
|
hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
|
qdev: Let the hotplug_handler_unplug() caller delete the device
When unplugging a device, at one point the device will be destroyed
via object_unparent(). This will, one the one hand, unrealize the
removed device hierarchy, and on the other hand, destroy/free the
device hierarchy.
When chaining hotplug handlers, we want to overwrite a bus hotplug
handler by the machine hotplug handler, to be able to perform
some part of the plug/unplug and to forward the calls to the bus hotplug
handler.
For now, the bus hotplug handler would trigger an object_unparent(), not
allowing us to perform some unplug action on a device after we forwarded
the call to the bus hotplug handler. The device would be gone at that
point.
machine_unplug_handler(dev)
/* eventually do unplug stuff */
bus_unplug_handler(dev)
/* dev is gone, we can't do more unplug stuff */
So move the object_unparent() to the original caller of the unplug. For
now, keep the unrealize() at the original places of the
object_unparent(). For implicitly chained hotplug handlers (e.g. pc
code calling acpi hotplug handlers), the object_unparent() has to be
done by the outermost caller. So when calling hotplug_handler_unplug()
from inside an unplug handler, nothing is to be done.
hotplug_handler_unplug(dev) -> calls machine_unplug_handler()
machine_unplug_handler(dev) {
/* eventually do unplug stuff */
bus_unplug_handler(dev) -> calls unrealize(dev)
/* we can do more unplug stuff but device already unrealized */
}
object_unparent(dev)
In the long run, every unplug action should be factored out of the
unrealize() function into the unplug handler (especially for PCI). Then
we can get rid of the additonal unrealize() calls and object_unparent()
will properly unrealize the device hierarchy after the device has been
unplugged.
hotplug_handler_unplug(dev) -> calls machine_unplug_handler()
machine_unplug_handler(dev) {
/* eventually do unplug stuff */
bus_unplug_handler(dev) -> only unplugs, does not unrealize
/* we can do more unplug stuff */
}
object_unparent(dev) -> will unrealize
The original approach was suggested by Igor Mammedov for the PCI
part, but I extended it to all hotplug handlers. I consider this one
step into the right direction.
To summarize:
- object_unparent() on synchronous unplugs is done by common code
-- "Caller of hotplug_handler_unplug"
- object_unparent() on asynchronous unplugs ("unplug requests") has to
be done manually
-- "Caller of hotplug_handler_unplug"
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190228122849.4296-2-david@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-02-28 15:28:47 +03:00
|
|
|
object_unparent(OBJECT(dev));
|
2015-05-07 08:33:55 +03:00
|
|
|
}
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
int spapr_pci_dt_populate(SpaprDrc *drc, SpaprMachineState *spapr,
|
2019-02-19 20:17:53 +03:00
|
|
|
void *fdt, int *fdt_start_offset, Error **errp)
|
|
|
|
{
|
|
|
|
HotplugHandler *plug_handler = qdev_get_hotplug_handler(drc->dev);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb = SPAPR_PCI_HOST_BRIDGE(plug_handler);
|
2019-02-19 20:17:53 +03:00
|
|
|
PCIDevice *pdev = PCI_DEVICE(drc->dev);
|
|
|
|
|
2019-03-22 08:17:14 +03:00
|
|
|
*fdt_start_offset = spapr_dt_pci_device(sphb, pdev, fdt, 0);
|
2019-02-19 20:17:53 +03:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2019-04-05 08:34:19 +03:00
|
|
|
static void spapr_pci_bridge_plug(SpaprPhbState *phb,
|
2020-05-05 18:29:25 +03:00
|
|
|
PCIBridge *bridge)
|
2019-04-05 08:34:19 +03:00
|
|
|
{
|
|
|
|
PCIBus *bus = pci_bridge_get_sec_bus(bridge);
|
|
|
|
|
2020-05-05 18:29:25 +03:00
|
|
|
add_drcs(phb, bus);
|
2019-04-05 08:34:19 +03:00
|
|
|
}
|
|
|
|
|
spapr_pci: Robustify support of PCI bridges
Some recent error handling cleanups unveiled issues with our support of
PCI bridges:
1) QEMU aborts when using non-standard PCI bridge types,
unveiled by commit 7ef1553dac "spapr_pci: Drop some dead error handling"
$ qemu-system-ppc64 -M pseries -device pcie-pci-bridge
Unexpected error in object_property_find() at qom/object.c:1240:
qemu-system-ppc64: -device pcie-pci-bridge: Property '.chassis_nr' not found
Aborted (core dumped)
This happens because we assume all PCI bridge types to have a "chassis_nr"
property. This property only exists with the standard PCI bridge type
"pci-bridge" actually. We could possibly revert 7ef1553dac but it seems
much simpler to check the presence of "chassis_nr" earlier.
2) QEMU abort if same "chassis_nr" value is used several times,
unveiled by commit d2623129a7de "qom: Drop parameter @errp of
object_property_add() & friends"
$ qemu-system-ppc64 -M pseries -device pci-bridge,chassis_nr=1 \
-device pci-bridge,chassis_nr=1
Unexpected error in object_property_try_add() at qom/object.c:1167:
qemu-system-ppc64: -device pci-bridge,chassis_nr=1: attempt to add duplicate property '40000100' to object (type 'container')
Aborted (core dumped)
This happens because we assume that "chassis_nr" values are unique, but
nobody enforces that and we end up generating duplicate DRC ids. The PCI
code doesn't really care for duplicate "chassis_nr" properties since it
is only used to initialize the "Chassis Number Register" of the bridge,
with no functional impact on QEMU. So, even if passing the same value
several times might look weird, it never broke anything before, so
I guess we don't necessarily want to enforce strict checking in the PCI
code now.
Workaround both issues in the PAPR code: check that the bridge has a
unique and non null "chassis_nr" when plugging it into its parent bus.
Fixes: 05929a6c5dfe ("spapr: Don't use bus number for building DRC ids")
Fixes: 7ef1553dac ("spapr_pci: Drop some dead error handling")
Fixes: d2623129a7de ("qom: Drop parameter @errp of object_property_add() & friends")
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159431476748.407044.16711294833569014964.stgit@bahia.lan>
[dwg: Move check slightly to a better place]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-07-09 20:12:47 +03:00
|
|
|
/* Returns non-zero if the value of "chassis_nr" is already in use */
|
|
|
|
static int check_chassis_nr(Object *obj, void *opaque)
|
|
|
|
{
|
|
|
|
int new_chassis_nr =
|
|
|
|
object_property_get_uint(opaque, "chassis_nr", &error_abort);
|
|
|
|
int chassis_nr =
|
|
|
|
object_property_get_uint(obj, "chassis_nr", NULL);
|
|
|
|
|
|
|
|
if (!object_dynamic_cast(obj, TYPE_PCI_BRIDGE)) {
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Skip unsupported bridge types */
|
|
|
|
if (!chassis_nr) {
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Skip self */
|
|
|
|
if (obj == opaque) {
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
return chassis_nr == new_chassis_nr;
|
|
|
|
}
|
|
|
|
|
|
|
|
static bool bridge_has_valid_chassis_nr(Object *bridge, Error **errp)
|
|
|
|
{
|
|
|
|
int chassis_nr =
|
|
|
|
object_property_get_uint(bridge, "chassis_nr", NULL);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* slotid_cap_init() already ensures that "chassis_nr" isn't null for
|
|
|
|
* standard PCI bridges, so this really tells if "chassis_nr" is present
|
|
|
|
* or not.
|
|
|
|
*/
|
|
|
|
if (!chassis_nr) {
|
|
|
|
error_setg(errp, "PCI Bridge lacks a \"chassis_nr\" property");
|
|
|
|
error_append_hint(errp, "Try -device pci-bridge instead.\n");
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* We want unique values for "chassis_nr" */
|
|
|
|
if (object_child_foreach_recursive(object_get_root(), check_chassis_nr,
|
|
|
|
bridge)) {
|
|
|
|
error_setg(errp, "Bridge chassis %d already in use", chassis_nr);
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2020-11-21 02:42:00 +03:00
|
|
|
static void spapr_pci_pre_plug(HotplugHandler *plug_handler,
|
|
|
|
DeviceState *plugged_dev, Error **errp)
|
2015-05-07 08:33:55 +03:00
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *phb = SPAPR_PCI_HOST_BRIDGE(DEVICE(plug_handler));
|
2015-05-07 08:33:55 +03:00
|
|
|
PCIDevice *pdev = PCI_DEVICE(plugged_dev);
|
2019-04-05 08:34:19 +03:00
|
|
|
PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(plugged_dev);
|
2019-04-05 07:51:00 +03:00
|
|
|
SpaprDrc *drc = drc_from_dev(phb, pdev);
|
2016-03-04 00:55:36 +03:00
|
|
|
PCIBus *bus = PCI_BUS(qdev_get_parent_bus(DEVICE(pdev)));
|
|
|
|
uint32_t slotnr = PCI_SLOT(pdev->devfn);
|
2015-05-07 08:33:55 +03:00
|
|
|
|
|
|
|
if (!phb->dr_enabled) {
|
|
|
|
/* if this is a hotplug operation initiated by the user
|
|
|
|
* we need to let them know it's not enabled
|
|
|
|
*/
|
|
|
|
if (plugged_dev->hotplugged) {
|
error: Avoid unnecessary error_propagate() after error_setg()
Replace
error_setg(&err, ...);
error_propagate(errp, err);
by
error_setg(errp, ...);
Related pattern:
if (...) {
error_setg(&err, ...);
goto out;
}
...
out:
error_propagate(errp, err);
return;
When all paths to label out are that way, replace by
if (...) {
error_setg(errp, ...);
return;
}
and delete the label along with the error_propagate().
When we have at most one other path that actually needs to propagate,
and maybe one at the end that where propagation is unnecessary, e.g.
foo(..., &err);
if (err) {
goto out;
}
...
bar(..., &err);
out:
error_propagate(errp, err);
return;
move the error_propagate() to where it's needed, like
if (...) {
foo(..., &err);
error_propagate(errp, err);
return;
}
...
bar(..., errp);
return;
and transform the error_setg() as above.
In some places, the transformation results in obviously unnecessary
error_propagate(). The next few commits will eliminate them.
Bonus: the elimination of gotos will make later patches in this series
easier to review.
Candidates for conversion tracked down with this Coccinelle script:
@@
identifier err, errp;
expression list args;
@@
- error_setg(&err, args);
+ error_setg(errp, args);
... when != err
error_propagate(errp, err);
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200707160613.848843-34-armbru@redhat.com>
2020-07-07 19:06:01 +03:00
|
|
|
error_setg(errp, QERR_BUS_NO_HOTPLUG,
|
2015-03-17 13:54:50 +03:00
|
|
|
object_get_typename(OBJECT(phb)));
|
2020-11-21 02:42:00 +03:00
|
|
|
return;
|
2015-05-07 08:33:55 +03:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-04-05 08:34:19 +03:00
|
|
|
if (pc->is_bridge) {
|
spapr_pci: Robustify support of PCI bridges
Some recent error handling cleanups unveiled issues with our support of
PCI bridges:
1) QEMU aborts when using non-standard PCI bridge types,
unveiled by commit 7ef1553dac "spapr_pci: Drop some dead error handling"
$ qemu-system-ppc64 -M pseries -device pcie-pci-bridge
Unexpected error in object_property_find() at qom/object.c:1240:
qemu-system-ppc64: -device pcie-pci-bridge: Property '.chassis_nr' not found
Aborted (core dumped)
This happens because we assume all PCI bridge types to have a "chassis_nr"
property. This property only exists with the standard PCI bridge type
"pci-bridge" actually. We could possibly revert 7ef1553dac but it seems
much simpler to check the presence of "chassis_nr" earlier.
2) QEMU abort if same "chassis_nr" value is used several times,
unveiled by commit d2623129a7de "qom: Drop parameter @errp of
object_property_add() & friends"
$ qemu-system-ppc64 -M pseries -device pci-bridge,chassis_nr=1 \
-device pci-bridge,chassis_nr=1
Unexpected error in object_property_try_add() at qom/object.c:1167:
qemu-system-ppc64: -device pci-bridge,chassis_nr=1: attempt to add duplicate property '40000100' to object (type 'container')
Aborted (core dumped)
This happens because we assume that "chassis_nr" values are unique, but
nobody enforces that and we end up generating duplicate DRC ids. The PCI
code doesn't really care for duplicate "chassis_nr" properties since it
is only used to initialize the "Chassis Number Register" of the bridge,
with no functional impact on QEMU. So, even if passing the same value
several times might look weird, it never broke anything before, so
I guess we don't necessarily want to enforce strict checking in the PCI
code now.
Workaround both issues in the PAPR code: check that the bridge has a
unique and non null "chassis_nr" when plugging it into its parent bus.
Fixes: 05929a6c5dfe ("spapr: Don't use bus number for building DRC ids")
Fixes: 7ef1553dac ("spapr_pci: Drop some dead error handling")
Fixes: d2623129a7de ("qom: Drop parameter @errp of object_property_add() & friends")
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159431476748.407044.16711294833569014964.stgit@bahia.lan>
[dwg: Move check slightly to a better place]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-07-09 20:12:47 +03:00
|
|
|
if (!bridge_has_valid_chassis_nr(OBJECT(plugged_dev), errp)) {
|
|
|
|
return;
|
|
|
|
}
|
2019-04-05 08:34:19 +03:00
|
|
|
}
|
|
|
|
|
2016-03-04 00:55:36 +03:00
|
|
|
/* Following the QEMU convention used for PCIe multifunction
|
|
|
|
* hotplug, we do not allow functions to be hotplugged to a
|
|
|
|
* slot that already has function 0 present
|
|
|
|
*/
|
|
|
|
if (plugged_dev->hotplugged && bus->devices[PCI_DEVFN(slotnr, 0)] &&
|
|
|
|
PCI_FUNC(pdev->devfn) != 0) {
|
2020-10-06 16:39:58 +03:00
|
|
|
error_setg(errp, "PCI: slot %d function 0 already occupied by %s,"
|
2016-03-04 00:55:36 +03:00
|
|
|
" additional functions can no longer be exposed to guest.",
|
|
|
|
slotnr, bus->devices[PCI_DEVFN(slotnr, 0)]->name);
|
2020-11-21 02:42:00 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
if (drc && drc->dev) {
|
|
|
|
error_setg(errp, "PCI: slot %d already occupied by %s", slotnr,
|
|
|
|
pci_get_function_0(PCI_DEVICE(drc->dev))->name);
|
error: Avoid unnecessary error_propagate() after error_setg()
Replace
error_setg(&err, ...);
error_propagate(errp, err);
by
error_setg(errp, ...);
Related pattern:
if (...) {
error_setg(&err, ...);
goto out;
}
...
out:
error_propagate(errp, err);
return;
When all paths to label out are that way, replace by
if (...) {
error_setg(errp, ...);
return;
}
and delete the label along with the error_propagate().
When we have at most one other path that actually needs to propagate,
and maybe one at the end that where propagation is unnecessary, e.g.
foo(..., &err);
if (err) {
goto out;
}
...
bar(..., &err);
out:
error_propagate(errp, err);
return;
move the error_propagate() to where it's needed, like
if (...) {
foo(..., &err);
error_propagate(errp, err);
return;
}
...
bar(..., errp);
return;
and transform the error_setg() as above.
In some places, the transformation results in obviously unnecessary
error_propagate(). The next few commits will eliminate them.
Bonus: the elimination of gotos will make later patches in this series
easier to review.
Candidates for conversion tracked down with this Coccinelle script:
@@
identifier err, errp;
expression list args;
@@
- error_setg(&err, args);
+ error_setg(errp, args);
... when != err
error_propagate(errp, err);
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200707160613.848843-34-armbru@redhat.com>
2020-07-07 19:06:01 +03:00
|
|
|
return;
|
2017-06-07 04:35:03 +03:00
|
|
|
}
|
2020-11-21 02:42:00 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
static void spapr_pci_plug(HotplugHandler *plug_handler,
|
|
|
|
DeviceState *plugged_dev, Error **errp)
|
|
|
|
{
|
|
|
|
SpaprPhbState *phb = SPAPR_PCI_HOST_BRIDGE(DEVICE(plug_handler));
|
|
|
|
PCIDevice *pdev = PCI_DEVICE(plugged_dev);
|
|
|
|
PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(plugged_dev);
|
|
|
|
SpaprDrc *drc = drc_from_dev(phb, pdev);
|
|
|
|
uint32_t slotnr = PCI_SLOT(pdev->devfn);
|
2017-06-07 04:35:03 +03:00
|
|
|
|
2020-11-21 02:42:00 +03:00
|
|
|
/*
|
|
|
|
* If DR is disabled we don't need to do anything in the case of
|
|
|
|
* hotplug or coldplug callbacks.
|
|
|
|
*/
|
|
|
|
if (!phb->dr_enabled) {
|
error: Avoid unnecessary error_propagate() after error_setg()
Replace
error_setg(&err, ...);
error_propagate(errp, err);
by
error_setg(errp, ...);
Related pattern:
if (...) {
error_setg(&err, ...);
goto out;
}
...
out:
error_propagate(errp, err);
return;
When all paths to label out are that way, replace by
if (...) {
error_setg(errp, ...);
return;
}
and delete the label along with the error_propagate().
When we have at most one other path that actually needs to propagate,
and maybe one at the end that where propagation is unnecessary, e.g.
foo(..., &err);
if (err) {
goto out;
}
...
bar(..., &err);
out:
error_propagate(errp, err);
return;
move the error_propagate() to where it's needed, like
if (...) {
foo(..., &err);
error_propagate(errp, err);
return;
}
...
bar(..., errp);
return;
and transform the error_setg() as above.
In some places, the transformation results in obviously unnecessary
error_propagate(). The next few commits will eliminate them.
Bonus: the elimination of gotos will make later patches in this series
easier to review.
Candidates for conversion tracked down with this Coccinelle script:
@@
identifier err, errp;
expression list args;
@@
- error_setg(&err, args);
+ error_setg(errp, args);
... when != err
error_propagate(errp, err);
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200707160613.848843-34-armbru@redhat.com>
2020-07-07 19:06:01 +03:00
|
|
|
return;
|
2015-05-07 08:33:55 +03:00
|
|
|
}
|
2016-03-04 00:55:36 +03:00
|
|
|
|
2020-11-21 02:42:00 +03:00
|
|
|
g_assert(drc);
|
|
|
|
|
|
|
|
if (pc->is_bridge) {
|
|
|
|
spapr_pci_bridge_plug(phb, PCI_BRIDGE(plugged_dev));
|
|
|
|
}
|
|
|
|
|
|
|
|
/* spapr_pci_pre_plug() already checked the DRC is attachable */
|
2020-12-01 14:37:28 +03:00
|
|
|
spapr_drc_attach(drc, DEVICE(pdev));
|
2020-11-21 02:42:00 +03:00
|
|
|
|
2016-03-04 00:55:36 +03:00
|
|
|
/* If this is function 0, signal hotplug for all the device functions.
|
|
|
|
* Otherwise defer sending the hotplug event.
|
|
|
|
*/
|
spapr: Treat devices added before inbound migration as coldplugged
When migrating a guest which has already had devices hotplugged,
libvirt typically starts the destination qemu with -incoming defer,
adds those hotplugged devices with qmp, then initiates the incoming
migration.
This causes problems for the management of spapr DRC state. Because
the device is treated as hotplugged, it goes into a DRC state for a
device immediately after it's plugged, but before the guest has
acknowledged its presence. However, chances are the guest on the
source machine *has* acknowledged the device's presence and configured
it.
If the source has fully configured the device, then DRC state won't be
sent in the migration stream: for maximum migration compatibility with
earlier versions we don't migrate DRCs in coldplug-equivalent state.
That means that the DRC effectively changes state over the migrate,
causing problems later on.
In addition, logging hotplug events for these devices isn't what we
want because a) those events should already have been issued on the
source host and b) the event queue should get wiped out by the
incoming state anyway.
In short, what we really want is to treat devices added before an
incoming migration as if they were coldplugged.
To do this, we first add a spapr_drc_hotplugged() helper which
determines if the device is hotplugged in the sense relevant for DRC
state management. We only send hotplug events when this is true.
Second, when we add a device which isn't hotplugged in this sense, we
force a reset of the DRC state - this ensures the DRC is in a
coldplug-equivalent state (there isn't usually a system reset between
these device adds and the incoming migration).
This is based on an earlier patch by Laurent Vivier, cleaned up and
extended.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-06-09 14:08:10 +03:00
|
|
|
if (!spapr_drc_hotplugged(plugged_dev)) {
|
|
|
|
spapr_drc_reset(drc);
|
|
|
|
} else if (PCI_FUNC(pdev->devfn) == 0) {
|
2016-03-04 00:55:36 +03:00
|
|
|
int i;
|
2020-05-05 18:29:25 +03:00
|
|
|
uint8_t chassis = chassis_from_bus(pci_get_bus(pdev));
|
2016-03-04 00:55:36 +03:00
|
|
|
|
|
|
|
for (i = 0; i < 8; i++) {
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprDrc *func_drc;
|
|
|
|
SpaprDrcClass *func_drck;
|
|
|
|
SpaprDREntitySense state;
|
2016-03-04 00:55:36 +03:00
|
|
|
|
2019-04-10 04:49:28 +03:00
|
|
|
func_drc = drc_from_devfn(phb, chassis, PCI_DEVFN(slotnr, i));
|
2016-03-04 00:55:36 +03:00
|
|
|
func_drck = SPAPR_DR_CONNECTOR_GET_CLASS(func_drc);
|
2017-06-07 04:26:52 +03:00
|
|
|
state = func_drck->dr_entity_sense(func_drc);
|
2016-03-04 00:55:36 +03:00
|
|
|
|
|
|
|
if (state == SPAPR_DR_ENTITY_SENSE_PRESENT) {
|
|
|
|
spapr_hotplug_req_add_by_index(func_drc);
|
|
|
|
}
|
|
|
|
}
|
2015-05-07 08:33:56 +03:00
|
|
|
}
|
2015-05-07 08:33:55 +03:00
|
|
|
}
|
|
|
|
|
2019-04-05 08:34:19 +03:00
|
|
|
static void spapr_pci_bridge_unplug(SpaprPhbState *phb,
|
2020-05-05 18:29:25 +03:00
|
|
|
PCIBridge *bridge)
|
2019-04-05 08:34:19 +03:00
|
|
|
{
|
|
|
|
PCIBus *bus = pci_bridge_get_sec_bus(bridge);
|
|
|
|
|
2020-05-05 18:29:25 +03:00
|
|
|
remove_drcs(phb, bus);
|
2019-04-05 08:34:19 +03:00
|
|
|
}
|
|
|
|
|
2018-12-12 12:16:23 +03:00
|
|
|
static void spapr_pci_unplug(HotplugHandler *plug_handler,
|
|
|
|
DeviceState *plugged_dev, Error **errp)
|
|
|
|
{
|
2019-04-05 08:34:19 +03:00
|
|
|
PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(plugged_dev);
|
|
|
|
SpaprPhbState *phb = SPAPR_PCI_HOST_BRIDGE(DEVICE(plug_handler));
|
|
|
|
|
2018-12-12 12:16:23 +03:00
|
|
|
/* some version guests do not wait for completion of a device
|
|
|
|
* cleanup (generally done asynchronously by the kernel) before
|
|
|
|
* signaling to QEMU that the device is safe, but instead sleep
|
|
|
|
* for some 'safe' period of time. unfortunately on a busy host
|
|
|
|
* this sleep isn't guaranteed to be long enough, resulting in
|
|
|
|
* bad things like IRQ lines being left asserted during final
|
|
|
|
* device removal. to deal with this we call reset just prior
|
|
|
|
* to finalizing the device, which will put the device back into
|
|
|
|
* an 'idle' state, as the device cleanup code expects.
|
|
|
|
*/
|
|
|
|
pci_device_reset(PCI_DEVICE(plugged_dev));
|
2019-04-05 08:34:19 +03:00
|
|
|
|
|
|
|
if (pc->is_bridge) {
|
2020-05-05 18:29:25 +03:00
|
|
|
spapr_pci_bridge_unplug(phb, PCI_BRIDGE(plugged_dev));
|
2019-04-05 08:34:19 +03:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2020-06-10 08:31:56 +03:00
|
|
|
qdev_unrealize(plugged_dev);
|
2018-12-12 12:16:23 +03:00
|
|
|
}
|
|
|
|
|
2017-07-03 09:34:28 +03:00
|
|
|
static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
|
|
|
|
DeviceState *plugged_dev, Error **errp)
|
2015-05-07 08:33:55 +03:00
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *phb = SPAPR_PCI_HOST_BRIDGE(DEVICE(plug_handler));
|
2015-05-07 08:33:55 +03:00
|
|
|
PCIDevice *pdev = PCI_DEVICE(plugged_dev);
|
2019-04-05 07:51:00 +03:00
|
|
|
SpaprDrc *drc = drc_from_dev(phb, pdev);
|
2015-05-07 08:33:55 +03:00
|
|
|
|
|
|
|
if (!phb->dr_enabled) {
|
2015-03-17 13:54:50 +03:00
|
|
|
error_setg(errp, QERR_BUS_NO_HOTPLUG,
|
|
|
|
object_get_typename(OBJECT(phb)));
|
2015-05-07 08:33:55 +03:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
g_assert(drc);
|
2017-07-03 09:34:28 +03:00
|
|
|
g_assert(drc->dev == plugged_dev);
|
2015-05-07 08:33:55 +03:00
|
|
|
|
2017-06-20 16:02:41 +03:00
|
|
|
if (!spapr_drc_unplug_requested(drc)) {
|
2019-04-05 08:34:19 +03:00
|
|
|
PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(plugged_dev);
|
2016-03-04 00:55:36 +03:00
|
|
|
uint32_t slotnr = PCI_SLOT(pdev->devfn);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprDrc *func_drc;
|
|
|
|
SpaprDrcClass *func_drck;
|
|
|
|
SpaprDREntitySense state;
|
2016-03-04 00:55:36 +03:00
|
|
|
int i;
|
2020-05-05 18:29:25 +03:00
|
|
|
uint8_t chassis = chassis_from_bus(pci_get_bus(pdev));
|
2016-03-04 00:55:36 +03:00
|
|
|
|
2019-04-05 08:34:19 +03:00
|
|
|
if (pc->is_bridge) {
|
|
|
|
error_setg(errp, "PCI: Hot unplug of PCI bridges not supported");
|
2020-03-26 08:12:40 +03:00
|
|
|
return;
|
2019-04-05 08:34:19 +03:00
|
|
|
}
|
2020-03-26 08:27:37 +03:00
|
|
|
if (object_property_get_uint(OBJECT(pdev), "nvlink2-tgt", NULL)) {
|
|
|
|
error_setg(errp, "PCI: Cannot unplug NVLink2 devices");
|
|
|
|
return;
|
|
|
|
}
|
2016-03-04 00:55:36 +03:00
|
|
|
|
|
|
|
/* ensure any other present functions are pending unplug */
|
|
|
|
if (PCI_FUNC(pdev->devfn) == 0) {
|
|
|
|
for (i = 1; i < 8; i++) {
|
2019-04-10 04:49:28 +03:00
|
|
|
func_drc = drc_from_devfn(phb, chassis, PCI_DEVFN(slotnr, i));
|
2016-03-04 00:55:36 +03:00
|
|
|
func_drck = SPAPR_DR_CONNECTOR_GET_CLASS(func_drc);
|
2017-06-07 04:26:52 +03:00
|
|
|
state = func_drck->dr_entity_sense(func_drc);
|
2016-03-04 00:55:36 +03:00
|
|
|
if (state == SPAPR_DR_ENTITY_SENSE_PRESENT
|
2017-06-20 16:02:41 +03:00
|
|
|
&& !spapr_drc_unplug_requested(func_drc)) {
|
spapr_pci: remove all child functions in function zero unplug
There is nothing wrong with how sPAPR handles multifunction PCI
hot unplugs. The problem is that x86 does it simpler. Instead of
removing each non-zero function and then removing function zero,
x86 can remove any function of the slot to trigger the hot unplug.
Libvirt will be directly impacted by this difference, in the
(hopefully soon) PCI Multifunction hot plug/unplug support. For
hot plugs, both x86 and sPAPR will operate the same way: a XML
with all desired functions to be added, then consecutive hotplugs
of all non-zero functions first, zero last. For hot unplugs, at
least in the current state, a XML with the devices to be removed
must also be provided because of how sPAPR operates - x86 does
not need it - since any function unplug will unplug the whole
PCIe slot. This difference puts extra strain in the management
layer, which needs to either handle both archs differently in
the unplug scenario or choose treat x86 like sPAPR, forcing x86
users to cope with sPAPR internals.
This patch changes spapr_pci_unplug_request to handle the
unplug of function zero differently. When removing function zero,
instead of error-ing out if there are any remaining function
DRCs which needs detaching, detach those. This has no effect in
any existing scripts that are detaching the non-zero functions
before function zero, and can be used by management as a shortcut
to remove the whole PCI multifunction device without specifying
each child function.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20190822195918.3307-1-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-22 22:59:18 +03:00
|
|
|
/*
|
|
|
|
* Attempting to remove function 0 of a multifunction
|
|
|
|
* device will will cascade into removing all child
|
|
|
|
* functions, even if their unplug weren't requested
|
|
|
|
* beforehand.
|
|
|
|
*/
|
spapr: rename spapr_drc_detach() to spapr_drc_unplug_request()
spapr_drc_detach() is not the best name for what the function does. The
function does not detach the DRC, it makes an uncommited attempt to do
it. It'll mark the DRC as pending unplug, via the 'unplug_request'
flag, and only if the DRC state is drck->empty_state it will detach the
DRC, via spapr_drc_release().
This is a contrast with its pair spapr_drc_attach(), where the function
is indeed creating the DRC QOM object. If you know what
spapr_drc_attach() does, you can be misled into thinking that
spapr_drc_detach() is removing the DRC from QEMU internal state, which
isn't true.
The current role of this function is better described as a request for
detach, since there's no guarantee that we're going to detach the DRC in
the end. Rename the function to spapr_drc_unplug_request to reflect
what is is doing.
The initial idea was to change the name to spapr_drc_detach_request(),
and later on change the unplug_request flag to detach_request. However,
unplug_request is a migratable boolean for a long time now and renaming
it is not worth the trouble. spapr_drc_unplug_request() setting
drc->unplug_request is more natural than spapr_drc_detach_request
setting drc->unplug_request.
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210222194531.62717-3-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-02-22 22:45:28 +03:00
|
|
|
spapr_drc_unplug_request(func_drc);
|
2016-03-04 00:55:36 +03:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
spapr: rename spapr_drc_detach() to spapr_drc_unplug_request()
spapr_drc_detach() is not the best name for what the function does. The
function does not detach the DRC, it makes an uncommited attempt to do
it. It'll mark the DRC as pending unplug, via the 'unplug_request'
flag, and only if the DRC state is drck->empty_state it will detach the
DRC, via spapr_drc_release().
This is a contrast with its pair spapr_drc_attach(), where the function
is indeed creating the DRC QOM object. If you know what
spapr_drc_attach() does, you can be misled into thinking that
spapr_drc_detach() is removing the DRC from QEMU internal state, which
isn't true.
The current role of this function is better described as a request for
detach, since there's no guarantee that we're going to detach the DRC in
the end. Rename the function to spapr_drc_unplug_request to reflect
what is is doing.
The initial idea was to change the name to spapr_drc_detach_request(),
and later on change the unplug_request flag to detach_request. However,
unplug_request is a migratable boolean for a long time now and renaming
it is not worth the trouble. spapr_drc_unplug_request() setting
drc->unplug_request is more natural than spapr_drc_detach_request
setting drc->unplug_request.
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210222194531.62717-3-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-02-22 22:45:28 +03:00
|
|
|
spapr_drc_unplug_request(drc);
|
2016-03-04 00:55:36 +03:00
|
|
|
|
|
|
|
/* if this isn't func 0, defer unplug event. otherwise signal removal
|
|
|
|
* for all present functions
|
|
|
|
*/
|
|
|
|
if (PCI_FUNC(pdev->devfn) == 0) {
|
|
|
|
for (i = 7; i >= 0; i--) {
|
2019-04-10 04:49:28 +03:00
|
|
|
func_drc = drc_from_devfn(phb, chassis, PCI_DEVFN(slotnr, i));
|
2016-03-04 00:55:36 +03:00
|
|
|
func_drck = SPAPR_DR_CONNECTOR_GET_CLASS(func_drc);
|
2017-06-07 04:26:52 +03:00
|
|
|
state = func_drck->dr_entity_sense(func_drc);
|
2016-03-04 00:55:36 +03:00
|
|
|
if (state == SPAPR_DR_ENTITY_SENSE_PRESENT) {
|
|
|
|
spapr_hotplug_req_remove_by_index(func_drc);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2021-02-26 19:33:00 +03:00
|
|
|
} else {
|
|
|
|
error_setg(errp,
|
|
|
|
"PCI device unplug already in progress for device %s",
|
|
|
|
drc->dev->id);
|
2015-05-07 08:33:55 +03:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-02-19 20:18:18 +03:00
|
|
|
static void spapr_phb_finalizefn(Object *obj)
|
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb = SPAPR_PCI_HOST_BRIDGE(obj);
|
2019-02-19 20:18:18 +03:00
|
|
|
|
|
|
|
g_free(sphb->dtbusname);
|
|
|
|
sphb->dtbusname = NULL;
|
|
|
|
}
|
|
|
|
|
qdev: Unrealize must not fail
Devices may have component devices and buses.
Device realization may fail. Realization is recursive: a device's
realize() method realizes its components, and device_set_realized()
realizes its buses (which should in turn realize the devices on that
bus, except bus_set_realized() doesn't implement that, yet).
When realization of a component or bus fails, we need to roll back:
unrealize everything we realized so far. If any of these unrealizes
failed, the device would be left in an inconsistent state. Must not
happen.
device_set_realized() lets it happen: it ignores errors in the roll
back code starting at label child_realize_fail.
Since realization is recursive, unrealization must be recursive, too.
But how could a partly failed unrealize be rolled back? We'd have to
re-realize, which can fail. This design is fundamentally broken.
device_set_realized() does not roll back at all. Instead, it keeps
unrealizing, ignoring further errors.
It can screw up even for a device with no buses: if the lone
dc->unrealize() fails, it still unregisters vmstate, and calls
listeners' unrealize() callback.
bus_set_realized() does not roll back either. Instead, it stops
unrealizing.
Fortunately, no unrealize method can fail, as we'll see below.
To fix the design error, drop parameter @errp from all the unrealize
methods.
Any unrealize method that uses @errp now needs an update. This leads
us to unrealize() methods that can fail. Merely passing it to another
unrealize method cannot cause failure, though. Here are the ones that
do other things with @errp:
* virtio_serial_device_unrealize()
Fails when qbus_set_hotplug_handler() fails, but still does all the
other work. On failure, the device would stay realized with its
resources completely gone. Oops. Can't happen, because
qbus_set_hotplug_handler() can't actually fail here. Pass
&error_abort to qbus_set_hotplug_handler() instead.
* hw/ppc/spapr_drc.c's unrealize()
Fails when object_property_del() fails, but all the other work is
already done. On failure, the device would stay realized with its
vmstate registration gone. Oops. Can't happen, because
object_property_del() can't actually fail here. Pass &error_abort
to object_property_del() instead.
* spapr_phb_unrealize()
Fails and bails out when remove_drcs() fails, but other work is
already done. On failure, the device would stay realized with some
of its resources gone. Oops. remove_drcs() fails only when
chassis_from_bus()'s object_property_get_uint() fails, and it can't
here. Pass &error_abort to remove_drcs() instead.
Therefore, no unrealize method can fail before this patch.
device_set_realized()'s recursive unrealization via bus uses
object_property_set_bool(). Can't drop @errp there, so pass
&error_abort.
We similarly unrealize with object_property_set_bool() elsewhere,
always ignoring errors. Pass &error_abort instead.
Several unrealize methods no longer handle errors from other unrealize
methods: virtio_9p_device_unrealize(),
virtio_input_device_unrealize(), scsi_qdev_unrealize(), ...
Much of the deleted error handling looks wrong anyway.
One unrealize methods no longer ignore such errors:
usb_ehci_pci_exit().
Several realize methods no longer ignore errors when rolling back:
v9fs_device_realize_common(), pci_qdev_unrealize(),
spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(),
virtio_device_realize().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-17-armbru@redhat.com>
2020-05-05 18:29:24 +03:00
|
|
|
static void spapr_phb_unrealize(DeviceState *dev)
|
2019-02-19 20:18:18 +03:00
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
|
2019-02-19 20:18:18 +03:00
|
|
|
SysBusDevice *s = SYS_BUS_DEVICE(dev);
|
|
|
|
PCIHostState *phb = PCI_HOST_BRIDGE(s);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb = SPAPR_PCI_HOST_BRIDGE(phb);
|
|
|
|
SpaprTceTable *tcet;
|
2019-02-19 20:18:18 +03:00
|
|
|
int i;
|
|
|
|
const unsigned windows_supported = spapr_phb_windows_supported(sphb);
|
|
|
|
|
spapr: Support NVIDIA V100 GPU with NVLink2
NVIDIA V100 GPUs have on-board RAM which is mapped into the host memory
space and accessible as normal RAM via an NVLink bus. The VFIO-PCI driver
implements special regions for such GPUs and emulates an NVLink bridge.
NVLink2-enabled POWER9 CPUs also provide address translation services
which includes an ATS shootdown (ATSD) register exported via the NVLink
bridge device.
This adds a quirk to VFIO to map the GPU memory and create an MR;
the new MR is stored in a PCI device as a QOM link. The sPAPR PCI uses
this to get the MR and map it to the system address space.
Another quirk does the same for ATSD.
This adds additional steps to sPAPR PHB setup:
1. Search for specific GPUs and NPUs, collect findings in
sPAPRPHBState::nvgpus, manage system address space mappings;
2. Add device-specific properties such as "ibm,npu", "ibm,gpu",
"memory-block", "link-speed" to advertise the NVLink2 function to
the guest;
3. Add "mmio-atsd" to vPHB to advertise the ATSD capability;
4. Add new memory blocks (with extra "linux,memory-usable" to prevent
the guest OS from accessing the new memory until it is onlined) and
npuphb# nodes representing an NPU unit for every vPHB as the GPU driver
uses it for link discovery.
This allocates space for GPU RAM and ATSD like we do for MMIOs by
adding 2 new parameters to the phb_placement() hook. Older machine types
set these to zero.
This puts new memory nodes in a separate NUMA node to as the GPU RAM
needs to be configured equally distant from any other node in the system.
Unlike the host setup which assigns numa ids from 255 downwards, this
adds new NUMA nodes after the user configures nodes or from 1 if none
were configured.
This adds requirement similar to EEH - one IOMMU group per vPHB.
The reason for this is that ATSD registers belong to a physical NPU
so they cannot invalidate translations on GPUs attached to another NPU.
It is guaranteed by the host platform as it does not mix NVLink bridges
or GPUs from different NPU in the same IOMMU group. If more than one
IOMMU group is detected on a vPHB, this disables ATSD support for that
vPHB and prints a warning.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for vfio portions]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <20190312082103.130561-1-aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 11:21:03 +03:00
|
|
|
spapr_phb_nvgpu_free(sphb);
|
|
|
|
|
2019-02-19 20:18:18 +03:00
|
|
|
if (sphb->msi) {
|
|
|
|
g_hash_table_unref(sphb->msi);
|
|
|
|
sphb->msi = NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Remove IO/MMIO subregions and aliases, rest should get cleaned
|
|
|
|
* via PHB's unrealize->object_finalize
|
|
|
|
*/
|
|
|
|
for (i = windows_supported - 1; i >= 0; i--) {
|
|
|
|
tcet = spapr_tce_find_by_liobn(sphb->dma_liobn[i]);
|
|
|
|
if (tcet) {
|
|
|
|
memory_region_del_subregion(&sphb->iommu_root,
|
|
|
|
spapr_tce_get_iommu(tcet));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-05-05 18:29:25 +03:00
|
|
|
remove_drcs(sphb, phb->bus);
|
2019-02-19 20:18:18 +03:00
|
|
|
|
|
|
|
for (i = PCI_NUM_PINS - 1; i >= 0; i--) {
|
|
|
|
if (sphb->lsi_table[i].irq) {
|
|
|
|
spapr_irq_free(spapr, sphb->lsi_table[i].irq, 1);
|
|
|
|
sphb->lsi_table[i].irq = 0;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
QLIST_REMOVE(sphb, list);
|
|
|
|
|
|
|
|
memory_region_del_subregion(&sphb->iommu_root, &sphb->msiwindow);
|
|
|
|
|
2019-06-21 12:27:33 +03:00
|
|
|
/*
|
|
|
|
* An attached PCI device may have memory listeners, eg. VFIO PCI. We have
|
|
|
|
* unmapped all sections. Remove the listeners now, before destroying the
|
|
|
|
* address space.
|
|
|
|
*/
|
|
|
|
address_space_remove_listeners(&sphb->iommu_as);
|
2019-02-19 20:18:18 +03:00
|
|
|
address_space_destroy(&sphb->iommu_as);
|
|
|
|
|
qdev: Drop qbus_set_hotplug_handler() parameter @errp
qbus_set_hotplug_handler() is a simple wrapper around
object_property_set_link().
object_property_set_link() fails when the property doesn't exist, is
not settable, or its .check() method fails. These are all programming
errors here, so passing &error_abort to qbus_set_hotplug_handler() is
appropriate.
Most of its callers do. Exceptions:
* pcie_cap_slot_init(), shpc_init(), spapr_phb_realize() pass NULL,
i.e. they ignore errors.
* spapr_machine_init() passes &error_fatal.
* s390_pcihost_realize(), virtio_serial_device_realize(),
s390_pcihost_plug() pass the error to their callers. The latter two
keep going after the error, which looks wrong.
Drop the @errp parameter, and instead pass &error_abort to
object_property_set_link().
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200630090351.1247703-15-armbru@redhat.com>
2020-06-30 12:03:39 +03:00
|
|
|
qbus_set_hotplug_handler(BUS(phb->bus), NULL);
|
2019-02-19 20:18:18 +03:00
|
|
|
pci_unregister_root_bus(phb->bus);
|
|
|
|
|
|
|
|
memory_region_del_subregion(get_system_memory(), &sphb->iowindow);
|
|
|
|
if (sphb->mem64_win_pciaddr != (hwaddr)-1) {
|
|
|
|
memory_region_del_subregion(get_system_memory(), &sphb->mem64window);
|
|
|
|
}
|
|
|
|
memory_region_del_subregion(get_system_memory(), &sphb->mem32window);
|
|
|
|
}
|
|
|
|
|
2019-07-26 17:44:38 +03:00
|
|
|
static void spapr_phb_destroy_msi(gpointer opaque)
|
|
|
|
{
|
|
|
|
SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
|
|
|
|
SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
|
2019-08-28 21:20:44 +03:00
|
|
|
SpaprPciMsi *msi = opaque;
|
2019-07-26 17:44:38 +03:00
|
|
|
|
|
|
|
if (!smc->legacy_irq_allocation) {
|
|
|
|
spapr_irq_msi_free(spapr, msi->first_irq, msi->num);
|
|
|
|
}
|
|
|
|
spapr_irq_free(spapr, msi->first_irq, msi->num);
|
|
|
|
g_free(msi);
|
|
|
|
}
|
|
|
|
|
2013-11-21 08:08:55 +04:00
|
|
|
static void spapr_phb_realize(DeviceState *dev, Error **errp)
|
2011-10-30 21:16:46 +04:00
|
|
|
{
|
2020-08-10 19:53:58 +03:00
|
|
|
ERRP_GUARD();
|
2017-10-12 19:30:14 +03:00
|
|
|
/* We don't use SPAPR_MACHINE() in order to exit gracefully if the user
|
|
|
|
* tries to add a sPAPR PHB to a non-pseries machine.
|
|
|
|
*/
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprMachineState *spapr =
|
|
|
|
(SpaprMachineState *) object_dynamic_cast(qdev_get_machine(),
|
2017-10-12 19:30:14 +03:00
|
|
|
TYPE_SPAPR_MACHINE);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprMachineClass *smc = spapr ? SPAPR_MACHINE_GET_CLASS(spapr) : NULL;
|
2013-11-21 08:08:55 +04:00
|
|
|
SysBusDevice *s = SYS_BUS_DEVICE(dev);
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb = SPAPR_PCI_HOST_BRIDGE(s);
|
2012-08-20 21:08:08 +04:00
|
|
|
PCIHostState *phb = PCI_HOST_BRIDGE(s);
|
2019-08-09 09:57:24 +03:00
|
|
|
MachineState *ms = MACHINE(spapr);
|
2012-03-12 21:50:24 +04:00
|
|
|
char *namebuf;
|
|
|
|
int i;
|
2011-10-30 21:16:46 +04:00
|
|
|
PCIBus *bus;
|
2014-08-27 20:17:12 +04:00
|
|
|
uint64_t msi_window_size = 4096;
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprTceTable *tcet;
|
2019-02-19 20:18:18 +03:00
|
|
|
const unsigned windows_supported = spapr_phb_windows_supported(sphb);
|
2011-10-30 21:16:46 +04:00
|
|
|
|
2017-10-12 19:30:14 +03:00
|
|
|
if (!spapr) {
|
|
|
|
error_setg(errp, TYPE_SPAPR_PCI_HOST_BRIDGE " needs a pseries machine");
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2019-02-19 20:18:49 +03:00
|
|
|
assert(sphb->index != (uint32_t)-1); /* checked in spapr_phb_pre_plug() */
|
2013-01-23 21:20:39 +04:00
|
|
|
|
spapr_pci: Add a 64-bit MMIO window
On real hardware, and under pHyp, the PCI host bridges on Power machines
typically advertise two outbound MMIO windows from the guest's physical
memory space to PCI memory space:
- A 32-bit window which maps onto 2GiB..4GiB in the PCI address space
- A 64-bit window which maps onto a large region somewhere high in PCI
address space (traditionally this used an identity mapping from guest
physical address to PCI address, but that's not always the case)
The qemu implementation in spapr-pci-host-bridge, however, only supports a
single outbound MMIO window, however. At least some Linux versions expect
the two windows however, so we arranged this window to map onto the PCI
memory space from 2 GiB..~64 GiB, then advertised it as two contiguous
windows, the "32-bit" window from 2G..4G and the "64-bit" window from
4G..~64G.
This approach means, however, that the 64G window is not naturally aligned.
In turn this limits the size of the largest BAR we can map (which does have
to be naturally aligned) to roughly half of the total window. With some
large nVidia GPGPU cards which have huge memory BARs, this is starting to
be a problem.
This patch adds true support for separate 32-bit and 64-bit outbound MMIO
windows to the spapr-pci-host-bridge implementation, each of which can
be independently configured. The 32-bit window always maps to 2G.. in PCI
space, but the PCI address of the 64-bit window can be configured (it
defaults to the same as the guest physical address).
So as not to break possible existing configurations, as long as a 64-bit
window is not specified, a large single window can be specified. This
will appear the same way to the guest as the old approach, although it's
now implemented by two contiguous memory regions rather than a single one.
For now, this only adds the possibility of 64-bit windows. The default
configuration still uses the legacy mode.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-11 06:23:33 +03:00
|
|
|
if (sphb->mem64_win_size != 0) {
|
|
|
|
if (sphb->mem_win_size > SPAPR_PCI_MEM32_WIN_SIZE) {
|
|
|
|
error_setg(errp, "32-bit memory window of size 0x%"HWADDR_PRIx
|
|
|
|
" (max 2 GiB)", sphb->mem_win_size);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
spapr_pci: make index property mandatory
PHBs can be created with an index property, in which case the machine
code automatically sets all the MMIO windows at addresses derived from
the index. Alternatively, they can be manually created without index,
but the user has to provide addresses for all MMIO windows.
The non-index way happens to be more trouble than it's worth: it's
difficult to use, keeps requiring (potentially incompatible) changes
when some new parameter needs adding, and is awkward to check for
collisions. It currently even has a bug that prevents to use two
non-index PHBs because their child DRCs are all derived from the
same index == -1 value, and, thus, collide.
This patch hence makes the index property mandatory. As a consequence,
the PHB's memory regions and BUID are now always configured according
to the index, and it is no longer possible to set them from the command
line.
This DOES BREAK backwards compat, but we don't think the non-index
PHB feature was used in practice (at least libvirt doesn't) and the
simplification is worth it.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-20 17:46:20 +03:00
|
|
|
/* 64-bit window defaults to identity mapping */
|
|
|
|
sphb->mem64_win_pciaddr = sphb->mem64_win_addr;
|
spapr_pci: Add a 64-bit MMIO window
On real hardware, and under pHyp, the PCI host bridges on Power machines
typically advertise two outbound MMIO windows from the guest's physical
memory space to PCI memory space:
- A 32-bit window which maps onto 2GiB..4GiB in the PCI address space
- A 64-bit window which maps onto a large region somewhere high in PCI
address space (traditionally this used an identity mapping from guest
physical address to PCI address, but that's not always the case)
The qemu implementation in spapr-pci-host-bridge, however, only supports a
single outbound MMIO window, however. At least some Linux versions expect
the two windows however, so we arranged this window to map onto the PCI
memory space from 2 GiB..~64 GiB, then advertised it as two contiguous
windows, the "32-bit" window from 2G..4G and the "64-bit" window from
4G..~64G.
This approach means, however, that the 64G window is not naturally aligned.
In turn this limits the size of the largest BAR we can map (which does have
to be naturally aligned) to roughly half of the total window. With some
large nVidia GPGPU cards which have huge memory BARs, this is starting to
be a problem.
This patch adds true support for separate 32-bit and 64-bit outbound MMIO
windows to the spapr-pci-host-bridge implementation, each of which can
be independently configured. The 32-bit window always maps to 2G.. in PCI
space, but the PCI address of the 64-bit window can be configured (it
defaults to the same as the guest physical address).
So as not to break possible existing configurations, as long as a 64-bit
window is not specified, a large single window can be specified. This
will appear the same way to the guest as the old approach, although it's
now implemented by two contiguous memory regions rather than a single one.
For now, this only adds the possibility of 64-bit windows. The default
configuration still uses the legacy mode.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-11 06:23:33 +03:00
|
|
|
} else if (sphb->mem_win_size > SPAPR_PCI_MEM32_WIN_SIZE) {
|
|
|
|
/*
|
|
|
|
* For compatibility with old configuration, if no 64-bit MMIO
|
|
|
|
* window is specified, but the ordinary (32-bit) memory
|
|
|
|
* window is specified as > 2GiB, we treat it as a 2GiB 32-bit
|
|
|
|
* window, with a 64-bit MMIO window following on immediately
|
|
|
|
* afterwards
|
|
|
|
*/
|
|
|
|
sphb->mem64_win_size = sphb->mem_win_size - SPAPR_PCI_MEM32_WIN_SIZE;
|
|
|
|
sphb->mem64_win_addr = sphb->mem_win_addr + SPAPR_PCI_MEM32_WIN_SIZE;
|
|
|
|
sphb->mem64_win_pciaddr =
|
|
|
|
SPAPR_PCI_MEM_WIN_BUS_OFFSET + SPAPR_PCI_MEM32_WIN_SIZE;
|
|
|
|
sphb->mem_win_size = SPAPR_PCI_MEM32_WIN_SIZE;
|
|
|
|
}
|
|
|
|
|
2015-05-07 08:33:34 +03:00
|
|
|
if (spapr_pci_find_phb(spapr, sphb->buid)) {
|
2019-05-29 20:15:09 +03:00
|
|
|
SpaprPhbState *s;
|
|
|
|
|
|
|
|
error_setg(errp, "PCI host bridges must have unique indexes");
|
|
|
|
error_append_hint(errp, "The following indexes are already in use:");
|
|
|
|
QLIST_FOREACH(s, &spapr->phbs, list) {
|
|
|
|
error_append_hint(errp, " %d", s->index);
|
|
|
|
}
|
|
|
|
error_append_hint(errp, "\nTry another value for the index property\n");
|
2013-11-21 08:08:55 +04:00
|
|
|
return;
|
2013-01-23 21:20:39 +04:00
|
|
|
}
|
|
|
|
|
2016-10-18 23:50:23 +03:00
|
|
|
if (sphb->numa_node != -1 &&
|
2019-08-09 09:57:24 +03:00
|
|
|
(sphb->numa_node >= MAX_NODES ||
|
|
|
|
!ms->numa_state->nodes[sphb->numa_node].present)) {
|
2016-10-18 23:50:23 +03:00
|
|
|
error_setg(errp, "Invalid NUMA node ID for PCI host bridge");
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2012-08-20 21:08:05 +04:00
|
|
|
sphb->dtbusname = g_strdup_printf("pci@%" PRIx64, sphb->buid);
|
2013-01-23 21:20:39 +04:00
|
|
|
|
2012-03-12 21:50:24 +04:00
|
|
|
/* Initialize memory regions */
|
2017-09-11 13:14:12 +03:00
|
|
|
namebuf = g_strdup_printf("%s.mmio", sphb->dtbusname);
|
2013-11-06 22:25:21 +04:00
|
|
|
memory_region_init(&sphb->memspace, OBJECT(sphb), namebuf, UINT64_MAX);
|
2017-09-11 13:14:12 +03:00
|
|
|
g_free(namebuf);
|
2011-10-30 21:16:46 +04:00
|
|
|
|
2017-09-11 13:14:12 +03:00
|
|
|
namebuf = g_strdup_printf("%s.mmio32-alias", sphb->dtbusname);
|
spapr_pci: Add a 64-bit MMIO window
On real hardware, and under pHyp, the PCI host bridges on Power machines
typically advertise two outbound MMIO windows from the guest's physical
memory space to PCI memory space:
- A 32-bit window which maps onto 2GiB..4GiB in the PCI address space
- A 64-bit window which maps onto a large region somewhere high in PCI
address space (traditionally this used an identity mapping from guest
physical address to PCI address, but that's not always the case)
The qemu implementation in spapr-pci-host-bridge, however, only supports a
single outbound MMIO window, however. At least some Linux versions expect
the two windows however, so we arranged this window to map onto the PCI
memory space from 2 GiB..~64 GiB, then advertised it as two contiguous
windows, the "32-bit" window from 2G..4G and the "64-bit" window from
4G..~64G.
This approach means, however, that the 64G window is not naturally aligned.
In turn this limits the size of the largest BAR we can map (which does have
to be naturally aligned) to roughly half of the total window. With some
large nVidia GPGPU cards which have huge memory BARs, this is starting to
be a problem.
This patch adds true support for separate 32-bit and 64-bit outbound MMIO
windows to the spapr-pci-host-bridge implementation, each of which can
be independently configured. The 32-bit window always maps to 2G.. in PCI
space, but the PCI address of the 64-bit window can be configured (it
defaults to the same as the guest physical address).
So as not to break possible existing configurations, as long as a 64-bit
window is not specified, a large single window can be specified. This
will appear the same way to the guest as the old approach, although it's
now implemented by two contiguous memory regions rather than a single one.
For now, this only adds the possibility of 64-bit windows. The default
configuration still uses the legacy mode.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-11 06:23:33 +03:00
|
|
|
memory_region_init_alias(&sphb->mem32window, OBJECT(sphb),
|
2013-06-07 05:25:08 +04:00
|
|
|
namebuf, &sphb->memspace,
|
2012-08-20 21:08:05 +04:00
|
|
|
SPAPR_PCI_MEM_WIN_BUS_OFFSET, sphb->mem_win_size);
|
2017-09-11 13:14:12 +03:00
|
|
|
g_free(namebuf);
|
2012-08-20 21:08:05 +04:00
|
|
|
memory_region_add_subregion(get_system_memory(), sphb->mem_win_addr,
|
spapr_pci: Add a 64-bit MMIO window
On real hardware, and under pHyp, the PCI host bridges on Power machines
typically advertise two outbound MMIO windows from the guest's physical
memory space to PCI memory space:
- A 32-bit window which maps onto 2GiB..4GiB in the PCI address space
- A 64-bit window which maps onto a large region somewhere high in PCI
address space (traditionally this used an identity mapping from guest
physical address to PCI address, but that's not always the case)
The qemu implementation in spapr-pci-host-bridge, however, only supports a
single outbound MMIO window, however. At least some Linux versions expect
the two windows however, so we arranged this window to map onto the PCI
memory space from 2 GiB..~64 GiB, then advertised it as two contiguous
windows, the "32-bit" window from 2G..4G and the "64-bit" window from
4G..~64G.
This approach means, however, that the 64G window is not naturally aligned.
In turn this limits the size of the largest BAR we can map (which does have
to be naturally aligned) to roughly half of the total window. With some
large nVidia GPGPU cards which have huge memory BARs, this is starting to
be a problem.
This patch adds true support for separate 32-bit and 64-bit outbound MMIO
windows to the spapr-pci-host-bridge implementation, each of which can
be independently configured. The 32-bit window always maps to 2G.. in PCI
space, but the PCI address of the 64-bit window can be configured (it
defaults to the same as the guest physical address).
So as not to break possible existing configurations, as long as a 64-bit
window is not specified, a large single window can be specified. This
will appear the same way to the guest as the old approach, although it's
now implemented by two contiguous memory regions rather than a single one.
For now, this only adds the possibility of 64-bit windows. The default
configuration still uses the legacy mode.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-11 06:23:33 +03:00
|
|
|
&sphb->mem32window);
|
|
|
|
|
spapr_pci: make index property mandatory
PHBs can be created with an index property, in which case the machine
code automatically sets all the MMIO windows at addresses derived from
the index. Alternatively, they can be manually created without index,
but the user has to provide addresses for all MMIO windows.
The non-index way happens to be more trouble than it's worth: it's
difficult to use, keeps requiring (potentially incompatible) changes
when some new parameter needs adding, and is awkward to check for
collisions. It currently even has a bug that prevents to use two
non-index PHBs because their child DRCs are all derived from the
same index == -1 value, and, thus, collide.
This patch hence makes the index property mandatory. As a consequence,
the PHB's memory regions and BUID are now always configured according
to the index, and it is no longer possible to set them from the command
line.
This DOES BREAK backwards compat, but we don't think the non-index
PHB feature was used in practice (at least libvirt doesn't) and the
simplification is worth it.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-20 17:46:20 +03:00
|
|
|
if (sphb->mem64_win_size != 0) {
|
2017-09-11 13:14:19 +03:00
|
|
|
namebuf = g_strdup_printf("%s.mmio64-alias", sphb->dtbusname);
|
|
|
|
memory_region_init_alias(&sphb->mem64window, OBJECT(sphb),
|
|
|
|
namebuf, &sphb->memspace,
|
|
|
|
sphb->mem64_win_pciaddr, sphb->mem64_win_size);
|
|
|
|
g_free(namebuf);
|
|
|
|
|
spapr_pci: make index property mandatory
PHBs can be created with an index property, in which case the machine
code automatically sets all the MMIO windows at addresses derived from
the index. Alternatively, they can be manually created without index,
but the user has to provide addresses for all MMIO windows.
The non-index way happens to be more trouble than it's worth: it's
difficult to use, keeps requiring (potentially incompatible) changes
when some new parameter needs adding, and is awkward to check for
collisions. It currently even has a bug that prevents to use two
non-index PHBs because their child DRCs are all derived from the
same index == -1 value, and, thus, collide.
This patch hence makes the index property mandatory. As a consequence,
the PHB's memory regions and BUID are now always configured according
to the index, and it is no longer possible to set them from the command
line.
This DOES BREAK backwards compat, but we don't think the non-index
PHB feature was used in practice (at least libvirt doesn't) and the
simplification is worth it.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-20 17:46:20 +03:00
|
|
|
memory_region_add_subregion(get_system_memory(),
|
|
|
|
sphb->mem64_win_addr,
|
|
|
|
&sphb->mem64window);
|
2017-09-11 13:14:19 +03:00
|
|
|
}
|
2011-10-30 21:16:46 +04:00
|
|
|
|
2014-02-07 17:44:17 +04:00
|
|
|
/* Initialize IO regions */
|
2017-09-11 13:14:12 +03:00
|
|
|
namebuf = g_strdup_printf("%s.io", sphb->dtbusname);
|
2013-06-07 05:25:08 +04:00
|
|
|
memory_region_init(&sphb->iospace, OBJECT(sphb),
|
|
|
|
namebuf, SPAPR_PCI_IO_WIN_SIZE);
|
2017-09-11 13:14:12 +03:00
|
|
|
g_free(namebuf);
|
2011-10-30 21:16:46 +04:00
|
|
|
|
2017-09-11 13:14:12 +03:00
|
|
|
namebuf = g_strdup_printf("%s.io-alias", sphb->dtbusname);
|
2013-07-22 17:54:14 +04:00
|
|
|
memory_region_init_alias(&sphb->iowindow, OBJECT(sphb), namebuf,
|
2014-02-07 17:44:17 +04:00
|
|
|
&sphb->iospace, 0, SPAPR_PCI_IO_WIN_SIZE);
|
2017-09-11 13:14:12 +03:00
|
|
|
g_free(namebuf);
|
2012-08-20 21:08:05 +04:00
|
|
|
memory_region_add_subregion(get_system_memory(), sphb->io_win_addr,
|
2012-10-29 21:24:57 +04:00
|
|
|
&sphb->iowindow);
|
2014-03-06 07:11:00 +04:00
|
|
|
|
spapr_pci: Fix broken naming of PCI bus
Recent commit 5cf0d326a0fe fixed a regression which was preventing the
guest to access the extended config space of a PCIe device. This was
done by introducing a new PCI bus subtype for PAPR. The original fix
was causing PCI busses to be named "spapr-pci-host-bridge-root-bus.N"
instead of "pci.N", which was making upper layers unhappy of course.
This got worked around by hardcoding the PCI bus name to "pci.0", but
this only works for the default PHB. And we're now hitting:
# qemu-system-ppc64 \
-device spapr-pci-host-bridge,index=1 \
-device e1000e,bus=pci.0 \
-device e1000e,bus=pci.1
qemu-system-ppc64: -device e1000e,bus=pci.1: Bus 'pci.1' not found
David already posted some patches [1] to control PCI extended config
space accesses with a new flag in the base PCI bus class instead of
subtyping. These patches are a bit more intrusive though, and
are targetted for 4.1.
When no name is passed to pci_register_bus(), the core device code
generates a lowercase name based on the QOM typename. The typename
for the base PCI bus class is "PCI", hence the "pci.0", "pci.1"
bus names. Rename the type of the PAPR PCI bus to "pci", so that
the QOM code can generate proper names. This is a hack but it is
enough to fix the regression. And all this will be reworked properly
in 4.1.
[1] https://patchwork.ozlabs.org/project/qemu-devel/list/?series=100486
Fixes: 5cf0d326a0fe
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155500034416.646888.1307366522340665522.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-04-11 19:32:24 +03:00
|
|
|
bus = pci_register_root_bus(dev, NULL,
|
2019-04-05 19:30:48 +03:00
|
|
|
pci_spapr_set_irq, pci_swizzle_map_irq_fn, sphb,
|
2017-11-29 11:46:22 +03:00
|
|
|
&sphb->memspace, &sphb->iospace,
|
2019-04-01 20:55:08 +03:00
|
|
|
PCI_DEVFN(0, 0), PCI_NUM_PINS,
|
2019-05-13 09:19:37 +03:00
|
|
|
TYPE_PCI_BUS);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Despite resembling a vanilla PCI bus in most ways, the PAPR
|
|
|
|
* para-virtualized PCI bus *does* permit PCI-E extended config
|
|
|
|
* space access
|
|
|
|
*/
|
|
|
|
if (sphb->pcie_ecs) {
|
|
|
|
bus->flags |= PCI_BUS_EXTENDED_CONFIG_SPACE;
|
|
|
|
}
|
2012-08-20 21:08:05 +04:00
|
|
|
phb->bus = bus;
|
qdev: Drop qbus_set_hotplug_handler() parameter @errp
qbus_set_hotplug_handler() is a simple wrapper around
object_property_set_link().
object_property_set_link() fails when the property doesn't exist, is
not settable, or its .check() method fails. These are all programming
errors here, so passing &error_abort to qbus_set_hotplug_handler() is
appropriate.
Most of its callers do. Exceptions:
* pcie_cap_slot_init(), shpc_init(), spapr_phb_realize() pass NULL,
i.e. they ignore errors.
* spapr_machine_init() passes &error_fatal.
* s390_pcihost_realize(), virtio_serial_device_realize(),
s390_pcihost_plug() pass the error to their callers. The latter two
keep going after the error, which looks wrong.
Drop the @errp parameter, and instead pass &error_abort to
object_property_set_link().
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200630090351.1247703-15-armbru@redhat.com>
2020-06-30 12:03:39 +03:00
|
|
|
qbus_set_hotplug_handler(BUS(phb->bus), OBJECT(sphb));
|
2012-03-12 21:50:24 +04:00
|
|
|
|
2014-05-27 09:36:32 +04:00
|
|
|
/*
|
|
|
|
* Initialize PHB address space.
|
|
|
|
* By default there will be at least one subregion for default
|
|
|
|
* 32bit DMA window.
|
|
|
|
* Later the guest might want to create another DMA window
|
|
|
|
* which will become another memory subregion.
|
|
|
|
*/
|
2017-09-11 13:14:12 +03:00
|
|
|
namebuf = g_strdup_printf("%s.iommu-root", sphb->dtbusname);
|
2014-05-27 09:36:32 +04:00
|
|
|
memory_region_init(&sphb->iommu_root, OBJECT(sphb),
|
|
|
|
namebuf, UINT64_MAX);
|
2017-09-11 13:14:12 +03:00
|
|
|
g_free(namebuf);
|
2014-05-27 09:36:32 +04:00
|
|
|
address_space_init(&sphb->iommu_as, &sphb->iommu_root,
|
|
|
|
sphb->dtbusname);
|
|
|
|
|
2014-08-27 20:17:12 +04:00
|
|
|
/*
|
|
|
|
* As MSI/MSIX interrupts trigger by writing at MSI/MSIX vectors,
|
|
|
|
* we need to allocate some memory to catch those writes coming
|
|
|
|
* from msi_notify()/msix_notify().
|
|
|
|
* As MSIMessage:addr is going to be the same and MSIMessage:data
|
|
|
|
* is going to be a VIRQ number, 4 bytes of the MSI MR will only
|
|
|
|
* be used.
|
|
|
|
*
|
|
|
|
* For KVM we want to ensure that this memory is a full page so that
|
|
|
|
* our memory slot is of page size granularity.
|
|
|
|
*/
|
|
|
|
if (kvm_enabled()) {
|
2019-10-13 05:11:45 +03:00
|
|
|
msi_window_size = qemu_real_host_page_size;
|
2014-08-27 20:17:12 +04:00
|
|
|
}
|
|
|
|
|
2017-07-25 20:59:18 +03:00
|
|
|
memory_region_init_io(&sphb->msiwindow, OBJECT(sphb), &spapr_msi_ops, spapr,
|
2014-08-27 20:17:12 +04:00
|
|
|
"msi", msi_window_size);
|
|
|
|
memory_region_add_subregion(&sphb->iommu_root, SPAPR_PCI_MSI_WINDOW,
|
|
|
|
&sphb->msiwindow);
|
|
|
|
|
2012-10-30 15:47:48 +04:00
|
|
|
pci_setup_iommu(bus, spapr_pci_dma_iommu, sphb);
|
2012-06-27 08:50:46 +04:00
|
|
|
|
2013-09-26 10:18:48 +04:00
|
|
|
pci_bus_set_route_irq_fn(bus, spapr_route_intx_pin_to_irq);
|
|
|
|
|
2012-08-20 21:08:05 +04:00
|
|
|
QLIST_INSERT_HEAD(&spapr->phbs, sphb, list);
|
2012-03-12 21:50:24 +04:00
|
|
|
|
|
|
|
/* Initialize the LSI table */
|
2012-04-25 21:55:42 +04:00
|
|
|
for (i = 0; i < PCI_NUM_PINS; i++) {
|
2020-08-10 19:53:58 +03:00
|
|
|
int irq = SPAPR_IRQ_PCI_LSI + sphb->index * PCI_NUM_PINS + i;
|
2012-03-12 21:50:24 +04:00
|
|
|
|
2018-08-10 11:00:26 +03:00
|
|
|
if (smc->legacy_irq_allocation) {
|
2020-08-10 19:53:58 +03:00
|
|
|
irq = spapr_irq_findone(spapr, errp);
|
|
|
|
if (irq < 0) {
|
|
|
|
error_prepend(errp, "can't allocate LSIs: ");
|
2019-02-19 20:18:18 +03:00
|
|
|
/*
|
|
|
|
* Older machines will never support PHB hotplug, ie, this is an
|
|
|
|
* init only path and QEMU will terminate. No need to rollback.
|
|
|
|
*/
|
2018-07-30 17:11:32 +03:00
|
|
|
return;
|
|
|
|
}
|
2018-06-18 20:34:00 +03:00
|
|
|
}
|
|
|
|
|
2020-08-10 19:53:58 +03:00
|
|
|
if (spapr_irq_claim(spapr, irq, true, errp) < 0) {
|
|
|
|
error_prepend(errp, "can't allocate LSIs: ");
|
2019-02-19 20:18:18 +03:00
|
|
|
goto unrealize;
|
2012-03-12 21:50:24 +04:00
|
|
|
}
|
|
|
|
|
2012-08-20 21:08:05 +04:00
|
|
|
sphb->lsi_table[i].irq = irq;
|
2012-03-12 21:50:24 +04:00
|
|
|
}
|
2014-05-27 09:36:31 +04:00
|
|
|
|
2015-05-07 08:33:53 +03:00
|
|
|
/* allocate connectors for child PCI devices */
|
2020-05-05 18:29:25 +03:00
|
|
|
add_drcs(sphb, phb->bus);
|
2015-05-07 08:33:53 +03:00
|
|
|
|
2016-07-04 06:33:07 +03:00
|
|
|
/* DMA setup */
|
|
|
|
for (i = 0; i < windows_supported; ++i) {
|
|
|
|
tcet = spapr_tce_new_table(DEVICE(sphb), sphb->dma_liobn[i]);
|
|
|
|
if (!tcet) {
|
|
|
|
error_setg(errp, "Creating window#%d failed for %s",
|
|
|
|
i, sphb->dtbusname);
|
2019-02-19 20:18:18 +03:00
|
|
|
goto unrealize;
|
2016-07-04 06:33:07 +03:00
|
|
|
}
|
2017-07-25 20:58:28 +03:00
|
|
|
memory_region_add_subregion(&sphb->iommu_root, 0,
|
|
|
|
spapr_tce_get_iommu(tcet));
|
2014-05-27 09:36:31 +04:00
|
|
|
}
|
2014-05-27 09:36:32 +04:00
|
|
|
|
2019-07-26 17:44:38 +03:00
|
|
|
sphb->msi = g_hash_table_new_full(g_int_hash, g_int_equal, g_free,
|
|
|
|
spapr_phb_destroy_msi);
|
2019-02-19 20:18:18 +03:00
|
|
|
return;
|
|
|
|
|
|
|
|
unrealize:
|
qdev: Unrealize must not fail
Devices may have component devices and buses.
Device realization may fail. Realization is recursive: a device's
realize() method realizes its components, and device_set_realized()
realizes its buses (which should in turn realize the devices on that
bus, except bus_set_realized() doesn't implement that, yet).
When realization of a component or bus fails, we need to roll back:
unrealize everything we realized so far. If any of these unrealizes
failed, the device would be left in an inconsistent state. Must not
happen.
device_set_realized() lets it happen: it ignores errors in the roll
back code starting at label child_realize_fail.
Since realization is recursive, unrealization must be recursive, too.
But how could a partly failed unrealize be rolled back? We'd have to
re-realize, which can fail. This design is fundamentally broken.
device_set_realized() does not roll back at all. Instead, it keeps
unrealizing, ignoring further errors.
It can screw up even for a device with no buses: if the lone
dc->unrealize() fails, it still unregisters vmstate, and calls
listeners' unrealize() callback.
bus_set_realized() does not roll back either. Instead, it stops
unrealizing.
Fortunately, no unrealize method can fail, as we'll see below.
To fix the design error, drop parameter @errp from all the unrealize
methods.
Any unrealize method that uses @errp now needs an update. This leads
us to unrealize() methods that can fail. Merely passing it to another
unrealize method cannot cause failure, though. Here are the ones that
do other things with @errp:
* virtio_serial_device_unrealize()
Fails when qbus_set_hotplug_handler() fails, but still does all the
other work. On failure, the device would stay realized with its
resources completely gone. Oops. Can't happen, because
qbus_set_hotplug_handler() can't actually fail here. Pass
&error_abort to qbus_set_hotplug_handler() instead.
* hw/ppc/spapr_drc.c's unrealize()
Fails when object_property_del() fails, but all the other work is
already done. On failure, the device would stay realized with its
vmstate registration gone. Oops. Can't happen, because
object_property_del() can't actually fail here. Pass &error_abort
to object_property_del() instead.
* spapr_phb_unrealize()
Fails and bails out when remove_drcs() fails, but other work is
already done. On failure, the device would stay realized with some
of its resources gone. Oops. remove_drcs() fails only when
chassis_from_bus()'s object_property_get_uint() fails, and it can't
here. Pass &error_abort to remove_drcs() instead.
Therefore, no unrealize method can fail before this patch.
device_set_realized()'s recursive unrealization via bus uses
object_property_set_bool(). Can't drop @errp there, so pass
&error_abort.
We similarly unrealize with object_property_set_bool() elsewhere,
always ignoring errors. Pass &error_abort instead.
Several unrealize methods no longer handle errors from other unrealize
methods: virtio_9p_device_unrealize(),
virtio_input_device_unrealize(), scsi_qdev_unrealize(), ...
Much of the deleted error handling looks wrong anyway.
One unrealize methods no longer ignore such errors:
usb_ehci_pci_exit().
Several realize methods no longer ignore errors when rolling back:
v9fs_device_realize_common(), pci_qdev_unrealize(),
spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(),
virtio_device_realize().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-17-armbru@redhat.com>
2020-05-05 18:29:24 +03:00
|
|
|
spapr_phb_unrealize(dev);
|
2012-03-12 21:50:24 +04:00
|
|
|
}
|
|
|
|
|
2014-05-27 09:36:33 +04:00
|
|
|
static int spapr_phb_children_reset(Object *child, void *opaque)
|
2012-09-12 20:57:14 +04:00
|
|
|
{
|
2014-05-27 09:36:33 +04:00
|
|
|
DeviceState *dev = (DeviceState *) object_dynamic_cast(child, TYPE_DEVICE);
|
|
|
|
|
|
|
|
if (dev) {
|
2020-01-30 19:02:03 +03:00
|
|
|
device_legacy_reset(dev);
|
2014-05-27 09:36:33 +04:00
|
|
|
}
|
2012-09-12 20:57:14 +04:00
|
|
|
|
2014-05-27 09:36:33 +04:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
void spapr_phb_dma_reset(SpaprPhbState *sphb)
|
2014-05-27 09:36:33 +04:00
|
|
|
{
|
2016-07-04 06:33:07 +03:00
|
|
|
int i;
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprTceTable *tcet;
|
2016-07-04 06:33:07 +03:00
|
|
|
|
|
|
|
for (i = 0; i < SPAPR_PCI_DMA_MAX_WINDOWS; ++i) {
|
|
|
|
tcet = spapr_tce_find_by_liobn(sphb->dma_liobn[i]);
|
2016-06-01 11:57:36 +03:00
|
|
|
|
2016-07-04 06:33:07 +03:00
|
|
|
if (tcet && tcet->nb_table) {
|
|
|
|
spapr_tce_table_disable(tcet);
|
|
|
|
}
|
2016-06-01 11:57:36 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Register default 32bit DMA window */
|
2016-07-04 06:33:07 +03:00
|
|
|
tcet = spapr_tce_find_by_liobn(sphb->dma_liobn[0]);
|
2016-06-01 11:57:36 +03:00
|
|
|
spapr_tce_table_enable(tcet, SPAPR_TCE_PAGE_SHIFT, sphb->dma_win_addr,
|
|
|
|
sphb->dma_win_size >> SPAPR_TCE_PAGE_SHIFT);
|
2016-06-01 11:57:39 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
static void spapr_phb_reset(DeviceState *qdev)
|
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb = SPAPR_PCI_HOST_BRIDGE(qdev);
|
2019-12-04 12:36:23 +03:00
|
|
|
Error *err = NULL;
|
2016-06-01 11:57:39 +03:00
|
|
|
|
|
|
|
spapr_phb_dma_reset(sphb);
|
spapr: Support NVIDIA V100 GPU with NVLink2
NVIDIA V100 GPUs have on-board RAM which is mapped into the host memory
space and accessible as normal RAM via an NVLink bus. The VFIO-PCI driver
implements special regions for such GPUs and emulates an NVLink bridge.
NVLink2-enabled POWER9 CPUs also provide address translation services
which includes an ATS shootdown (ATSD) register exported via the NVLink
bridge device.
This adds a quirk to VFIO to map the GPU memory and create an MR;
the new MR is stored in a PCI device as a QOM link. The sPAPR PCI uses
this to get the MR and map it to the system address space.
Another quirk does the same for ATSD.
This adds additional steps to sPAPR PHB setup:
1. Search for specific GPUs and NPUs, collect findings in
sPAPRPHBState::nvgpus, manage system address space mappings;
2. Add device-specific properties such as "ibm,npu", "ibm,gpu",
"memory-block", "link-speed" to advertise the NVLink2 function to
the guest;
3. Add "mmio-atsd" to vPHB to advertise the ATSD capability;
4. Add new memory blocks (with extra "linux,memory-usable" to prevent
the guest OS from accessing the new memory until it is onlined) and
npuphb# nodes representing an NPU unit for every vPHB as the GPU driver
uses it for link discovery.
This allocates space for GPU RAM and ATSD like we do for MMIOs by
adding 2 new parameters to the phb_placement() hook. Older machine types
set these to zero.
This puts new memory nodes in a separate NUMA node to as the GPU RAM
needs to be configured equally distant from any other node in the system.
Unlike the host setup which assigns numa ids from 255 downwards, this
adds new NUMA nodes after the user configures nodes or from 1 if none
were configured.
This adds requirement similar to EEH - one IOMMU group per vPHB.
The reason for this is that ATSD registers belong to a physical NPU
so they cannot invalidate translations on GPUs attached to another NPU.
It is guaranteed by the host platform as it does not mix NVLink bridges
or GPUs from different NPU in the same IOMMU group. If more than one
IOMMU group is detected on a vPHB, this disables ATSD support for that
vPHB and prints a warning.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for vfio portions]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <20190312082103.130561-1-aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 11:21:03 +03:00
|
|
|
spapr_phb_nvgpu_free(sphb);
|
2019-12-04 12:36:23 +03:00
|
|
|
spapr_phb_nvgpu_setup(sphb, &err);
|
|
|
|
if (err) {
|
|
|
|
error_report_err(err);
|
spapr: Support NVIDIA V100 GPU with NVLink2
NVIDIA V100 GPUs have on-board RAM which is mapped into the host memory
space and accessible as normal RAM via an NVLink bus. The VFIO-PCI driver
implements special regions for such GPUs and emulates an NVLink bridge.
NVLink2-enabled POWER9 CPUs also provide address translation services
which includes an ATS shootdown (ATSD) register exported via the NVLink
bridge device.
This adds a quirk to VFIO to map the GPU memory and create an MR;
the new MR is stored in a PCI device as a QOM link. The sPAPR PCI uses
this to get the MR and map it to the system address space.
Another quirk does the same for ATSD.
This adds additional steps to sPAPR PHB setup:
1. Search for specific GPUs and NPUs, collect findings in
sPAPRPHBState::nvgpus, manage system address space mappings;
2. Add device-specific properties such as "ibm,npu", "ibm,gpu",
"memory-block", "link-speed" to advertise the NVLink2 function to
the guest;
3. Add "mmio-atsd" to vPHB to advertise the ATSD capability;
4. Add new memory blocks (with extra "linux,memory-usable" to prevent
the guest OS from accessing the new memory until it is onlined) and
npuphb# nodes representing an NPU unit for every vPHB as the GPU driver
uses it for link discovery.
This allocates space for GPU RAM and ATSD like we do for MMIOs by
adding 2 new parameters to the phb_placement() hook. Older machine types
set these to zero.
This puts new memory nodes in a separate NUMA node to as the GPU RAM
needs to be configured equally distant from any other node in the system.
Unlike the host setup which assigns numa ids from 255 downwards, this
adds new NUMA nodes after the user configures nodes or from 1 if none
were configured.
This adds requirement similar to EEH - one IOMMU group per vPHB.
The reason for this is that ATSD registers belong to a physical NPU
so they cannot invalidate translations on GPUs attached to another NPU.
It is guaranteed by the host platform as it does not mix NVLink bridges
or GPUs from different NPU in the same IOMMU group. If more than one
IOMMU group is detected on a vPHB, this disables ATSD support for that
vPHB and prints a warning.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for vfio portions]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <20190312082103.130561-1-aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 11:21:03 +03:00
|
|
|
}
|
2016-06-01 11:57:36 +03:00
|
|
|
|
2012-09-12 20:57:14 +04:00
|
|
|
/* Reset the IOMMU state */
|
2014-05-27 09:36:33 +04:00
|
|
|
object_child_foreach(OBJECT(qdev), spapr_phb_children_reset, NULL);
|
2016-02-29 09:45:05 +03:00
|
|
|
|
|
|
|
if (spapr_phb_eeh_available(SPAPR_PCI_HOST_BRIDGE(qdev))) {
|
|
|
|
spapr_phb_vfio_reset(qdev);
|
|
|
|
}
|
2019-07-26 17:44:44 +03:00
|
|
|
|
|
|
|
g_hash_table_remove_all(sphb->msi);
|
2012-09-12 20:57:14 +04:00
|
|
|
}
|
|
|
|
|
2012-03-12 21:50:24 +04:00
|
|
|
static Property spapr_phb_properties[] = {
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
DEFINE_PROP_UINT32("index", SpaprPhbState, index, -1),
|
|
|
|
DEFINE_PROP_UINT64("mem_win_size", SpaprPhbState, mem_win_size,
|
2016-10-16 04:04:15 +03:00
|
|
|
SPAPR_PCI_MEM32_WIN_SIZE),
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
DEFINE_PROP_UINT64("mem64_win_size", SpaprPhbState, mem64_win_size,
|
2016-10-16 04:04:15 +03:00
|
|
|
SPAPR_PCI_MEM64_WIN_SIZE),
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
DEFINE_PROP_UINT64("io_win_size", SpaprPhbState, io_win_size,
|
2014-02-08 14:01:53 +04:00
|
|
|
SPAPR_PCI_IO_WIN_SIZE),
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
DEFINE_PROP_BOOL("dynamic-reconfiguration", SpaprPhbState, dr_enabled,
|
2015-05-07 08:33:52 +03:00
|
|
|
true),
|
2015-09-24 02:56:44 +03:00
|
|
|
/* Default DMA window is 0..1GB */
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
DEFINE_PROP_UINT64("dma_win_addr", SpaprPhbState, dma_win_addr, 0),
|
|
|
|
DEFINE_PROP_UINT64("dma_win_size", SpaprPhbState, dma_win_size, 0x40000000),
|
|
|
|
DEFINE_PROP_UINT64("dma64_win_addr", SpaprPhbState, dma64_win_addr,
|
2016-07-04 06:33:07 +03:00
|
|
|
0x800000000000000ULL),
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
DEFINE_PROP_BOOL("ddw", SpaprPhbState, ddw_enabled, true),
|
|
|
|
DEFINE_PROP_UINT64("pgsz", SpaprPhbState, page_size_mask,
|
2019-07-05 08:03:05 +03:00
|
|
|
(1ULL << 12) | (1ULL << 16)
|
|
|
|
| (1ULL << 21) | (1ULL << 24)),
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
DEFINE_PROP_UINT32("numa_node", SpaprPhbState, numa_node, -1),
|
|
|
|
DEFINE_PROP_BOOL("pre-2.8-migration", SpaprPhbState,
|
2016-11-23 02:26:38 +03:00
|
|
|
pre_2_8_migration, false),
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
DEFINE_PROP_BOOL("pcie-extended-configuration-space", SpaprPhbState,
|
2017-03-14 03:54:17 +03:00
|
|
|
pcie_ecs, true),
|
spapr: Support NVIDIA V100 GPU with NVLink2
NVIDIA V100 GPUs have on-board RAM which is mapped into the host memory
space and accessible as normal RAM via an NVLink bus. The VFIO-PCI driver
implements special regions for such GPUs and emulates an NVLink bridge.
NVLink2-enabled POWER9 CPUs also provide address translation services
which includes an ATS shootdown (ATSD) register exported via the NVLink
bridge device.
This adds a quirk to VFIO to map the GPU memory and create an MR;
the new MR is stored in a PCI device as a QOM link. The sPAPR PCI uses
this to get the MR and map it to the system address space.
Another quirk does the same for ATSD.
This adds additional steps to sPAPR PHB setup:
1. Search for specific GPUs and NPUs, collect findings in
sPAPRPHBState::nvgpus, manage system address space mappings;
2. Add device-specific properties such as "ibm,npu", "ibm,gpu",
"memory-block", "link-speed" to advertise the NVLink2 function to
the guest;
3. Add "mmio-atsd" to vPHB to advertise the ATSD capability;
4. Add new memory blocks (with extra "linux,memory-usable" to prevent
the guest OS from accessing the new memory until it is onlined) and
npuphb# nodes representing an NPU unit for every vPHB as the GPU driver
uses it for link discovery.
This allocates space for GPU RAM and ATSD like we do for MMIOs by
adding 2 new parameters to the phb_placement() hook. Older machine types
set these to zero.
This puts new memory nodes in a separate NUMA node to as the GPU RAM
needs to be configured equally distant from any other node in the system.
Unlike the host setup which assigns numa ids from 255 downwards, this
adds new NUMA nodes after the user configures nodes or from 1 if none
were configured.
This adds requirement similar to EEH - one IOMMU group per vPHB.
The reason for this is that ATSD registers belong to a physical NPU
so they cannot invalidate translations on GPUs attached to another NPU.
It is guaranteed by the host platform as it does not mix NVLink bridges
or GPUs from different NPU in the same IOMMU group. If more than one
IOMMU group is detected on a vPHB, this disables ATSD support for that
vPHB and prints a warning.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for vfio portions]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <20190312082103.130561-1-aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 11:21:03 +03:00
|
|
|
DEFINE_PROP_UINT64("gpa", SpaprPhbState, nv2_gpa_win_addr, 0),
|
|
|
|
DEFINE_PROP_UINT64("atsd", SpaprPhbState, nv2_atsd_win_addr, 0),
|
2020-07-17 01:56:55 +03:00
|
|
|
DEFINE_PROP_BOOL("pre-5.1-associativity", SpaprPhbState,
|
|
|
|
pre_5_1_assoc, false),
|
2012-03-12 21:50:24 +04:00
|
|
|
DEFINE_PROP_END_OF_LIST(),
|
|
|
|
};
|
|
|
|
|
2013-07-18 23:33:02 +04:00
|
|
|
static const VMStateDescription vmstate_spapr_pci_lsi = {
|
|
|
|
.name = "spapr_pci/lsi",
|
|
|
|
.version_id = 1,
|
|
|
|
.minimum_version_id = 1,
|
2014-04-16 17:24:04 +04:00
|
|
|
.fields = (VMStateField[]) {
|
2019-08-28 21:20:44 +03:00
|
|
|
VMSTATE_UINT32_EQUAL(irq, SpaprPciLsi, NULL),
|
2013-07-18 23:33:02 +04:00
|
|
|
|
|
|
|
VMSTATE_END_OF_LIST()
|
|
|
|
},
|
|
|
|
};
|
|
|
|
|
|
|
|
static const VMStateDescription vmstate_spapr_pci_msi = {
|
2014-05-30 13:34:20 +04:00
|
|
|
.name = "spapr_pci/msi",
|
2013-07-18 23:33:02 +04:00
|
|
|
.version_id = 1,
|
|
|
|
.minimum_version_id = 1,
|
2014-05-30 13:34:20 +04:00
|
|
|
.fields = (VMStateField []) {
|
2019-08-28 21:20:44 +03:00
|
|
|
VMSTATE_UINT32(key, SpaprPciMsiMig),
|
|
|
|
VMSTATE_UINT32(value.first_irq, SpaprPciMsiMig),
|
|
|
|
VMSTATE_UINT32(value.num, SpaprPciMsiMig),
|
2013-07-18 23:33:02 +04:00
|
|
|
VMSTATE_END_OF_LIST()
|
|
|
|
},
|
|
|
|
};
|
|
|
|
|
2017-09-25 14:29:12 +03:00
|
|
|
static int spapr_pci_pre_save(void *opaque)
|
2014-05-30 13:34:20 +04:00
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb = opaque;
|
2015-07-02 09:23:13 +03:00
|
|
|
GHashTableIter iter;
|
|
|
|
gpointer key, value;
|
|
|
|
int i;
|
2014-05-30 13:34:20 +04:00
|
|
|
|
2016-11-23 02:26:38 +03:00
|
|
|
if (sphb->pre_2_8_migration) {
|
|
|
|
sphb->mig_liobn = sphb->dma_liobn[0];
|
|
|
|
sphb->mig_mem_win_addr = sphb->mem_win_addr;
|
|
|
|
sphb->mig_mem_win_size = sphb->mem_win_size;
|
|
|
|
sphb->mig_io_win_addr = sphb->io_win_addr;
|
|
|
|
sphb->mig_io_win_size = sphb->io_win_size;
|
|
|
|
|
|
|
|
if ((sphb->mem64_win_size != 0)
|
|
|
|
&& (sphb->mem64_win_addr
|
|
|
|
== (sphb->mem_win_addr + sphb->mem_win_size))) {
|
|
|
|
sphb->mig_mem_win_size += sphb->mem64_win_size;
|
|
|
|
}
|
|
|
|
}
|
2017-06-28 17:09:19 +03:00
|
|
|
|
|
|
|
g_free(sphb->msi_devs);
|
|
|
|
sphb->msi_devs = NULL;
|
|
|
|
sphb->msi_devs_num = g_hash_table_size(sphb->msi);
|
|
|
|
if (!sphb->msi_devs_num) {
|
2017-09-25 14:29:12 +03:00
|
|
|
return 0;
|
2017-06-28 17:09:19 +03:00
|
|
|
}
|
2019-08-28 21:20:44 +03:00
|
|
|
sphb->msi_devs = g_new(SpaprPciMsiMig, sphb->msi_devs_num);
|
2017-06-28 17:09:19 +03:00
|
|
|
|
|
|
|
g_hash_table_iter_init(&iter, sphb->msi);
|
|
|
|
for (i = 0; g_hash_table_iter_next(&iter, &key, &value); ++i) {
|
|
|
|
sphb->msi_devs[i].key = *(uint32_t *) key;
|
2019-08-28 21:20:44 +03:00
|
|
|
sphb->msi_devs[i].value = *(SpaprPciMsi *) value;
|
2017-06-28 17:09:19 +03:00
|
|
|
}
|
2017-09-25 14:29:12 +03:00
|
|
|
|
|
|
|
return 0;
|
2014-05-30 13:34:20 +04:00
|
|
|
}
|
|
|
|
|
2020-12-31 09:10:18 +03:00
|
|
|
static int spapr_pci_post_save(void *opaque)
|
|
|
|
{
|
|
|
|
SpaprPhbState *sphb = opaque;
|
|
|
|
|
|
|
|
g_free(sphb->msi_devs);
|
|
|
|
sphb->msi_devs = NULL;
|
|
|
|
sphb->msi_devs_num = 0;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2014-05-30 13:34:20 +04:00
|
|
|
static int spapr_pci_post_load(void *opaque, int version_id)
|
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb = opaque;
|
2014-05-30 13:34:20 +04:00
|
|
|
gpointer key, value;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
for (i = 0; i < sphb->msi_devs_num; ++i) {
|
|
|
|
key = g_memdup(&sphb->msi_devs[i].key,
|
|
|
|
sizeof(sphb->msi_devs[i].key));
|
|
|
|
value = g_memdup(&sphb->msi_devs[i].value,
|
|
|
|
sizeof(sphb->msi_devs[i].value));
|
|
|
|
g_hash_table_insert(sphb->msi, key, value);
|
|
|
|
}
|
2015-08-26 15:02:53 +03:00
|
|
|
g_free(sphb->msi_devs);
|
|
|
|
sphb->msi_devs = NULL;
|
2014-05-30 13:34:20 +04:00
|
|
|
sphb->msi_devs_num = 0;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2016-11-23 02:26:38 +03:00
|
|
|
static bool pre_2_8_migration(void *opaque, int version_id)
|
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb = opaque;
|
2016-11-23 02:26:38 +03:00
|
|
|
|
|
|
|
return sphb->pre_2_8_migration;
|
|
|
|
}
|
|
|
|
|
2013-07-18 23:33:02 +04:00
|
|
|
static const VMStateDescription vmstate_spapr_pci = {
|
|
|
|
.name = "spapr_pci",
|
2016-11-21 04:12:10 +03:00
|
|
|
.version_id = 2,
|
2014-05-30 13:34:20 +04:00
|
|
|
.minimum_version_id = 2,
|
|
|
|
.pre_save = spapr_pci_pre_save,
|
2020-12-31 09:10:18 +03:00
|
|
|
.post_save = spapr_pci_post_save,
|
2014-05-30 13:34:20 +04:00
|
|
|
.post_load = spapr_pci_post_load,
|
2014-04-16 17:24:04 +04:00
|
|
|
.fields = (VMStateField[]) {
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
VMSTATE_UINT64_EQUAL(buid, SpaprPhbState, NULL),
|
|
|
|
VMSTATE_UINT32_TEST(mig_liobn, SpaprPhbState, pre_2_8_migration),
|
|
|
|
VMSTATE_UINT64_TEST(mig_mem_win_addr, SpaprPhbState, pre_2_8_migration),
|
|
|
|
VMSTATE_UINT64_TEST(mig_mem_win_size, SpaprPhbState, pre_2_8_migration),
|
|
|
|
VMSTATE_UINT64_TEST(mig_io_win_addr, SpaprPhbState, pre_2_8_migration),
|
|
|
|
VMSTATE_UINT64_TEST(mig_io_win_size, SpaprPhbState, pre_2_8_migration),
|
|
|
|
VMSTATE_STRUCT_ARRAY(lsi_table, SpaprPhbState, PCI_NUM_PINS, 0,
|
2019-08-28 21:20:44 +03:00
|
|
|
vmstate_spapr_pci_lsi, SpaprPciLsi),
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
VMSTATE_INT32(msi_devs_num, SpaprPhbState),
|
|
|
|
VMSTATE_STRUCT_VARRAY_ALLOC(msi_devs, SpaprPhbState, msi_devs_num, 0,
|
2019-08-28 21:20:44 +03:00
|
|
|
vmstate_spapr_pci_msi, SpaprPciMsiMig),
|
2013-07-18 23:33:02 +04:00
|
|
|
VMSTATE_END_OF_LIST()
|
|
|
|
},
|
|
|
|
};
|
|
|
|
|
2013-06-06 12:48:49 +04:00
|
|
|
static const char *spapr_phb_root_bus_path(PCIHostState *host_bridge,
|
|
|
|
PCIBus *rootbus)
|
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb = SPAPR_PCI_HOST_BRIDGE(host_bridge);
|
2013-06-06 12:48:49 +04:00
|
|
|
|
|
|
|
return sphb->dtbusname;
|
|
|
|
}
|
|
|
|
|
2012-03-12 21:50:24 +04:00
|
|
|
static void spapr_phb_class_init(ObjectClass *klass, void *data)
|
|
|
|
{
|
2013-06-06 12:48:49 +04:00
|
|
|
PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(klass);
|
2012-03-12 21:50:24 +04:00
|
|
|
DeviceClass *dc = DEVICE_CLASS(klass);
|
2015-05-07 08:33:55 +03:00
|
|
|
HotplugHandlerClass *hp = HOTPLUG_HANDLER_CLASS(klass);
|
2012-03-12 21:50:24 +04:00
|
|
|
|
2013-06-06 12:48:49 +04:00
|
|
|
hc->root_bus_path = spapr_phb_root_bus_path;
|
2013-11-21 08:08:55 +04:00
|
|
|
dc->realize = spapr_phb_realize;
|
2019-02-19 20:18:18 +03:00
|
|
|
dc->unrealize = spapr_phb_unrealize;
|
2020-01-10 18:30:32 +03:00
|
|
|
device_class_set_props(dc, spapr_phb_properties);
|
2012-09-12 20:57:14 +04:00
|
|
|
dc->reset = spapr_phb_reset;
|
2013-07-18 23:33:02 +04:00
|
|
|
dc->vmsd = &vmstate_spapr_pci;
|
sysbus: Set user_creatable=false by default on TYPE_SYS_BUS_DEVICE
commit 33cd52b5d7b9adfd009e95f07e6c64dd88ae2a31 unset
cannot_instantiate_with_device_add_yet in TYPE_SYSBUS, making all
sysbus devices appear on "-device help" and lack the "no-user"
flag in "info qdm".
To fix this, we can set user_creatable=false by default on
TYPE_SYS_BUS_DEVICE, but this requires setting
user_creatable=true explicitly on the sysbus devices that
actually work with -device.
Fortunately today we have just a few has_dynamic_sysbus=1
machines: virt, pc-q35-*, ppce500, and spapr.
virt, ppce500, and spapr have extra checks to ensure just a few
device types can be instantiated:
* virt supports only TYPE_VFIO_CALXEDA_XGMAC, TYPE_VFIO_AMD_XGBE.
* ppce500 supports only TYPE_ETSEC_COMMON.
* spapr supports only TYPE_SPAPR_PCI_HOST_BRIDGE.
This patch sets user_creatable=true explicitly on those 4 device
classes.
Now, the more complex cases:
pc-q35-*: q35 has no sysbus device whitelist yet (which is a
separate bug). We are in the process of fixing it and building a
sysbus whitelist on q35, but in the meantime we can fix the
"-device help" and "info qdm" bugs mentioned above. Also, despite
not being strictly necessary for fixing the q35 bug, reducing the
list of user_creatable=true devices will help us be more
confident when building the q35 whitelist.
xen: We also have a hack at xen_set_dynamic_sysbus(), that sets
has_dynamic_sysbus=true at runtime when using the Xen
accelerator. This hack is only used to allow xen-backend devices
to be dynamically plugged/unplugged.
This means today we can use -device with the following 22 device
types, that are the ones compiled into the qemu-system-x86_64 and
qemu-system-i386 binaries:
* allwinner-ahci
* amd-iommu
* cfi.pflash01
* esp
* fw_cfg_io
* fw_cfg_mem
* generic-sdhci
* hpet
* intel-iommu
* ioapic
* isabus-bridge
* kvmclock
* kvm-ioapic
* kvmvapic
* SUNW,fdtwo
* sysbus-ahci
* sysbus-fdc
* sysbus-ohci
* unimplemented-device
* virtio-mmio
* xen-backend
* xen-sysdev
This patch adds user_creatable=true explicitly to those devices,
temporarily, just to keep 100% compatibility with existing
behavior of q35. Subsequent patches will remove
user_creatable=true from the devices that are really not meant to
user-creatable on any machine, and remove the FIXME comment from
the ones that are really supposed to be user-creatable. This is
being done in separate patches because we still don't have an
obvious list of devices that will be whitelisted by q35, and I
would like to get each device reviewed individually.
Cc: Alexander Graf <agraf@suse.de>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Beniamino Galvani <b.galvani@gmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Frank Blaschka <frank.blaschka@de.ibm.com>
Cc: Gabriel L. Somlo <somlo@cmu.edu>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: John Snow <jsnow@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Pierre Morel <pmorel@linux.vnet.ibm.com>
Cc: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-arm@nongnu.org
Cc: qemu-block@nongnu.org
Cc: qemu-ppc@nongnu.org
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rob Herring <robh@kernel.org>
Cc: Shannon Zhao <zhaoshenglong@huawei.com>
Cc: sstabellini@kernel.org
Cc: Thomas Huth <thuth@redhat.com>
Cc: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Acked-by: John Snow <jsnow@redhat.com>
Acked-by: Juergen Gross <jgross@suse.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-3-ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[ehabkost: Small changes at sysbus_device_class_init() comments]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-03 23:35:45 +03:00
|
|
|
/* Supported by TYPE_SPAPR_MACHINE */
|
|
|
|
dc->user_creatable = true;
|
2014-01-13 13:29:09 +04:00
|
|
|
set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
|
2020-11-21 02:42:00 +03:00
|
|
|
hp->pre_plug = spapr_pci_pre_plug;
|
2017-07-03 09:34:28 +03:00
|
|
|
hp->plug = spapr_pci_plug;
|
2018-12-12 12:16:23 +03:00
|
|
|
hp->unplug = spapr_pci_unplug;
|
2017-07-03 09:34:28 +03:00
|
|
|
hp->unplug_request = spapr_pci_unplug_request;
|
2012-03-12 21:50:24 +04:00
|
|
|
}
|
2011-10-30 21:16:46 +04:00
|
|
|
|
2012-08-20 21:07:56 +04:00
|
|
|
static const TypeInfo spapr_phb_info = {
|
2012-08-20 21:08:05 +04:00
|
|
|
.name = TYPE_SPAPR_PCI_HOST_BRIDGE,
|
2012-08-20 21:08:08 +04:00
|
|
|
.parent = TYPE_PCI_HOST_BRIDGE,
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
.instance_size = sizeof(SpaprPhbState),
|
2019-02-19 20:18:18 +03:00
|
|
|
.instance_finalize = spapr_phb_finalizefn,
|
2012-03-12 21:50:24 +04:00
|
|
|
.class_init = spapr_phb_class_init,
|
2015-05-07 08:33:55 +03:00
|
|
|
.interfaces = (InterfaceInfo[]) {
|
|
|
|
{ TYPE_HOTPLUG_HANDLER },
|
|
|
|
{ }
|
|
|
|
}
|
2012-03-12 21:50:24 +04:00
|
|
|
};
|
|
|
|
|
2015-07-02 09:23:21 +03:00
|
|
|
static void spapr_phb_pci_enumerate_bridge(PCIBus *bus, PCIDevice *pdev,
|
|
|
|
void *opaque)
|
|
|
|
{
|
|
|
|
unsigned int *bus_no = opaque;
|
|
|
|
PCIBus *sec_bus = NULL;
|
|
|
|
|
|
|
|
if ((pci_default_read_config(pdev, PCI_HEADER_TYPE, 1) !=
|
|
|
|
PCI_HEADER_TYPE_BRIDGE)) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
(*bus_no)++;
|
2019-01-23 11:24:25 +03:00
|
|
|
pci_default_write_config(pdev, PCI_PRIMARY_BUS, pci_dev_bus_num(pdev), 1);
|
2015-07-02 09:23:21 +03:00
|
|
|
pci_default_write_config(pdev, PCI_SECONDARY_BUS, *bus_no, 1);
|
|
|
|
pci_default_write_config(pdev, PCI_SUBORDINATE_BUS, *bus_no, 1);
|
|
|
|
|
|
|
|
sec_bus = pci_bridge_get_sec_bus(PCI_BRIDGE(pdev));
|
|
|
|
if (!sec_bus) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2021-10-28 07:31:26 +03:00
|
|
|
pci_for_each_device_under_bus(sec_bus, spapr_phb_pci_enumerate_bridge,
|
|
|
|
bus_no);
|
2015-07-02 09:23:21 +03:00
|
|
|
pci_default_write_config(pdev, PCI_SUBORDINATE_BUS, *bus_no, 1);
|
|
|
|
}
|
|
|
|
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
static void spapr_phb_pci_enumerate(SpaprPhbState *phb)
|
2015-07-02 09:23:21 +03:00
|
|
|
{
|
|
|
|
PCIBus *bus = PCI_HOST_BRIDGE(phb)->bus;
|
|
|
|
unsigned int bus_no = 0;
|
|
|
|
|
2021-10-28 07:31:26 +03:00
|
|
|
pci_for_each_device_under_bus(bus, spapr_phb_pci_enumerate_bridge,
|
|
|
|
&bus_no);
|
2015-07-02 09:23:21 +03:00
|
|
|
|
|
|
|
}
|
|
|
|
|
2019-09-27 06:44:58 +03:00
|
|
|
int spapr_dt_phb(SpaprMachineState *spapr, SpaprPhbState *phb,
|
|
|
|
uint32_t intc_phandle, void *fdt, int *node_offset)
|
2011-10-30 21:16:46 +04:00
|
|
|
{
|
2015-05-07 08:33:53 +03:00
|
|
|
int bus_off, i, j, ret;
|
2011-10-30 21:16:46 +04:00
|
|
|
uint32_t bus_range[] = { cpu_to_be32(0), cpu_to_be32(0xff) };
|
|
|
|
struct {
|
|
|
|
uint32_t hi;
|
|
|
|
uint64_t child;
|
|
|
|
uint64_t parent;
|
|
|
|
uint64_t size;
|
2012-07-18 12:22:51 +04:00
|
|
|
} QEMU_PACKED ranges[] = {
|
2011-10-30 21:16:46 +04:00
|
|
|
{
|
|
|
|
cpu_to_be32(b_ss(1)), cpu_to_be64(0),
|
|
|
|
cpu_to_be64(phb->io_win_addr),
|
|
|
|
cpu_to_be64(memory_region_size(&phb->iospace)),
|
|
|
|
},
|
|
|
|
{
|
|
|
|
cpu_to_be32(b_ss(2)), cpu_to_be64(SPAPR_PCI_MEM_WIN_BUS_OFFSET),
|
|
|
|
cpu_to_be64(phb->mem_win_addr),
|
spapr_pci: Add a 64-bit MMIO window
On real hardware, and under pHyp, the PCI host bridges on Power machines
typically advertise two outbound MMIO windows from the guest's physical
memory space to PCI memory space:
- A 32-bit window which maps onto 2GiB..4GiB in the PCI address space
- A 64-bit window which maps onto a large region somewhere high in PCI
address space (traditionally this used an identity mapping from guest
physical address to PCI address, but that's not always the case)
The qemu implementation in spapr-pci-host-bridge, however, only supports a
single outbound MMIO window, however. At least some Linux versions expect
the two windows however, so we arranged this window to map onto the PCI
memory space from 2 GiB..~64 GiB, then advertised it as two contiguous
windows, the "32-bit" window from 2G..4G and the "64-bit" window from
4G..~64G.
This approach means, however, that the 64G window is not naturally aligned.
In turn this limits the size of the largest BAR we can map (which does have
to be naturally aligned) to roughly half of the total window. With some
large nVidia GPGPU cards which have huge memory BARs, this is starting to
be a problem.
This patch adds true support for separate 32-bit and 64-bit outbound MMIO
windows to the spapr-pci-host-bridge implementation, each of which can
be independently configured. The 32-bit window always maps to 2G.. in PCI
space, but the PCI address of the 64-bit window can be configured (it
defaults to the same as the guest physical address).
So as not to break possible existing configurations, as long as a 64-bit
window is not specified, a large single window can be specified. This
will appear the same way to the guest as the old approach, although it's
now implemented by two contiguous memory regions rather than a single one.
For now, this only adds the possibility of 64-bit windows. The default
configuration still uses the legacy mode.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-11 06:23:33 +03:00
|
|
|
cpu_to_be64(phb->mem_win_size),
|
2015-01-30 04:53:19 +03:00
|
|
|
},
|
|
|
|
{
|
spapr_pci: Add a 64-bit MMIO window
On real hardware, and under pHyp, the PCI host bridges on Power machines
typically advertise two outbound MMIO windows from the guest's physical
memory space to PCI memory space:
- A 32-bit window which maps onto 2GiB..4GiB in the PCI address space
- A 64-bit window which maps onto a large region somewhere high in PCI
address space (traditionally this used an identity mapping from guest
physical address to PCI address, but that's not always the case)
The qemu implementation in spapr-pci-host-bridge, however, only supports a
single outbound MMIO window, however. At least some Linux versions expect
the two windows however, so we arranged this window to map onto the PCI
memory space from 2 GiB..~64 GiB, then advertised it as two contiguous
windows, the "32-bit" window from 2G..4G and the "64-bit" window from
4G..~64G.
This approach means, however, that the 64G window is not naturally aligned.
In turn this limits the size of the largest BAR we can map (which does have
to be naturally aligned) to roughly half of the total window. With some
large nVidia GPGPU cards which have huge memory BARs, this is starting to
be a problem.
This patch adds true support for separate 32-bit and 64-bit outbound MMIO
windows to the spapr-pci-host-bridge implementation, each of which can
be independently configured. The 32-bit window always maps to 2G.. in PCI
space, but the PCI address of the 64-bit window can be configured (it
defaults to the same as the guest physical address).
So as not to break possible existing configurations, as long as a 64-bit
window is not specified, a large single window can be specified. This
will appear the same way to the guest as the old approach, although it's
now implemented by two contiguous memory regions rather than a single one.
For now, this only adds the possibility of 64-bit windows. The default
configuration still uses the legacy mode.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-11 06:23:33 +03:00
|
|
|
cpu_to_be32(b_ss(3)), cpu_to_be64(phb->mem64_win_pciaddr),
|
|
|
|
cpu_to_be64(phb->mem64_win_addr),
|
|
|
|
cpu_to_be64(phb->mem64_win_size),
|
2011-10-30 21:16:46 +04:00
|
|
|
},
|
|
|
|
};
|
spapr_pci: Add a 64-bit MMIO window
On real hardware, and under pHyp, the PCI host bridges on Power machines
typically advertise two outbound MMIO windows from the guest's physical
memory space to PCI memory space:
- A 32-bit window which maps onto 2GiB..4GiB in the PCI address space
- A 64-bit window which maps onto a large region somewhere high in PCI
address space (traditionally this used an identity mapping from guest
physical address to PCI address, but that's not always the case)
The qemu implementation in spapr-pci-host-bridge, however, only supports a
single outbound MMIO window, however. At least some Linux versions expect
the two windows however, so we arranged this window to map onto the PCI
memory space from 2 GiB..~64 GiB, then advertised it as two contiguous
windows, the "32-bit" window from 2G..4G and the "64-bit" window from
4G..~64G.
This approach means, however, that the 64G window is not naturally aligned.
In turn this limits the size of the largest BAR we can map (which does have
to be naturally aligned) to roughly half of the total window. With some
large nVidia GPGPU cards which have huge memory BARs, this is starting to
be a problem.
This patch adds true support for separate 32-bit and 64-bit outbound MMIO
windows to the spapr-pci-host-bridge implementation, each of which can
be independently configured. The 32-bit window always maps to 2G.. in PCI
space, but the PCI address of the 64-bit window can be configured (it
defaults to the same as the guest physical address).
So as not to break possible existing configurations, as long as a 64-bit
window is not specified, a large single window can be specified. This
will appear the same way to the guest as the old approach, although it's
now implemented by two contiguous memory regions rather than a single one.
For now, this only adds the possibility of 64-bit windows. The default
configuration still uses the legacy mode.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-11 06:23:33 +03:00
|
|
|
const unsigned sizeof_ranges =
|
|
|
|
(phb->mem64_win_size ? 3 : 2) * sizeof(ranges[0]);
|
2011-10-30 21:16:46 +04:00
|
|
|
uint64_t bus_reg[] = { cpu_to_be64(phb->buid), 0 };
|
|
|
|
uint32_t interrupt_map_mask[] = {
|
2012-04-25 21:55:42 +04:00
|
|
|
cpu_to_be32(b_ddddd(-1)|b_fff(0)), 0x0, 0x0, cpu_to_be32(-1)};
|
|
|
|
uint32_t interrupt_map[PCI_SLOT_MAX * PCI_NUM_PINS][7];
|
2016-07-04 06:33:07 +03:00
|
|
|
uint32_t ddw_applicable[] = {
|
|
|
|
cpu_to_be32(RTAS_IBM_QUERY_PE_DMA_WINDOW),
|
|
|
|
cpu_to_be32(RTAS_IBM_CREATE_PE_DMA_WINDOW),
|
|
|
|
cpu_to_be32(RTAS_IBM_REMOVE_PE_DMA_WINDOW)
|
|
|
|
};
|
|
|
|
uint32_t ddw_extensions[] = {
|
|
|
|
cpu_to_be32(1),
|
|
|
|
cpu_to_be32(RTAS_IBM_RESET_PE_DMA_WINDOW)
|
|
|
|
};
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprTceTable *tcet;
|
|
|
|
SpaprDrc *drc;
|
2019-12-04 12:36:23 +03:00
|
|
|
Error *err = NULL;
|
2011-10-30 21:16:46 +04:00
|
|
|
|
|
|
|
/* Start populating the FDT */
|
2019-04-05 19:30:43 +03:00
|
|
|
_FDT(bus_off = fdt_add_subnode(fdt, 0, phb->dtbusname));
|
2019-02-19 20:18:39 +03:00
|
|
|
if (node_offset) {
|
|
|
|
*node_offset = bus_off;
|
|
|
|
}
|
2011-10-30 21:16:46 +04:00
|
|
|
|
|
|
|
/* Write PHB properties */
|
|
|
|
_FDT(fdt_setprop_string(fdt, bus_off, "device_type", "pci"));
|
|
|
|
_FDT(fdt_setprop_string(fdt, bus_off, "compatible", "IBM,Logical_PHB"));
|
|
|
|
_FDT(fdt_setprop_cell(fdt, bus_off, "#interrupt-cells", 0x1));
|
|
|
|
_FDT(fdt_setprop(fdt, bus_off, "used-by-rtas", NULL, 0));
|
|
|
|
_FDT(fdt_setprop(fdt, bus_off, "bus-range", &bus_range, sizeof(bus_range)));
|
2015-01-30 04:53:19 +03:00
|
|
|
_FDT(fdt_setprop(fdt, bus_off, "ranges", &ranges, sizeof_ranges));
|
2011-10-30 21:16:46 +04:00
|
|
|
_FDT(fdt_setprop(fdt, bus_off, "reg", &bus_reg, sizeof(bus_reg)));
|
2012-01-11 23:46:25 +04:00
|
|
|
_FDT(fdt_setprop_cell(fdt, bus_off, "ibm,pci-config-space-type", 0x1));
|
2019-09-27 06:44:58 +03:00
|
|
|
_FDT(fdt_setprop_cell(fdt, bus_off, "ibm,pe-total-#msi",
|
|
|
|
spapr_irq_nr_msis(spapr)));
|
2011-10-30 21:16:46 +04:00
|
|
|
|
2016-07-04 06:33:07 +03:00
|
|
|
/* Dynamic DMA window */
|
|
|
|
if (phb->ddw_enabled) {
|
|
|
|
_FDT(fdt_setprop(fdt, bus_off, "ibm,ddw-applicable", &ddw_applicable,
|
|
|
|
sizeof(ddw_applicable)));
|
|
|
|
_FDT(fdt_setprop(fdt, bus_off, "ibm,ddw-extensions",
|
|
|
|
&ddw_extensions, sizeof(ddw_extensions)));
|
|
|
|
}
|
|
|
|
|
2016-07-27 11:03:38 +03:00
|
|
|
/* Advertise NUMA via ibm,associativity */
|
2016-10-18 23:50:23 +03:00
|
|
|
if (phb->numa_node != -1) {
|
spapr: introduce SpaprMachineState::numa_assoc_array
The next step to centralize all NUMA/associativity handling in
the spapr machine is to create a 'one stop place' for all
things ibm,associativity.
This patch introduces numa_assoc_array, a 2 dimensional array
that will store all ibm,associativity arrays of all NUMA nodes.
This array is initialized in a new spapr_numa_associativity_init()
function, called in spapr_machine_init(). It is being initialized
with the same values used in other ibm,associativity properties
around spapr files (i.e. all zeros, last value is node_id).
The idea is to remove all hardcoded definitions and FDT writes
of ibm,associativity arrays, doing instead a call to the new
helper spapr_numa_write_associativity_dt() helper, that will
be able to write the DT with the correct values.
We'll start small, handling the trivial cases first. The
remaining instances of ibm,associativity will be handled
next.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20200903220639.563090-2-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-09-04 01:06:33 +03:00
|
|
|
spapr_numa_write_associativity_dt(spapr, fdt, bus_off, phb->numa_node);
|
2016-07-27 11:03:38 +03:00
|
|
|
}
|
|
|
|
|
2012-01-11 23:46:28 +04:00
|
|
|
/* Build the interrupt-map, this must matches what is done
|
2019-04-05 19:30:48 +03:00
|
|
|
* in pci_swizzle_map_irq_fn
|
2012-01-11 23:46:28 +04:00
|
|
|
*/
|
|
|
|
_FDT(fdt_setprop(fdt, bus_off, "interrupt-map-mask",
|
|
|
|
&interrupt_map_mask, sizeof(interrupt_map_mask)));
|
2012-04-25 21:55:42 +04:00
|
|
|
for (i = 0; i < PCI_SLOT_MAX; i++) {
|
|
|
|
for (j = 0; j < PCI_NUM_PINS; j++) {
|
|
|
|
uint32_t *irqmap = interrupt_map[i*PCI_NUM_PINS + j];
|
2019-04-05 19:30:48 +03:00
|
|
|
int lsi_num = pci_swizzle(i, j);
|
2012-04-25 21:55:42 +04:00
|
|
|
|
|
|
|
irqmap[0] = cpu_to_be32(b_ddddd(i)|b_fff(0));
|
|
|
|
irqmap[1] = 0;
|
|
|
|
irqmap[2] = 0;
|
|
|
|
irqmap[3] = cpu_to_be32(j+1);
|
2019-01-17 20:14:39 +03:00
|
|
|
irqmap[4] = cpu_to_be32(intc_phandle);
|
|
|
|
spapr_dt_irq(&irqmap[5], phb->lsi_table[lsi_num].irq, true);
|
2012-04-25 21:55:42 +04:00
|
|
|
}
|
2011-10-30 21:16:46 +04:00
|
|
|
}
|
|
|
|
/* Write interrupt map */
|
|
|
|
_FDT(fdt_setprop(fdt, bus_off, "interrupt-map", &interrupt_map,
|
2012-04-25 21:55:42 +04:00
|
|
|
sizeof(interrupt_map)));
|
2011-10-30 21:16:46 +04:00
|
|
|
|
2016-07-04 06:33:07 +03:00
|
|
|
tcet = spapr_tce_find_by_liobn(phb->dma_liobn[0]);
|
2016-04-21 13:08:58 +03:00
|
|
|
if (!tcet) {
|
|
|
|
return -1;
|
|
|
|
}
|
2015-05-07 08:33:36 +03:00
|
|
|
spapr_dma_dt(fdt, bus_off, "ibm,dma-window",
|
|
|
|
tcet->liobn, tcet->bus_offset,
|
|
|
|
tcet->nb_table << tcet->page_shift);
|
2012-06-27 08:50:46 +04:00
|
|
|
|
2019-02-19 20:18:44 +03:00
|
|
|
drc = spapr_drc_by_id(TYPE_SPAPR_DRC_PHB, phb->index);
|
|
|
|
if (drc) {
|
|
|
|
uint32_t drc_index = cpu_to_be32(spapr_drc_index(drc));
|
|
|
|
|
|
|
|
_FDT(fdt_setprop(fdt, bus_off, "ibm,my-drc-index", &drc_index,
|
|
|
|
sizeof(drc_index)));
|
|
|
|
}
|
|
|
|
|
2015-07-02 09:23:21 +03:00
|
|
|
/* Walk the bridges and program the bus numbers*/
|
|
|
|
spapr_phb_pci_enumerate(phb);
|
|
|
|
_FDT(fdt_setprop_cell(fdt, bus_off, "qemu,phb-enumerated", 0x1));
|
|
|
|
|
2019-04-05 05:31:48 +03:00
|
|
|
/* Walk the bridge and subordinate buses */
|
|
|
|
ret = spapr_dt_pci_bus(phb, PCI_HOST_BRIDGE(phb)->bus, fdt, bus_off);
|
|
|
|
if (ret < 0) {
|
2015-05-07 08:33:53 +03:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2019-12-04 12:36:23 +03:00
|
|
|
spapr_phb_nvgpu_populate_dt(phb, fdt, bus_off, &err);
|
|
|
|
if (err) {
|
|
|
|
error_report_err(err);
|
spapr: Support NVIDIA V100 GPU with NVLink2
NVIDIA V100 GPUs have on-board RAM which is mapped into the host memory
space and accessible as normal RAM via an NVLink bus. The VFIO-PCI driver
implements special regions for such GPUs and emulates an NVLink bridge.
NVLink2-enabled POWER9 CPUs also provide address translation services
which includes an ATS shootdown (ATSD) register exported via the NVLink
bridge device.
This adds a quirk to VFIO to map the GPU memory and create an MR;
the new MR is stored in a PCI device as a QOM link. The sPAPR PCI uses
this to get the MR and map it to the system address space.
Another quirk does the same for ATSD.
This adds additional steps to sPAPR PHB setup:
1. Search for specific GPUs and NPUs, collect findings in
sPAPRPHBState::nvgpus, manage system address space mappings;
2. Add device-specific properties such as "ibm,npu", "ibm,gpu",
"memory-block", "link-speed" to advertise the NVLink2 function to
the guest;
3. Add "mmio-atsd" to vPHB to advertise the ATSD capability;
4. Add new memory blocks (with extra "linux,memory-usable" to prevent
the guest OS from accessing the new memory until it is onlined) and
npuphb# nodes representing an NPU unit for every vPHB as the GPU driver
uses it for link discovery.
This allocates space for GPU RAM and ATSD like we do for MMIOs by
adding 2 new parameters to the phb_placement() hook. Older machine types
set these to zero.
This puts new memory nodes in a separate NUMA node to as the GPU RAM
needs to be configured equally distant from any other node in the system.
Unlike the host setup which assigns numa ids from 255 downwards, this
adds new NUMA nodes after the user configures nodes or from 1 if none
were configured.
This adds requirement similar to EEH - one IOMMU group per vPHB.
The reason for this is that ATSD registers belong to a physical NPU
so they cannot invalidate translations on GPUs attached to another NPU.
It is guaranteed by the host platform as it does not mix NVLink bridges
or GPUs from different NPU in the same IOMMU group. If more than one
IOMMU group is detected on a vPHB, this disables ATSD support for that
vPHB and prints a warning.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for vfio portions]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <20190312082103.130561-1-aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 11:21:03 +03:00
|
|
|
}
|
|
|
|
spapr_phb_nvgpu_ram_populate_dt(phb, fdt);
|
|
|
|
|
2011-10-30 21:16:46 +04:00
|
|
|
return 0;
|
|
|
|
}
|
2012-03-12 21:50:24 +04:00
|
|
|
|
2012-08-07 20:10:33 +04:00
|
|
|
void spapr_pci_rtas_init(void)
|
|
|
|
{
|
2014-06-23 17:26:32 +04:00
|
|
|
spapr_rtas_register(RTAS_READ_PCI_CONFIG, "read-pci-config",
|
|
|
|
rtas_read_pci_config);
|
|
|
|
spapr_rtas_register(RTAS_WRITE_PCI_CONFIG, "write-pci-config",
|
|
|
|
rtas_write_pci_config);
|
|
|
|
spapr_rtas_register(RTAS_IBM_READ_PCI_CONFIG, "ibm,read-pci-config",
|
|
|
|
rtas_ibm_read_pci_config);
|
|
|
|
spapr_rtas_register(RTAS_IBM_WRITE_PCI_CONFIG, "ibm,write-pci-config",
|
|
|
|
rtas_ibm_write_pci_config);
|
2016-03-04 12:24:28 +03:00
|
|
|
if (msi_nonbroken) {
|
2014-06-23 17:26:32 +04:00
|
|
|
spapr_rtas_register(RTAS_IBM_QUERY_INTERRUPT_SOURCE_NUMBER,
|
|
|
|
"ibm,query-interrupt-source-number",
|
2012-08-07 20:10:37 +04:00
|
|
|
rtas_ibm_query_interrupt_source_number);
|
2014-06-23 17:26:32 +04:00
|
|
|
spapr_rtas_register(RTAS_IBM_CHANGE_MSI, "ibm,change-msi",
|
|
|
|
rtas_ibm_change_msi);
|
2012-08-07 20:10:37 +04:00
|
|
|
}
|
2015-02-20 07:58:52 +03:00
|
|
|
|
|
|
|
spapr_rtas_register(RTAS_IBM_SET_EEH_OPTION,
|
|
|
|
"ibm,set-eeh-option",
|
|
|
|
rtas_ibm_set_eeh_option);
|
|
|
|
spapr_rtas_register(RTAS_IBM_GET_CONFIG_ADDR_INFO2,
|
|
|
|
"ibm,get-config-addr-info2",
|
|
|
|
rtas_ibm_get_config_addr_info2);
|
|
|
|
spapr_rtas_register(RTAS_IBM_READ_SLOT_RESET_STATE2,
|
|
|
|
"ibm,read-slot-reset-state2",
|
|
|
|
rtas_ibm_read_slot_reset_state2);
|
|
|
|
spapr_rtas_register(RTAS_IBM_SET_SLOT_RESET,
|
|
|
|
"ibm,set-slot-reset",
|
|
|
|
rtas_ibm_set_slot_reset);
|
|
|
|
spapr_rtas_register(RTAS_IBM_CONFIGURE_PE,
|
|
|
|
"ibm,configure-pe",
|
|
|
|
rtas_ibm_configure_pe);
|
|
|
|
spapr_rtas_register(RTAS_IBM_SLOT_ERROR_DETAIL,
|
|
|
|
"ibm,slot-error-detail",
|
|
|
|
rtas_ibm_slot_error_detail);
|
2012-08-07 20:10:33 +04:00
|
|
|
}
|
|
|
|
|
2012-08-20 21:08:05 +04:00
|
|
|
static void spapr_pci_register_types(void)
|
2012-03-12 21:50:24 +04:00
|
|
|
{
|
|
|
|
type_register_static(&spapr_phb_info);
|
|
|
|
}
|
2012-08-20 21:08:05 +04:00
|
|
|
|
|
|
|
type_init(spapr_pci_register_types)
|
2015-02-10 07:36:16 +03:00
|
|
|
|
|
|
|
static int spapr_switch_one_vga(DeviceState *dev, void *opaque)
|
|
|
|
{
|
|
|
|
bool be = *(bool *)opaque;
|
|
|
|
|
|
|
|
if (object_dynamic_cast(OBJECT(dev), "VGA")
|
2020-09-28 11:53:35 +03:00
|
|
|
|| object_dynamic_cast(OBJECT(dev), "secondary-vga")
|
|
|
|
|| object_dynamic_cast(OBJECT(dev), "bochs-display")
|
|
|
|
|| object_dynamic_cast(OBJECT(dev), "virtio-vga")) {
|
qom: Put name parameter before value / visitor parameter
The object_property_set_FOO() setters take property name and value in
an unusual order:
void object_property_set_FOO(Object *obj, FOO_TYPE value,
const char *name, Error **errp)
Having to pass value before name feels grating. Swap them.
Same for object_property_set(), object_property_get(), and
object_property_parse().
Convert callers with this Coccinelle script:
@@
identifier fun = {
object_property_get, object_property_parse, object_property_set_str,
object_property_set_link, object_property_set_bool,
object_property_set_int, object_property_set_uint, object_property_set,
object_property_set_qobject
};
expression obj, v, name, errp;
@@
- fun(obj, v, name, errp)
+ fun(obj, name, v, errp)
Chokes on hw/arm/musicpal.c's lcd_refresh() with the unhelpful error
message "no position information". Convert that one manually.
Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by
ARMSSE being used both as typedef and function-like macro there.
Convert manually.
Fails to convert hw/rx/rx-gdbsim.c, because Coccinelle gets confused
by RXCPU being used both as typedef and function-like macro there.
Convert manually. The other files using RXCPU that way don't need
conversion.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200707160613.848843-27-armbru@redhat.com>
[Straightforwad conflict with commit 2336172d9b "audio: set default
value for pcspk.iobase property" resolved]
2020-07-07 19:05:54 +03:00
|
|
|
object_property_set_bool(OBJECT(dev), "big-endian-framebuffer", be,
|
2015-02-10 07:36:16 +03:00
|
|
|
&error_abort);
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2020-12-09 20:00:49 +03:00
|
|
|
void spapr_pci_switch_vga(SpaprMachineState *spapr, bool big_endian)
|
2015-02-10 07:36:16 +03:00
|
|
|
{
|
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-06 07:35:37 +03:00
|
|
|
SpaprPhbState *sphb;
|
2015-02-10 07:36:16 +03:00
|
|
|
|
|
|
|
/*
|
|
|
|
* For backward compatibility with existing guests, we switch
|
|
|
|
* the endianness of the VGA controller when changing the guest
|
|
|
|
* interrupt mode
|
|
|
|
*/
|
|
|
|
QLIST_FOREACH(sphb, &spapr->phbs, list) {
|
|
|
|
BusState *bus = &PCI_HOST_BRIDGE(sphb)->bus->qbus;
|
|
|
|
qbus_walk_children(bus, spapr_switch_one_vga, NULL, NULL, NULL,
|
|
|
|
&big_endian);
|
|
|
|
}
|
|
|
|
}
|