Commit Graph

1261 Commits

Author SHA1 Message Date
Suraj Jitindar Singh ee4d9ecc36 target/ppc: Move setting of patb_entry on hash table init
The patb_entry is used to store the location of the process table in
guest memory. The msb is also used to indicate the mmu mode of the
guest, that is patb_entry & 1 << 63 ? radix_mode : hash_mode.

Currently we set this to zero in spapr_setup_hpt_and_vrma() since if
this function gets called then we know we're hash. However some code
paths, such as setting up the hpt on incoming migration of a hash guest,
call spapr_reallocate_hpt() directly bypassing this higher level
function. Since we assume radix if the host is capable this results in
the msb in patb_entry being left set so in spapr_post_load() we call
kvmppc_configure_v3_mmu() and tell the host we're radix which as
expected means addresses cannot be translated once we actually run the cpu.

To fix this move the zeroing of patb_entry into spapr_reallocate_hpt().

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-11-27 12:20:11 +11:00
Thomas Huth bac658d1a4 hw/ppc/spapr: Fix virtio-scsi bootindex handling for LUNs >= 256
LUNs >= 256 have to be encoded with the so-called "flat space
addressing method" for virtio-scsi, where an additional bit has to
be set. SLOF already took care of this with the following commit:

 https://git.qemu.org/?p=SLOF.git;a=commitdiff;h=f72a37713fea47da
 (see https://bugzilla.redhat.com/show_bug.cgi?id=1431584 for details)

But QEMU does not use this encoding yet for device tree paths
that have to be handed over to SLOF to deal with the "bootindex"
property, so SLOF currently fails to boot from virtio-scsi devices
with LUNs >= 256 in the right boot order. Fix it by using the bit
to indicate the "flat space addressing method" for LUNs >= 256.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-11-22 15:28:37 +11:00
Greg Kurz 8251248394 spapr: reset DRCs after devices
A DRC with a pending unplug request releases its associated device at
machine reset time.

In the case of LMB, when all DRCs for a DIMM device have been reset,
the DIMM gets unplugged, causing guest memory to disappear. This may
be very confusing for anything still using this memory.

This is exactly what happens with vhost backends, and QEMU aborts
with:

qemu-system-ppc64: used ring relocated for ring 2
qemu-system-ppc64: qemu/hw/virtio/vhost.c:649: vhost_commit: Assertion
 `r >= 0' failed.

The issue is that each DRC registers a QEMU reset handler, and we
don't control the order in which these handlers are called (ie,
a LMB DRC will unplug a DIMM before the virtio device using the
memory on this DIMM could stop its vhost backend).

To avoid such situations, let's reset DRCs after all devices
have been reset.

Reported-by: Mallesh N. Koti <mallesh@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-11-20 10:10:56 +11:00
Suraj Jitindar Singh 7abd43baec target/ppc: Update setting of cpu features to account for compat modes
The device tree nodes ibm,arch-vec-5-platform-support and ibm,pa-features
are used to communicate features of the cpu to the guest operating
system. The properties of each of these are determined based on the
selected cpu model and the availability of hypervisor features.
Currently the compatibility mode of the cpu is not taken into account.

The ibm,arch-vec-5-platform-support node is used to communicate the
level of support for various ISAv3 processor features to the guest
before CAS to inform the guests' request. The available mmu mode should
only be hash unless the cpu is a POWER9 which is not in a prePOWER9
compat mode, in which case the available modes depend on the
accelerator and the hypervisor capabilities.

The ibm,pa-featues node is used to communicate the level of cpu support
for various features to the guest os. This should only contain features
relevant to the operating mode of the processor, that is the selected
cpu model taking into account any compat mode. This means that the
compat mode should be taken into account when choosing the properties of
ibm,pa-features and they should match the compat mode selected, or the
cpu model selected if no compat mode.

Update the setting of these cpu features in the device tree as described
above to properly take into account any compat mode. We use the
ppc_check_compat function which takes into account the current processor
model and the cpu compat mode.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-11-20 10:07:49 +11:00
Sam Bobroff e05fba5004 target/ppc: correct htab shift for hash on radix
KVM HV will soon support running a guest in hash mode on a POWER9 host
running in radix mode (see [1]), however the guest currently fails to
boot.

This is because the "htab_shift" value (the size of the MMU's hash
table) is added to the device tree before KVM has had a chance to
change it. If the host is in hash mode, KVM does not need to change it
and so the problem is not seen, but when the host is in radix mode a
change is required and we see a problem.

To fix this, move the call spapr_setup_hpt_and_vrma() (where
htab_shift could be changed) up a little so that it's called before
spapr_h_cas_compose_response() (where htab_shift is added to the
device tree).

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>

[1] See http://www.spinics.net/lists/kvm-ppc/msg13057.html
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-11-14 10:28:32 +11:00
Michael Davidsaver c91c187f71 e500: ppce500_init_mpic() return device instead of IRQ array
Actual number of interrupt pins isn't known
in ppce500_init_mpic() so a hardcoded number
was used, which causes a crash with older openpic.

Instead, return the DeviceState* and change ppce500_init()
to call qdev_get_gpio_in() to get only the irq pins
which are needed.

Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-11-08 13:21:37 +11:00
Greg Kurz e7cca3e94f spapr_cpu_core: rewrite machine type sanity check
This makes the code easier to understand and it is consistent with what
we already do for PHBs.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:01 +11:00
Greg Kurz f7d6bfcdc0 spapr_pci: fail gracefully with non-pseries machine types
QEMU currently crashes when the user tries to add an spapr-pci-host-bridge
on a non-pseries machine:

$ qemu-system-ppc64 -M ppce500 -device spapr-pci-host-bridge,index=1
hw/ppc/spapr_pci.c:1535:spapr_phb_realize:
Object 0x1003dacae60 is not an instance of type spapr-machine
Aborted (core dumped)

The same thing happens with the deprecated but still available child type
spapr-pci-vfio-host-bridge.

Fix both by checking the machine type with object_dynamic_cast().

Reviewed-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:01 +11:00
David Gibson db50f280cf spapr: Correct RAM size calculation for HPT resizing
In order to prevent the guest from forcing the allocation of large amounts
of qemu memory (or host kernel memory, in the case of KVM HV), we limit
the size of Hashed Page Table (HPT) it is allowed to allocated, based on
its RAM size.

However, the current calculation is not correct: it only adds up the size
of plugged memory, ignoring the base memory size.  This patch corrects it.

While we're there, use get_plugged_memory_size() instead of directly
calling pc_existing_dimms_capacity().  The only difference is that it
will abort on failure, which is right: a failure here indicates something
wrong within qemu.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-10-17 10:34:01 +11:00
Igor Mammedov beba5c0fe4 ppc: pnv: consolidate type definitions and batch register them
Use a new DEFINE_TYPES() helper to simplify type registration

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:01 +11:00
Igor Mammedov 40abf43f72 ppc: pnv: drop PnvChipClass::cpu_model field
deduce core type directly from chip type instead of
maintaining type mapping in PnvChipClass::cpu_model.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:01 +11:00
Igor Mammedov 7383af1edc ppc: pnv: define core types statically
pnv core type definition doesn't have any fields that
require it to be defined at runtime. So replace code
that fills in TypeInfo at runtime with static TypeInfo
array that does the same at complie time.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:01 +11:00
Igor Mammedov 35bdb9def2 ppc: pnv: drop PnvCoreClass::cpu_oc field
deduce cpu type directly from core type instead of
maintaining type mapping in PnvCoreClass::cpu_oc and doing
extra cpu_model parsing in pnv_core_class_init()

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:01 +11:00
Igor Mammedov 7fd544d8a7 ppc: pnv: normalize core/chip type names
typically for cpus/core type names following convention is used

   new_type_prefix-superclass_typename

make PNV core/chip to follow common convention.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:01 +11:00
Igor Mammedov 4a12c699d3 ppc: pnv: use generic cpu_model parsing
use common cpu_model prasing in vl.c and set default cpu_model
using generic MachineClass::default_cpu_type.

Beside of switching to generic infrastructure it solves several
issues.

 * ppc_cpu_class_by_name() is used to deal with lower/upper case
   and alias translations into actual cpu type, which fixes
    '-M powernv -cpu power8' and '-M powernv -cpu power9_v1.0'
   usecases which error out with:
    'invalid CPU model 'FOO' for powernv machine'
 * allows to switch to lower-case typenames in pnv chip/core name
   (by convention typnames should be lower-case)
 * replace aliased names /power8, power9, .../ with exact cpu model
   names (i.e. typenames should be stable but aliases might decide to
   point to other cpu model withi family or changed by kvm). It will
   also help to simplify pnv_chip/core code and get rid of dependency
   on cpu_model parsing.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[dwg: Updated to make DD2.0 as default POWER9 chip]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:01 +11:00
Igor Mammedov 2e9c10eba0 ppc: spapr: use generic cpu_model parsing
use generic cpu_model parsing introduced by
 (6063d4c0f vl.c: convert cpu_model to cpu type and set of global properties before machine_init())

it allows to:
  * replace sPAPRMachineClass::tcg_default_cpu with
    MachineClass::default_cpu_type
  * drop cpu_parse_cpu_model() from hw/ppc/spapr.c and reuse
    one in vl.c
  * simplify spapr_get_cpu_core_type() by removing
    not needed anymore recurrsion since alias look up
    happens earlier at vl.c and spapr_get_cpu_core_type()
    works only with resulted from that cpu type.
  * spapr no more needs to parse/depend on being phased out
    MachineState::cpu_model, all tha parsing done by generic
    code and target specific callback.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
[dwg: Correct minor compile error]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:01 +11:00
Igor Mammedov 17be88a713 ppc: spapr: use cpu model names as tcg defaults instead of aliases
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:01 +11:00
Igor Mammedov 5bbb264186 ppc: spapr: register 'host' core type along with the rest of core types
consolidate 'host' core type registration by moving it from
KVM specific code into spapr_cpu_core.c, similar like it's
done in x86 target.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:00 +11:00
Igor Mammedov b51d3c8818 ppc: spapr: use cpu type name directly
replace sPAPRCPUCoreClass::cpu_class with cpu type name
since it were needed just to get that at points it were
accessed.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:00 +11:00
Igor Mammedov 44cd95e31a ppc: spapr: define core types statically
spapr core type definition doesn't have any fields that
require it to be defined at runtime. So replace code
that fills in TypeInfo at runtime with static TypeInfo
array that does the same at complie time.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:00 +11:00
Igor Mammedov b8e999673b ppc: move '-cpu foo,compat=xxx' parsing into ppc_cpu_parse_featurestr()
there is a dedicated callback CPUClass::parse_features
which purpose is to convert -cpu features into a set of
global properties AND deal with compat/legacy features
that couldn't be directly translated into CPU's properties.

Create ppc variant of it (ppc_cpu_parse_featurestr) and
move 'compat=val' handling from spapr_cpu_core.c into it.
That removes a dependency of board/core code on cpu_model
parsing and would let to reuse common -cpu parsing
introduced by 6063d4c0

Set "max-cpu-compat" property only if it exists, in practice
it should limit 'compat' hack to spapr machine and allow
to avoid including machine/spapr headers in target/ppc/cpu.c

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:00 +11:00
Igor Mammedov a1063aa8a5 ppc: spapr: replace ppc_cpu_parse_features() with cpu_parse_cpu_model()
ppc_cpu_parse_features() is doing practically the same thing as
generic cpu_parse_cpu_model(). So remove duplicated impl. and
reuse generic one.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:00 +11:00
Igor Mammedov 23ec69ecf9 ppc: 40p/prep: replace cpu_model with cpu_type
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:00 +11:00
Igor Mammedov 6bab8eaa95 ppc: virtex-ml507: replace cpu_model with cpu_type
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:00 +11:00
Igor Mammedov 9391b8c563 ppc: replace cpu_model with cpu_type on ref405ep,taihu boards
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:00 +11:00
Igor Mammedov 376d7a2abb ppc: bamboo: use generic cpu_model parsing
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:00 +11:00
Igor Mammedov f4c6604e86 ppc: mac_oldworld: use generic cpu_model parsing
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:00 +11:00
Igor Mammedov 9dff4c07e1 ppc: mac_newworld: use generic cpu_model parsing
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:00 +11:00
Igor Mammedov 59e816fd3e ppc: mpc8544ds/e500plat: use generic cpu_model parsing
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:00 +11:00
Daniel Henrique Barboza 2a129767eb hw/ppc/spapr.c: abort unplug_request if previous unplug isn't done
LMB removal is completed only when the spapr_lmb_release callback
is called after all DRCs of the dimm are detached. During this
time, it is possible that a unplug request for the same dimm
arrives, trying to detach DRCs that were detached by the guest
in the first unplug_request.

BQL doesn't help in this case - the lock will prevent any concurrent
removal from happening until the end of spapr_memory_unplug_request
only. What happens is that the second unplug_request ends up calling
spapr_drc_detach in a DRC that were detached already, causing an
assert error in spapr_drc_detach (e.g
https://bugs.launchpad.net/qemu/+bug/1718118).

spapr_lmb_release uses a structure called sPAPRDIMMState, stored in the
spapr->pending_dimm_unplugs QTAIL, to track how many LMB DRCs are left
to be detached by the guest. When there are no more DRCs left, this
structure is deleted and the pc-dimm unplug handler is called to
finish the process.

This patch reuses the sPAPRDIMMState to allow unplug_request to know
if there is an ongoing unplug process for a given dimm, aborting the
unplug request in this case, by doing the following changes:

- in spapr_lmb_release callback, move the dimm state removal to the
end, after pc-dimm unplug handler. With this change we can check for
the existence of the dimm state to see if the unplug process is
done.

- use spapr_pending_dimm_unplugs_find in spapr_memory_unplug_request
to check if the dimm state exists. If positive, there is an unplug
operation already in progress for this dimm, meaning that we should
abort it and warn the user about it.

Fixes: https://bugs.launchpad.net/qemu/+bug/1718118
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:00 +11:00
David Gibson 1ed9c8af50 target/ppc: Add POWER9 DD2.0 model information
At the moment the only POWER9 model which is listed in qemu is v1.0 (aka
"DD1").  This is a very early (read, buggy) version which will never be
released to the public - it was included in qemu only for the convenience
of those doing bringup on the early silicon.  For bonus points, we actually
had its PVR incorrect in the table (0x004e0000 instead of 0x004e0100).  We
also never actually implemented the differences in behaviour (read, bugs)
that marked DD1 in qemu.

Now that we know the PVR for the substantially better v2.0 (DD2) chip,
include it and make it the default POWER9 in qemu.  For the time being we
leave the DD1 definition in place for the poor souls (read, me) who still
need to work with DD1 hardware.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:00 +11:00
Greg Kurz 827b17c468 spapr: sanity check size of the CAS buffer
The CAS buffer is provided by SLOF. A broken SLOF could pass a silly
size: either smaller than the diff header, in which case the current
code will try to allocate 16 Exabytes of memory and g_malloc0() will
abort, or bigger than the maximum memory provisioned for SLOF (ie,
40 Megabytes), which doesn't make sense. Both cases indicate that
SLOF has a bug.

Let's print out an explicit error message and exit since rebooting as
we do with other errors would only result in a reset loop.

Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Fix format specifier that broke 32-bit builds]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:00 +11:00
Greg Kurz dc1b5eee86 spapr: fix OF word name in comment
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:00 +11:00
Greg Kurz a4f3885c74 hw/ppc: use 0 instead of fdt_path_offset(fdt, "/")
The offset of the root node is guaranteed to be 0.

This doesn't fix anything, it's just trivial cleanup of the two
remaining places where this was done under hw/ppc.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-10-17 10:34:00 +11:00
Eduardo Habkost fd3b02c889 pci: Add INTERFACE_CONVENTIONAL_PCI_DEVICE to Conventional PCI devices
Add INTERFACE_CONVENTIONAL_PCI_DEVICE to all direct subtypes of
TYPE_PCI_DEVICE, except:

1) The ones that already have INTERFACE_PCIE_DEVICE set:

* base-xhci
* e1000e
* nvme
* pvscsi
* vfio-pci
* virtio-pci
* vmxnet3

2) base-pci-bridge

Not all PCI bridges are Conventional PCI devices, so
INTERFACE_CONVENTIONAL_PCI_DEVICE is added only to the subtypes
that are actually Conventional PCI:

* dec-21154-p2p-bridge
* i82801b11-bridge
* pbm-bridge
* pci-bridge

The direct subtypes of base-pci-bridge not touched by this patch
are:

* xilinx-pcie-root: Already marked as PCIe-only.
* pcie-pci-bridge: Already marked as PCIe-only.
* pcie-port: all non-abstract subtypes of pcie-port are already
  marked as PCIe-only devices.

3) megasas-base

Not all megasas devices are Conventional PCI devices, so the
interface names are added to the subclasses registered by
megasas_register_types(), according to information in the
megasas_devices[] array.

"megasas-gen2" already implements INTERFACE_PCIE_DEVICE, so add
INTERFACE_CONVENTIONAL_PCI_DEVICE only to "megasas".

Acked-by: Alberto Garcia <berto@igalia.com>
Acked-by: John Snow <jsnow@redhat.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-10-15 05:54:43 +03:00
Peter Maydell ab16152926 Migration pull 2017-09-27
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZy64HAAoJEAUWMx68W/3nTqwP/A5Gx4Qwkv5KKdpM0YLq//d+
 OODmzl7Ni3a5Up1ETqGdLb84estrgY+5DISp73Rkt4a5tbT7+XKrhb4qD+93NnTe
 zynY9in4C1jGxYm7YzeOhwSeIiuLZMTCLQlGdYw7/nunIFwkItUEvAFx3AG1WCJe
 2Mk0lvmg4LikruDDMdzqZaJu7h5RU5sQjA7SsyrTBdsN7tNWl3rKLYGXwgzv0uz5
 n2xkUgzvvnj1Bk/Adojkn05yxA86xKD/4rhFED9fjNVSjAGHMrHIWOJ70V26Cg5w
 3gJ+5mesWsH+erf0JFYv0S38SyFbmIOE39Nn13D/d0o1x89P8B8cgqbi3ADTKM77
 875wuIVnZzi2vIwVdxXQ9GHQ79cpXwr2fOfQ2rjT6Ll95K+u/MQG86fQiO0eJW+0
 KwQVCwwh+HmCUcCogMuxAc9+F8C8qolwCi/9QXwS2yLBElHKaWDIMyTce36cW9d7
 cZaKIOeSJUGNFoaWZnXN88MRuOYbdywTl+GddVAW3+VJCTYV2oi0o5fsTfxXy5AV
 y7uYo/pcSj2gSZJ5GairMlB6p5iXnE8yusi1e4ZKA1x1TaSHSb6zR59lRUFr+j/L
 JhUCfA85v5/elGqgkYp6UhSzFDJ2ID2oSEMQTIzfVrinOXtnf2KEh33YMbUH5qyo
 yHVEu12uPe9rE6A0vWlu
 =/+LV
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20170927a' into staging

Migration pull 2017-09-27

# gpg: Signature made Wed 27 Sep 2017 14:56:23 BST
# gpg:                using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-migration-20170927a:
  migration: Route more error paths
  migration: Route errors up through vmstate_save
  migration: wire vmstate_save_state errors up to vmstate_subsection_save
  migration: Check field save returns
  migration: check pre_save return in vmstate_save_state
  migration: pre_save return int
  migration: disable auto-converge during bulk block migration

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-09-27 22:44:51 +01:00
Dr. David Alan Gilbert 44b1ff319c migration: pre_save return int
Modify the pre_save method on VMStateDescription to return an int
rather than void so that it potentially can fail.

Changed zillions of devices to make them return 0; the only
case I've made it return non-0 is hw/intc/s390_flic_kvm.c that already
had an error_report/return case.

Note: If you add an error exit in your pre_save you must emit
an error_report to say why.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170925112917.21340-2-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-09-27 11:35:59 +01:00
Mark Cave-Ayland e451b85f1b macio: use object link between MACIO_IDE and MAC_DBDMA object
Using a standard QOM object link we can pass a reference to the MAC_DBDMA
controller to the MACIO_IDE object which removes the last external parameter
to macio_ide_register_dma().

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-27 13:05:41 +10:00
Mark Cave-Ayland 0fc84331d6 macio: pass channel into MACIOIDEState via qdev property
One of the reasons macio_ide_register_dma() needs to exist is because the
channel id isn't passed into the MACIO_IDE object. Pass in the channel id
using a qdev property to remove this requirement.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-27 13:05:41 +10:00
Greg Kurz 1ec26c757d spapr: fix the value of SDR1 in kvmppc_put_books_sregs()
When running with KVM PR, if a new HPT is allocated we need to inform
KVM about the HPT address and size. This is currently done by hacking
the value of SDR1 and pushing it to KVM in several places.

Also, migration breaks the guest since it is very unlikely the HPT has
the same address in source and destination, but we push the incoming
value of SDR1 to KVM anyway.

This patch introduces a new virtual hypervisor hook so that the spapr
code can provide the correct value of SDR1 to be pushed to KVM each
time kvmppc_put_books_sregs() is called.

It allows to get rid of all the hacking in the spapr/kvmppc code and
it fixes migration of nested KVM PR.

Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-27 13:05:41 +10:00
Cédric Le Goater 15fcedb26f ppc/pnv: check for OPAL firmware file presence
and exit before uselessly trying to load it if the file does not
exists.

Issue discovered by Coverity Scan.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-27 13:05:41 +10:00
Greg Kurz 30b3bc5aa9 spapr_pci: make index property mandatory
PHBs can be created with an index property, in which case the machine
code automatically sets all the MMIO windows at addresses derived from
the index. Alternatively, they can be manually created without index,
but the user has to provide addresses for all MMIO windows.

The non-index way happens to be more trouble than it's worth: it's
difficult to use, keeps requiring (potentially incompatible) changes
when some new parameter needs adding, and is awkward to check for
collisions. It currently even has a bug that prevents to use two
non-index PHBs because their child DRCs are all derived from the
same index == -1 value, and, thus, collide.

This patch hence makes the index property mandatory. As a consequence,
the PHB's memory regions and BUID are now always configured according
to the index, and it is no longer possible to set them from the command
line.

This DOES BREAK backwards compat, but we don't think the non-index
PHB feature was used in practice (at least libvirt doesn't) and the
simplification is worth it.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-27 13:05:41 +10:00
Greg Kurz 332f7721cb spapr: introduce helpers to migrate HPT chunks and the end marker
This consolidates some duplicated code in a dedicated helpers.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-27 13:05:41 +10:00
Greg Kurz 14b0d74887 ppc/kvm: generalize the use of kvmppc_get_htab_fd()
The use of KVM_PPC_GET_HTAB_FD is open-coded in kvmppc_read_hptes()
and kvmppc_write_hpte().

This patch modifies kvmppc_get_htab_fd() so that it can be used
everywhere we need to access the in-kernel htab:
- add an index argument
  => only kvmppc_read_hptes() passes an actual index, all other users
     pass 0
- add an errp argument to propagate error messages to the caller.
  => spapr migration code prints the error
  => hpte helpers pass &error_abort to keep the current behavior
     of hw_error()

While here, this also fixes a bug in kvmppc_write_hpte() so that it
opens the htab fd for writing instead of reading as it currently does.
This never broke anything because we currently never call this code,
as explained in the changelog of commit c1385933804bb:

"This support updating htab managed by the hypervisor. Currently
 we don't have any user for this feature. This actually bring the
 store_hpte interface in-line with the load_hpte one. We may want
 to use this when we want to emulate henter hcall in qemu for HV
 kvm."

The above is still true today.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-27 13:05:41 +10:00
Greg Kurz 82be8e7394 ppc/kvm: change kvmppc_get_htab_fd() to return -errno on error
When kvmppc_get_htab_fd() fails, its return value is propagated up to
qemu_savevm_state_iterate() or to qemu_savevm_state_complete_precopy().
All savevm handlers expect to receive a negative errno on error.

Let's patch kvmppc_get_htab_fd() accordingly.

While here, let's change htab_load() in the spapr code to also
propagate the error, since it doesn't make sense to abort() if
we couldn't get the htab fd from KVM.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-27 13:05:41 +10:00
Benjamin Herrenschmidt 58b6283586 ppc: Fix OpenPIC model
Apple uses an IBM MPIC2A without timers, it has 64 sources.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-27 13:05:41 +10:00
Benjamin Herrenschmidt 4f7265ff17 ppc/ide/macio: Add missing registers
The timing register exists on all variants of MacIO IDE, we just
store and return its value.

The interrupts register only exists on KeyLargo but it doesn't
hurt to have it. The lack of this register causes MacOS X to
hangs under some circumstances.

Both are 32-bit only. The HW might support smaller access sizes
but no known OS uses them.

Because the core IDE subsystem doesn't provide us with a way
to query the main (level) interrupt state, nor do we have a way
to know that DBDMA issued a (edge) interrupt, we reflect both
through a private pair of qirq's in order to maintain the
register state.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-27 13:05:41 +10:00
Benjamin Herrenschmidt 3c0622897e ppc/mac: Advertise a high clock frequency for NewWorld Macs
We use 900Mhz, otherwise MacOS X 10.5 refuses to install.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-27 13:05:41 +10:00
Mark Cave-Ayland c8bd35260d ppc: QOMify g3beige machine
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-27 13:05:41 +10:00
BALATON Zoltan 4c46f372b0 ppc4xx: Add more PLB registers
These registers are present in 440 SoCs (and maybe in others too) and
U-Boot accesses them when printing register info. We don't emulate
these but add them to avoid crashing when they are read or written.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-27 13:05:41 +10:00
Kamil Rytarowski 39d96847c9 Replace round_page() with TARGET_PAGE_ALIGN()
This change fixes conflict with the DragonFly BSD headers.

Signed-off-by: Kamil Rytarowski <n54@gmx.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-09-26 09:06:02 +03:00
Igor Mammedov 79e0793614 numa: cpu: calculate/set default node-ids after all -numa CLI options are parsed
Calculating default node-ids for CPUs in possible_cpu_arch_ids()
is rather fragile since defaults calculation uses nb_numa_nodes but
callback might be potentially called early before all -numa CLI
options are parsed, which would lead to cpus assigned only upto
nb_numa_nodes at the time possible_cpu_arch_ids() is called.

Issue was introduced by
(7c88e65 numa: mirror cpu to node mapping in MachineState::possible_cpus)
and for example CLI:
  -smp 4 -numa node,cpus=0 -numa node
would set props.node-id in possible_cpus array for every non
explicitly mapped CPU to the first node.

Issue is not visible to guest nor to mgmt interface due to
  1) implictly mapped cpus are forced to the first node in
     case of partial mapping
  2) in case of default mapping possible_cpu_arch_ids() is
     called after all -numa options are parsed (resulting
     in correct mapping).

However it's fragile to rely on late execution of
possible_cpu_arch_ids(), therefore add machine specific
callback that returns node-id for CPU and use it to calculate/
set defaults at machine_numa_finish_init() time when all -numa
options are parsed.

Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1496314408-163972-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-09-19 16:51:33 -03:00
Igor Mammedov 4482e05cbb cpu: make cpu_generic_init() abort QEMU on error
Almost every user of cpu_generic_init() checks for
returned NULL and then reports failure in a custom way
and aborts process.
Some users assume that call can't fail and don't check
for failure, though they should have checked for it.

In either cases cpu_generic_init() failure is fatal,
so instead of checking for failure and reporting
it various ways, make cpu_generic_init() report
errors in consistent way and terminate QEMU on failure.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1505318697-77161-3-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-09-19 09:09:32 -03:00
Greg Kurz d492a75cfe spapr_events: use QTAILQ_FOREACH_SAFE() in spapr_clear_pending_events()
QTAILQ_FOREACH_SAFE() must be used when removing the current element
inside the loop block.

This fixes a user-after-free error introduced by commit 5625817423
and reported by Coverity (CID 1381017).

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-15 10:29:48 +10:00
Greg Kurz 3b2fcedd52 spapr_cpu_core: cleaning up qdev_get_machine() calls
This patch removes the qdev_get_machine() calls that are made
in spapr_cpu_core.c in situations where we can get an existing
pointer for the MachineState by either passing it as an argument
to the function or by using other already available pointers.

Credits to Daniel Henrique Barboza for the idea and the changelog
text.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-15 10:29:48 +10:00
Greg Kurz 96dbc9af35 spapr_pci: don't create 64-bit MMIO window if we don't need to
When running a pseries-2.2 or older machine type, we get the following
lines in info mtree:

address-space: memory
...
ffffffffffffffff-ffffffffffffffff (prio 0, i/o): alias
 pci@800000020000000.mmio64-alias @pci@800000020000000.mmio
  ffffffffffffffff-ffffffffffffffff

address-space: cpu-memory
...
ffffffffffffffff-ffffffffffffffff (prio 0, i/o): alias
 pci@800000020000000.mmio64-alias @pci@800000020000000.mmio
  ffffffffffffffff-ffffffffffffffff

The same thing occurs when running a pseries-2.7 with

    -global spapr-pci-host-bridge.mem_win_size=2147483648

This happens because we always create a 64-bit MMIO window, even if
we didn't explicitely requested it (ie, mem64_win_size == 0) and the
32-bit window is below 2GiB. It doesn't seem to have an impact on the
guest though because spapr_populate_pci_dt() doesn't advertise the
bogus windows when mem64_win_size == 0.

Since these memory regions don't induce any state, we can safely
choose to not create them when their address is equal to -1,
without breaking migration from existing setups.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-15 10:29:48 +10:00
Greg Kurz 1d36da769a spapr_pci: convert sprintf() to g_strdup_printf()
In order to follow a QEMU common practice.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-15 10:29:48 +10:00
Greg Kurz 1bbadc759e spapr_cpu_core: fail gracefully with non-pseries machine types
Since commit 7cca3e466e ("ppc: spapr: Move VCPU ID calculation into
sPAPR"), QEMU aborts when started with a *-spapr-cpu-core device and
a non-pseries machine.

Let's rely on the already existing call to object_dynamic_cast() instead
of using the SPAPR_MACHINE() macro.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-15 10:29:48 +10:00
Greg Kurz 9ba255365e spapr_pci: handle FDT creation errors with _FDT()
libfdt failures when creating the FDT should cause QEMU to terminate.

Let's use the _FDT() macro which does just that instead of propagating
the error to the caller. spapr_populate_pci_child_dt() no longer needs
to return a value in this case.

Note that, on the way, this get rids of the following nonsensical lines:

    g_assert(!ret);
    if (ret) {

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-15 10:29:48 +10:00
Greg Kurz 99372e785e spapr_pci: use the common _FDT() helper
All other users in hw/ppc already consider an error when building
the FDT to be fatal, even on hotplug paths. There's no valid reason
for spapr_pci to behave differently. So let's used the common _FDT()
helper which terminates QEMU when libfdt fails.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-15 10:29:48 +10:00
Cédric Le Goater 30bf9ed168 spapr: fix CAS-generated reset
The OV5_MMU_RADIX_300 requires special handling in the CAS negotiation
process. It is cleared from the option vector of the guest before
evaluating the changes and re-added later. But, when testing for a
possible CAS reset :

    spapr->cas_reboot = spapr_ovec_diff(ov5_updates,
                                        ov5_cas_old, spapr->ov5_cas);

the bit OV5_MMU_RADIX_300 will each time be seen as removed from the
previous OV5 set, hence generating a reset loop.

Fix this problem by also clearing the same bit in the ov5_cas_old set.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-15 10:29:48 +10:00
Cédric Le Goater 21f3f8db0e ppc/xive: fix OV5_XIVE_EXPLOIT bits
On POWER9, the Client Architecture Support (CAS) negotiation process
determines whether the guest operates in XIVE Legacy compatibility or
in XIVE exploitation mode. Now that we have initial guest support for
the XIVE interrupt controller, let's fix the bits definition which have
evolved in the latest specs.

The platform advertises the XIVE Exploitation Mode support using the
property "ibm,arch-vec-5-platform-support-vec-5", byte 23 bits 0-1 :

 - 0b00 XIVE legacy mode Only
 - 0b01 XIVE exploitation mode Only
 - 0b10 XIVE legacy or exploitation mode

The OS asks for XIVE Exploitation Mode support using the property
"ibm,architecture-vec-5", byte 23 bits 0-1:

 - 0b00 XIVE legacy mode Only
 - 0b01 XIVE exploitation mode Only

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-15 10:29:48 +10:00
Greg Kurz 4c563d9df5 spapr: only update SDR1 once per-cpu during CAS
Commit b55d295e3e added the possibility to support HPT resizing with KVM.
In the case of PR, we need to pass the userspace address of the HPT to KVM
using the SDR1 slot.
This is handled by kvmppc_update_sdr1() which uses CPU_FOREACH() to update
all CPUs. It is hence not needed to call kvmppc_update_sdr1() for each CPU.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-15 10:29:48 +10:00
Greg Kurz 549ce59e2b spapr_pci: use g_strdup_printf()
Building strings with g_strdup_printf() instead of snprintf() is
a QEMU common practice.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-15 10:29:48 +10:00
Greg Kurz d049bde69d spapr_pci: drop useless check in spapr_populate_pci_child_dt()
spapr_phb_get_loc_code() either returns a non-null pointer, or aborts
if g_strdup_printf() failed to allocate memory.

Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Grammatical fix to commit message]
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-15 10:29:48 +10:00
Greg Kurz 8f68760561 spapr_pci: drop useless check in spapr_phb_vfio_get_loc_code()
g_strdup_printf() either returns a non-null pointer, or aborts if it
failed to allocate memory.

Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Grammatical fix to commit message]
Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-15 10:29:48 +10:00
Daniel Henrique Barboza c86c1affae hw/ppc/spapr.c: cleaning up qdev_get_machine() calls
This patch removes the qdev_get_machine() calls that are made in
spapr.c in situations where we can get an existing pointer for
the MachineState by either passing it as an argument to the function
or by using other already available pointers.

The following changes were made:

- spapr_node0_size: static function that is called two times:
at spapr_setup_hpt_and_vrma and ppc_spapr_init. In both cases we can
pass an existing MachineState pointer to it.

- spapr_build_fdt: MachineState pointer can be retrieved from
the existing sPAPRMachineState pointer.

- spapr_boot_set: the opaque in the first arg is a sPAPRMachineState
pointer as we can see inside ppc_spapr_init:

    qemu_register_boot_set(spapr_boot_set, spapr);

We can get a MachineState pointer from it.

- spapr_machine_device_plug and spapr_machine_device_unplug_request: the
MachineState, sPAPRMachineState, MachineClass and sPAPRMachineClass pointers
can all be retrieved from the HotplugHandler pointer.

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-15 10:29:48 +10:00
Sam Bobroff 7cca3e466e ppc: spapr: Move VCPU ID calculation into sPAPR
Move the calculation of a CPU's VCPU ID out of the generic PPC code
(ppc_cpu_realizefn()) and into sPAPR specific code
(spapr_cpu_core_realize()) where it belongs.

Unfortunately, due to the way things are ordered, we still need to
default the VCPU ID in ppc_cpu_realizfn() but at least doing that
doesn't require any interaction with sPAPR.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:55 +10:00
Igor Mammedov c5354f54aa ppc: make cpu_model translation to type consistent
PPC handles -cpu FOO rather incosistently,
i.e. it does case-insensitive matching of FOO to
a CPU type (see: ppc_cpu_compare_class_name) but
handles alias names as case-sensitive, as result:

 # qemu-system-ppc64 -M mac99 -cpu g3
 qemu-system-ppc64: unable to find CPU model ' kN�U'

 # qemu-system-ppc64 -cpu 970MP_V1.1
 qemu-system-ppc64: Unable to find sPAPR CPU Core definition

while

 # qemu-system-ppc64 -M mac99 -cpu G3
 # qemu-system-ppc64 -cpu 970MP_v1.1

start up just fine.

Considering we can't take case-insensitive matching away,
make it case-insensitive for  all alias/type/core_type
lookups.

As side effect it allows to remove duplicate core types
which are the same except of using different cased letters in name.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:55 +10:00
Sam Bobroff fa98fbfcdf PPC: KVM: Support machine option to set VSMT mode
KVM now allows writing to KVM_CAP_PPC_SMT which has previously been
read only. Doing so causes KVM to act, for that VM, as if the host's
SMT mode was the given value. This is particularly important on Power
9 systems because their default value is 1, but they are able to
support values up to 8.

This patch introduces a way to control this capability via a new
machine property called VSMT ("Virtual SMT"). If the value is not set
on the command line a default is chosen that is, when possible,
compatible with legacy systems.

Note that the intialization of KVM_CAP_PPC_SMT has changed slightly
because it has changed (in KVM) from a global capability to a
VM-specific one. This won't cause a problem on older KVMs because VM
capabilities fall back to global ones.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:55 +10:00
Greg Kurz cc7b35b169 spapr: fallback to raw mode if best compat mode cannot be set during CAS
KVM PR doesn't allow to set a compat mode. This causes ppc_set_compat_all()
to fail and we return H_HARDWARE to the guest right away.

This is excessive: even if we favor compat mode since commit 152ef803ce,
we should at least fallback to raw mode if the guest supports it.

This patch modifies cas_check_pvr() so that it also reports that the real
PVR was found in the table supplied by the guest. Note that this is only
makes sense if raw mode isn't explicitely disabled (ie, the user didn't
set the machine "max-cpu-compat" property). If this is the case, we can
simply ignore ppc_set_compat_all() failures, and let the guest run in raw
mode.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:55 +10:00
Thomas Huth 2363d5ee23 hw/ppc/spapr_cpu_core: Add a proper check for spapr machine
QEMU currently crashes when the user tries to add a spapr-cpu-core
on a non-pseries machine:

$ qemu-system-ppc64 -S -machine ppce500,accel=tcg \
                    -device POWER5+_v2.1-spapr-cpu-core
hw/ppc/spapr_cpu_core.c:178:spapr_cpu_core_realize_child:
Object 0x55cee1f55160 is not an instance of type spapr-machine
Aborted (core dumped)

So let's add a proper check for the correct machine time with
a more friendly error message here.

Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:55 +10:00
BALATON Zoltan 97c2acb550 ppc4xx: Export ECB and PLB emulation
Make these device models available outside ppc405_uc.c for reuse in
460EX emulation. They are left in their current place for now because
they are used mostly unchanged and I'm not sure these correctly model
the components in 440 SoCs (but they seem to be good enough). These
functions could be moved in a subsequent clean up series when this is
confirmed.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:55 +10:00
BALATON Zoltan 44a935d8c3 ppc4xx_i2c: Move to hw/i2c
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:55 +10:00
BALATON Zoltan 3b09bb0fb9 ppc4xx_i2c: QOMify
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:55 +10:00
BALATON Zoltan 65ca801bf4 ppc4xx: Split off 4xx I2C emulation from ppc405_uc to its own file
This device appears in other SoCs as well not just in 405 ones and
subsequent patches will modify it, so move it out of ppc405_uc.c in
preparation

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:55 +10:00
BALATON Zoltan 0453428047 ppc4xx: Make MAL emulation more generic
Allow MAL with more RX and TX channels as found in newer versions.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:55 +10:00
BALATON Zoltan 517284a771 ppc4xx: Move MAL from ppc405_uc to ppc4xx_devs
This device appears in other SoCs as well not just in 405 ones

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:55 +10:00
Alexey Kardashevskiy f5509b6beb spapr_iommu: Realloc guest visible TCE table when hot(un)plugging vfio-pci
This replaces g_malloc() with spapr_tce_alloc_table() as this is
the standard way of allocating tables and this allows moving the table
back to KVM when unplugging a VFIO PCI device and VFIO TCE acceleration
support is not present in the KVM.

Although spapr_tce_alloc_table() is expected to fail with EBUSY
if called when previous fd is not closed yet, in practice we will not
see it because cap_spapr_vfio is false at the moment.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:55 +10:00
Sam Bobroff 2e886fb391 ppc: spapr: Make VCPU ID handling private to SPAPR
The concept of a VCPU ID that differs from the CPU's index
(cpu->cpu_index) exists only within SPAPR machines so, move the
functions ppc_get_vcpu_id() and ppc_get_cpu_by_vcpu_id() into spapr.c
and rename them appropriately.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:55 +10:00
Sam Bobroff 81210c2009 ppc: spapr: Rename cpu_dt_id to vcpu_id
This field actually records the VCPU ID used by KVM and, although the
value is also used in the device tree it is primarily the VCPU ID so
rename it as such.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[dwg: Updated comment missed in cpu.h]
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:55 +10:00
Sam Bobroff 6d53657019 e500: Use cpu_index instead of vcpu_dt_id
The e500 platform code uses the function ppc_get_vcpu_dt_id() to get
an id to put in its device tree.  Which seems like it makes sense, but
ppc_get_vcpu_dt_id() is actually badly named - it only differs from
cpu_index in cases where you're running on KVM HV and the host's
number of threads differs from the guests.  Since KVM HV only supports
PAPR, not e500, it doesn't make sense to use it here.

Simply use the cpu_index instead (which is 'i' in this context
because qemu_get_cpu(i) returns the cpu with cpu_index == i).

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
[dwg: Rewrote commit message]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:55 +10:00
Greg Kurz e2676b1697 spapr: add pseries-2.11 machine type
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:55 +10:00
Greg Kurz ea359d20e6 spapr_iommu: unregister vmstate at unrealize time
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:54 +10:00
Michael Roth 8dc9785ca0 spapr_iommu: pass object ownership to parent/owner
TCE table objects attach themselves to an owner as a child
property. unref afterward to allow them to be finalized
when their owner is finalized.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:54 +10:00
Michael Roth f3f4103034 spapr_drc: pass object ownership to parent/owner
DRC objects attach themselves to an owner as a child
property. unref afterward to allow them to be finalized
when their owner is finalized.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:54 +10:00
Greg Kurz 379ae096f7 spapr_drc: add unrealize method to physical DRC class
When hot-unplugging a PHB, all its PCI DRC connectors get unrealized. This
patch adds an unrealize method to the physical DRC class, in order to undo
registrations performed in realize_physical().

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:54 +10:00
Greg Kurz dba95ebbf8 spapr_pci: parent the MSI memory region to the PHB
This memory region should be owned by the PHB. This ensures the PHB
cannot be finalized as long as the the region is guest visible, or
used by a CPU or a device.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:54 +10:00
Greg Kurz a931ad137a spapr_iommu: convert TCE table object to realize()
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:54 +10:00
Greg Kurz f5babeacc4 spapr_drc: use g_strdup_printf() instead of snprintf()
Passing a stack allocated buffer of arbitrary length to snprintf()
without checking the return value can cause the resultant strings
to be silently truncated.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:54 +10:00
Greg Kurz a205a053dc spapr_iommu: use g_strdup_printf() instead of snprintf()
Passing a stack allocated buffer of arbitrary length to snprintf()
without checking the return value can cause the resultant strings
to be silently truncated.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:54 +10:00
Greg Kurz 5c3d70e970 spapr_pci: use memory_region_add_subregion() with DMA windows
Passing a null priority to memory_region_add_subregion_overlap() is
strictly equivalent to calling memory_region_add_subregion().

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:54 +10:00
Daniel Henrique Barboza 10f12e6450 hw/ppc: CAS reset on early device hotplug
This patch is a follow up on the discussions made in patch
"hw/ppc: disable hotplug before CAS is completed" that can be
found at [1].

At this moment, we do not support CPU/memory hotplug in early
boot stages, before CAS. When a hotplug occurs, the event is logged
in an internal RTAS event log queue and an IRQ pulse is fired. In
regular conditions, the guest handles the interrupt by executing
check_exception, fetching the generated hotplug event and enabling
the device for use.

In early boot, this IRQ isn't caught (SLOF does not handle hotplug
events), leaving the event in the rtas event log queue. If the guest
executes check_exception due to another hotplug event, the re-assertion
of the IRQ ends up de-queuing the first hotplug event as well. In short,
a device hotplugged before CAS is considered coldplugged by SLOF.
This leads to device misbehavior and, in some cases, guest kernel
Ooops when trying to unplug the device.

A proper fix would be to turn every device hotplugged before CAS
as a colplugged device. This is not trivial to do with the current
code base though - the FDT is written in the guest memory at
ppc_spapr_reset and can't be retrieved without adding extra state
(fdt_size for example) that will need to managed and migrated. Adding
the hotplugged DT in the middle of CAS negotiation via the updated DT
tree works with CPU devs, but panics the guest kernel at boot. Additional
analysis would be necessary for LMBs and PCI devices. There are
questions to be made in QEMU/SLOF/kernel level about how we can make
this change in a sustainable way.

With Linux guests, a fix would be the kernel executing check_exception
at boot time, de-queueing the events that happened in early boot and
processing them. However, even if/when the newer kernels start
fetching these events at boot time, we need to take care of older
kernels that won't be doing that.

This patch works around the situation by issuing a CAS reset if a hotplugged
device is detected during CAS:

- the DRC conditions that warrant a CAS reset is the same as those that
triggers a DRC migration - the DRC must have a device attached and
the DRC state is not equal to its ready_state. With that in mind, this
patch makes use of 'spapr_drc_needed' to determine if a CAS reset
is needed.

- In the middle of CAS negotiations, the function
'spapr_hotplugged_dev_before_cas' goes through all the DRCs to see
if there are any DRC that requires a reset, using spapr_drc_needed. If
that happens, returns '1' in 'spapr_h_cas_compose_response' which will set
spapr->cas_reboot to true, causing the machine to reboot.

No changes are made for coldplug devices.

[1] http://lists.nongnu.org/archive/html/qemu-devel/2017-08/msg02855.html

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:54 +10:00
Daniel Henrique Barboza 5625817423 hw/ppc: clear pending_events on machine reset
The sPAPR machine isn't clearing up the pending events QTAILQ on
machine reboot. This allows for unprocessed hotplug/epow events
to persist in the queue after reset and, when reasserting the IRQs in
check_exception later on, these will be being processed by the OS.

This patch implements a new function called 'spapr_clear_pending_events'
that clears up the pending_events QTAILQ. This helper is then called
inside ppc_spapr_reset to clear up the events queue, preventing
old/deprecated events from persisting after a reset.

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:54 +10:00
Daniel Henrique Barboza c618e300eb hw/ppc/spapr_drc.c: change spapr_drc_needed to use drc->dev
This patch makes a small fix in 'spapr_drc_needed' to change how we detect
if a DRC has a device attached. Previously it used dr_entity_sense for this,
which  works for physical DRCs.

However, for logical DRCs, it didn't cover the case where a logical DRC has
a drc->dev but the state is LOGICAL_UNUSABLE (e.g. a hotplugged CPU before
CAS). In this case, the dr_entity_sense of this DRC returns UNUSABLE and the
code was considering that there were no dev attached, making spapr_drc_needed
return 'false' when in fact we would like to migrate the DRC.

Changing it to check for drc->dev instead works for all DRC types.

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08 09:30:54 +10:00
Igor Mammedov 84efa64c60 ppc: replace cpu_ppc_init() with cpu_generic_init()
it's just a wrapper, drop it and use cpu_generic_init() directly

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1503592308-93913-26-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-09-01 11:54:25 -03:00
Thomas Huth 1f98e55385 hw/ppc/spapr_iommu: Fix crash when removing the "spapr-tce-table" device
QEMU currently aborts unexpectedly when the user tries to add and
remove a "spapr-tce-table" device:

$ qemu-system-ppc64 -nographic -S -nodefaults -monitor stdio
QEMU 2.9.92 monitor - type 'help' for more information
(qemu) device_add spapr-tce-table,id=x
(qemu) device_del x
**
ERROR:qemu/qdev-monitor.c:872:qdev_unplug: assertion failed: (hotplug_ctrl)
Aborted (core dumped)

The device should not be accessable for the users at all, it's just
used internally, so mark it with user_creatable = false.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-08-22 21:26:46 +10:00
Thomas Huth 8ccccff9dd hw/ppc/spapr_rtc: Mark the RTC device with user_creatable = false
QEMU currently aborts unexpectedly when a user tries to do something
like this:

$ qemu-system-ppc64 -nographic -S -nodefaults -monitor stdio
QEMU 2.9.92 monitor - type 'help' for more information
(qemu) device_add spapr-rtc,id=spapr-rtc
(qemu) device_del spapr-rtc
**
ERROR:qemu/qdev-monitor.c:872:qdev_unplug: assertion failed: (hotplug_ctrl)
Aborted (core dumped)

The RTC device is not meant to be hot-pluggable - it's an internal
device only and it even should not be possible to create it a
second time with the "-device" parameter, so let's mark this
with "user_creatable = false".

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-08-22 21:26:46 +10:00
Thomas Huth 0479097859 hw/ppc/spapr: Fix segfault when instantiating a 'pc-dimm' without 'memdev'
QEMU currently crashes when trying to use a 'pc-dimm' on the pseries
machine without specifying its 'memdev' property. This happens because
pc_dimm_get_memory_region() does not check whether the 'memdev' property
has properly been set by the user. Looking closer at this function, it's
also obvious that it is using &error_abort to call another function - and
this is bad in a function that is used in the hot-plugging calling chain
since this can also cause QEMU to exit unexpectedly.

So let's fix these issues in a proper way now: Add a "Error **errp"
parameter to pc_dimm_get_memory_region() which we use in case the 'memdev'
property has not been set by the user, and which we can use instead of
the &error_abort, and change the callers of get_memory_region() to make
use of this "errp" parameter for proper error checking.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-08-22 21:26:46 +10:00
Bharata B Rao 188bfe1b00 spapr: Allow configure-connector to be called multiple times
In case of in-kernel memory hot unplug, when the guest is not able
to remove all the LMBs that are requested for removal, it will add back
any LMBs that have been successfully removed. The DR Connectors of
these LMBs wouldn't have been unconfigured and hence the addition of
these LMBs will result in configure-connector call being issued on
LMB DR connectors that are already in configured state. Such
configure-connector calls will fail resulting in a DIMM which is
partially unplugged.

This however worked till recently before we overhauled the DRC
implementation in QEMU. Commit 9d4c0f4f0a71e: "spapr: Consolidate
DRC state variables" is the first commit where this problem shows up
as per git bisect.

Ideally guest shouldn't be issuing configure-connector call on an
already configured DR connector. However for now, work around this in
QEMU by allowing configure-connector to be called multiple times for
all types of DR connectors.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
[dwg: Corrected buglet that would have initialized fdt pointers ready
 for reading on a device not present at reset]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-08-22 21:26:46 +10:00