Commit Graph

1982 Commits

Author SHA1 Message Date
Philippe Mathieu-Daudé
54db89f5bb spapr_events: Rewrite a fall through comment
GCC9 is confused by this comment when building with CFLAG
-Wimplicit-fallthrough=2:

    CC      ppc64-softmmu/hw/ppc/spapr_rtc.o
  hw/ppc/spapr_events.c: In function ‘rtas_event_log_to_source’:
  hw/ppc/spapr_events.c:312:12: error: this statement may fall through [-Werror=implicit-fallthrough=]
    312 |         if (spapr_ovec_test(spapr->ov5_cas, OV5_HP_EVT)) {
        |            ^
  hw/ppc/spapr_events.c:317:5: note: here
    317 |     case RTAS_LOG_TYPE_EPOW:
        |     ^~~~
  cc1: all warnings being treated as errors

Rewrite the comment using 'fall through' which is recognized by
GCC and static analyzers.

Reported-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190719131425.10835-8-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2019-08-21 10:59:10 +02:00
Greg Kurz
e1588bcdd2 spapr/irq: Drop spapr_irq_msi_reset()
PHBs already take care of clearing the MSIs from the bitmap during reset
or unplug. No need to do this globally from the machine code. Rather add
an assert to ensure that PHBs have acted as expected.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156415228966.1064338.190189424190233355.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[dwg: Fix crash in qtest case where spapr->irq_map can be NULL at the
 new assert()]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:39 +10:00
Greg Kurz
ea52074d3a spapr/pci: Free MSIs during reset
When the machine is reset, the MSI bitmap is cleared but the allocated
MSIs are not freed. Some operating systems, such as AIX, can detect the
previous configuration and assert.

Empty the MSI cache, this performs the needed cleanup.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156415228410.1064338.4486161194061636096.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:39 +10:00
Greg Kurz
078eb6b05b spapr/pci: Consolidate de-allocation of MSIs
When freeing MSIs, we need to:
- remove them from the machine's MSI bitmap
- remove them from the IC backend
- remove them from the PHB's MSI cache

This is currently open coded in two places in rtas_ibm_change_msi(),
and we're about to need this in spapr_phb_reset() as well. Instead of
duplicating this code again, make it a destroy function for the PHB's
MSI cache. Removing an MSI device from the cache will call the destroy
function internally.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156415227855.1064338.5657793835271464648.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:39 +10:00
Nicholas Piggin
93eac7b8f4 spapr: Implement ibm,suspend-me
This has been useful to modify and test the Linux pseries suspend
code but it requires modification to the guest to call it (due to
being gated by other unimplemented features). It is not otherwise
used by Linux yet, but work is slowly progressing there.

This allows a (lightly modified) guest kernel to suspend with
`echo mem > /sys/power/state` and be resumed with system_wakeup
monitor command.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20190722061752.22114-2-npiggin@gmail.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:39 +10:00
Michael Roth
0fb6bd0732 spapr: initial implementation for H_TPM_COMM/spapr-tpm-proxy
This implements the H_TPM_COMM hypercall, which is used by an
Ultravisor to pass TPM commands directly to the host's TPM device, or
a TPM Resource Manager associated with the device.

This also introduces a new virtual device, spapr-tpm-proxy, which
is used to configure the host TPM path to be used to service
requests sent by H_TPM_COMM hcalls, for example:

  -device spapr-tpm-proxy,id=tpmp0,host-path=/dev/tpmrm0

By default, no spapr-tpm-proxy will be created, and hcalls will return
H_FUNCTION.

The full specification for this hypercall can be found in
docs/specs/ppc-spapr-uv-hcalls.txt

Since SVM-related hcalls like H_TPM_COMM use a reserved range of
0xEF00-0xEF80, we introduce a separate hcall table here to handle
them.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com
Message-Id: <20190717205842.17827-3-mdroth@linux.vnet.ibm.com>
[dwg: Corrected #include for upstream change]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:12 +10:00
Nicholas Piggin
107413142b spapr: Implement H_JOIN
This has been useful to modify and test the Linux pseries suspend
code but it requires modification to the guest to call it (due to
being gated by other unimplemented features). It is not otherwise
used by Linux yet, but work is slowly progressing there.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20190718034214.14948-5-npiggin@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:12 +10:00
Nicholas Piggin
e8ce0e40ee spapr: Implement H_CONFER
This does not do directed yielding and is not quite as strict as PAPR
specifies in terms of precise dispatch behaviour. This generally will
mean suboptimal performance, rather than guest misbehaviour. Linux
does not rely on exact dispatch behaviour.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20190718034214.14948-4-npiggin@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:12 +10:00
Nicholas Piggin
3a6e6224a9 spapr: Implement H_PROD
H_PROD is added, and H_CEDE is modified to test the prod bit
according to PAPR.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20190718034214.14948-3-npiggin@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:12 +10:00
Nicholas Piggin
03ef074c04 spapr: Implement dispatch tracking for tcg
Implement cpu_exec_enter/exit on ppc which calls into new methods of
the same name in PPCVirtualHypervisorClass. These are used by spapr
to implement the splpar VPA dispatch counter initially.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20190718034214.14948-2-npiggin@gmail.com>
[dwg: Removed unnecessary CONFIG_USER_ONLY checks as suggested by gkurz]
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:11 +10:00
Shivaprasad G Bhat
00005f2229 ppc: fix leak in h_client_architecture_support
Free all SpaprOptionVector local pointers after use.

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Message-Id: <156335160761.82682.11912058325777251614.stgit@lep8c.aus.stglabs.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:11 +10:00
Shivaprasad G Bhat
dbd26f2f7f ppc: fix memory leak in spapr_dt_drc()
Leaking the drc_name while preparing the DT properties.
Fixing that.

Also, remove the const qualifier from spapr_drc_name().

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Message-Id: <156335159028.82682.5404622104535818162.stgit@lep8c.aus.stglabs.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:11 +10:00
Shivaprasad G Bhat
d758880586 ppc: fix memory leak in spapr_caps_add_properties
Free the capability name string after setting
the capability.

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Message-Id: <156335156198.82682.8756968724044750843.stgit@lep8c.aus.stglabs.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:11 +10:00
Maxiwell S. Garcia
d14f339762 migration: Do not re-read the clock on pre_save in case of paused guest
Re-read the timebase before migrate was ported from x86 commit:
   6053a86fe7: kvmclock: reduce kvmclock difference on migration

The clock move makes the guest knows about the paused time between
the stop and migrate commands. This is an issue in an already-paused
VM because some side effects, like process stalls, could happen
after migration.

So, this patch checks the runstate of guest in the pre_save handler and
do not re-reads the timebase in case of paused state (cold migration).

Signed-off-by: Maxiwell S. Garcia <maxiwell@linux.ibm.com>
Message-Id: <20190711194702.26598-1-maxiwell@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:11 +10:00
David Gibson
d15d4ad64f spapr_pci: Allow 2MiB and 16MiB IOMMU pagesizes by default
We've had the qemu and kernel KVM infrastructure to handle larger TCE
page sizes for a while, but forgot to update the defaults to actually
allow them.  This turns that change on.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:16:22 +10:00
Cornelia Huck
9aec2e52ce hw: add compat machines for 4.2
Add 4.2 machine types for arm/i440fx/q35/s390x/spapr.

For i440fx and q35, unversioned cpu models are still translated
to -v1, as 0788a56bd1 ("i386: Make unversioned CPU models be
aliases") states this should only transition to the latest cpu
model version in 4.3 (or later).

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190724103524.20916-1-cohuck@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 11:32:11 +10:00
Alexey Kardashevskiy
a14f04ebba spapr_iommu: Fix xlate trace to print translated address
Currently we basically print IO address twice, fix this.

Fixes: 7e472264e9 ("PPC: spapr: iommu: rework traces")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20190812054202.125492-1-aik@ozlabs.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 11:32:11 +10:00
Daniel Black
f92be77fea spapr: quantify error messages regarding capability settings
Its not immediately obvious how cap-X=Y setting need to be applied
to the command line so, for spapr capability error messages, this
has been clarified to:

 appending -machine cap-X=Y

The wrong value messages have been left as is, as the user has found
the right location.

Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Black <daniel@linux.ibm.com>
Message-Id: <20190812071044.30806-1-daniel@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 11:32:11 +10:00
Markus Armbruster
54d31236b9 sysemu: Split sysemu/runstate.h off sysemu/sysemu.h
sysemu/sysemu.h is a rather unfocused dumping ground for stuff related
to the system-emulator.  Evidence:

* It's included widely: in my "build everything" tree, changing
  sysemu/sysemu.h still triggers a recompile of some 1100 out of 6600
  objects (not counting tests and objects that don't depend on
  qemu/osdep.h, down from 5400 due to the previous two commits).

* It pulls in more than a dozen additional headers.

Split stuff related to run state management into its own header
sysemu/runstate.h.

Touching sysemu/sysemu.h now recompiles some 850 objects.  qemu/uuid.h
also drops from 1100 to 850, and qapi/qapi-types-run-state.h from 4400
to 4200.  Touching new sysemu/runstate.h recompiles some 500 objects.

Since I'm touching MAINTAINERS to add sysemu/runstate.h anyway, also
add qemu/main-loop.h.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-30-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
[Unbreak OS-X build]
2019-08-16 13:37:36 +02:00
Markus Armbruster
d5938f29fe Clean up inclusion of sysemu/sysemu.h
In my "build everything" tree, changing sysemu/sysemu.h triggers a
recompile of some 5400 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

Almost a third of its inclusions are actually superfluous.  Delete
them.  Downgrade two more to qapi/qapi-types-run-state.h, and move one
from char/serial.h to char/serial.c.

hw/semihosting/config.c, monitor/monitor.c, qdev-monitor.c, and
stubs/semihost.c define variables declared in sysemu/sysemu.h without
including it.  The compiler is cool with that, but include it anyway.

This doesn't reduce actual use much, as it's still included into
widely included headers.  The next commit will tackle that.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-27-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2019-08-16 13:31:53 +02:00
Markus Armbruster
b58c5c2dd2 numa: Move remaining NUMA declarations from sysemu.h to numa.h
Commit e35704ba9c "numa: Move NUMA declarations from sysemu.h to
numa.h" left a few NUMA-related macros behind.  Move them now.

Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190812052359.30071-26-armbru@redhat.com>
2019-08-16 13:31:53 +02:00
Markus Armbruster
12e9493df9 Include hw/boards.h a bit less
hw/boards.h pulls in almost 60 headers.  The less we include it into
headers, the better.  As a first step, drop superfluous inclusions,
and downgrade some more to what's actually needed.  Gets rid of just
one inclusion into a header.

Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-23-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
2019-08-16 13:31:53 +02:00
Markus Armbruster
a27bd6c779 Include hw/qdev-properties.h less
In my "build everything" tree, changing hw/qdev-properties.h triggers
a recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

Many places including hw/qdev-properties.h (directly or via hw/qdev.h)
actually need only hw/qdev-core.h.  Include hw/qdev-core.h there
instead.

hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h
and hw/qdev-properties.h, which in turn includes hw/qdev-core.h.
Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h.

While there, delete a few superfluous inclusions of hw/qdev-core.h.

Touching hw/qdev-properties.h now recompiles some 1200 objects.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190812052359.30071-22-armbru@redhat.com>
2019-08-16 13:31:53 +02:00
Markus Armbruster
db72581598 Include qemu/main-loop.h less
In my "build everything" tree, changing qemu/main-loop.h triggers a
recompile of some 5600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).  It includes block/aio.h,
which in turn includes qemu/event_notifier.h, qemu/notify.h,
qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h,
qemu/thread.h, qemu/timer.h, and a few more.

Include qemu/main-loop.h only where it's needed.  Touching it now
recompiles only some 1700 objects.  For block/aio.h and
qemu/event_notifier.h, these numbers drop from 5600 to 2800.  For the
others, they shrink only slightly.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-21-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster
650d103d3e Include hw/hw.h exactly where needed
In my "build everything" tree, changing hw/hw.h triggers a recompile
of some 2600 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).

The previous commits have left only the declaration of hw_error() in
hw/hw.h.  This permits dropping most of its inclusions.  Touching it
now recompiles less than 200 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster
d645427057 Include migration/vmstate.h less
In my "build everything" tree, changing migration/vmstate.h triggers a
recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

hw/hw.h supposedly includes it for convenience.  Several other headers
include it just to get VMStateDescription.  The previous commit made
that unnecessary.

Include migration/vmstate.h only where it's still needed.  Touching it
now recompiles only some 1600 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-16-armbru@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster
64552b6be4 Include hw/irq.h a lot less
In my "build everything" tree, changing hw/irq.h triggers a recompile
of some 5400 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).

hw/hw.h supposedly includes it for convenience.  Several other headers
include it just to get qemu_irq and.or qemu_irq_handler.

Move the qemu_irq and qemu_irq_handler typedefs from hw/irq.h to
qemu/typedefs.h, and then include hw/irq.h only where it's still
needed.  Touching it now recompiles only some 500 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-13-armbru@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster
5a720b1ed5 ide: Include hw/ide/internal a bit less outside hw/ide/
According to hw/ide/internal's file comment, only files in hw/ide/ are
supposed to include it.  Drag reality slightly closer to supposition.

Three includes outside hw/ide remain: hw/arm/sbsa-ref.c,
include/hw/ide/pci.h, and include/hw/misc/macio/macio.h.  Turns out
board code needs ide-internal.h to wire up IDE stuff.  More cleanup is
needed.  Left for another day.

Cc: John Snow <jsnow@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-11-armbru@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster
ca77ee28e0 Include migration/qemu-file-types.h a lot less
In my "build everything" tree, changing migration/qemu-file-types.h
triggers a recompile of some 2600 out of 6600 objects (not counting
tests and objects that don't depend on qemu/osdep.h).

The culprit is again hw/hw.h, which supposedly includes it for
convenience.

Include migration/qemu-file-types.h only where it's needed.  Touching
it now recompiles less than 200 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-10-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster
71e8a91585 Include sysemu/reset.h a lot less
In my "build everything" tree, changing sysemu/reset.h triggers a
recompile of some 2600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

The main culprit is hw/hw.h, which supposedly includes it for
convenience.

Include sysemu/reset.h only where it's needed.  Touching it now
recompiles less than 200 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-9-armbru@redhat.com>
2019-08-16 13:31:52 +02:00
David Gibson
25c9780d38 spapr: Reset CAS & IRQ subsystem after devices
This fixes a nasty regression in qemu-4.1 for the 'pseries' machine,
caused by the new "dual" interrupt controller model.  Specifically,
qemu can crash when used with KVM if a 'system_reset' is requested
while there's active I/O in the guest.

The problem is that in spapr_machine_reset() we:

1. Reset the CAS vector state
	spapr_ovec_cleanup(spapr->ov5_cas);

2. Reset all devices
	qemu_devices_reset()

3. Reset the irq subsystem
	spapr_irq_reset();

However (1) implicitly changes the interrupt delivery mode, because
whether we're using XICS or XIVE depends on the CAS state.  We don't
properly initialize the new irq mode until (3) though - in particular
setting up the KVM devices.

During (2), we can temporarily drop the BQL allowing some irqs to be
delivered which will go to an irq system that's not properly set up.

Specifically, if the previous guest was in (KVM) XIVE mode, the CAS
reset will put us back in XICS mode.  kvm_kernel_irqchip() still
returns true, because XIVE was using KVM, however XICs doesn't have
its KVM components intialized and kernel_xics_fd == -1.  When the irq
is delivered it goes via ics_kvm_set_irq() which assert()s that
kernel_xics_fd != -1.

This change addresses the problem by delaying the CAS reset until
after the devices reset.  The device reset should quiesce all the
devices so we won't get irqs delivered while we mess around with the
IRQ.  The CAS reset and irq re-initialize should also now be under the
same BQL critical section so nothing else should be able to interrupt
it either.

We also move the spapr_irq_msi_reset() used in one of the legacy irq
modes, since it logically makes sense at the same point as the
spapr_irq_reset() (it's essentially an equivalent operation for older
machine types).  Since we don't need to switch between different
interrupt controllers for those old machine types it shouldn't
actually be broken in those cases though.

Cc: Cédric Le Goater <clg@kaod.org>

Fixes: b2e22477 "spapr: add a 'reset' method to the sPAPR IRQ backend"
Fixes: 13db0cd9 "spapr: introduce a new sPAPR IRQ backend supporting
                 XIVE and XICS"
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-13 15:59:21 +10:00
Greg Kurz
f5bda01066 spapr/irq: Inform the user when falling back to emulated IC
Just to give an indication to the user that the error condition is
handled and how.

Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156398743479.546975.14566809803480887488.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-28 11:50:26 +10:00
Igor Mammedov
cd5ff8333a machine: show if CLI option '-numa node,mem' is supported in QAPI schema
Legacy '-numa node,mem' option has a number of issues and mgmt often
defaults to it. Unfortunately it's no possible to replace it with
an alternative '-numa memdev' without breaking migration compatibility.
What's possible though is to deprecate it, keeping option working with
old machine types only.

In order to help users to find out if being deprecated CLI option
'-numa node,mem' is still supported by particular machine type, add new
"numa-mem-supported" property to output of query-machines.

"numa-mem-supported" is set to 'true' for machines that currently support
NUMA, but it will be flipped to 'false' later on, once deprecation period
expires and kept 'true' only for old machine types that used to support
the legacy option so it won't break existing configuration that are using
it.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1560172207-378962-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:03 -03:00
Like Xu
fe6b6346e9 hw/ppc: Replace global smp variables with machine smp properties
The global smp variables in ppc are replaced with smp machine properties.

A local variable of the same name would be introduced in the declaration
phase if it's used widely in the context OR replace it on the spot if it's
only used once. No semantic changes.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190518205428.90532-5-like.xu@linux.intel.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:07:36 -03:00
Like Xu
a0628599fa machine: Refactor smp-related call chains to pass MachineState
To get rid of the global smp_* variables we're currently using, it's recommended
to pass MachineState in the list of incoming parameters for functions that use
global smp variables, thus some redundant parameters are dropped. It's applied
for legacy smbios_*(), *_machine_reset(), hot_add_cpu() and mips *_create_cpu().

Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190518205428.90532-3-like.xu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:07:36 -03:00
Peter Maydell
374f63f681 Monitor patches for 2019-07-02
-----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAl0bQhUSHGFybWJydUBy
 ZWRoYXQuY29tAAoJEDhwtADrkYZTOgoP/3v1ZAg4ulTrUx/KO8C80sA3qqaPgkbP
 s8DFUwtjWcDrefGejIep4C0dxhY5vD1oNg9JeK+6O4IInijlg30kildBj85nPa5J
 Z55hZFIGWW1CSMzeSlOMWH1QdYdGPXkGRe8ApXPqRH4VpsdulC+vErQl1YrleNtv
 B8K8402hMOKL+TsheBpdnbM+1hXRj8zBGfobiY/9eLex30uaNDVOd3bIpx0M63fr
 kcwOOPKQeUTLPbUvI6mVQtTkNFCzk6Wmi5vMyT3bSe2ZMLNnEFQZXabcgSBverTK
 9ar5MxTMHIplstWVQEceXN3BLVlIsmunUsuCSHqmX6tdX37EKiJXZImiz0i98bnq
 5SFNAHntr3JDMdqqZJX+v1DvmGbPfv/H5poWk+wQfFBkjDykExEd77v9kuOc4aVZ
 HkEYNbAPVTjfm5xXxn8yXdY++tVsQKV4q2T4OX9WacMu5sJujDm9qIlVaE2A3Cdc
 ePM4tNrHJ0MNDHn2CG/wPEaLelfylLlL/Aai/WQe/YPVrVOHroT4zvwVv6+QJB2k
 MWqmRzGEOYDevPs8PizPetEHiirTHyrIufuleFJglBVSNi5V2LpG5d97Pal0Dn2k
 1ZzSnonXGnhx7VoaqbxhAEj8vAI42gJJ3Q/f6VW2q2rBAv4/oc9jeQVx1SYGFKCu
 QbNALrVpyv+5
 =jtpi
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2019-07-02-v2' into staging

Monitor patches for 2019-07-02

# gpg: Signature made Tue 02 Jul 2019 12:37:57 BST
# gpg:                using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg:                issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-monitor-2019-07-02-v2:
  dump: Move HMP command handlers to dump/
  MAINTAINERS: Add Windows dump to section "Dump"
  dump: Move the code to dump/
  qapi: Split dump.json off misc.json
  qapi: Rename target.json to misc-target.json
  qapi: Split machine-target.json off target.json and misc.json
  hw/core: Collect HMP command handlers in hw/core/
  hw/core: Collect QMP command handlers in hw/core/
  hw/core: Move numa.c to hw/core/
  qapi: Split machine.json off misc.json
  MAINTAINERS: Merge sections CPU, NUMA into Machine core
  qom: Move HMP command handlers to qom/
  qom: Move QMP command handlers to qom/
  qapi: Split qom.json and qdev.json off misc.json
  hmp: Move hmp.h to include/monitor/
  Makefile: Don't add monitor/ twice to common-obj-y
  MAINTAINERS: Make section "QOM" cover qdev as well
  MAINTAINERS: new maintainers for QOM

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-07-03 00:16:43 +01:00
Markus Armbruster
b0227cdb00 qapi: Rename target.json to misc-target.json
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190619201050.19040-14-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2019-07-02 13:37:00 +02:00
Cédric Le Goater
d0e9bc0407 spapr/xive: simplify spapr_irq_init_device() to remove the emulated init
The init_emu() handles are now empty. Remove them and rename
spapr_irq_init_device() to spapr_irq_init_kvm().

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190614165920.12670-3-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Cédric Le Goater
981b1c6266 spapr/xive: rework the mapping the KVM memory regions
Today, the interrupt device is fully initialized at reset when the CAS
negotiation process has completed. Depending on the KVM capabilities,
the SpaprXive memory regions (ESB, TIMA) are initialized with a host
MMIO backend or a QEMU emulated backend. This results in a complex
initialization sequence partially done at realize and later at reset,
and some memory region leaks.

To simplify this sequence and to remove of the late initialization of
the emulated device which is required to be done only once, we
introduce new memory regions specific for KVM. These regions are
mapped as overlaps on top of the emulated device to make use of the
host MMIOs. Also provide proper cleanups of these regions when the
XIVE KVM device is destroyed to fix the leaks.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190614165920.12670-2-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz
a2166410ad spapr_pci: Unregister listeners before destroying the IOMMU address space
Hot-unplugging a PHB with a VFIO device connected to it crashes QEMU:

-device spapr-pci-host-bridge,index=1,id=phb1 \
-device vfio-pci,host=0034:01:00.3,id=vfio0

(qemu) device_del phb1
[  357.207183] iommu: Removing device 0001:00:00.0 from group 1
[  360.375523] rpadlpar_io: slot PHB 1 removed
qemu-system-ppc64: memory.c:2742:
 do_address_space_destroy: Assertion `QTAILQ_EMPTY(&as->listeners)' failed.

'as' is the IOMMU address space, which indeed has a listener registered
to by vfio_connect_container() when the VFIO device is realized. This
listener is supposed to be unregistered by vfio_disconnect_container()
when the VFIO device is finalized. Unfortunately, the VFIO device hasn't
reached finalize yet at the time the PHB unrealize function is called,
and address_space_destroy() gets called with the VFIO listener still
being registered.

All regions have just been unmapped from the address space. Listeners
aren't needed anymore at this point. Remove them before destroying the
address space.

The VFIO code will try to remove them _again_ at device finalize,
but it is okay since memory_listener_unregister() is idempotent.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156110925375.92514.11649846071216864570.stgit@bahia.lan>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[dwg: Correct spelling error pointed out by aik]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz
9723295a72 ppc: Introduce kvmppc_set_reg_tb_offset() helper
Introduce a KVM helper and its stub instead of guarding the code with
CONFIG_KVM.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156051055736.224162.11641594431517798715.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz
eab9f191a0 xics/spapr: Rename xics_kvm_init()
Switch to using the connect/disconnect terminology like we already do for
XIVE.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156077920102.433243.6605099291134598170.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz
363ce377da hw/ppc: Drop useless CONFIG_KVM ifdefery
kvmppc_set_interrupt() has a stub that does nothing when CONFIG_KVM is
not defined.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156051055182.224162.15842560287892241124.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz
6d893a4d70 hw/ppc/prep: Drop useless CONFIG_KVM ifdefery
kvm_enabled() expands to (0) when CONFIG_KVM is not defined. It is
likely that the compiler will optimize the code out. And even if
it doesn't, we have a stub for kvmppc_get_hypercall().

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156051054630.224162.6140707722034383410.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz
da6e10177a hw/ppc/mac_newworld: Drop useless CONFIG_KVM ifdefery
kvm_enabled() expands to (0) when CONFIG_KVM is not defined. The first
CONFIG_KVM guard is thus useless and it is likely that the compiler
will optimize the code out in the case of the second guard. And even
if it doesn't, we have a stub for kvmppc_get_hypercall().

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156051054077.224162.9332715375637801197.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz
7a660e776e hw/ppc/mac_oldworld: Drop useless CONFIG_KVM ifdefery
kvm_enabled() expands to (0) when CONFIG_KVM is not defined. It is
likely that the compiler will optimize the code out. And even if
it doesn't, we have a stub for kvmppc_get_hypercall().

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156051053529.224162.3489943067148134636.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz
8d08fa93bb spapr_pci: Drop useless CONFIG_KVM ifdefery
kvm_enabled() expands to (0) when CONFIG_KVM is not defined.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156051052977.224162.17306829691809502082.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz
7e10b57dd9 spapr_pci: Fix DRC owner in spapr_dt_pci_bus()
spapr_dt_drc() scans the aliases of all DRConnector objects and filters
the ones that it will use to generate OF properties according to their
owner and type.

Passing bus->parent_dev _works_ if bus belongs to a PCI bridge, but it is
NULL if it is the PHB's root bus. This causes all allocated PCI DRCs to
be associated to all PHBs (visible in their "ibm,drc-types" properties).
As a consequence, hot unplugging a PHB results in PCI devices from the
other PHBs to be unplugged as well, and likely confuses the guest.

Use the same logic as in add_drcs() to ensure the correct owner is passed
to spapr_dt_drc().

Fixes: 14e714900f "spapr: Allow hot plug/unplug of PCI bridges and devices under PCI bridges"
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156084737348.512412.3552825999605902691.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz
7abc0c6d35 xics/spapr: Detect old KVM XICS on POWER9 hosts
Older KVMs on POWER9 don't support destroying/recreating a KVM XICS
device, which is required by 'dual' interrupt controller mode. This
causes QEMU to emit a warning when the guest is rebooted and to fall
back on XICS emulation:

qemu-system-ppc64: warning: kernel_irqchip allowed but unavailable:
 Error on KVM_CREATE_DEVICE for XICS: File exists

If kernel irqchip is required, QEMU will thus exit when the guest is
first rebooted. Failing QEMU this late may be a painful experience
for the user.

Detect that and exit at machine init instead.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156044430517.125694.6207865998817342638.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz
d9293c4843 xics/spapr: Register RTAS/hypercalls once at machine init
QEMU may crash when running a spapr machine in 'dual' interrupt controller
mode on some older (but not that old, eg. ubuntu 18.04.2) KVMs with partial
XIVE support:

qemu-system-ppc64: hw/ppc/spapr_rtas.c:411: spapr_rtas_register:
 Assertion `!name || !rtas_table[token].name' failed.

XICS is controlled by the guest thanks to a set of RTAS calls. Depending
on whether KVM XICS is used or not, the RTAS calls are handled by KVM or
QEMU. In both cases, QEMU needs to expose the RTAS calls to the guest
through the "rtas" node of the device tree.

The spapr_rtas_register() helper takes care of all of that: it adds the
RTAS call token to the "rtas" node and registers a QEMU callback to be
invoked when the guest issues the RTAS call. In the KVM XICS case, QEMU
registers a dummy callback that just prints an error since it isn't
supposed to be invoked, ever.

Historically, the XICS controller was setup during machine init and
released during final teardown. This changed when the 'dual' interrupt
controller mode was added to the spapr machine: in this case we need
to tear the XICS down and set it up again during machine reset. The
crash happens because we indeed have an incompatibility with older
KVMs that forces QEMU to fallback on emulated XICS, which tries to
re-registers the same RTAS calls.

This could be fixed by adding proper rollback that would unregister
RTAS calls on error. But since the emulated RTAS calls in QEMU can
now detect when they are mistakenly called while KVM XICS is in
use, it seems simpler to register them once and for all at machine
init. This fixes the crash and allows to remove some now useless
lines of code.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156044429963.125694.13710679451927268758.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Philippe Mathieu-Daudé
740a19313b spapr_pci: Fix potential NULL pointer dereference in spapr_dt_pci_bus()
Commit 14e714900f refactored the call to spapr_dt_drc(),
introducing a potential NULL pointer dereference while
accessing bus->parent_dev.
A trivial audit show 'bus' is not null in the two places
the static function spapr_dt_drc() is called.

Since the 'bus' parameter is not NULL in both callers, remove
remove the test on if (bus), and add an assert() to silent
static analyzers.

This fixes:

  /hw/ppc/spapr_pci.c: 1367 in spapr_dt_pci_bus()
  >>>     CID 1401933:  Null pointer dereferences  (FORWARD_NULL)
  >>>     Dereferencing null pointer "bus".
  1367         ret = spapr_dt_drc(fdt, offset, OBJECT(bus->parent_dev),
  1368                            SPAPR_DR_CONNECTOR_TYPE_PCI);

Fixes: 14e714900f
Reported-by: Coverity (CID 1401933)
Suggested-by: Greg Kurz <groug@kaod.org>
Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190613213406.22053-1-philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Cédric Le Goater
c29a0b0fb3 ppc/pnv: remove xscom_base field from PnvChip
It has now became useless with the previous patch.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190612174345.9799-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Cédric Le Goater
709044fd2d ppc/pnv: fix XSCOM MMIO base address for P9 machines with multiple chips
The PNV_XSCOM_BASE and PNV_XSCOM_SIZE macros are specific to POWER8
and they are used when the device tree is populated and the MMIO
region created, even for POWER9 chips. This is not too much of a
problem today because we don't have important devices on the second
chip, but we might have oneday (PHBs).

Fix by using the appropriate macros in case of P9.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190612174345.9799-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Peter Maydell
a050901d4b ppc patch queue 2019-06-12
Next pull request against qemu-4.1.  The big thing here is adding
 support for hot plug of P2P bridges, and PCI devices under P2P bridges
 on the "pseries" machine (which doesn't use SHPC).  Other than that
 there's just a handful of fixes and small enhancements.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl0AkgwACgkQbDjKyiDZ
 s5Jyug//cwxP+t1t2CNHtffKwiXFzuEKx9YSNE1V0wog6aB40EbPKU72FzCq6FfA
 lev+pZWV9AwVMzFYe4VM/7Lqh7WFMYDT3DOXaZwfANs4471vYtgvPi21L2TBj80d
 hMszlyLWMLY9ByOzCxIq3xnbivGpA94G2q9rKbwXdK4T/5i62Pe3SIfgG+gXiiwW
 +YlHWCPX0I1cJz2bBs9ElXdl7ONWnn+7uDf7gNfWkTKuiUq6Ps7mxzy3GhJ1T7nz
 OFKmQ5dKzLJsgOULSSun8kWpXBmnPffkM3+fCE07edrWZVor09fMCk4HvtfaRy2K
 FFa2Kvzn/V/70TL+44dsSX4QcwdcHQztiaMO7UGPq9CMswx5L7gsNmfX6zvK1Nrb
 1t7ORZKNJ72hMyvDPSMiGU2DpVjO3ZbBlSL4/xG8Qeal4An0kgkN5NcFlB/XEfnz
 dsKu9XzuGSeD1bWz1Mgcf1x7lPDBoHIKLcX6notZ8epP/otu4ywNFvAkPu4fk8s0
 4jQGajIT7328SmzpjXClsmiEskpKsEr7hQjPRhu0hFGrhVc+i9PjkmbDl0TYRAf6
 N6k6gJQAi+StJde2rcua1iS7Ra+Tka6QRKy+EctLqfqOKPb2VmkZ6fswQ3nfRRlT
 LgcTHt2iJcLeud2klVXs1e4pKXzXchkVyFL4ucvmyYG5VeimMzU=
 =ERgu
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190612' into staging

ppc patch queue 2019-06-12

Next pull request against qemu-4.1.  The big thing here is adding
support for hot plug of P2P bridges, and PCI devices under P2P bridges
on the "pseries" machine (which doesn't use SHPC).  Other than that
there's just a handful of fixes and small enhancements.

# gpg: Signature made Wed 12 Jun 2019 06:47:56 BST
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-4.1-20190612:
  ppc/xive: Make XIVE generate the proper interrupt types
  ppc/pnv: activate the "dumpdtb" option on the powernv machine
  target/ppc: Use tcg_gen_gvec_bitsel
  spapr: Allow hot plug/unplug of PCI bridges and devices under PCI bridges
  spapr: Direct all PCI hotplug to host bridge, rather than P2P bridge
  spapr: Don't use bus number for building DRC ids
  spapr: Clean up DRC index construction
  spapr: Clean up spapr_drc_populate_dt()
  spapr: Clean up dt creation for PCI buses
  spapr: Clean up device tree construction for PCI devices
  spapr: Clean up device node name generation for PCI devices
  target/ppc: Fix lxvw4x, lxvh8x and lxvb16x
  spapr_pci: Improve error message

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-12 14:43:47 +01:00
Markus Armbruster
a8d2532645 Include qemu-common.h exactly where needed
No header includes qemu-common.h after this commit, as prescribed by
qemu-common.h's file comment.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-5-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c
block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c
target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h
target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h
target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h
target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and
net/tap-bsd.c fixed up]
2019-06-12 13:20:20 +02:00
Markus Armbruster
0b8fa32f55 Include qemu/module.h where needed, drop it from qemu-common.h
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-4-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c
hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c;
ui/cocoa.m fixed up]
2019-06-12 13:18:33 +02:00
Cédric Le Goater
8d40926141 ppc/pnv: activate the "dumpdtb" option on the powernv machine
This is a good way to debug the DT creation for current PowerNV
machines and new ones to come.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190606174732.13051-1-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-06-12 10:41:50 +10:00
David Gibson
14e714900f spapr: Allow hot plug/unplug of PCI bridges and devices under PCI bridges
The pseries machine type already allows PCI hotplug and unplug via the
PAPR mechanism, but only on the root bus of each PHB.  This patch extends
this to allow PCI to PCI bridges to be hotplugged, and devices to be
hotplugged or unplugged under P2P bridges.

For now we disallow hot unplugging P2P bridges.  I tried doing that, but
haven't managed to get it working, I think due to some guest side problems
that need further investigation.

To do this we dynamically construct DRCs when bridges are hot (or cold)
added, which can in turn be used to hotplug devices under the bridge.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2019-06-12 10:41:49 +10:00
David Gibson
cb60008706 spapr: Direct all PCI hotplug to host bridge, rather than P2P bridge
A P2P bridge will attempt to handle the hotplug with SHPC, which doesn't
work in the PAPR environment.  Instead we want to direct all PCI hotplug
actions to the PAPR specific host bridge which will use the PAPR hotplug
mechanism.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2019-06-12 10:41:49 +10:00
David Gibson
05929a6c5d spapr: Don't use bus number for building DRC ids
DRC ids are more or less arbitrary, as long as they're consistent.  For
PCI, we notionally build them from the phb's index along with PCI bus
number, slot and function number.

Using bus number is broken, however, because it can change if the guest
re-enumerates the PCI topology for whatever reason (e.g. due to hotplug
of a bridge, which we don't support yet but want to).

Fortunately, there's an alternative.  Bridges are required to have a unique
non-zero "chassis number" that we can use instead.  Adjust the code to
use that instead.

This looks like it would introduce a guest visible breaking change, but
in fact it does not because we don't yet ever use non-zero bus numbers.
Both chassis and bus number are always 0 for the root bus, so there's no
change for the existing cases.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2019-06-12 10:41:49 +10:00
David Gibson
a1ec25b287 spapr: Clean up DRC index construction
spapr_pci.c currently has several confusingly similarly named functions for
various conversions between representations of DRCs.  Make things clearer
by renaming things in a more consistent XXX_from_YYY() manner and remove
some called-only-once variants in favour of open coding.

While we're at it, move this code together in the file to avoid some extra
forward references, and split out construction and removal of DRCs for the
host bridge into helper functions.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2019-06-12 10:41:49 +10:00
David Gibson
9e7d38e8a3 spapr: Clean up spapr_drc_populate_dt()
This makes some minor cleanups to spapr_drc_populate_dt(), renaming it to
the shorter and more idiomatic spapr_dt_drc() along the way.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2019-06-12 10:41:49 +10:00
David Gibson
466e883185 spapr: Clean up dt creation for PCI buses
Device nodes for PCI bridges (both host and P2P) describe both the bridge
device itself and the bus hanging off it, handling of this is a bit of a
mess.

spapr_dt_pci_device() has a few things it only adds for non-bridges, but
always adds #address-cells and #size-cells which should only appear for
bridges.  But the walking down the subordinate PCI bus is done in one of
its callers spapr_populate_pci_devices_dt().  The PHB dt creation in
spapr_populate_pci_dt() open codes some similar logic to the bridge case.

This patch consolidates things in a bunch of ways:
 * Bus specific dt info is now created in spapr_dt_pci_bus() used for both
   P2P bridges and the host bridge.  This includes walking subordinate
   devices
 * spapr_dt_pci_device() now calls spapr_dt_pci_bus() when called on a
   P2P bridge
 * We do detection of bridges with the is_bridge field of the device class,
   rather than checking PCI config space directly, for consistency with
   qemu's core PCI code.
 * Several things are renamed for brevity and clarity

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2019-06-12 10:41:49 +10:00
David Gibson
9d2134d81d spapr: Clean up device tree construction for PCI devices
spapr_create_pci_child_dt() is a trivial wrapper around
spapr_populate_pci_child_dt(), but is the latter's only caller.  So fold
them together into spapr_dt_pci_device(), which closer matches our modern
naming convention.

While there, make a number of cleanups to the function itself.  This is
mostly using more temporary locals to avoid awkwardly long lines, and in
some cases avoiding double reads of PCI config space variables.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2019-06-12 10:41:49 +10:00
David Gibson
4782a8bb81 spapr: Clean up device node name generation for PCI devices
spapr_populate_pci_child_dt() adds a 'name' property to the device tree
node for PCI devices.  This is never necessary for a flattened device tree,
it is implicit in the name added when the node is constructed.  In fact
anything we do add to a 'name' property will be overwritten with something
derived from the structural name in the guest firmware (but in fact it is
exactly the same bytes).

So, remove that.  In addition, pci_get_node_name() is very simple, so fold
it into its (also simple) sole caller spapr_create_pci_child_dt().

While we're there rename pci_find_device_name() to the shorter and more
accurate dt_name_from_class().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2019-06-12 10:41:49 +10:00
Greg Kurz
7028293017 spapr_pci: Improve error message
Every PHB must have a unique index. This is checked at realize but when
a duplicate index is detected, an error message mentioning BUIDs is
printed. This doesn't help much, especially since BUID is an internal
concept that is no longer exposed to the user.

Fix the message to mention the index property instead of BUID. As a bonus
print a list of indexes already in use.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155915010892.2061314.10485622810149098411.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-06-12 10:41:49 +10:00
Markus Armbruster
14a48c1d0d qemu-common: Move tcg_enabled() etc. to sysemu/tcg.h
Other accelerators have their own headers: sysemu/hax.h, sysemu/hvf.h,
sysemu/kvm.h, sysemu/whpx.h.  Only tcg_enabled() & friends sit in
qemu-common.h.  This necessitates inclusion of qemu-common.h into
headers, which is against the rules spelled out in qemu-common.h's
file comment.

Move tcg_enabled() & friends into their own header sysemu/tcg.h, and
adjust #include directives.

Cc: Richard Henderson <rth@twiddle.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-2-armbru@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[Rebased with conflicts resolved automatically, except for
accel/tcg/tcg-all.c]
2019-06-11 20:22:09 +02:00
Richard Henderson
db70b31144 target/ppc: Use env_cpu, env_archcpu
Cleanup in the boilerplate that each target must define.
Replace ppc_env_get_cpu with env_archcpu.  The combination
CPU(ppc_env_get_cpu) should have used ENV_GET_CPU to begin;
use env_cpu now.

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-10 07:03:42 -07:00
Peter Maydell
347a6f44e9 virtio, pci, pc: cleanups, features
stricter rules for acpi tables: we now fail
 on any difference that isn't whitelisted.
 
 vhost-scsi migration.
 
 some cleanups all over the place
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJc+B4YAAoJECgfDbjSjVRpq1EIAJR7tCxcpu9GggVlinmUA8G4
 tmSAe06IryH7+nF3RsnINuGu7ius9qC2/E2y0uJUHhTqiU/RWOfWZ7PPM0EcYZaA
 TLPaCe2NUF6/8afeqmvE9Usk7VspI5TDZRms+bonmZz2xP1lHIMN0qW4s7HHLWr8
 sZKDtCJ+9cYII93VQwtlR0qiHgv5f0kzcuZeJaZHsAHH6XZGqRuQjI6txcFa4o53
 lkdLCEwTnRuwu2wyL84eL5p+E8SzOgR/x1QI+nffrJfsvnmiT7lnOrkjnQlWAp5G
 xqwqsUrUxUCuQ+zitwJqmv+H6nx79MwAM7fTHAETCWX703N5o9tZxAnHHqLoa8I=
 =cQNg
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio, pci, pc: cleanups, features

stricter rules for acpi tables: we now fail
on any difference that isn't whitelisted.

vhost-scsi migration.

some cleanups all over the place

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Wed 05 Jun 2019 20:55:04 BST
# gpg:                using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  bios-tables-test: ignore identical binaries
  tests: acpi: add simple arm/virt testcase
  tests: add expected ACPI tables for arm/virt board
  bios-tables-test: list all tables that differ
  vhost-scsi: Allow user to enable migration
  vhost-scsi: Add VMState descriptor
  vhost-scsi: The vhost backend should be stopped when the VM is not running
  bios-tables-test: add diff allowed list
  vhost: fix memory leak in vhost_user_scsi_realize
  vhost: fix incorrect print type
  vhost: remove the dead code
  docs: smbios: remove family=x from type2 entry description
  pci: Fold pci_get_bus_devfn() into its sole caller
  pci: Make is_bridge a bool
  pcie: Simplify pci_adjust_config_limit()
  acpi: pci: use build_append_foo() API to construct MCFG
  hw/acpi: Consolidate build_mcfg to pci.c

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-06 12:52:31 +01:00
David Gibson
2f57db8a27 pcie: Simplify pci_adjust_config_limit()
Since c2077e2c "pci: Adjust PCI config limit based on bus topology",
pci_adjust_config_limit() has been used in the config space read and write
paths to only permit access to extended config space on buses which permit
it.  Specifically it prevents access on devices below a vanilla-PCI bus via
some combination of bridges, even if both the host bridge and the device
itself are PCI-E.

It accomplishes this with a somewhat complex call up the chain of bridges
to see if any of them prohibit extended config space access.  This is
overly complex, since we can always know if the bus will support such
access at the point it is constructed.

This patch simplifies the test by using a flag in the PCIBus instance
indicating whether extended configuration space is accessible.  It is
false for vanilla PCI buses.  For PCI-E buses, it is true for root
buses and equal to the parent bus's's capability otherwise.

For the special case of sPAPR's paravirtualized PCI root bus, which
acts mostly like vanilla PCI, but does allow extended config space
access, we override the default value of the flag from the host bridge
code.

This should cause no behavioural change.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20190513061939.3464-4-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-05-29 18:00:57 -04:00
Cédric Le Goater
ce4b1b5685 ppc/pnv: add dummy XSCOM registers for PRD initialization
PRD (Processor recovery diagnostics) is a service available on
OpenPower systems. The opal-prd daemon initializes the PowerPC
Processor through the XSCOM bus and then waits for hardware diagnostic
events.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190527071722.31424-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:47 +10:00
Cédric Le Goater
83b90bf026 ppc/pnv: introduce new skiboot platform properties
Newer skiboots (after 6.3) support QEMU platforms that have
characteristics closer to real OpenPOWER systems. The CPU type is used
to define the BMC drivers: Aspeed AST2400 for POWER8 processors and
AST2500 for POWER9s.

Advertise the new platform property names, "qemu,powernv8" and
"qemu,powernv9", using the CPU type chosen for the QEMU PowerNV
machine. Also, advertise the original platform name "qemu,powernv" in
case of POWER8 processors for compatibility with older skiboots.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190527071749.31499-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:47 +10:00
Greg Kurz
3725ef1a94 spapr: Don't migrate the hpt_maxpagesize cap to older machine types
Commit 0b8c89be7f7b added the hpt_maxpagesize capability to the migration
stream. This is okay for new machine types but it breaks backward migration
to older QEMUs, which don't expect the extra subsection.

Add a compatibility boolean flag to the sPAPR machine class and use it to
skip migration of the capability for machine types 4.0 and older. This
fixes migration to an older QEMU. Note that the destination will emit a
warning:

qemu-system-ppc64: warning: cap-hpt-max-page-size lower level (16) in incoming stream than on destination (24)

This is expected and harmless though. It is okay to migrate from a lower
HPT maximum page size (64k) to a greater one (16M).

Fixes: 0b8c89be7f7b "spapr: Add forgotten capability to migration stream"
Based-on: <20190522074016.10521-3-clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155853262675.1158324.17301777846476373459.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:47 +10:00
Cédric Le Goater
bd94bc0647 spapr: change default interrupt mode to 'dual'
Now that XIVE support is complete (QEMU emulated and KVM devices),
change the pseries machine to advertise both interrupt modes: XICS
(P7/P8) and XIVE (P9).

The machine default interrupt modes depends on the version. Current
settings are:

    pseries   default interrupt mode

    4.1       dual
    4.0       xics
    3.1       xics
    3.0       legacy xics (different IRQ number space layout)

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190522074016.10521-3-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:47 +10:00
Cédric Le Goater
3f777abc71 spapr/irq: add KVM support to the 'dual' machine
The interrupt mode is chosen by the CAS negotiation process and
activated after a reset to take into account the required changes in
the machine. This brings new constraints on how the associated KVM IRQ
device is initialized.

Currently, each model takes care of the initialization of the KVM
device in their realize method but this is not possible anymore as the
initialization needs to be done globaly when the interrupt mode is
known, i.e. when machine is reseted. It also means that we need a way
to delete a KVM device when another mode is chosen.

Also, to support migration, the QEMU objects holding the state to
transfer should always be available but not necessarily activated.

The overall approach of this proposal is to initialize both interrupt
mode at the QEMU level to keep the IRQ number space in sync and to
allow switching from one mode to another. For the KVM side of things,
the whole initialization of the KVM device, sources and presenters, is
grouped in a single routine. The XICS and XIVE sPAPR IRQ reset
handlers are modified accordingly to handle the init and the delete
sequences of the KVM device.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-15-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater
ae805ea907 spapr/irq: introduce a spapr_irq_init_device() helper
The way the XICS and the XIVE devices are initialized follows the same
pattern. First, try to connect to the KVM device and if not possible
fallback on the emulated device, unless a kernel_irqchip is required.
The spapr_irq_init_device() routine implements this sequence in
generic way using new sPAPR IRQ handlers ->init_emu() and ->init_kvm().

The XIVE init sequence is moved under the associated sPAPR IRQ
->init() handler. This will change again when KVM support is added for
the dual interrupt mode.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-12-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater
0dc9f5f849 spapr/xive: activate KVM support
All is in place for KVM now. State synchronization and migration will
come next.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-8-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater
277dd3d771 spapr/xive: add migration support for KVM
When the VM is stopped, the VM state handler stabilizes the XIVE IC
and marks the EQ pages dirty. These are then transferred to destination
before the transfer of the device vmstates starts.

The SpaprXive interrupt controller model captures the XIVE internal
tables, EAT and ENDT and the XiveTCTX model does the same for the
thread interrupt context registers.

At restart, the SpaprXive 'post_load' method restores all the XIVE
states. It is called by the sPAPR machine 'post_load' method, when all
XIVE states have been transferred and loaded.

Finally, the source states are restored in the VM change state handler
when the machine reaches the running state.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-7-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater
38afd772f8 spapr/xive: add KVM support
This introduces a set of helpers when KVM is in use, which create the
KVM XIVE device, initialize the interrupt sources at a KVM level and
connect the interrupt presenters to the vCPU.

They also handle the initialization of the TIMA and the source ESB
memory regions of the controller. These have a different type under
KVM. They are 'ram device' memory mappings, similarly to VFIO, exposed
to the guest and the associated VMAs on the host are populated
dynamically with the appropriate pages using a fault handler.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:45 +10:00
Greg Kurz
75de59416d spapr: Print out extra hints when CAS negotiation of interrupt mode fails
Let's suggest to the user how the machine should be configured to allow
the guest to boot successfully.

Suggested-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155799221739.527449.14907564571096243745.stgit@bahia.lan>
Reviewed-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Tested-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
[dwg: Adjusted for style error]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:45 +10:00
David Gibson
eb3cba8272 spapr: Fix phb_placement backwards compatibility
When we added support for NVLink2 passthrough devices, we changed the
phb_placement hook to handle the placement of NVLink2 bridges' specific
resources.  For compatibility we use a version that doesn't do this
allocation  for old machine types.

However, because of the delay between when the patch was posted and when
it was merged, we ended up with that compatibility hook applying for
machine versions 3.1 and earlier whereas it should apply for 4.0 and
earlier (since the patch was applied early in the 4.1 tree).

Fixes: ec132efaa8 "spapr: Support NVIDIA V100 GPU with NVLink2"

Reported-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2019-05-29 11:39:45 +10:00
David Gibson
64d4a53431 spapr: Add forgotten capability to migration stream
spapr machine capabilities are supposed to be sent in the migration stream
so that we can sanity check the source and destination have compatible
configuration.  Unfortunately, when we added the hpt-max-page-size
capability, we forgot to add it to the migration state.  This means that we
can generate spurious warnings when both ends are configured for large
pages, or potentially fail to warn if the source is configured for huge
pages, but the destination is not.

Fixes: 2309832afd "spapr: Maximum (HPT) pagesize property"

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-05-29 11:39:45 +10:00
Suraj Jitindar Singh
70de096748 target/ppc: Set PSSCR_EC on cpu halt to prevent spurious wakeup
The processor stop status and control register (PSSCR) is used to
control the power saving facilities of the thread. The exit criterion
bit (EC) is used to specify whether the thread should be woken by any
interrupt (EC == 0) or only an interrupt enabled in the LPCR to wake the
thread (EC == 1).

The rtas facilities start-cpu and self-stop are used to transition a
vcpu between the stopped and running states. When a vcpu is stopped it
may only be started again by the start-cpu rtas call.

Currently a vcpu in the stopped state will start again whenever an
interrupt comes along due to PSSCR_EC being cleared, and while this is
architecturally correct for a hardware thread, a vcpu is expected to
only be woken by calling start-cpu. This means when performing a reboot
on a tcg machine that the secondary threads will restart while the
primary is still in slof, this is unsupported and causes call traces
like:

SLOF **********************************************************************
QEMU Starting
 Build Date = Jan 14 2019 18:00:39
 FW Version = git-a5b428e1c1eae703
 Press "s" to enter Open Firmware.

qemu: fatal: Trying to deliver HV exception (MSR) 70 with no HV support

NIP 6d61676963313230   LR 000000003dbe0308 CTR 6d61676963313233 XER 0000000000000000 CPU#1
MSR 0000000000000000 HID0 0000000000000000  HF 0000000000000000 iidx 3 didx 3
TB 00000026 115746031956 DECR 18446744073326238463
GPR00 000000003dbe0308 000000003e669fe0 000000003dc10700 0000000000000003
GPR04 000000003dc62198 000000003dc62178 000000003dc0ea48 0000000000000030
GPR08 000000003dc621a8 0000000000000018 000000003e466008 000000003dc50700
GPR12 c00000000093a4e0 c00000003ffff300 c00000003e533f90 0000000000000000
GPR16 0000000000000000 0000000000000000 000000003e466010 000000003dc0b040
GPR20 0000000000008000 000000000000f003 0000000000000006 000000003e66a050
GPR24 000000003dc06400 000000003dc0ae70 0000000000000003 000000000000f001
GPR28 000000003e66a060 ffffffffffffffff 6d61676963313233 0000000000000028
CR 28000222  [ E  L  -  -  -  E  E  E  ]             RES ffffffffffffffff
FPR00 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR08 0000000000000000 0000000000000000 0000000000000000 00000000311825e0
FPR12 00000000311825e0 0000000000000000 0000000000000000 0000000000000000
FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPSCR 0000000000000000
 SRR0 000000003dbe06b0  SRR1 0000000000080000    PVR 00000000004e1200 VRSAVE 0000000000000000
SPRG0 000000003dbe0308 SPRG1 000000003e669fe0  SPRG2 00000000000000d8  SPRG3 000000003dbe0308
SPRG4 0000000000000000 SPRG5 0000000000000000  SPRG6 0000000000000000  SPRG7 0000000000000000
HSRR0 6d61676963313230 HSRR1 0000000000000000
 CFAR 000000003dbe3e64
 LPCR 0000000004020008
 PTCR 0000000000000000   DAR 0000000000000000  DSISR 0000000000000000
Aborted (core dumped)

To fix this, set the PSSCR_EC bit when a vcpu is stopped to disable it
from coming back online until the start-cpu rtas call is made.

Fixes: 21c0d66a9c ("target/ppc: Fix support for "STOP light" states on POWER9")

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190516005744.24366-1-sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:45 +10:00
Greg Kurz
e7f78db9fb spapr/xive: Sanity checks of OV5 during CAS
If a machine is started with ic-mode=xive but the guest only knows
about XICS, eg. an RHEL 7.6 guest, the kernel panics. This is
expected but a bit unfortunate since the crash doesn't provide
much information for the end user to guess what's happening.

Detect that during CAS and exit QEMU with a proper error message
instead, like it is already done for the MMU.

Even if this is less likely to happen, the opposite case of a guest
that only knows about XIVE would certainly fail all the same if the
machine is started with ic-mode=xics.

Also, the only valid values a guest can pass in byte 23 of OV5 during
CAS are 0b00 (XIVE legacy mode) and 0b01 (XIVE exploitation mode). Any
other value is a bug, at least with the current spec. Again, it does
not seem right to let the guest go on without a precise idea of the
interrupt mode it asked for.

Handle these cases as well.

Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155793986451.464434.12887933000007255549.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:45 +10:00
Suraj Jitindar Singh
83f192d34d target/ppc: Add ibm,purr and ibm,spurr device-tree properties
The ibm,purr and ibm,spurr device tree properties are used to indicate
that the processor implements the Processor Utilisation of Resources
Register (PURR) and Scaled Processor Utilisation of Resources Registers
(SPURR), respectively. Each property has a single value which represents
the level of architecture supported. A value of 1 for ibm,purr means
support for the version of the PURR defined in book 3 in version 2.02 of
the architecture. A value of 1 for ibm,spurr means support for the
version of the SPURR defined in version 2.05 of the architecture.

Add these properties for all processors for which the PURR and SPURR
registers are generated.

Fixes: 0da6f3fef9 "spapr: Reorganize CPU dt generation code"
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190506014803.21299-1-sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:44 +10:00
Artyom Tarasenko
1dbe3d196d hw/ppc/40p: use 1900 as a base year
AIX 5.1 expects the base year to be 1900. Adjust accordingly.

Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190505152839.18650-4-philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:44 +10:00
Philippe Mathieu-Daudé
2e8f85189d hw/ppc/40p: Move the MC146818 RTC to the board where it belongs
The MC146818 RTC was incorrectly added to the i82378 chipset in
commit a04ff94097. In the next commit (506b7ddf88) the PReP
machine use the i82378.
Since the MC146818 is specific to the PReP machine, move its use
there.

Fixes: a04ff94097
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190505152839.18650-3-philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:44 +10:00
Philippe Mathieu-Daudé
c50be9e1ec hw/ppc/prep: use TYPE_MC146818_RTC instead of a hardcoded string
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190505152839.18650-2-philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:44 +10:00
Philippe Mathieu-Daudé
bc4c406c3e hw/ppc/pnv: Use object_initialize_child for correct reference counting
As explained in commit aff39be0ed:

  Both functions, object_initialize() and object_property_add_child()
  increase the reference counter of the new object, so one of the
  references has to be dropped afterwards to get the reference
  counting right. Otherwise the child object will not be properly
  cleaned up when the parent gets destroyed.
  Thus let's use now object_initialize_child() instead to get the
  reference counting here right.

This patch was generated using the following Coccinelle script
(with a bit of manual fix-up for overly long lines):

 @use_object_initialize_child@
 expression parent_obj;
 expression child_ptr;
 expression child_name;
 expression child_type;
 expression child_size;
 expression errp;
 @@
 (
 -   object_initialize(child_ptr, child_size, child_type);
 +   object_initialize_child(parent_obj, child_name,  child_ptr, child_size,
 +                           child_type, &error_abort, NULL);
     ... when != parent_obj
 -   object_property_add_child(parent_obj, child_name, OBJECT(child_ptr), NULL);
     ...
?-   object_unref(OBJECT(child_ptr));
 |
 -   object_initialize(child_ptr, child_size, child_type);
 +   object_initialize_child(parent_obj, child_name,  child_ptr, child_size,
 +                            child_type, errp, NULL);
     ... when != parent_obj
 -   object_property_add_child(parent_obj, child_name, OBJECT(child_ptr), errp);
     ...
?-   object_unref(OBJECT(child_ptr));
 )

While the object_initialize() function doesn't take an
'Error *errp' argument, the object_initialize_child() does.
Since this code is used when a machine is created (and is not
yet running), we deliberately choose to use the &error_abort
argument instead of ignoring errors if an object creation failed.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Inspired-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190507163416.24647-2-philmd@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-05-24 15:29:02 -03:00
Philippe Mathieu-Daudé
d632b9de78 hw/ppc: Implement fw_cfg_arch_key_name()
Implement fw_cfg_arch_key_name(), which returns the name of a
ppc-specific key.

The fw_cfg device is used by the machine using OpenBIOS:
- 40p
- mac99 (oldworld)
- g3beige (newworld)

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20190422195020.1494-6-philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-05-23 14:10:31 +02:00
Peter Maydell
9ec34ecc97 ppc patch queue 2019-04-26
Here's the first ppc target pull request for qemu-4.1.  This has a
 number of things that have accumulated while qemu-4.0 was frozen.
 
  * A number of emulated MMU improvements from Ben Herrenschmidt
 
  * Assorted cleanups fro Greg Kurz
 
  * A large set of mostly mechanical cleanups from me to make target/ppc
    much closer to compliant with the modern coding style
 
  * Support for passthrough of NVIDIA GPUs using NVLink2
 
 As well as some other assorted fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAlzCnusACgkQbDjKyiDZ
 s5LfhhAAuem5UBGKPKPj33c87HC+GGG+S4y89ic3ebyKplWulGgouHCa4Dnc7Y5m
 9MfIEcljRDpuRJCEONo6yg9aaRb3cW2Go9TpTwxmF8o1suG/v5bIQIdiRbBuMa2t
 yhNujVg5kkWSU1G4mCZjL9FS2ADPsxsKZVd73DPEqjlNJg981+2qtSnfR8SXhfnk
 dSSKxyfC6Hq1+uhGkLI+xtft+BCTWOstjz+efHpZ5l2mbiaMeh7zMKrIXXy/FtKA
 ufIyxbZznMS5MAZk7t90YldznfwOCqfh3di1kx8GTZ40LkBKbuI5LLHTG0sT75z5
 LHwFuLkBgWmS8RyIRRh9opr7ifrayHx8bQFpW368Qu+PbPzUCcTVIrWUfPmaNR74
 CkYJvhiYZfTwKtUeP7b2wUkHpZF4KINI4TKNaS4QAlm3DNbO67DFYkBrytpXsSzv
 smEpe+sqlbY40olw9q4ESP80r+kGdEPLkRjfdj0R7qS4fsqAH1bjuSkNqlPaCTJQ
 hNsoz2D+f56z0bBq4x8FRzDpqnBkdy4x6PlLxkJuAaV7WAtvq7n7tiMA3TRr/rIB
 OYFP2xPNajjP8MfyOB94+S4WDltmsgXoM7HyyvrKp2JBpe7mFjpep5fMp5GUpweV
 OOYrTsN1Nuu3kFpeimEc+IOyp1BWXnJF4vHhKTOqHeqZEs5Fgus=
 =RpAK
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190426' into staging

ppc patch queue 2019-04-26

Here's the first ppc target pull request for qemu-4.1.  This has a
number of things that have accumulated while qemu-4.0 was frozen.

 * A number of emulated MMU improvements from Ben Herrenschmidt

 * Assorted cleanups fro Greg Kurz

 * A large set of mostly mechanical cleanups from me to make target/ppc
   much closer to compliant with the modern coding style

 * Support for passthrough of NVIDIA GPUs using NVLink2

As well as some other assorted fixes.

# gpg: Signature made Fri 26 Apr 2019 07:02:19 BST
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-4.1-20190426: (36 commits)
  target/ppc: improve performance of large BAT invalidations
  ppc/hash32: Rework R and C bit updates
  ppc/hash64: Rework R and C bit updates
  ppc/spapr: Use proper HPTE accessors for H_READ
  target/ppc: Don't check UPRT in radix mode when in HV real mode
  target/ppc/kvm: Convert DPRINTF to traces
  target/ppc/trace-events: Fix trivial typo
  spapr: Drop duplicate PCI swizzle code
  spapr_pci: Get rid of duplicate code for node name creation
  target/ppc: Style fixes for translate/spe-impl.inc.c
  target/ppc: Style fixes for translate/vmx-impl.inc.c
  target/ppc: Style fixes for translate/vsx-impl.inc.c
  target/ppc: Style fixes for translate/fp-impl.inc.c
  target/ppc: Style fixes for translate.c
  target/ppc: Style fixes for translate_init.inc.c
  target/ppc: Style fixes for monitor.c
  target/ppc: Style fixes for mmu_helper.c
  target/ppc: Style fixes for mmu-hash64.[ch]
  target/ppc: Style fixes for mmu-hash32.[ch]
  target/ppc: Style fixes for misc_helper.c
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-04-27 21:34:46 +01:00
Peter Maydell
06e6433955 Machine queue, 2019-04-25
* 4.1 machine-types (Cornelia Huck)
 * Support MAP_SYNC on pmem memory backends (Zhang Yi)
 * -cpu parsing fixes and cleanups (Eduardo Habkost)
 * machine initialization cleanups (Wei Yang, Markus Armbruster)
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJcwfRxAAoJECgHk2+YTcWmBegP/1alp8qiO/JdSkI/+jw9iUBC
 SviMwFrQVdKWT5ou/aYTM3apqrwC9XLUQ2vuNzLQDURG+SbcCf5BLvSrcvg9iR6z
 ASUot7ta1QtkR361dL0akhvqH8pNXpGolq5VleQqBOWAGUVjgrbWuwPlFVz9TZ8R
 LaVwDITv0fpQwtq+hB4b9hiDkebZFE4/xkNyxpaoJGzaePe1sCqACzNe1/PQ15ni
 gmd+VQ1qX3frUTSZcaWTrJIdQvZlkaD+pmEiwo969EE4U9ZGwwPRpShmeHnjuKDQ
 ufTGo05+/ikqp8refxA/XqyveHeJ69JSFNLCz2QwAgdwN/OXRG306Ln69vFNuX0D
 rfMJBvKZotc7enN08aQN1m1Sm0Y+2xo9RQgFUynZnzauQXKiEndLPHyjbbQ+pAPQ
 TmHrUQnmYSvoELewrCaq4XloXrd3X57U3K19ksqF+3meApQ7fuY9dQF2A2bE+aB7
 OhiMqdw9HVAjSzplKa5jPniSc5vgRCdr9AtX5B2RJdsQEv72JfwsOYB0DnrF4hyo
 NJz7HyS28xkbKrfbhztr8WoV8nPYvdS+xjSfim8YS6lFaNDnWZl2ybp/Trr1HItv
 TbDtPSx/IePHhIXd63aXkDt7FSoUib6+fCi8Wssuuo+MJMZfHacpWHkx2bVwSuf6
 doOaY/KY8mAq5DiM09zz
 =MNVq
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ehabkost/tags/machine-next-pull-request' into staging

Machine queue, 2019-04-25

* 4.1 machine-types (Cornelia Huck)
* Support MAP_SYNC on pmem memory backends (Zhang Yi)
* -cpu parsing fixes and cleanups (Eduardo Habkost)
* machine initialization cleanups (Wei Yang, Markus Armbruster)

# gpg: Signature made Thu 25 Apr 2019 18:54:57 BST
# gpg:                using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" [full]
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/machine-next-pull-request:
  util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap()
  linux-headers: add linux/mman.h.
  scripts/update-linux-headers: add linux/mman.h
  util/mmap-alloc: Add a 'is_pmem' parameter to qemu_ram_mmap
  cpu: Fix crash with empty -cpu option
  cpu: Rename parse_cpu_model() to parse_cpu_option()
  vl: Simplify machine_parse()
  vl: Clean up after previous commit
  vl.c: allocate TYPE_MACHINE list once during bootup
  vl.c: make find_default_machine() local
  hw: add compat machines for 4.1

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-04-26 14:30:18 +01:00
Benjamin Herrenschmidt
a2dd4e83e7 ppc/hash64: Rework R and C bit updates
With MT-TCG, we are now running translation in a racy way, thus
we need to mimic hardware when it comes to updating the R and
C bits, by doing byte stores.

The current "store_hpte" abstraction is ill suited for this, we
replace it with two separate callbacks for setting R and C.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190411080004.8690-4-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-04-26 11:37:57 +10:00
Benjamin Herrenschmidt
993aaf0c00 ppc/spapr: Use proper HPTE accessors for H_READ
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190411080004.8690-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-04-26 11:37:57 +10:00
Greg Kurz
e8ec4adfe2 spapr: Drop duplicate PCI swizzle code
LSI mapping in spapr currently open-codes standard PCI swizzling. It thus
duplicates the code of pci_swizzle_map_irq_fn().

Expose the swizzling formula so that it can be used with a slot number
when building the device tree. Simply drop pci_spapr_map_irq() and call
pci_swizzle_map_irq_fn() instead.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155448184841.8446.13959787238854054119.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-04-26 11:37:57 +10:00
Greg Kurz
c413605ba6 spapr_pci: Get rid of duplicate code for node name creation
According to the changelog of 298a971024, SpaprPhbState::dtbusname was
introduced to "make it easier to relate the guest and qemu views of memory
to each other", hence its name.

Use it when creating the PHB node to avoid code duplication.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155448184292.8446.8225650773162648595.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-04-26 11:37:57 +10:00
Cédric Le Goater
f56275a2fc spapr/irq: remove spapr_ics_create()
spapr_ics_create() is only called once. Merge it in spapr_irq_init_xics()
and simplify a bit the error handling by using 'error_fatal' .

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190321144914.19934-13-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-04-26 10:41:23 +10:00
Cédric Le Goater
64db6c70dc spapr/rtas: modify spapr_rtas_register() to remove RTAS handlers
Removing RTAS handlers will become necessary when the new pseries
machine supporting multiple interrupt mode is introduced.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190321144914.19934-9-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-04-26 10:41:23 +10:00
Philippe Mathieu-Daudé
7cbf3f113a hw/ppc/prep: Drop useless inclusion of "hw/input/i8042.h"
In commit 47973a2dbf we split the last generic chipset out of
the PC board, but missed to remove the i8042 keyboard controller.
This omission was later fixed in commit 7cb00357c1, but here we
forgot to remove the "i8042.h" include. Do it now.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190316201528.9140-1-philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-04-26 10:41:23 +10:00
Alexey Kardashevskiy
ec132efaa8 spapr: Support NVIDIA V100 GPU with NVLink2
NVIDIA V100 GPUs have on-board RAM which is mapped into the host memory
space and accessible as normal RAM via an NVLink bus. The VFIO-PCI driver
implements special regions for such GPUs and emulates an NVLink bridge.
NVLink2-enabled POWER9 CPUs also provide address translation services
which includes an ATS shootdown (ATSD) register exported via the NVLink
bridge device.

This adds a quirk to VFIO to map the GPU memory and create an MR;
the new MR is stored in a PCI device as a QOM link. The sPAPR PCI uses
this to get the MR and map it to the system address space.
Another quirk does the same for ATSD.

This adds additional steps to sPAPR PHB setup:

1. Search for specific GPUs and NPUs, collect findings in
sPAPRPHBState::nvgpus, manage system address space mappings;

2. Add device-specific properties such as "ibm,npu", "ibm,gpu",
"memory-block", "link-speed" to advertise the NVLink2 function to
the guest;

3. Add "mmio-atsd" to vPHB to advertise the ATSD capability;

4. Add new memory blocks (with extra "linux,memory-usable" to prevent
the guest OS from accessing the new memory until it is onlined) and
npuphb# nodes representing an NPU unit for every vPHB as the GPU driver
uses it for link discovery.

This allocates space for GPU RAM and ATSD like we do for MMIOs by
adding 2 new parameters to the phb_placement() hook. Older machine types
set these to zero.

This puts new memory nodes in a separate NUMA node to as the GPU RAM
needs to be configured equally distant from any other node in the system.
Unlike the host setup which assigns numa ids from 255 downwards, this
adds new NUMA nodes after the user configures nodes or from 1 if none
were configured.

This adds requirement similar to EEH - one IOMMU group per vPHB.
The reason for this is that ATSD registers belong to a physical NPU
so they cannot invalidate translations on GPUs attached to another NPU.
It is guaranteed by the host platform as it does not mix NVLink bridges
or GPUs from different NPU in the same IOMMU group. If more than one
IOMMU group is detected on a vPHB, this disables ATSD support for that
vPHB and prints a warning.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for vfio portions]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <20190312082103.130561-1-aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-04-26 10:41:23 +10:00
Cornelia Huck
9bf2650bc3 hw: add compat machines for 4.1
Add 4.1 machine types for arm/i440fx/q35/s390x/spapr.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190411102025.22559-1-cohuck@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-04-25 14:16:41 -03:00
David Hildenbrand
905b7ee4d6 exec: Introduce qemu_maxrampagesize() and rename qemu_getrampagesize()
Rename qemu_getrampagesize() to qemu_minrampagesize(). While at it,
properly rename find_max_supported_pagesize() to
find_min_backend_pagesize().

s390x is actually interested into the maximum ram pagesize, so
introduce and use qemu_maxrampagesize().

Add a TODO, indicating that looking at any mapped memory backends is not
100% correct in some cases.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190417113143.5551-3-david@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-04-25 13:47:27 +02:00
Greg Kurz
4560116e42 spapr_pci: Fix broken naming of PCI bus
Recent commit 5cf0d326a0 fixed a regression which was preventing the
guest to access the extended config space of a PCIe device. This was
done by introducing a new PCI bus subtype for PAPR. The original fix
was causing PCI busses to be named "spapr-pci-host-bridge-root-bus.N"
instead of "pci.N", which was making upper layers unhappy of course.
This got worked around by hardcoding the PCI bus name to "pci.0", but
this only works for the default PHB. And we're now hitting:

# qemu-system-ppc64 \
             -device spapr-pci-host-bridge,index=1 \
             -device e1000e,bus=pci.0 \
             -device e1000e,bus=pci.1
qemu-system-ppc64: -device e1000e,bus=pci.1: Bus 'pci.1' not found

David already posted some patches [1] to control PCI extended config
space accesses with a new flag in the base PCI bus class instead of
subtyping. These patches are a bit more intrusive though, and
are targetted for 4.1.

When no name is passed to pci_register_bus(), the core device code
generates a lowercase name based on the QOM typename. The typename
for the base PCI bus class is "PCI", hence the "pci.0", "pci.1"
bus names. Rename the type of the PAPR PCI bus to "pci", so that
the QOM code can generate proper names. This is a hack but it is
enough to fix the regression. And all this will be reworked properly
in 4.1.

[1] https://patchwork.ozlabs.org/project/qemu-devel/list/?series=100486

Fixes: 5cf0d326a0
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155500034416.646888.1307366522340665522.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-04-12 12:23:02 +10:00
Greg Kurz
5cf0d326a0 spapr_pci: Fix extended config space accesses
The PAPR PHB acts as a legacy PCI bus but it allows PCIe extended
config space accesses anyway (for pseries-2.9 and newer machine
types).

Introduce a specific PCI bus subtype to inform the common PCI code
about that.

Fixes: c2077e2ca0
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155414130834.574858.16502276132110219890.stgit@bahia.lan>
[dwg: Apply fix so we don't rename the default pci bus, breaking everything]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-04-09 15:03:10 +10:00
Cédric Le Goater
273fef83f6 spapr/irq: Add XIVE sanity checks on non-P9 machines
On non-P9 machines, the XIVE interrupt mode is not advertised, see
spapr_dt_ov5_platform_support(). Add a couple of checks on the machine
configuration to filter bogus setups and prevent OS failures :

                     Interrupt modes

  CPU/Compat      XICS    XIVE                dual

   P8/P8          OK      QEMU failure (1)    OK (3)
   P9/P8          OK      QEMU failure (2)    OK (3)
   P9/P9          OK      OK                  OK

  (1) CPU exception model is incompatible with XIVE and the presenters
      will fail to realize.

  (2) CPU exception model is compatible with XIVE, but the XIVE CAS
      advertisement is dropped when in POWER8 mode. So we could ended up
      booting with the XIVE DT properties but without the HCALLs. Avoid
      confusing Linux with such settings and fail under QEMU.

  (3) force XICS in machine init

Remove the check on XIVE-only machines in spapr_machine_init(), which
has now become redundant.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190328100044.11408-1-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-29 10:38:20 +11:00
David Gibson
0a794529bd spapr: Simplify handling of host-serial and host-model values
27461d69a0 "ppc: add host-serial and host-model machine attributes
(CVE-2019-8934)" introduced 'host-serial' and 'host-model' machine
properties for spapr to explicitly control the values advertised to the
guest in device tree properties with the same names.

The previous behaviour on KVM was to unconditionally populate the device
tree with the real host serial number and model, which leaks possibly
sensitive information about the host to the guest.

To maintain compatibility for old machine types, we allowed those props
to be set to "passthrough" to take the value from the host as before.  Or
they could be set to "none" to explicitly omit the device tree items.

Special casing specific values on what's otherwise a user supplied string
is very ugly.  So, this patch simplifies things by implementing the
backwards compatibility in a different way: we have a machine class flag
set for the older machines, and we only load the host values into the
device tree if A) they're not set by the user and B) we have that flag set.

This does mean that the "passthrough" functionality is no longer available
with the current machine type.  That's ok though: if a user or management
layer really wants the information passed through they can read it
themselves (OpenStack Nova already does something similar for x86).

It also means the user can't explicitly ask for the values to be omitted
on the old machine types.  I think that's an acceptable trade-off: if you
care enough about not leaking the host information you can either move to
the new machine type, or use a dummy value for the properties.

For the new machine type, this also removes an odd inconsistency
between running on a POWER and non-POWER (or non-Linux) hosts: if the
host information couldn't be read from where we expect (in the host's
device tree as exposed by Linux), we'd fallback to omitting the guest
device tree items.

While we're there, improve some poorly worded comments, and the help text
for the properties.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
2019-03-29 10:25:50 +11:00
Greg Kurz
d0db7caddb target/ppc: Consolidate 64-bit server processor detection in a helper
We use PPC_SEGMENT_64B in various places to guard code that is specific
to 64-bit server processors compliant with arch 2.x. Consolidate the
logic in a helper macro with an explicit name.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155327783157.1283071.3747129891004927299.stgit@bahia.lan>
Tested-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-29 10:22:22 +11:00
Peter Maydell
84bdc58c06 * Kconfig improvements (msi_nonbroken, imply for default PCI devices)
* intel-iommu: sharing passthrough FlatViews (Peter)
 * Fix for SEV with VFIO (Brijesh)
 * Allow compilation without CONFIG_PARALLEL (Thomas)
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAlyTvvAUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNNwwf/RrtjBoqu8Ulu6k+HJczdpkhO44c5
 R7sidGaOBHVjT+EsaYZxanXQlsbpDPiXCRoMRMln+O3Kgso/UlVTLBfctIjuf5kp
 P8Amp8rw843yl3TQ+Xaqat1qtfVVN2xjRDoyRwWrTU5w52MVVsan2j1/XzGX/7Bb
 Y3gXRxsN7MyjDCXxhxVwQCxKU2ue3ytvnfdCnu1SNZxZEaFAyGprTNCCTXYugehl
 bVauAs/0qOZWEyvElinNEz+zbqMTm07ULAWBRXgCDcOudsidZFtu0Xl62dXlp1Ou
 0zkaoGiOdMM6OXZkLd6vOK8mY9XDuqaUZE3zAeFMJsK1wSnZdGUVCJO1Hw==
 =Pkcj
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Kconfig improvements (msi_nonbroken, imply for default PCI devices)
* intel-iommu: sharing passthrough FlatViews (Peter)
* Fix for SEV with VFIO (Brijesh)
* Allow compilation without CONFIG_PARALLEL (Thomas)

# gpg: Signature made Thu 21 Mar 2019 16:42:24 GMT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (23 commits)
  virtio-vga: only enable for specific boards
  config-all-devices.mak: rebuild on reconfigure
  minikconf: fix parser typo
  intel-iommu: optimize nodmar memory regions
  test-announce-self: convert to qgraph
  hw/alpha/Kconfig: DP264 hardware requires e1000 network card
  hw/hppa/Kconfig: Dino board requires e1000 network card
  hw/sh4/Kconfig: r2d machine requires the rtl8139 network card
  hw/ppc/Kconfig: e500 based machines require virtio-net-pci device
  hw/ppc/Kconfig: Bamboo machine requires e1000 network card
  hw/mips/Kconfig: Fulong 2e board requires ati-vga/rtl8139 PCI devices
  hw/mips/Kconfig: Malta machine requires the pcnet network card
  hw/i386/Kconfig: enable devices that can be created by default
  hw/isa/Kconfig: PIIX4 southbridge requires USB UHCI
  hw/isa/Kconfig: i82378 SuperIO requires PC speaker device
  prep: do not select I82374
  hw/i386/Kconfig: PC uses I8257, not I82374
  hw/char/parallel: Make it possible to compile also without CONFIG_PARALLEL
  target/i386: sev: Do not pin the ram device memory region
  memory: Fix the memory region type assignment order
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	hw/rdma/Makefile.objs
#	hw/riscv/sifive_plic.c
2019-03-28 09:18:53 +00:00
Markus Armbruster
dec9776049 trace-events: Fix attribution of trace points to source
Some trace points are attributed to the wrong source file.  Happens
when we neglect to update trace-events for code motion, or add events
in the wrong place, or misspell the file name.

Clean up with help of cleanup-trace-events.pl.  Same funnies as in the
previous commit, of course.  Manually shorten its change to
linux-user/trace-events to */signal.c.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 20190314180929.27722-6-armbru@redhat.com
Message-Id: <20190314180929.27722-6-armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-03-22 16:18:07 +00:00
Markus Armbruster
a9779a3ab0 trace-events: Delete unused trace points
Tracked down with cleanup-trace-events.pl.  Funnies requiring manual
post-processing:

* block.c and blockdev.c trace points are in block/trace-events.

* hw/block/nvme.c uses the preprocessor to hide its trace point use
  from cleanup-trace-events.pl.

* include/hw/xen/xen_common.h trace points are in hw/xen/trace-events.

* net/colo-compare and net/filter-rewriter.c use pseudo trace points
  colo_compare_udp_miscompare and colo_filter_rewriter_debug to guard
  debug code.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 20190314180929.27722-5-armbru@redhat.com
Message-Id: <20190314180929.27722-5-armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-03-22 16:18:07 +00:00
Markus Armbruster
500016e5db trace-events: Shorten file names in comments
We spell out sub/dir/ in sub/dir/trace-events' comments pointing to
source files.  That's because when trace-events got split up, the
comments were moved verbatim.

Delete the sub/dir/ part from these comments.  Gets rid of several
misspellings.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190314180929.27722-3-armbru@redhat.com
Message-Id: <20190314180929.27722-3-armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-03-22 16:18:07 +00:00
Paolo Bonzini
938912a866 virtio-vga: only enable for specific boards
When virtio-vga was added, the intention was to only support it for
those machines where the firmware does not know about virtio-gpu,
and supported VGA legacy hardware before virtio-{gpu,vga} were
introduced.

The Kconfig switch however enabled virtio-vga for all machines with
a PCI bus, and libvirt then prefers it even on hardware where
virtio-gpu would be preferrable.  At least for now, only enable
virtio-vga for PC, hppa and pSeries machines, as was the case
before Kconfig dependencies were introduced.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-21 17:42:18 +01:00
Philippe Mathieu-Daudé
bcb7ef9d1b hw/ppc/Kconfig: e500 based machines require virtio-net-pci device
This fixes when configuring with CONFIG_PCI_DEVICES=n:

  $ qemu-system-ppc64 -bios /dev/null -M ppce500
  qemu-system-ppc64: Unsupported NIC model: virtio-net-pci

And:

  $ qemu-system-ppc64 -bios /dev/null -M mpc8544ds
  qemu-system-ppc64: Unsupported NIC model: virtio-net-pci

Fixes: 98bd1db99f
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190316200818.8265-10-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-20 11:44:13 +01:00
Philippe Mathieu-Daudé
f7b5cdcbf2 hw/ppc/Kconfig: Bamboo machine requires e1000 network card
This fixes when configuring with CONFIG_PCI_DEVICES=n:

  $ qemu-system-ppc64 -bios /dev/null -M bamboo
  qemu-system-ppc64: Unsupported NIC model: e1000

Fixes: 7c28b925b7
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190316200818.8265-9-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-20 11:44:13 +01:00
Paolo Bonzini
b4f15fc4c1 prep: do not select I82374
It is only needed through I82378, which also selects it.

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-20 11:44:11 +01:00
Markus Armbruster
e366d181ce spapr: Remove NULL checks on error_propagate() calls
Patch created mechanically by rerunning:

  $  spatch --sp-file scripts/coccinelle/error_propagate_null.cocci \
	    --macro-file scripts/cocci-macro-file.h \
	    --dir . --in-place

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190318190148.18283-1-armbru@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-19 15:24:15 +11:00
Greg Kurz
f3e971ac9b ppc/pnv: Fix variable size in pnv_psi_power9_irq_set()
PSI registers are 64-bit.

Spotted by Coverity: CID 1399704

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155248884690.893204.5428179144527749023.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-19 15:20:14 +11:00
Greg Kurz
26aa5b1eeb ppc/pnv: Use local_err variable in pnv_chip_power9_intc_create()
Detected by Coverity: CID 1399702

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155248884129.893204.2293309859485638162.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-19 15:20:14 +11:00
David Gibson
49e9fdd741 spapr: Correctly set LPCR[GTSE] in H_REGISTER_PROCESS_TABLE
176dccee "target/ppc/spapr: Clear partition table entry when allocating
hash table" reworked the H_REGISTER_PROCESS_TABLE hypercall, but
unfortunately due to a small error no longer correctly sets the LPCR[GTSE]
bit which allows the guest to directly execute (some types of) tlbie (TLB
flush) instructions without involving the hypervisor.

We got away with this, initially, because POWER9 did not have hypervisor
mode enabled in its msr_mask, which meant we didn't actually run hypervisor
privilege checks in TCG at all.  However, da874d90 "target/ppc: add HV
support for POWER9" turned on HV support on POWER9 for the benefit of the
powernv machine type.

This exposed the earlier bug in H_REGISTER_PROCESS_TABLE, and causes guests
which rely on LPCR[GTSE] (i.e. basically all of them) to crash during early
boot when their first tlbie instruction causes an unexpected trap.

Fixes: 176dccee target/ppc/spapr: Clear partition table entry when allocating hash table
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Cleber Rosa <crosa@redhat.com>
2019-03-19 15:20:14 +11:00
Paolo Bonzini
ca9b7e29de kconfig: add CONFIG_MSI_NONBROKEN
Not all interrupt controllers have a working implementation of
message-signalled interrupts; in some cases, the guest may expect
MSI to work but it won't due to the buggy or lacking emulation.

In QEMU this is represented by the "msi_nonbroken" variable.  This
patch adds a new configuration symbol enabled whenever the binary
contains an interrupt controller that will set "msi_nonbroken".  We
can then use it to remove devices that cannot be possibly added
to the machine, because they require MSI.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-18 09:39:57 +01:00
Peter Maydell
eda1df0345 Pflash and firmware configuration patches for 2019-03-11
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJchtowAAoJEDhwtADrkYZTbmwP/i3N1SjDjg6j5ymzjl4YtaBP
 k61RoZ4Z/FPRuPGov1/WUrreqS7vqPLyCz4UpwgnAc3gslGGhYMAosU3EDtUYlS4
 hzI2lfAGoUQwAYvB6nLYQI81gKDf4HY/hMzzC38OrH89XRr2GgBFDJmz9WURlof/
 4ZHLkEQLasq93bEAItNZ/bAiEEwiidE13JTuFZ6PPzoMQYZlD2irjtPefFITGeV8
 rz0qRMuPSoOEm5dx4YoLnhyrGQP9DUKmhWKsiZqEVXnNhUtaki0g4wt9/dLsnvzS
 XnQINyTsGnqyqLaam8MT6hPMFZZexVd0h6JhIFVOxKbpF82/wLgWiWgPiiyZQVaF
 O10bcz3M2liCC7ttU+LGaoZLch+ua9k0PqqfeCxC8VbpTOBUJc75QJWOOu1snhnA
 iZB20oG61pEk9GTV8n44uARRdZ9vYAN2C2kKYuRFxTBjp9epKAa7zJGJQcj88l3y
 AXm+XhZEddFU4eI5wMlRvjVDSLb6CJ1bukps9gKEDBJoiUbLTLQbEtv82PmwRFLk
 ZkyHhFrox02tblh4bTjE81gTd8yVG2dzTuvykX14EXbeqWcGeR9EGmqOZ1mJv1jq
 kfKvydh4VEAakhJAdNhypWt9+sjko6jSpHlejRFzgQWFXPiR4Kh72+QWWTFipUXM
 x8609BVHji8Sg9dWMT/Y
 =k9u2
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/armbru/tags/pull-pflash-2019-03-11' into staging

Pflash and firmware configuration patches for 2019-03-11

# gpg: Signature made Mon 11 Mar 2019 21:59:12 GMT
# gpg:                using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-pflash-2019-03-11: (27 commits)
  docs/interop/firmware.json: Prefer -machine to if=pflash
  pc: Support firmware configuration with -blockdev
  pc_sysfw: Pass PCMachineState to pc_system_firmware_init()
  pc_sysfw: Remove unused PcSysFwDevice
  pflash_cfi01: Add pflash_cfi01_get_blk() helper
  vl: Create block backends before setting machine properties
  vl: Factor configure_blockdev() out of main()
  vl: Improve legibility of BlockdevOptions queue
  sysbus: Fix latent bug with onboard devices
  vl: Fix latent bug with -global and onboard devices
  qom: Move compat_props machinery from qdev to QOM
  qdev: Fix latent bug with compat_props and onboard devices
  pflash: Clean up after commit 368a354f02, part 2
  pflash: Clean up after commit 368a354f02, part 1
  mips_malta: Clean up definition of flash memory size somewhat
  hw/mips/malta: Restrict 'bios_size' variable scope
  hw/mips/malta: Remove fl_sectors variable
  mips_malta: Delete disabled, broken DEBUG_BOARD_INIT code
  r2d: Fix flash memory size, sector size, width, device ID
  ppc405_boards: Don't size flash memory to match backing image
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-12 11:12:36 +00:00
David Gibson
ce2918cbc3 spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of.  There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".

That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.

In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words".  So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.

In addition to case changes, we also make some other identifier renames:
  VIOsPAPR* -> SpaprVio*
    The reverse word ordering was only ever used to mitigate the capital
    cluster, so revert to the natural ordering.
  VIOsPAPRVTYDevice -> SpaprVioVty
  VIOsPAPRVLANDevice -> SpaprVioVlan
    Brevity, since the "Device" didn't add useful information
  sPAPRDRConnector -> SpaprDrc
  sPAPRDRConnectorClass -> SpaprDrcClass
    Brevity, and makes it clearer this is the same thing as a "DRC"
    mentioned in many other places in the code

This is 100% a mechanical search-and-replace patch.  It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:05 +11:00
Cédric Le Goater
e5694793ee ppc/pnv: add a "ibm,opal/power-mgt" device tree node on POWER9
Activate only stop0 and stop1 levels. We should not need more levels
when under QEMU.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-15-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:05 +11:00
Cédric Le Goater
bc56511668 ppc/pnv: add more dummy XSCOM addresses
To improve OPAL/skiboot support. We don't need to strictly model these
XSCOM accesses.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-14-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:05 +11:00
Cédric Le Goater
5dad902ce0 ppc/pnv: POWER9 XSCOM quad support
The POWER9 processor does not support per-core frequency control. The
cores are arranged in groups of four, along with their respective L2
and L3 caches, into a structure known as a Quad. The frequency must be
managed at the Quad level.

Provide a basic Quad model to fake the settings done by the firmware
on the Non-Cacheable Unit (NCU). Each core pair (EX) needs a special
BAR setting for the TIMA area of XIVE because it resides on the same
address on all chips.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-12-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
90ef386c74 ppc/pnv: extend XSCOM core support for POWER9
Provide a new class attribute to define XSCOM operations per CPU
family and add a couple of XSCOM addresses controlling the power
management states of the core on POWER9.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-11-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
6598a70d00 ppc/pnv: add a OCC model for POWER9
The OCC on POWER9 is very similar to the one found on POWER8. Provide
the same routines with P9 values for the registers and IRQ number.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-10-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
3233838cd1 ppc/pnv: add a OCC model class
To ease the introduction of the OCC model for POWER9, provide a new
class attributes to define XSCOM operations per CPU family and a PSI
IRQ number.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190307223548.20516-9-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
8207b90604 ppc/pnv: add SerIRQ routing registers
This is just a simple reminder that SerIRQ routing should be
addressed.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-8-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
15376c66fa ppc/pnv: add a LPC Controller model for POWER9
The LPC Controller on POWER9 is very similar to the one found on
POWER8 but accesses are now done via on MMIOs, without the XSCOM and
ECCB logic. The device tree is populated differently so we add a
specific POWER9 routine for the purpose.

SerIRQ routing is yet to be done.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-7-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
64d011d56e ppc/pnv: add a 'dt_isa_nodename' to the chip
The ISA bus has a different DT nodename on POWER9. Compute the name
when the PnvChip is realized, that is before it is used by the machine
to populate the device tree with the ISA devices.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-6-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
82514be28b ppc/pnv: add a LPC Controller class model
It will ease the introduction of the LPC Controller model for POWER9.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190307223548.20516-5-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
6f89f48e56 ppc/pnv: lpc: fix OPB address ranges
The PowerNV LPC Controller exposes different sets of registers for
each of the functional units it encompasses, among which the OPB
(On-Chip Peripheral Bus) Master and Arbitrer and the LPC HOST
Controller.

The mapping addresses of each register range are correct but the sizes
are too large. Fix the sizes and define the OPB Arbitrer range to fill
the gap between the OPB Master registers and the LPC HOST Controller
registers.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-4-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
c38536bc80 ppc/pnv: add a PSI bridge model for POWER9
The PSI bridge on POWER9 is very similar to POWER8. The BAR is still
set through XSCOM but the controls are now entirely done with MMIOs.
More interrupts are defined and the interrupt controller interface has
changed to XIVE. The POWER9 model is a first example of the usage of
the notify() handler of the XiveNotifier interface, linking the PSI
XiveSource to its owning device model.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
ae85605531 ppc/pnv: add a PSI bridge class model
To ease the introduction of the PSI bridge model for POWER9, abstract
the POWER chip differences in a PnvPsi class model and introduce a
specific Pnv8Psi type for POWER8. POWER8 interface to the interrupt
controller is still XICS whereas POWER9 uses the new XIVE model.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Mark Cave-Ayland
31bc6fa7fa mac_newworld: use node name instead of alias name for hd device in FWPathProvider
When using -drive to configure the hd drive for the New World machine, the node
name "disk" should be used instead of the "hd" alias.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307212058.4890-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Mark Cave-Ayland
484d366e02 mac_oldworld: use node name instead of alias name for hd device in FWPathProvider
When using -drive to configure the hd drive for the Old World machine, the node
name "disk" should be used instead of the "hd" alias.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307212058.4890-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Alexey Kardashevskiy
5f36666722 spapr_iommu: Do not replay mappings from just created DMA window
On sPAPR vfio_listener_region_add() is called in 2 situations:
1. a new listener is registered from vfio_connect_container();
2. a new IOMMU Memory Region is added from rtas_ibm_create_pe_dma_window().

In both cases vfio_listener_region_add() calls
memory_region_iommu_replay() to notify newly registered IOMMU notifiers
about existing mappings which is totally desirable for case 1.

However for case 2 it is nothing but noop as the window has just been
created and has no valid mappings so replaying those does not do anything.
It is barely noticeable with usual guests but if the window happens to be
really big, such no-op replay might take minutes and trigger RCU stall
warnings in the guest.

For example, a upcoming GPU RAM memory region mapped at 64TiB (right
after SPAPR_PCI_LIMIT) causes a 64bit DMA window to be at least 128TiB
which is (128<<40)/0x10000=2.147.483.648 TCEs to replay.

This mitigates the problem by adding an "skipping_replay" flag to
sPAPRTCETable and defining sPAPR own IOMMU MR replay() hook which does
exactly the same thing as the generic one except it returns early if
@skipping_replay==true.

Another way of fixing this would be delaying replay till the very first
H_PUT_TCE but this does not work if in-kernel H_PUT_TCE handler is
enabled (a likely case).

When "ibm,create-pe-dma-window" is complete, the guest will map only
required regions of the huge DMA window.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20190307050518.64968-2-aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
f7eb6a0a9b ppc/pnv: psi: add a reset handler
Reset all regs but keep the MMIO BAR enabled as it is at realize time.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190306085032.15744-14-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
029699aa04 ppc/pnv: psi: add a PSIHB_REG macro
This is a simple helper to translate XSCOM addresses to MMIO addresses

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190306085032.15744-13-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
cdbaf8cd9a ppc/pnv: fix logging primitives using Ox
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190306085032.15744-12-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
d8e4aad533 ppc/pnv: introduce a new pic_print_info() operation to the chip model
The POWER9 and POWER8 processors have different interrupt controllers,
and reporting their state requires calling different helper routines.

However, the interrupt presenters are still handled in the higher
level pic_print_info() routine because they are not related to the
chip.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190306085032.15744-9-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
eb859a27e1 ppc/pnv: introduce a new dt_populate() operation to the chip model
The POWER9 and POWER8 processors have a different set of devices and a
different device tree layout.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190306085032.15744-8-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
2dfa91a2aa ppc/pnv: add a XIVE interrupt controller model for POWER9
This is a simple model of the POWER9 XIVE interrupt controller for the
PowerNV machine which only addresses the needs of the skiboot
firmware. The PowerNV model reuses the common XIVE framework developed
for sPAPR as the fundamentals aspects are quite the same. The
difference are outlined below.

The controller initial BAR configuration is performed using the XSCOM
bus from there, MMIO are used for further configuration.

The MMIO regions exposed are :

 - Interrupt controller registers
 - ESB pages for IPIs and ENDs
 - Presenter MMIO (Not used)
 - Thread Interrupt Management Area MMIO, direct and indirect

The virtualization controller MMIO region containing the IPI ESB pages
and END ESB pages is sub-divided into "sets" which map portions of the
VC region to the different ESB pages. These are modeled with custom
address spaces and the XiveSource and XiveENDSource objects are sized
to the maximum allowed by HW. The memory regions are resized at
run-time using the configuration of EDT set translation table provided
by the firmware.

The XIVE virtualization structure tables (EAT, ENDT, NVTT) are now in
the machine RAM and not in the hypervisor anymore. The firmware
(skiboot) configures these tables using Virtual Structure Descriptor
defining the characteristics of each table : SBE, EAS, END and
NVT. These are later used to access the virtual interrupt entries. The
internal cache of these tables in the interrupt controller is updated
and invalidated using a set of registers.

Still to address to complete the model but not fully required is the
support for block grouping. Escalation support will be necessary for
KVM guests.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190306085032.15744-7-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
956b8f468d ppc/pnv: change the CPU machine_data presenter type to Object *
The POWER9 PowerNV machine will use a XIVE interrupt presenter type.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190306085032.15744-6-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Cédric Le Goater
051e2973bf ppc: externalize ppc_get_vcpu_by_pir()
We will use it to get the CPU interrupt presenter in XIVE when the
TIMA is accessed from the indirect page.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190306085032.15744-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Andrew Randrianasulu
7abb479c7a PPC: E500: Add FSL I2C controller and integrate RTC with it
Original commit message:
This patch adds an emulation model for i2c controller found on most of the FSL SoCs.
It also integrates the RTC (ds1338) that sits on the i2c Bus with e500 machine model.

Patch was originally written by Amit Singh Tomar <amit.tomar@freescale.com>
see http://patchwork.ozlabs.org/patch/431475/
I only fixed it enough for application on top of current qemu master
20b084c4b1, and hopefully fixed checkpatch errors

Tested by booting Linux kernel 4.20.12. Now e500 machine doesn't need
network time protocol daemon because it will have working RTC
(before all timestamps on files were from 2016)

Signed-off-by: Amit Singh Tomar <amit.tomar@freescale.com>
Signed-off-by: Andrew Randrianasulu <randrianasulu@gmail.com>
Message-Id: <20190306102812.28972-1-randrianasulu@gmail.com>
[dwg: Add Kconfig stanza to define the new symbol, update MAINTAINERS]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Suraj Jitindar Singh
68f9f70841 target/ppc/spapr: Enable H_PAGE_INIT in-kernel handling
The H_CALL H_PAGE_INIT can be used to zero or copy a page of guest
memory. Enable the in-kernel H_PAGE_INIT handler.

The in-kernel handler takes half the time to complete compared to
handling the H_CALL in userspace.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190306060608.19935-1-sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Suraj Jitindar Singh
176dcceedd target/ppc/spapr: Clear partition table entry when allocating hash table
If we allocate a hash page table then we know that the guest won't be
using process tables, so set the partition table entry maintained for
the guest to zero. If this isn't done, then the guest radix bit will
remain set in the entry. This means that when the guest calls
H_REGISTER_PROCESS_TABLE there will be a mismatch between then flags
and the value in spapr->patb_entry, and the call will fail. The guest
will then panic:

Failed to register process table (rc=-4)
kernel BUG at arch/powerpc/platforms/pseries/lpar.c:959

The result being that it isn't possible to boot a hash guest on a P9
system.

Also fix a bug in the flags parsing in h_register_process_table() which
was introduced by the same patch, and simplify the handling to make it
less likely that errors will be introduced in the future. The effect
would have been setting the host radix bit LPCR_HR for a hash guest
using process tables, which currently isn't supported and so couldn't
have been triggered.

Fixes: 00fd075e18 "target/ppc/spapr: Set LPCR:HR when using Radix mode"

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190305022102.17610-1-sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Suraj Jitindar Singh
2782ad4c41 target/ppc/spapr: Enable mitigations by default for pseries-4.0 machine type
There are currently 3 mitigations the availability of which is controlled
by the spapr-caps mechanism, cap-cfpc, cap-sbbc, and cap-ibs. Enable these
mitigations by default for the pseries-4.0 machine type.

By now machine firmware should have been upgraded to allow these
settings.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190301044609.9626-3-sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Suraj Jitindar Singh
006e9d3618 target/ppc/tcg: make spapr_caps apply cap-[cfpc/sbbc/ibs] non-fatal for tcg
The spapr_caps cap-cfpc, cap-sbbc and cap-ibs are used to control the
availability of certain mitigations to the guest. These haven't been
implemented under TCG, it is unlikely they ever will be, and it is unclear
as to whether they even need to be.

As such, make failure to apply these capabilities under TCG non-fatal.
Instead we print a warning message to the user but still allow the guest
to continue.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190301044609.9626-2-sjitindarsingh@gmail.com>
[dwg: Small style fix]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:32:54 +11:00
Suraj Jitindar Singh
8ff43ee404 target/ppc/spapr: Add SPAPR_CAP_CCF_ASSIST
Introduce a new spapr_cap SPAPR_CAP_CCF_ASSIST to be used to indicate
the requirement for a hw-assisted version of the count cache flush
workaround.

The count cache flush workaround is a software workaround which can be
used to flush the count cache on context switch. Some revisions of
hardware may have a hardware accelerated flush, in which case the
software flush can be shortened. This cap is used to set the
availability of such hardware acceleration for the count cache flush
routine.

The availability of such hardware acceleration is indicated by the
H_CPU_CHAR_BCCTR_FLUSH_ASSIST flag being set in the characteristics
returned from the KVM_PPC_GET_CPU_CHAR ioctl.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190301031912.28809-2-sjitindarsingh@gmail.com>
[dwg: Small style fixes]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 12:07:49 +11:00
Suraj Jitindar Singh
399b2896d4 target/ppc/spapr: Add workaround option to SPAPR_CAP_IBS
The spapr_cap SPAPR_CAP_IBS is used to indicate the level of capability
for mitigations for indirect branch speculation. Currently the available
values are broken (default), fixed-ibs (fixed by serialising indirect
branches) and fixed-ccd (fixed by diabling the count cache).

Introduce a new value for this capability denoted workaround, meaning that
software can work around the issue by flushing the count cache on
context switch. This option is available if the hypervisor sets the
H_CPU_BEHAV_FLUSH_COUNT_CACHE flag in the cpu behaviours returned from
the KVM_PPC_GET_CPU_CHAR ioctl.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190301031912.28809-1-sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 12:07:49 +11:00
Suraj Jitindar Singh
edaa799559 target/ppc/spapr: Enable the large decrementer for pseries-4.0
Enable the large decrementer by default for the pseries-4.0 machine type.
It is disabled again by default_caps_with_cpu() for pre-POWER9 cpus
since they don't support the large decrementer.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190301024317.22137-4-sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 12:07:49 +11:00
Suraj Jitindar Singh
7d050527e3 target/ppc: Implement large decrementer support for KVM
Implement support to allow KVM guests to take advantage of the large
decrementer introduced on POWER9 cpus.

To determine if the host can support the requested large decrementer
size, we check it matches that specified in the ibm,dec-bits device-tree
property. We also need to enable it in KVM by setting the LPCR_LD bit in
the LPCR. Note that to do this we need to try and set the bit, then read
it back to check the host allowed us to set it, if so we can use it but
if we were unable to set it the host cannot support it and we must not
use the large decrementer.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190301024317.22137-3-sjitindarsingh@gmail.com>
[dwg: Small style fixes]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 12:07:49 +11:00
Suraj Jitindar Singh
a8dafa5251 target/ppc: Implement large decrementer support for TCG
Prior to POWER9 the decrementer was a 32-bit register which decremented
with each tick of the timebase. From POWER9 onwards the decrementer can
be set to operate in a mode called large decrementer where it acts as a
n-bit decrementing register which is visible as a 64-bit register, that
is the value of the decrementer is sign extended to 64 bits (where n is
implementation dependant).

The mode in which the decrementer operates is controlled by the LPCR_LD
bit in the logical paritition control register (LPCR).

>From POWER9 onwards the HDEC (hypervisor decrementer) was enlarged to
h-bits, also sign extended to 64 bits (where h is implementation
dependant). Note this isn't configurable and is always enabled.

On POWER9 the large decrementer and hdec are both 56 bits, as
represented by the lrg_decr_bits cpu class property. Since they are the
same size we only add one property for now, which could be extended in
the case they ever differ in the future.

We also add the lrg_decr_bits property for POWER5+/7/8 since it is used
to determine the size of the hdec, which is only generated on the
POWER5+ processor and later. On these processors it is 32 bits.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190301024317.22137-2-sjitindarsingh@gmail.com>
[dwg: Small style fixes]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 12:07:49 +11:00
Suraj Jitindar Singh
c982f5cf9a target/ppc/spapr: Add SPAPR_CAP_LARGE_DECREMENTER
Add spapr_cap SPAPR_CAP_LARGE_DECREMENTER to be used to control the
availability of the large decrementer for a guest.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190301024317.22137-1-sjitindarsingh@gmail.com>
[dwg: Trivial style fix]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 12:07:49 +11:00
Greg Kurz
c65ecfe2f3 Revert "spapr: support memory unplug for qtest"
Commit b8165118f5 broke CPU hotplug tests for old machine types:

$ QTEST_QEMU_BINARY=ppc64-softmmu/qemu-system-ppc64 ./tests/cpu-plug-test -m=slow
/ppc64/cpu-plug/pseries-3.1/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.12-sxxm/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-3.0/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.10/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.11/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.12/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.9/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.7/device-add/2x3x1&maxcpus=6: **
ERROR:/home/thuth/devel/qemu/hw/ppc/spapr_events.c:313:rtas_event_log_to_source: assertion failed: (source->enabled)
Broken pipe
/home/thuth/devel/qemu/tests/libqtest.c:143: kill_qemu() detected QEMU death from signal 6 (Aborted) (core dumped)
Aborted (core dumped)

The approach of faking the availability of OV5_HP_EVT causes the
code to assume the hotplug event source is enabled, which is wrong
for older machines.

We've now fixed CAS under qtest with a different approach.  Therefore,
this reverts commit b8165118f5.

A subsequent patch will address the problem of CAS under qtest from
a different angle.

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155146875097.147873.1732264036668112686.stgit@bahia.lan>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 12:06:36 +11:00
Greg Kurz
23ff81bdfd spapr: Simulate CAS for qtest
The RTAS event hotplug code for machine types 2.8 and newer depends on
the CAS negotiated ov5 in order to work properly. However, there's no
CAS when running under qtest. There has been a tentative to trick the
code by faking the OV5_HP_EVT bit, but it turned out to break other
assumptions in the code and the change got reverted.

Go for a more general approach and simulate a CAS when running under
qtest. For simplicity, this pseudo CAS simple simulates the case where
the guest supports the same features as the machine. It is done at
reset time, just before we reset the DRCs, which could potentially
exercise the unplug code.

This allows to test unplug on spapr with both older and newer machine
types.

Suggested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155146875704.147873.10563808578795890265.stgit@bahia.lan>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 10:50:59 +11:00
Markus Armbruster
ce14710f4f pflash: Clean up after commit 368a354f02, part 2
Our pflash devices are simplistically modelled has having
"num-blocks" sectors of equal size "sector-length".  Real hardware
commonly has sectors of different sizes.  How our "sector-length"
property is related to the physical device's multiple sector sizes
is unclear.

Helper functions pflash_cfi01_register() and pflash_cfi02_register()
create a pflash device, set properties including "sector-length" and
"num-blocks", and realize.  They take parameters @size, @sector_len
and @nb_blocs.

QOMification left parameter @size unused.  Obviously, @size should
match @sector_len and @nb_blocs, i.e. size == sector_len * nb_blocs.
All callers satisfy this.

Remove @nb_blocs and compute it from @size and @sector_len.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20190308094610.21210-16-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-03-11 22:53:44 +01:00
Markus Armbruster
940d5b132f pflash: Clean up after commit 368a354f02, part 1
QOMification left parameter @qdev unused in pflash_cfi01_register()
and pflash_cfi02_register().  All callers pass NULL.  Remove.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190308094610.21210-15-armbru@redhat.com>
2019-03-11 22:53:44 +01:00
Markus Armbruster
dd59bcae76 ppc405_boards: Don't size flash memory to match backing image
Machine "ref405ep" maps its flash memory at address 2^32 - image size.
Image size is rounded up to the next multiple of 64KiB.  Useless,
because pflash_cfi02_realize() fails with "failed to read the initial
flash content" unless the rounding is a no-op.

If the image size exceeds 0x80000 Bytes, we overlap first SRAM, then
other stuff.  No idea how that would play out, but useful outcomes
seem unlikely.

Map the flash memory at fixed address 0xFFF80000 with size 512KiB,
regardless of image size, to match the physical hardware.

Machine "taihu" maps its boot flash memory similarly.  The code even
has a comment /* XXX: should check that size is 2MB */, followed by
disabled code to adjust the size to 2MiB regardless of image size.

Its code to map its application flash memory looks the same, except
there the XXX comment asks for 32MiB, and the code to adjust the size
isn't disabled.  Note that pflash_cfi02_realize() fails with "failed
to read the initial flash content" for images smaller than 32MiB.

Map the boot flash memory at fixed address 0xFFE00000 with size 2MiB,
to match the physical hardware.  Delete dead code from application
flash mapping, and simplify some.

Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20190308094610.21210-9-armbru@redhat.com>
2019-03-11 22:53:44 +01:00
Markus Armbruster
886db7c55c ppc405_boards: Delete stale, disabled DEBUG_BOARD_INIT code
The disabled DEBUG_BOARD_INIT code goes back to the initial commit
1a6c088620, and has since seen only mechanical updates.  It sure
feels like useless clutter now.  Delete it.

Suggested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190308094610.21210-8-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2019-03-11 22:53:44 +01:00
Markus Armbruster
f30bc99559 sam460ex: Don't size flash memory to match backing image
Machine "sam460ex" maps its flash memory at address 0xFFF00000.  When
no image is supplied, its size is 1MiB (0x100000), and 512KiB of ROM
get mapped on top of its second half.  Else, it's the size of the
image rounded up to the next multiple of 64KiB.

The rounding is actually useless: pflash_cfi01_realize() fails with
"failed to read the initial flash content" unless it's a no-op.

I have no idea what happens when the pflash's size exceeds 1MiB.
Useful outcomes seem unlikely.

I guess memory at the end of the address space remains unmapped when
it's smaller than 1MiB.  Again, useful outcomes seem unlikely.

The physical hardware appears to have 512KiB of flash memory:
https://eu.mouser.com/datasheet/2/268/atmel_AT49BV040B-1180330.pdf

For now, just set the flash memory size to 1MiB regardless of image
size, and document the mess.

Cc: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20190308094610.21210-7-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-03-11 22:53:44 +01:00
Peter Maydell
234afe7828 - qtest fixes
- Some generic clean-ups by Philippe
 - macOS CI testing via cirrus-ci.com
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJcgi7HAAoJEC7Z13T+cC21Y00P/1/m7FcVVfMlDw85+rYjkUri
 QWPvWUORhGbAkv87AfsFezCzoO/n3KX+AefPDWbnIM1Ixt8MvS/8zPOWAXwHUKVy
 ira5jP7CNJDPGr13qoO0lNrvU5cmxRWdmLOMbMsqW3Aparc5RBgDPn0bvcm5l2vX
 i90fdxpXvpQ/FgoX0J1j//awa3JXf94pijBb3pL985qXI670ZkRq13JIlmVZ1+Gw
 Fmx4XvpIwajo2HM1G+CcG8ElAxTgYmjC9bkKJW1fddOkwP7wRnZtAdLZpRTzojCb
 CUNBaTSM/xjinVzOhwgiHFtak/ZMOdUZrGjrbin1e/p+Xppw75P7FdUoiSnJNhga
 BJr8LbGcJwcIXfpMdEw7ZGlWACd+D0+G7363jNWOPyff3by6xx4gdCrBsYc4qwSR
 MJ8Wyb5o4oSisUg06VxghGyPTE/xBgog/YgLb4Bu6FXjCPKsl0mKQMxG0ROZLvT+
 dFiaHeeCKEn7Yw6OkdqW9Sa1uGfna7gRCC7hZErDA3URe+02dUBb4VCtnjAaCLx3
 0Jq8jpb2T57N8roP23QFQBxA+Y859qlZPrWzwRqbgdADZCnFsSJlmBxjDmhbYuF0
 4qAQtGFTgdmhjdG/FjJkcMQkCcx4h6V62kqi8HtP+vCd43SFwLPqHH/HKq5cU/Zt
 YIXF2oo6z5k7iqx1H26G
 =DEp5
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/huth-gitlab/tags/pull-request-2019-03-08' into staging

- qtest fixes
- Some generic clean-ups by Philippe
- macOS CI testing via cirrus-ci.com

# gpg: Signature made Fri 08 Mar 2019 08:58:47 GMT
# gpg:                using RSA key 2ED9D774FE702DB5
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* remotes/huth-gitlab/tags/pull-request-2019-03-08:
  cirrus.yml: Add macOS continuous integration task
  tests/bios-tables: Improve portability by searching bash in the $PATH
  vhost-user-test: fix leaks
  tests: Do not use "\n" in g_test_message() strings
  hw/devices: Remove unused TC6393XB_RAM definition
  hw: Remove unused 'hw/devices.h' include
  tests: Move qdict-test-data.txt to tests/data/qobject/

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	tests/vhost-user-test.c
2019-03-08 16:31:34 +00:00
Philippe Mathieu-Daudé
04f3c0084d hw: Remove unused 'hw/devices.h' include
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2019-03-07 22:16:11 +01:00
Thomas Huth
98bd1db99f ppc: Express dependencies of the embedded machines with kconfig
This makes it much easier if the users want to disable some of
the embedded machines for their builds.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-07 21:46:19 +01:00
Thomas Huth
1f40cc5e84 ppc: Express dependencies of the Sam460EX machines with kconfig
Most of the dependencies are now directly selected by the SAM460EX
switch. We can drop CONFIG_VGA_CIRRUS since this device is already
selected automatically when CONFIG_PCI_DEVICES is set.

Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-07 21:46:19 +01:00
Thomas Huth
d7cfb520cf ppc: Express dependencies of the Mac machines with kconfig
This will make it for example easier if the users want to disable
one of the two machines for their builds.

Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-07 21:46:19 +01:00
Thomas Huth
12bb3a9008 ppc: Express dependencies of the 'prep' and '40p' machines with kconfig
Select the required devices in hw/ppc/Kconfig instead, so that
ppc-softmmu.mak only contains the user-selectable PREP switch.
Plug-in devices like NE2000_ISA are pulled in automatically by the
Kconfig build system now.

Cc: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-07 21:46:13 +01:00
Thomas Huth
87f9108bad ppc64: Express dependencies of 'pseries' and 'powernv' machines with kconfig
The POWERNV switch should always select ISA_IPMI_BT, then the other
IPMI options are turned on automatically now.
CONFIG_DIMM should always be selected by the pseries machine,
which in turn depends on CONFIG_MEM_DEVICE since DIMM implements
this interface.
CONFIG_VIRTIO_VGA can be dropped from default-configs/ppc64-softmmu.mak
completely since this device is already automatically enabled via
hw/display/Kconfig now.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-07 21:45:53 +01:00
Paolo Bonzini
e0e312f352 build: switch to Kconfig
The make_device_config.sh script is replaced by minikconf, which
is modified to support the same command line as its predecessor.

The roots of the parsing are default-configs/*.mak, Kconfig.host and
hw/Kconfig.  One difference with make_device_config.sh is that all symbols
have to be defined in a Kconfig file, including those coming from the
configure script.  This is the reason for the Kconfig.host file introduced
in the previous patch. Whenever a file in default-configs/*.mak used
$(...) to refer to a config-host.mak symbol, this is replaced by a
Kconfig dependency; this part must be done already in this patch
for bisectability.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190123065618.3520-28-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-07 21:45:53 +01:00
Paolo Bonzini
82f5181777 kconfig: introduce kconfig files
The Kconfig files were generated mostly with this script:

  for i in `grep -ho CONFIG_[A-Z0-9_]* default-configs/* | sort -u`; do
    set fnord `git grep -lw $i -- 'hw/*/Makefile.objs' `
    shift
    if test $# = 1; then
      cat >> $(dirname $1)/Kconfig << EOF
config ${i#CONFIG_}
    bool

EOF
      git add $(dirname $1)/Kconfig
    else
      echo $i $*
    fi
  done
  sed -i '$d' hw/*/Kconfig
  for i in hw/*; do
    if test -d $i && ! test -f $i/Kconfig; then
      touch $i/Kconfig
      git add $i/Kconfig
    fi
  done

Whenever a symbol is referenced from multiple subdirectories, the
script prints the list of directories that reference the symbol.
These symbols have to be added manually to the Kconfig files.

Kconfig.host and hw/Kconfig were created manually.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20190123065618.3520-27-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-07 21:45:53 +01:00
David Hildenbrand
07578b0ad6 qdev: Let the hotplug_handler_unplug() caller delete the device
When unplugging a device, at one point the device will be destroyed
via object_unparent(). This will, one the one hand, unrealize the
removed device hierarchy, and on the other hand, destroy/free the
device hierarchy.

When chaining hotplug handlers, we want to overwrite a bus hotplug
handler by the machine hotplug handler, to be able to perform
some part of the plug/unplug and to forward the calls to the bus hotplug
handler.

For now, the bus hotplug handler would trigger an object_unparent(), not
allowing us to perform some unplug action on a device after we forwarded
the call to the bus hotplug handler. The device would be gone at that
point.

machine_unplug_handler(dev)
    /* eventually do unplug stuff */
    bus_unplug_handler(dev)
    /* dev is gone, we can't do more unplug stuff */

So move the object_unparent() to the original caller of the unplug. For
now, keep the unrealize() at the original places of the
object_unparent(). For implicitly chained hotplug handlers (e.g. pc
code calling acpi hotplug handlers), the object_unparent() has to be
done by the outermost caller. So when calling hotplug_handler_unplug()
from inside an unplug handler, nothing is to be done.

hotplug_handler_unplug(dev) -> calls machine_unplug_handler()
    machine_unplug_handler(dev) {
        /* eventually do unplug stuff */
        bus_unplug_handler(dev) -> calls unrealize(dev)
        /* we can do more unplug stuff but device already unrealized */
    }
object_unparent(dev)

In the long run, every unplug action should be factored out of the
unrealize() function into the unplug handler (especially for PCI). Then
we can get rid of the additonal unrealize() calls and object_unparent()
will properly unrealize the device hierarchy after the device has been
unplugged.

hotplug_handler_unplug(dev) -> calls machine_unplug_handler()
    machine_unplug_handler(dev) {
        /* eventually do unplug stuff */
        bus_unplug_handler(dev) -> only unplugs, does not unrealize
        /* we can do more unplug stuff */
    }
object_unparent(dev) -> will unrealize

The original approach was suggested by Igor Mammedov for the PCI
part, but I extended it to all hotplug handlers. I consider this one
step into the right direction.

To summarize:
- object_unparent() on synchronous unplugs is done by common code
-- "Caller of hotplug_handler_unplug"
- object_unparent() on asynchronous unplugs ("unplug requests") has to
  be done manually
-- "Caller of hotplug_handler_unplug"

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190228122849.4296-2-david@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-03-06 11:51:08 -03:00
Eric Auger
dc0ca80eb1 hw/boards: Add a MachineState parameter to kvm_type callback
On ARM, the kvm_type will be resolved by querying the KVMState.
Let's add the MachineState handle to the callback so that we
can retrieve the  KVMState handle. in kvm_init, when the callback
is called, the kvm_state variable is not yet set.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20190304101339.25970-5-eric.auger@redhat.com
[ppc parts]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-05 15:55:09 +00:00
Peter Maydell
20b084c4b1 This has been out there long enough, I need to get this in.
This was changed a little bit since my post on Feb 20 (to which
 there were no comments) due to changes I had to work around:
 
 Change b296b664ab "smbus: Add a helper to generate SPD EEPROM
 data" added a function to include/hw/i2c/smbus.h, which I had to move to
 include/hw/smbus_eeprom.h.
 
 There were some changes to hw/i2c/Makefile.objs that I had to fix up.
 
 Beyond that, no changes.
 
 Thanks,
 
 -corey
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE/Q1c5nzg9ZpmiCaGYfOMkJGb/4EFAlx4Iv0ACgkQYfOMkJGb
 /4HoUw/+IcrfemAuaEt0f7hOENpeWD4HYFCk0wgzXraSLaurREQHNP4KmYxz2xOS
 ISLqgTty3dEjo95VXuSQUMm9ZaV1p8LquO+I1FnNGIt0otO3SMEh6/nOyrH1zY74
 Q+6IlUzTQlU8dQCsZOd5FqGxmH/nvIVufC1WCauwfHP0hEIx0F631i2l/DeZRhYj
 7SO+idIwHljKyiDgS+CtKygSXjEnwOqV9rVQiLWYrCu0+wXBv2WIDH66xPRnYA3F
 WM3MI3ViYekCw2jWLrkaM5sjgfQ/FhTpEFC8uCJXYBF6/FggCEfkd+Yp7G9RnXq+
 ZbezRw0HCNmm7inWWGW3hfaVUFS3QVapoppJTDAAsUCspj+TQ9NkbVWdqIqCqUtU
 GFgVzwMwSgoW8rekF4A4VxE9IAWPfh9KVKT6JVIYizx0Z/F7P+VmTAvbTlHZGHYX
 QtMzyDyIpj0FtZ7yL+6LIywGR4zOP37d97xlKiYQS2JAZMiLnDr0v+avY/Ps/rmV
 fFC0sNwctD22gXIW+OecEOEckv/dSIL2PlzZ2gSuJ5xGzyfw2OPa6C1CaoD7y3qn
 xbv0zY2jBvm5hLBG/GgorlSkQOyQwLupUYl8hf5EVNjjrOcWk0/Se7Pp2HMp+RrG
 krnc4CNhfmyiJxd7GvVA23GHUgC4jMOq6P0qlUu2XcDDQC/jnbs=
 =XTkI
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/cminyard/tags/i2c-for-release-20190228' into staging

This has been out there long enough, I need to get this in.

This was changed a little bit since my post on Feb 20 (to which
there were no comments) due to changes I had to work around:

Change b296b664ab "smbus: Add a helper to generate SPD EEPROM
data" added a function to include/hw/i2c/smbus.h, which I had to move to
include/hw/smbus_eeprom.h.

There were some changes to hw/i2c/Makefile.objs that I had to fix up.

Beyond that, no changes.

Thanks,

-corey

# gpg: Signature made Thu 28 Feb 2019 18:05:49 GMT
# gpg:                using RSA key FD0D5CE67CE0F59A6688268661F38C90919BFF81
# gpg: Good signature from "Corey Minyard <cminyard@mvista.com>" [unknown]
# gpg:                 aka "Corey Minyard <minyard@acm.org>" [unknown]
# gpg:                 aka "Corey Minyard <corey@minyard.net>" [unknown]
# gpg:                 aka "Corey Minyard <minyard@mvista.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: FD0D 5CE6 7CE0 F59A 6688  2686 61F3 8C90 919B FF81

* remotes/cminyard/tags/i2c-for-release-20190228:
  i2c: Verify that the count passed in to smbus_eeprom_init() is valid
  i2c:smbus_eeprom: Add a reset function to smbus_eeprom
  i2c:smbus_eeprom: Add vmstate handling to the smbus eeprom
  i2c:smbus_eeprom: Add a size constant for the smbus_eeprom size
  i2c:smbus_eeprom: Add normal type name and cast to smbus_eeprom.c
  i2c:smbus_slave: Add an SMBus vmstate structure
  i2c:pm_smbus: Fix state transfer
  migration: Add a VMSTATE_BOOL_TEST() macro
  i2c:pm_smbus: Fix pm_smbus handling of I2C block read
  boards.h: Ignore migration for SMBus devices on older machines
  i2c:smbus: Make white space in switch statements consistent
  i2c:smbus_eeprom: Get rid of the quick command
  i2c:smbus: Simplify read handling
  i2c:smbus: Simplify write operation
  i2c:smbus: Correct the working of quick commands
  i2c: Don't check return value from i2c_recv()
  arm:i2c: Don't mask return from i2c_recv()
  i2c: have I2C receive operation return uint8_t
  i2c: Split smbus into parts

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-01 11:20:49 +00:00
Corey Minyard
93198b6cad i2c: Split smbus into parts
smbus.c and smbus.h had device side code, master side code, and
smbus.h has some smbus_eeprom.c definitions.  Split them into
separate files.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-02-27 21:06:08 -06:00
Murilo Opsfelder Araujo
b268a6162d ppc/pnv: use IEC binary prefixes to represent sizes
Using IEC binary prefixes from qemu/units.h provides a more human-friendly value
to size constants.

Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Message-Id: <20190225170155.1972-4-muriloo@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 14:20:30 +11:00
Murilo Opsfelder Araujo
584ea7e76f ppc/pnv: add INITRD_MAX_SIZE constant
The current 0x10000000 value is actually 256MiB, not 128MB as the comment
suggests. Move it to a constant and fix the comment (no change in the size
value).

Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Message-Id: <20190225170155.1972-3-muriloo@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 14:20:30 +11:00
Murilo Opsfelder Araujo
b45b56baee ppc/pnv: increase kernel size limit to 256MiB
Building kernel with CONFIG_DEBUG_INFO_REDUCED can generate a ~90MB image and
building with CONFIG_DEBUG_INFO can generate a ~225M one, both exceeds the
current limit of 32MiB.

Increasing kernel size limit to 256MiB should fit for now.

Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Message-Id: <20190225170155.1972-2-muriloo@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 14:20:30 +11:00
Thomas Huth
f6d4dca807 hw/ppc: Use object_initialize_child for correct reference counting
Both functions, object_initialize() and object_property_add_child() increase
the reference counter of the new object, so one of the references has to be
dropped afterwards to get the reference counting right. Otherwise the child
object will not be properly cleaned up when the parent gets destroyed.
Thus let's use now object_initialize_child() instead to get the reference
counting here right.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1550748288-30598-1-git-send-email-thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Michael Roth
dae5e39ada spapr: enable PHB hotplug for default pseries machine type
The 'dr_phb_enabled' field of that class can be set as part of
machine-specific init code. It will be used to conditionally
enable creation of DRC objects and device-tree description to
facilitate hotplug of PHBs.

Since we can't migrate this state to older machine types,
default the option to true and disable it for older machine
types.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <155059673433.1466090.6188091133769611501.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
bb2bdd812e spapr: add hotplug hooks for PHB hotplug
Hotplugging PHBs is a machine-level operation, but PHBs reside on the
main system bus, so we register spapr machine as the handler for the
main system bus.

Provide the usual pre-plug, plug and unplug-request handlers.

Move the checking of the PHB index to the pre-plug handler. It is okay
to do that and assert in the realize function because the pre-plug
handler is always called, even for the oldest machine types we support.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(Fixed interrupt controller phandle in "interrupt-map" and
 TCE table size in "ibm,dma-window" FDT fragment, Greg Kurz)
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059672926.1466090.13612804072190051439.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Michael Roth
f130928d2a spapr_pci: add ibm, my-drc-index property for PHB hotplug
This is needed to denote a boot-time PHB as being hot-pluggable.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059672420.1466090.15147504040270659866.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Michael Roth
0a0a66cd1b spapr_pci: provide node start offset via spapr_populate_pci_dt()
PHB hotplug re-uses PHB device tree generation code and passes
it to a guest via RTAS. Doing this requires knowledge of where
exactly in the device tree the node describing the PHB begins.

Provide this via a new optional pointer that can be used to
store the PHB node's start offset.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059671912.1466090.10891589403973703473.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Michael Roth
4b6d336f2c spapr_events: add support for phb hotplug events
Extend the existing EPOW event format we use for PCI
devices to emit PHB plug/unplug events.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059671405.1466090.535964535260503283.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Nathan Fontenot
3998ccd092 spapr: populate PHB DRC entries for root DT node
This add entries to the root OF node to advertise our PHBs as being
DR-capable in accordance with PAPR specification.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059670897.1466090.10843921337591637414.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Michael Roth
962b6c3650 spapr: create DR connectors for PHBs
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059670389.1466090.10015601248906623076.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
ef28b98d58 spapr_pci: add PHB unrealize
To support PHB hotplug we need to clean up lingering references,
memory, child properties, etc. prior to the PHB object being
finalized. Generally this will be called as a result of calling
object_unparent() on the PHB object, which in turn would normally
be called as the result of an unplug() operation.

When the PHB is finalized, child objects will be unparented in
turn, and finalized if the PHB was the only reference holder. so
we don't bother to explicitly unparent child objects of the PHB,
with the notable exception of DRCs. This is needed to avoid a QEMU
crash when unplugging a PHB and resetting the machine before the
guest could handle the event. The DRCs are removed from the QOM tree
by  pci_unregister_root_bus() and we must make sure we're not leaving
stale aliases under the global /dr-connector path.

The formula that gives the number of DMA windows is moved to an
inline function in the hw/pci-host/spapr.h header because it
will have other users.

The unrealize function is able to cope with partially realized PHBs.
It is hence used to implement proper rollback on the realize error
path.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <155059669881.1466090.13515030705986041517.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
ad62bff638 spapr_irq: Expose the phandle of the interrupt controller
This will be used by PHB hotplug in order to create the "interrupt-map"
property of the PHB node.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059669374.1466090.12943228478046223856.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
743ed566c1 spapr: Expose the name of the interrupt controller node
This will be needed by PHB hotplug in order to access the "phandle"
property of the interrupt controller node.

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <155059668867.1466090.6339199751719123386.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
09d876ce2c spapr/drc: Drop spapr_drc_attach() fdt argument
All DRC subtypes have been converted to generate the FDT fragment at
configure connector time instead of attach time. The fdt and fdt_offset
arguments of spapr_drc_attach() aren't needed anymore. Drop them and
make the implementation of the dt_populate() method mandatory.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059667853.1466090.16527852453054217565.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
46fd02990d spapr/pci: Generate FDT fragment at configure connector time
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059667346.1466090.326696113231137772.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
345b12b99e spapr: Generate FDT fragment for CPUs at configure connector time
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059666839.1466090.3833376527523126752.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
62d38c9bd3 spapr: Generate FDT fragment for LMBs at configure connector time
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059666331.1466090.6766540766297333313.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
d9c95c71ac spapr_drc: Allow FDT fragment to be added later
The current logic is to provide the FDT fragment when attaching a device
to a DRC. This works perfectly fine for our current hotplug support, but
soon we will add support for PHB hotplug which has some constraints, that
CPU, PCI and LMB devices don't seem to have.

The first constraint is that the "ibm,dma-window" property of the PHB
node requires the IOMMU to be configured, ie, spapr_tce_table_enable()
has been called, which happens during PHB reset. It is okay in the case
of hotplug since the device is reset before the hotplug handler is
called. On the contrary with coldplug, the hotplug handler is called
first and device is only reset during the initial system reset. Trying
to create the FDT fragment on the hotplug path in this case, would
result in somthing like this:

ibm,dma-window = < 0x80000000 0x00 0x00 0x00 0x00 >;

This will cause linux in the guest to panic, by simply removing and
re-adding the PHB using the drmgr command:

	page = alloc_pages_node(nid, GFP_KERNEL, get_order(sz));
	if (!page)
		panic("iommu_init_table: Can't allocate %ld bytes\n", sz);

The second and maybe more problematic constraint is that the
"interrupt-map" property needs to reference the interrupt controller
node using the very same phandle that SLOF has already exposed to the
guest. QEMU requires SLOF to call the private KVMPPC_H_UPDATE_DT hcall
at some point to know about this phandle. With the latest QEMU and SLOF,
this happens when SLOF gets quiesced. This means that if the PHB gets
hotplugged after CAS but before SLOF quiesce, then we're sure that the
phandle is not known when the hotplug handler is called.

The FDT is only needed when the guest first invokes RTAS to configure
the connector actually, long after SLOF quiesce. Let's postpone the
creation of FDT fragments for PHBs to rtas_ibm_configure_connector().

Since we only need this for PHBs, introduce a new method in the base
DRC class for that. DRC subtypes will be converted to use it in
subsequent patches.

Allow spapr_drc_attach() to be passed a NULL fdt argument if the method
is available. When all DRC subtypes have been converted, the fdt argument
will eventually disappear.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059665823.1466090.18358845122627355537.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Benjamin Herrenschmidt
79825f4d58 target/ppc: Rename PATB/PATBE -> PATE
That "b" means "base address" and thus shouldn't be in the name
of actual entries and related constants.

This patch keeps the synthetic patb_entry field of the spapr
virtual hypervisor unchanged until I figure out if that has
an impact on the migration stream.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-11-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Benjamin Herrenschmidt
3054b0ca4b target/ppc: Fix ordering of hash MMU accesses
With mttcg, we can have MMU lookups happening at the same time
as the guest modifying the page tables.

Since the HPTEs of the hash table MMU contains two words (or
double worlds on 64-bit), we need to make sure we read them
in the right order, with the correct memory barrier.

Additionally, when using emulated SPAPR mode, the hypercalls
writing to the hash table must also perform the udpates in
the right order.

Note: This part is still not entirely correct

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-7-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Benjamin Herrenschmidt
00fd075e18 target/ppc/spapr: Set LPCR:HR when using Radix mode
The HW relies on LPCR:HR along with the PATE to determine whether
to use Radix or Hash mode. In fact it uses LPCR:HR more commonly
than the PATE.

For us, it's also more efficient to do so, especially since unlike
the HW we do not maintain a cache of the current PATE and HV PATE
in a generic place.

Prepare the grounds for that by ensuring that LPCR:HR is set
properly on SPAPR machines.

Another option would have been to use a callback to get the PATE
but this gets messy when implementing bare metal support, it's
much simpler (and faster) to use LPCR.

Since existing migration streams may not have it, fix it up in
spapr_post_load() as well based on the pseudo-PATE entry that
we keep.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
David Hildenbrand
b8165118f5 spapr: support memory unplug for qtest
Fake availability of OV5_HP_EVT, so we can test memory unplug in qtest.

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218092202.26683-3-david@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00