This value is constant across all thread-local copies of TCGContext,
so we might as well move it out of thread-local storage.
Reviewed-by: Joelle van Dyne <j@getutm.app>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This value is constant across all thread-local copies of TCGContext,
so we might as well move it out of thread-local storage.
Use the correct function pointer type, and name the variable
tcg_qemu_tb_exec, which means that we are able to remove the
macro that does the casting.
Replace HAVE_TCG_QEMU_TB_EXEC with CONFIG_TCG_INTERPRETER,
as this is somewhat clearer in intent.
Reviewed-by: Joelle van Dyne <j@getutm.app>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For darwin, the CTR_EL0 register is not accessible, but there
are system routines that we can use.
For other hosts, copy the single pointer implementation from
libgcc and modify it to support the double pointer interface
we require. This halves the number of cache operations required
when split-rwx is enabled.
Reviewed-by: Joelle van Dyne <j@getutm.app>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We are shortly going to have a split rw/rx jit buffer. Depending
on the host, we need to flush the dcache at the rw data pointer and
flush the icache at the rx code pointer.
For now, the two passed pointers are identical, so there is no
effective change in behaviour.
Reviewed-by: Joelle van Dyne <j@getutm.app>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is currently a no-op within tci/tcg-target.h, but
is about to be moved to a more generic location.
Reviewed-by: Joelle van Dyne <j@getutm.app>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
qemu_try_memalign() expects a power of 2 alignment:
- posix_memalign(3):
The address of the allocated memory will be a multiple of alignment,
which must be a power of two and a multiple of sizeof(void *).
- _aligned_malloc()
The alignment value, which must be an integer power of 2.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20201021173803.2619054-3-philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We do not need or want to be allocating page sized quanta.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20201018164836.1149452-1-richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Enable this on i386 to restrict the set of input registers
for an 8-bit store, as required by the architecture. This
removes the last use of scratch registers for user-only mode.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Always true when movbe is available, otherwise leave
this to generic code.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This fixes the build for older ppc64 kernel headers.
Fixes: 6addf06a3c
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
- include ccache in Debian 10 docker image
- iotests: drop 312 from auto group
- bound reading of s390x framebuffer file
- cirrus: drop non-x86 tests so we complete
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAl/18+IACgkQ+9DbCVqe
KkQh6Qf/TUHYf/eSCXIdMEE6Zim9DKFCzWqkhdfOd4n/59SirxhnrvUh27BZVKhb
LW/W5Q+UZ+kYCjwRG231+S9kWp5Cr/FTVm+l/77le7dDaBOU9RzpK0cCCXv4OFsZ
DztiLfv869lDe0URHs4yk9TSu9u5DfqVcvrSiWvr7SeGW7QBW9BpdeibY9OmEVUU
d0io0ueEMq7iRHwIAanfuqc2tbh94jDhg3SK7Lc2cIf9iyhonDcEY2aweNk8k/Kd
p5QceiUcsz8FaLPQYQRhQxjxvZoFqQqMwP7NpyltbkHYU69Mh633v424/Dz67oqt
4Mq0JHgPMfKAbPAWlib4f06dSLkHAg==
=pbhj
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stsquad/tags/pull-testing-060121-4' into staging
Testing updates (back to green)
- include ccache in Debian 10 docker image
- iotests: drop 312 from auto group
- bound reading of s390x framebuffer file
- cirrus: drop non-x86 tests so we complete
# gpg: Signature made Wed 06 Jan 2021 17:31:14 GMT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-testing-060121-4:
cirrus: don't run full qtest on macOS
tests/acceptance: bound the size of readline in s390_ccw_virtio
tests/iotests: drop test 312 from auto group
tests/docker: Include 'ccache' in Debian base image
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Cirrus CI macOS build hosts have exhibited a serious performance
degradation in recent months. For example the "qom-test" qtest takes
over an hour for only the qemu-system-aarch64 binary. This is as much
20-40 times slower than other environments. The other qtests all show
similar performance degradation, as do many of the unit tests.
This does not appear related to QEMU code changes, since older git
commits which were known to fully complete in less than 1 hour on
Cirrus CI now also show similar bad performance. Either Cirrus CI
performance has degraded, or an change in its environment has exposed
a latent bug widely affecting all of QEMU. Debugging the qom-test
showed no easily identified large bottleneck - every step of the
test execution was simply slower.
macOS builds/tests run outside Cirrus CI show normal performance.
With an inability to identify any obvious problem, the only viable
way to get a reliably completing Cirrus CI macOS job is to cut out
almost all of the qtests. We choose to run the x86_64 target only,
since that has very few machine types and thus is least badly
impacted in the qom-test execution.
With this change, the macOS jobs complete in approx 35 minutes.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Willian Rampazzo <willianr@redhat.com>
Message-Id: <20210106114159.981538-1-berrange@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
The read binary data as text via a PPM export of the frame buffer
seems a bit sketchy and it did blow up in the real world when the
assertion failed:
https://gitlab.com/qemu-project/qemu/-/jobs/943183183
However short of cleaning up the test to be more binary focused at
least limit the attempt to dump the whole file as hexified zeros in
the logs.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Willian Rampazzo <willianr@redhat.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210105124405.15424-1-alex.bennee@linaro.org>
The "auto" documentation states:
That means they should run with every QEMU binary (also non-x86)
which is not the case as the check-system-fedora build which only
includes a rag tag group of rare and deprecated targets doesn't
support the virtio device required.
Fixes: ef9bba1484 ("quorum: Implement bdrv_co_block_status()")
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210105100402.12350-1-alex.bennee@linaro.org>
Include the 'ccache' package to speed up compilation.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20201213211601.253530-1-f4bug@amsat.org>
Fixes: d6db2a1cdf ("docker: add debian-buster-arm64-cross")
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
The sun4m board code connects both of the IRQ outputs of each ESCC to the
same slavio input qemu_irq. Connecting two qemu_irqs outputs directly to the
same input is not valid as it produces subtly wrong behaviour (for instance
if both the IRQ lines are high, and then one goes low, the PIC input will see
this as a high-to-low transition even though the second IRQ line should still
be holding it high).
This kind of wiring needs an explicitly created OR gate; add one.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20201219111934.5540-1-mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The grlib.h header defines a set_pil_in_fn typedef which is never
used; remove it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20201212144134.29594-3-peter.maydell@linaro.org>
Reviewed-by: KONRAD Frederic <frederic.konrad@adacore.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Currently the GRLIB_IRQMP device is used in one place (the leon3 board),
but instead of the device providing inbound gpio lines for the board
to wire up, the board code itself calls qemu_allocate_irqs() with
the handler function being a set_irq function defined in the code
for the device.
Refactor this into the standard setup of a device having input
gpio lines.
This fixes a trivial Coverity memory leak report (the leon3
board code leaks the IRQ array returned from qemu_allocate_irqs()).
Fixes: Coverity CID 1421922
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20201212144134.29594-2-peter.maydell@linaro.org>
Reviewed-by: KONRAD Frederic <frederic.konrad@adacore.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Per the "NCR89C105 Chip Specification" referenced in the header:
Chip-level Address Map
------------------------------------------------------------------
| 1D0 0000 -> | Counter/Timers | W,D |
| 1DF FFFF | | |
...
The address map indicated the allowed accesses at each address.
[...] W indicates a word access, and D indicates a double-word
access.
The SLAVIO timer controller is implemented expecting 32-bit accesses.
Commit a3d12d073e restricted the memory accesses to 32-bit, while
the device allows 64-bit accesses.
This was not an issue until commit 5d971f9e67 which reverted
("memory: accept mismatching sizes in memory_region_access_valid").
Fix by renaming .valid MemoryRegionOps as .impl, and add the valid
access range (W -> 4, D -> 8).
Since commit 21786c7e59 ("memory: Log invalid memory accesses")
this class of bug can be quickly debugged displaying 'guest_errors'
accesses, as:
$ qemu-system-sparc -M SS-20 -m 256 -bios ss20_v2.25_rom -serial stdio -d guest_errors
Power-ON Reset
Invalid access at addr 0x0, size 8, region 'timer-1', reason: invalid size (min:4 max:4)
$ qemu-system-sparc -M SS-20 -m 256 -bios ss20_v2.25_rom -monitor stdio -S
(qemu) info mtree
address-space: memory
0000000000000000-ffffffffffffffff (prio 0, i/o): system
...
0000000ff1300000-0000000ff130000f (prio 0, i/o): timer-1
^^^^^^^^^ ^^^^^^^
\ memory region base address and name /
(qemu) info qtree
bus: main-system-bus
dev: slavio_timer, id "" <-- device type name
gpio-out "sysbus-irq" 17
num_cpus = 1 (0x1)
mmio 0000000ff1310000/0000000000000014
mmio 0000000ff1300000/0000000000000010 <--- base address
mmio 0000000ff1301000/0000000000000010
mmio 0000000ff1302000/0000000000000010
...
Reported-by: Yap KV <yapkv@yahoo.com>
Buglink: https://bugs.launchpad.net/bugs/1906905
Fixes: a3d12d073e ("slavio_timer: convert to memory API")
CC: qemu-stable@nongnu.org
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20201205150903.3062711-1-f4bug@amsat.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
First pull request for 2021, which has a bunch of things accumulated
over the holidays. Includes:
* A number of cleanups to sam460ex and ppc440 code from BALATON Zoltan
* Several fixes for builds with --without-default-devices from Greg Kurz
* Fixes for some DRC reset problems from Greg Kurz
* QOM conversion of the PPC 4xx UIC devices from Peter Maydell
* Some other assorted fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl/1L38ACgkQbDjKyiDZ
s5JmvQ//RddzvCrHewdtRys+XnLDsKbWng3rQGKh2rSpKMYM11ilmo7FOGoOMNwq
aiZXm5z3t2lpSUTGZorVuPAnYKExzkuAQkPsFZ65uf9wfDhB2wg3BIr97GgZBF2S
MvK9DlxhUNJI+1W8Y+hwj9xDMOX3oFqZp24g2i6EQPRcpqE7GtRpOzt6PdL15sNz
KiJtIeyZ32uGDQaqNlWHJ/pBiYECEQTVpaZIztg2WLdfMICzgYMSCSZzbUrYXCii
WPDJ9sr69sMFwX2oEAgmfmJeFaTOFMt/xTOwFvi2ex4Rd1Rzqb9XToZ+ihOeOAFr
c4a7fpZzx0ePYLIAfOAZ2exV8Nh04dWjRyr2ykgo1ik3DaJ1Ck80O7jYyPQN1Dir
wKpWW59a3pjdABa/ZAoMoFwJh1zPAwGuiN4Higy87Ux8X+JOlTzzkP9ja9v2fgRC
DNb8VYvehUbY6bbHkqs57JcVyYLX56yphfq6Pr2D3DE6y1Ekph2G2vR8YXnqbRmY
Pw5VJ9q1SdYypGVZdMmIXseM7XerFA9YlIfIAQ7DiEW5wH9sx5QjDxlSt07l56J0
TlK6m9Fgc3koLLtVqDlK0NPx39xqVa1JUkrvPeWNKqn1FG/0tfPU6oPVjdQx3ouk
X2cv4A99MJsWSoyUMCH5r5+CHdMCscILOSOZ6OiWAHMEdqCxH0Q=
=7Eiy
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.0-20210106' into staging
ppc patch queue 2021-01-06
First pull request for 2021, which has a bunch of things accumulated
over the holidays. Includes:
* A number of cleanups to sam460ex and ppc440 code from BALATON Zoltan
* Several fixes for builds with --without-default-devices from Greg Kurz
* Fixes for some DRC reset problems from Greg Kurz
* QOM conversion of the PPC 4xx UIC devices from Peter Maydell
* Some other assorted fixes and cleanups
# gpg: Signature made Wed 06 Jan 2021 03:33:19 GMT
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dg-gitlab/tags/ppc-for-6.0-20210106: (22 commits)
ppc440_pcix: Fix up pci config access
ppc440_pcix: Fix register write trace event
ppc440_pcix: Improve comment for IRQ mapping
sam460ex: Remove FDT_PPC dependency from KConfig
ppc4xx: Move common dependency on serial to common option
pnv: Fix reverse dependency on PCI express root ports
ppc: Simplify reverse dependencies of POWERNV and PSERIES on XICS and XIVE
ppc: Fix build with --without-default-devices
spapr: Add drc_ prefix to the DRC realize and unrealize functions
spapr: Use spapr_drc_reset_all() at machine reset
spapr: Introduce spapr_drc_reset_all()
spapr: Fix reset of transient DR connectors
spapr: Call spapr_drc_reset() for all DRCs at CAS
spapr: Fix buffer overflow in spapr_numa_associativity_init()
spapr: Allow memory unplug to always succeed
spapr: Fix DR properties of the root node
spapr/xive: Make spapr_xive_pic_print_info() static
spapr: DRC lookup cannot fail
hw/ppc/ppc440_bamboo: Drop use of ppcuic_init()
hw/ppc/virtex_ml507: Drop use of ppcuic_init()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Low-level fd users from QEMU use aio_set_fd_handler(), which handles
event registration with the main loop; qemu_fd_register() is only
needed together with the main loop's poll notifiers, of which SLIRP
is the only user.
This removes a dependency from oslib-win32.c to main-loop.c.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20201218135712.674094-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fixes a long standing issue with MorphOS booting on sam460ex
which turns out to be because of suspicious values written to PCI
config address that apparently works on real machine but caused wrong
access on this device model. This replaces a previous work around for
this with a better fix that makes it work.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <6fd215ab2bc5f8d4455cd20ed1a2f059e4415fe5.1609636173.git.balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The trace event for pci_host_config_write() was also using the trace
event for read. Add corresponding trace and correct this.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <a6c7dcf7153cc537123ed8ceac060f2f64a883cb.1609636173.git.balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The code mapping all PCI interrupts to a single CPU IRQ works but is
not trivial so document it in a comment.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <c25c0310510672b58466e795fd701e65e8f1ff97.1609636173.git.balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Dependency on FDT_PPC was added in commit b0048f7609
("hw/ppc/Kconfig: Only select FDT helper for machines using it") but
it does not seem to be really necessary so remove it again.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <7461a20b129a912aeacdb9ad115a55f0b84c8726.1609636173.git.balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
All machines that select SERIAL also select PPC4XX so we can just add
this common dependency there once.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <94f1eb7cfb7f315bd883d825f3ce7e0cfc2f2b69.1609636173.git.balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
qemu-system-ppc64 built with --without-default-devices crashes:
Type 'pnv-phb4-root-port' is missing its parent 'pcie-root-port-base'
Aborted (core dumped)
Have POWERNV to select PCIE_PORT. This is done through a
new PCI_POWERNV config in hw/pci-host/Kconfig since POWERNV
doesn't have a direct dependency on PCI. For this reason,
PCI_EXPRESS and MSI_NONBROKEN are also moved under
PCI_POWERNV.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <160883058299.253005.342913177952681375.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Have PSERIES to select XICS and XIVE, and directly check PSERIES
in hw/intc/meson.build to enable build of the XICS and XIVE sPAPR
backends, like POWERNV already does. This allows to get rid of the
intermediate XICS_SPAPR and XIVE_SPAPR.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160883057560.253005.4206568349917633920.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Linking of the qemu-system-ppc64 fails on a POWER9 host when
--without-default-devices is passed to configure:
$ ./configure --without-default-devices \
--target-list=ppc64-softmmu && make
...
libqemu-ppc64-softmmu.fa.p/hw_ppc_e500.c.o: In function `ppce500_init_mpic_kvm':
/home/greg/Work/qemu/qemu-ppc/build/../hw/ppc/e500.c:777: undefined reference to `kvm_openpic_connect_vcpu'
libqemu-ppc64-softmmu.fa.p/hw_ppc_spapr_irq.c.o: In function `spapr_irq_check':
/home/greg/Work/qemu/qemu-ppc/build/../hw/ppc/spapr_irq.c:189: undefined reference to `xics_kvm_has_broken_disconnect'
libqemu-ppc64-softmmu.fa.p/hw_intc_spapr_xive.c.o: In function `spapr_xive_post_load':
/home/greg/Work/qemu/qemu-ppc/build/../hw/intc/spapr_xive.c:530: undefined reference to `kvmppc_xive_post_load'
... and tons of other symbols belonging to the KVM backend of the
openpic, XICS and XIVE interrupt controllers.
It turns out that OPENPIC_KVM, XICS_KVM and XIVE_KVM are marked
to depend on KVM but this has no effect when minikconf runs in
allnoconfig mode. Such reverse dependencies should rather be
handled with a 'select' statement, eg.
config OPENPIC
select OPENPIC_KVM if KVM
or even better by getting rid of the intermediate _KVM config
and directly checking CONFIG_KVM in the meson.build file:
specific_ss.add(when: ['CONFIG_KVM', 'CONFIG_OPENPIC'],
if_true: files('openpic_kvm.c'))
Go for the latter with OPENPIC, XICS and XIVE.
This went unnoticed so far because CI doesn't test the build with
--without-default-devices and KVM enabled on a POWER host.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160883056791.253005.14924294027763955653.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Use a less generic name for an easier experience with tools such as
cscope or grep.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20201218103400.689660-6-groug@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Documentation of object_child_foreach_recursive() clearly stipulates
that "it is forbidden to add or remove children from @obj from the @fn
callback". But this is exactly what we do during machine reset. The call
to spapr_drc_reset() can finalize the hot-unplug sequence of a PHB or a
PCI bridge, both of which will then in turn destroy their PCI DRCs. This
could potentially invalidate the iterator used by do_object_child_foreach().
It is pure luck that this haven't caused any issues so far.
Use spapr_drc_reset_all() since it can cope with DRC removal.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20201218103400.689660-5-groug@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
No need to expose the way DRCs are traversed outside of spapr_drc.c.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20201218103400.689660-4-groug@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Documentation of object_property_iter_init() clearly stipulates that
"it is forbidden to modify the property list while iterating". But this
is exactly what we do when resetting transient DR connectors during CAS.
The call to spapr_drc_reset() can finalize the hot-unplug sequence of a
PHB or a PCI bridge, both of which will then in turn destroy their PCI
DRCs. This could potentially invalidate the iterator. It is pure luck
that this haven't caused any issues so far.
Change spapr_drc_reset() to return true if it caused a device to be
removed. Restart from scratch in this case. This can potentially
increase the overall DRC reset time, especially with a high maxmem
which generates a lot of LMB DRCs. But this kind of setup is rare,
and so is the use case of rebooting a guest while doing hot-unplug.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20201218103400.689660-3-groug@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Non-transient DRCs are either in the empty or the ready state,
which means spapr_drc_reset() doesn't change their state. It
is thus not needed to do any checking. Call spapr_drc_reset()
unconditionally and squash spapr_drc_transient() into its
only user, spapr_drc_needed().
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20201218103400.689660-2-groug@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Running a guest with 128 NUMA nodes crashes QEMU:
../../util/error.c:59: error_setv: Assertion `*errp == NULL' failed.
The crash happens when setting the FWNMI migration blocker:
2861 if (spapr_get_cap(spapr, SPAPR_CAP_FWNMI) == SPAPR_CAP_ON) {
2862 /* Create the error string for live migration blocker */
2863 error_setg(&spapr->fwnmi_migration_blocker,
2864 "A machine check is being handled during migration. The handler"
2865 "may run and log hardware error on the destination");
2866 }
Inspection reveals that papr->fwnmi_migration_blocker isn't NULL:
(gdb) p spapr->fwnmi_migration_blocker
$1 = (Error *) 0x8000000004000000
Since this is the only place where papr->fwnmi_migration_blocker is
set, this means someone wrote there in our back. Further analysis
points to spapr_numa_associativity_init(), especially the part
that initializes the associative arrays for NVLink GPUs:
max_nodes_with_gpus = nb_numa_nodes + NVGPU_MAX_NUM;
ie. max_nodes_with_gpus = 128 + 6, but the array isn't sized to
accommodate the 6 extra nodes:
struct SpaprMachineState {
.
.
.
uint32_t numa_assoc_array[MAX_NODES][NUMA_ASSOC_SIZE];
Error *fwnmi_migration_blocker;
};
and the following loops happily overwrite spapr->fwnmi_migration_blocker,
and probably more:
for (i = nb_numa_nodes; i < max_nodes_with_gpus; i++) {
spapr->numa_assoc_array[i][0] = cpu_to_be32(MAX_DISTANCE_REF_POINTS);
for (j = 1; j < MAX_DISTANCE_REF_POINTS; j++) {
uint32_t gpu_assoc = smc->pre_5_1_assoc_refpoints ?
SPAPR_GPU_NUMA_ID : cpu_to_be32(i);
spapr->numa_assoc_array[i][j] = gpu_assoc;
}
spapr->numa_assoc_array[i][MAX_DISTANCE_REF_POINTS] = cpu_to_be32(i);
}
Fix the size of the array. This requires "hw/ppc/spapr.h" to see
NVGPU_MAX_NUM. Including "hw/pci-host/spapr.h" introduces a
circular dependency that breaks the build, so this moves the
definition of NVGPU_MAX_NUM to "hw/ppc/spapr.h" instead.
Reported-by: Min Deng <mdeng@redhat.com>
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1908693
Fixes: dd7e1d7ae4 ("spapr_numa: move NVLink2 associativity handling to spapr_numa.c")
Cc: danielhb413@gmail.com
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160829960428.734871.12634150161215429514.stgit@bahia.lan>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It is currently impossible to hot-unplug a memory device between
machine reset and CAS.
(qemu) device_del dimm1
Error: Memory hot unplug not supported for this guest
This limitation was introduced in order to provide an explicit
error path for older guests that didn't support hot-plug event
sources (and thus memory hot-unplug).
The linux kernel has been supporting these since 4.11. All recent
enough guests are thus capable of handling the removal of a memory
device at all time, including during early boot.
Lift the limitation for the latest machine type. This means that
trying to unplug memory from a guest that doesn't support it will
likely just do nothing and the memory will only get removed at
next reboot. Such older guests can still get the existing behavior
by using an older machine type.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160794035064.23292.17560963281911312439.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Section 13.5.2 of LoPAPR mandates various DR related indentifiers
for all hot-pluggable entities to be exposed in the "ibm,drc-indexes",
"ibm,drc-power-domains", "ibm,drc-names" and "ibm,drc-types" properties
of their parent node. These properties are created with spapr_dt_drc().
PHBs and LMBs are both children of the machine. Their DR identifiers
are thus supposed to be exposed in the afore mentioned properties of
the root node.
When PHB hot-plug support was added, an extra call to spapr_dt_drc()
was introduced: this overwrites the existing properties, previously
populated with the LMB identifiers, and they end up containing only
PHB identifiers. This went unseen so far because linux doesn't care,
but this is still not conformant with LoPAPR.
Fortunately spapr_dt_drc() is able to handle multiple DR entity types
at the same time. Use that to handle DR indentifiers for PHBs and LMBs
with a single call to spapr_dt_drc(). While here also account for PMEM
DR identifiers, which were forgotten when NVDIMM hot-plug support was
added. Also add an assert to prevent further misuse of spapr_dt_drc().
With -m 1G,maxmem=2G,slots=8 passed on the QEMU command line we get:
Without this patch:
/proc/device-tree/ibm,drc-indexes
0000001f 20000001 20000002 20000003
20000000 20000005 20000006 20000007
20000004 20000009 20000008 20000010
20000011 20000012 20000013 20000014
20000015 20000016 20000017 20000018
20000019 2000000a 2000000b 2000000c
2000000d 2000000e 2000000f 2000001a
2000001b 2000001c 2000001d 2000001e
These are the DRC indexes for the 31 possible PHBs.
With this patch:
/proc/device-tree/ibm,drc-indexes
0000002b 90000000 90000001 90000002
90000003 90000004 90000005 90000006
90000007 20000001 20000002 20000003
20000000 20000005 20000006 20000007
20000004 20000009 20000008 20000010
20000011 20000012 20000013 20000014
20000015 20000016 20000017 20000018
20000019 2000000a 2000000b 2000000c
2000000d 2000000e 2000000f 2000001a
2000001b 2000001c 2000001d 2000001e
80000004 80000005 80000006 80000007
And now we also have the 4 ((2G - 1G) / 256M) LMBs and the
8 (slots) PMEMs.
Fixes: 3998ccd092 ("spapr: populate PHB DRC entries for root DT node")
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160794479566.35245.17809158217760761558.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>