Commit Graph

30461 Commits

Author SHA1 Message Date
Nicholas Piggin
120f738a46 spapr: implement nested-hv capability for the virtual hypervisor
This implements the Nested KVM HV hcall API for spapr under TCG.

The L2 is switched in when the H_ENTER_NESTED hcall is made, and the
L1 is switched back in returned from the hcall when a HV exception
is sent to the vhyp. Register state is copied in and out according to
the nested KVM HV hcall API specification.

The hdecr timer is started when the L2 is switched in, and it provides
the HDEC / 0x980 return to L1.

The MMU re-uses the bare metal radix 2-level page table walker by
using the get_pate method to point the MMU to the nested partition
table entry. MMU faults due to partition scope errors raise HV
exceptions and accordingly are routed back to the L1.

The MMU does not tag translations for the L1 (direct) vs L2 (nested)
guests, so the TLB is flushed on any L1<->L2 transition (hcall entry
and exit).

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[ clg: checkpatch fixes ]
Message-Id: <20220216102545.1808018-10-npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Nicholas Piggin
7cebc5db2e target/ppc: Introduce a vhyp framework for nested HV support
Introduce virtual hypervisor methods that can support a "Nested KVM HV"
implementation using the bare metal 2-level radix MMU, and using HV
exceptions to return from H_ENTER_NESTED (rather than cause interrupts).

HV exceptions can now be raised in the TCG spapr machine when running a
nested KVM HV guest. The main ones are the lev==1 syscall, the hdecr,
hdsi and hisi, hv fu, and hv emu, and h_virt external interrupts.

HV exceptions are intercepted in the exception handler code and instead
of causing interrupts in the guest and switching the machine to HV mode,
they go to the vhyp where it may exit the H_ENTER_NESTED hcall with the
interrupt vector numer as return value as required by the hcall API.

Address translation is provided by the 2-level page table walker that is
implemented for the bare metal radix MMU. The partition scope page table
is pointed to the L1's partition scope by the get_pate vhc method.

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20220216102545.1808018-9-npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Nicholas Piggin
f32d4ab41c target/ppc: make vhyp get_pate method take lpid and return success
In prepartion for implementing a full partition table option for
vhyp, update the get_pate method to take an lpid and return a
success/fail indicator.

The spapr implementation currently just asserts lpid is always 0
and always return success.

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[ clg: checkpatch fixes ]
Message-Id: <20220216102545.1808018-6-npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Nicholas Piggin
93aeb70210 ppc: allow the hdecr timer to be created/destroyed
Machines which don't emulate the HDEC facility are able to use the
timer for something else. Provide functions to start and stop the
hdecr timer.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[ clg: checkpatch fixes ]
Message-Id: <20220216102545.1808018-4-npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Nicholas Piggin
5ff40b0124 spapr: prevent hdec timer being set up under virtual hypervisor
The spapr virtual hypervisor does not require the hdecr timer.
Remove it.

Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20220216102545.1808018-3-npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Shivaprasad G Bhat
8601b4f11d spapr: nvdimm: Introduce spapr-nvdimm device
If the device backend is not persistent memory for the nvdimm, there is
need for explicit IO flushes on the backend to ensure persistence.

On SPAPR, the issue is addressed by adding a new hcall to request for
an explicit flush from the guest when the backend is not pmem. So, the
approach here is to convey when the hcall flush is required in a device
tree property. The guest once it knows the device backend is not pmem,
makes the hcall whenever flush is required.

To set the device tree property, a new PAPR specific device type inheriting
the nvdimm device is implemented. When the backend doesn't have pmem=on
the device tree property "ibm,hcall-flush-required" is set, and the guest
makes hcall H_SCM_FLUSH requesting for an explicit flush. The new device
has boolean property pmem-override which when "on" advertises the device
tree property even when pmem=on for the backend. The flush function
invokes the fdatasync or pmem_persist() based on the type of backend.

The vmstate structures are made part of the spapr-nvdimm device object.
The patch attempts to keep the migration compatibility between source and
destination while rejecting the incompatibles ones with failures.

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <164396256092.109112.17933240273840803354.stgit@ltczzess4.aus.stglabs.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Shivaprasad G Bhat
b5513584a0 spapr: nvdimm: Implement H_SCM_FLUSH hcall
The patch adds support for the SCM flush hcall for the nvdimm devices.
To be available for exploitation by guest through the next patch. The
hcall is applicable only for new SPAPR specific device class which is
also introduced in this patch.

The hcall expects the semantics such that the flush to return with
H_LONG_BUSY_ORDER_10_MSEC when the operation is expected to take longer
time along with a continue_token. The hcall to be called again by providing
the continue_token to get the status. So, all fresh requests are put into
a 'pending' list and flush worker is submitted to the thread pool. The
thread pool completion callbacks move the requests to 'completed' list,
which are cleaned up after collecting the return status for the guest
in subsequent hcall from the guest.

The semantics makes it necessary to preserve the continue_tokens and
their return status across migrations. So, the completed flush states
are forwarded to the destination and the pending ones are restarted
at the destination in post_load. The necessary nvdimm flush specific
vmstate structures are also introduced in this patch which are to be
saved in the new SPAPR specific nvdimm device to be introduced in the
following patch.

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <164396254862.109112.16675611182159105748.stgit@ltczzess4.aus.stglabs.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Shivaprasad G Bhat
3e35960bf1 nvdimm: Add realize, unrealize callbacks to NVDIMMDevice class
A new subclass inheriting NVDIMMDevice is going to be introduced in
subsequent patches. The new subclass uses the realize and unrealize
callbacks. Add them on NVDIMMClass to appropriately call them as part
of plug-unplug.

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Acked-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <164396253158.109112.1926755104259023743.stgit@ltczzess4.aus.stglabs.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:13 +01:00
Anup Patel
e8f79343cf hw/intc: Add RISC-V AIA APLIC device emulation
The RISC-V AIA (Advanced Interrupt Architecture) defines a new
interrupt controller for wired interrupts called APLIC (Advanced
Platform Level Interrupt Controller). The APLIC is capabable of
forwarding wired interupts to RISC-V HARTs directly or as MSIs
(Message Signaled Interupts).

This patch adds device emulation for RISC-V AIA APLIC.

Signed-off-by: Anup Patel <anup.patel@wdc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Message-id: 20220204174700.534953-19-anup@brainfault.org
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2022-02-16 12:24:19 +10:00
Anup Patel
d207863cd3 hw/riscv: virt: Use AIA INTC compatible string when available
We should use the AIA INTC compatible string in the CPU INTC
DT nodes when the CPUs support AIA feature. This will allow
Linux INTC driver to use AIA local interrupt CSRs.

Signed-off-by: Anup Patel <anup.patel@wdc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Message-id: 20220204174700.534953-17-anup@brainfault.org
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2022-02-16 12:24:19 +10:00
Petr Tesarik
f42483d776 Allow setting up to 8 bytes with the generic loader
The documentation for the generic loader says that "the maximum size of
the data is 8 bytes". However, attempts to set data-len=8 trigger the
following assertion failure:

../hw/core/generic-loader.c:59: generic_loader_reset: Assertion `s->data_len < sizeof(s->data)' failed.

The type of s->data is uint64_t (i.e. 8 bytes long), so I believe this
assert should use <= instead of <.

Fixes: e481a1f63c ("generic-loader: Add a generic loader")
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20220120092715.7805-1-ptesarik@suse.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2022-02-16 12:24:18 +10:00
Peter Maydell
ad38520bde Pull request
This contains coroutine poll size scaling, virtiofsd rseq seccomp for new glibc
 versions, and the QEMU C virtiofsd deprecation notice.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmIKje0ACgkQnKSrs4Gr
 c8iPZQgAouxAvwRyTpZnRNLANB5QoHovgLqw7EdWvfdCP9r/EQsjJ1NSkOvYx9AH
 LnxxF4ReciEO5KaNK6C397ktTnE30iPGXm+MHC4m1u7/FFthxXjIJj5As2It9Wyk
 9M3R78vkcVuXf6SyAJfUQspav6GIcdLaX1yOXOHY+5VMGogubLIOaFfL+J/tIF85
 Z1FPGogOBPLZnOkhRNTQkZn9tuW8U45Cwo4zggthIbRnoPBIaCfjyv0qRXeGdczi
 qM5NC81/VhSzUcvuJ8VYZA2gyDKTumq451VHfHy0uAywCvjk281nUcL37C8U2yvS
 OJtW5XnOr0UUlwjLhxPT4qZilH9hQw==
 =i6e5
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/stefanha-gitlab/tags/block-pull-request' into staging

Pull request

This contains coroutine poll size scaling, virtiofsd rseq seccomp for new glibc
versions, and the QEMU C virtiofsd deprecation notice.

# gpg: Signature made Mon 14 Feb 2022 17:14:21 GMT
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha-gitlab/tags/block-pull-request:
  util: adjust coroutine pool size to virtio block queue
  Deprecate C virtiofsd
  tools/virtiofsd: Add rseq syscall to the seccomp allowlist

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-02-15 19:30:33 +00:00
Peter Maydell
cc6721e449 hw/nvme updates
- fix CVE-2021-3929
   - add zone random write area support
   - misc cleanups from Philippe
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEUigzqnXi3OaiR2bATeGvMW1PDekFAmIKDF4ACgkQTeGvMW1P
 DenhhQgAm8bq39EIaNfVswdHW2ggPfwZHR0udzDoUaOadc0HnS8UMm13//pOtYrs
 fLs6YcUQNdPnRvkOsodfvN8Jw+oxkPudvhdFxWCLqgHIrm6+JeY3VCuDvvvt5Q0z
 3gSMOSLTY1Yb8/tIJvDXK6xx7LjDbxpaDfkL4QsatTfzUcCR6joIlmZTsIoYMi7x
 e+Ag0Nu56QxpP9amzAWVzNuLt3UYcAZaeJ1v1figrijKw27Wgh7CXWlJsFoZrWwa
 KsC6zDn80PgHTWZueQIag5zeQHC2V5/kExx5hA7lYaO48F1JveVwQttiVgW1lXjR
 aGmn6mB6PANNiAEdBEf6CBX60PiwYQ==
 =pn4q
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/nvme/tags/nvme-next-pull-request' into staging

hw/nvme updates

  - fix CVE-2021-3929
  - add zone random write area support
  - misc cleanups from Philippe

# gpg: Signature made Mon 14 Feb 2022 08:01:34 GMT
# gpg:                using RSA key 522833AA75E2DCE6A24766C04DE1AF316D4F0DE9
# gpg: Good signature from "Klaus Jensen <its@irrelevant.dk>" [unknown]
# gpg:                 aka "Klaus Jensen <k.jensen@samsung.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: DDCA 4D9C 9EF9 31CC 3468  4272 63D5 6FC5 E55D A838
#      Subkey fingerprint: 5228 33AA 75E2 DCE6 A247  66C0 4DE1 AF31 6D4F 0DE9

* remotes/nvme/tags/nvme-next-pull-request:
  hw/nvme: add support for zoned random write area
  hw/nvme: add ozcs enum
  hw/nvme: add struct for zone management send
  hw/nvme/ctrl: Pass buffers as 'void *' types
  hw/nvme/ctrl: Have nvme_addr_write() take const buffer
  hw/nvme: fix CVE-2021-3929

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-02-15 13:51:35 +00:00
Peter Maydell
e56d873f0e -----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
 
 iQEcBAABAgAGBQJiCdGyAAoJEO8Ells5jWIRWL8H/2XOuuv9JJYqboCVPWSpltch
 FgTh2SHWbQueo70xxxIzRbM2RN/G9Eu+4LnpMw+ZRJA6EY6QgNYmgVFbyV1eTkxG
 f0qUyCliCPWzZEZ4GLJ7JOSuHpU4rZ90e5IKuGhtD+yrfT+L0Re1TyluZdWEniOp
 tz6daq31jkF870iPn7X9QOTW0JBcK5ww7Qv5BThAoUmCOq6BMBFxg+xFNto9a5S7
 UjADfhZiqNIbks5hfpldr9g2F2LcBNeSHWOAxhEi24IEaV7AcL2/1B3EZhfMcA/O
 hcbw1oKcJ3anpAD6Ukwy4KnyrZNv1M7wtiN9XkAKKu6idIzrHuIju9j7TOKkxt4=
 =M6Ok
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging

# gpg: Signature made Mon 14 Feb 2022 03:51:14 GMT
# gpg:                using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  net/eth: Don't consider ESP to be an IPv6 option header
  hw/net: e1000e: Clear ICR on read when using non MSI-X interrupts
  net/filter: Optimize filter_send to coroutine
  net/colo-compare.c: Update the default value comments
  net/colo-compare.c: Optimize compare order for performance
  net: Fix uninitialized data usage
  net/tap: Set return code on failure
  hw/net/vmxnet3: Log guest-triggerable errors using LOG_GUEST_ERROR

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-02-15 11:39:54 +00:00
Peter Maydell
2d88a3a595 Block layer patches
- Fix crash in blockdev-reopen with iothreads
 - fdc-isa: Respect QOM properties when building AML
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmIGoJQRHGt3b2xmQHJl
 ZGhhdC5jb20ACgkQfwmycsiPL9YOzBAAig2bCFiJJds0EpbWPGN/bNGpeAWMSUAi
 lcbp2U+yS/VWEaa/LyD8VfWd8t7gcIXc12bDxHi+/AGgZsR5AQTQclT/kj+Ncmxn
 E2rGTQdXvIZEkfXWYmwbM/Lm2+iK2g0Xw4WfmVj1peT4Mm1hmcle8odnzXywp/dL
 3LLIKVUh7ol/khvQfqR6dbJPhVlbPaKyEmlmdfBjLNYX0FSwRqspKY6GJEj0fZal
 o8wCUA27O8u+ASF3bpk/bFcBcsKAREPi2IXkm+TRFGb+lzolnsO4p7JmOAyE0xaW
 JoMHHU20hGHWMLvvTdOELVNVLp6iz54ZlarUTZFn5pjTbXrT9ELh5d6dfIKQYaGc
 tFpfX+n8dFzqVxAD0/lisAGPxzZYwrHVyVeypJaAMeogPL9+zNKiPdkDEY+1thn8
 Qr3X57qz+saoqH1pGr2Y/x7ZUzA2TKYz0fnN2bHENaAzwNuNJkTKd1+gJ501/ILM
 3v+H2KJKHaKhyxYubRHmCdBod8bOdYCYgZoptEhydhMFQW99dnA6m51h6spWiP1c
 SR+faJxqulnfKu9lTW80+q0akzDArmk8roxNw0Hg0svZfJefKXJKG4oCXy5YU/Fe
 UTAnXx8oWnQpPtnlvCKlMLTKzt6oHGlrE2BED2QPlIk0Mca/9/BwEyu2Yx7AC4Uj
 TkhxAjDdPDg=
 =/1qA
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kwolf-gitlab/tags/for-upstream' into staging

Block layer patches

- Fix crash in blockdev-reopen with iothreads
- fdc-isa: Respect QOM properties when building AML

# gpg: Signature made Fri 11 Feb 2022 17:44:52 GMT
# gpg:                using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg:                issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kwolf-gitlab/tags/for-upstream:
  hw/block/fdc-isa: Respect QOM properties when building AML
  iotests: Test blockdev-reopen with iothreads and throttling
  block: Lock AioContext for drain_end in blockdev-reopen

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-02-14 19:54:00 +00:00
Hiroki Narukawa
4c41c69e05 util: adjust coroutine pool size to virtio block queue
Coroutine pool size was 64 from long ago, and the basis was organized in the commit message in 4d68e86b.

At that time, virtio-blk queue-size and num-queue were not configuable, and equivalent values were 128 and 1.

Coroutine pool size 64 was fine then.

Later queue-size and num-queue got configuable, and default values were increased.

Coroutine pool with size 64 exhausts frequently with random disk IO in new size, and slows down.

This commit adjusts coroutine pool size adaptively with new values.

This commit adds 64 by default, but now coroutine is not only for block devices,

and is not too much burdon comparing with new default.

pool size of 128 * vCPUs.

Signed-off-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp>
Message-id: 20220214115302.13294-2-hnarukaw@yahoo-corp.jp
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-02-14 17:11:25 +00:00
Klaus Jensen
e321b4cdc2 hw/nvme: add support for zoned random write area
Add support for TP 4076 ("Zoned Random Write Area"), v2021.08.23
("Ratified").

This adds three new namespace parameters: "zoned.numzrwa" (number of
zrwa resources, i.e. number of zones that can have a zrwa),
"zoned.zrwas" (zrwa size in LBAs), "zoned.zrwafg" (granularity in LBAs
for flushes).

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-02-14 08:58:29 +01:00
Klaus Jensen
25872031e1 hw/nvme: add ozcs enum
Add enumeration for OZCS values.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-02-14 08:58:29 +01:00
Klaus Jensen
6190d92ff7 hw/nvme: add struct for zone management send
Add struct for Zone Management Send in preparation for more zone send
flags.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-02-14 08:58:29 +01:00
Philippe Mathieu-Daudé
8d3a17be6f hw/nvme/ctrl: Pass buffers as 'void *' types
These buffers can be anything, not an array of chars,
so use the 'void *' type for them.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-02-14 08:58:29 +01:00
Philippe Mathieu-Daudé
e080ce8676 hw/nvme/ctrl: Have nvme_addr_write() take const buffer
The 'buf' argument is not modified, so better pass it as const type.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-02-14 08:58:29 +01:00
Klaus Jensen
736b01642d hw/nvme: fix CVE-2021-3929
This fixes CVE-2021-3929 "locally" by denying DMA to the iomem of the
device itself. This still allows DMA to MMIO regions of other devices
(e.g. doing P2P DMA to the controller memory buffer of another NVMe
device).

Fixes: CVE-2021-3929
Reported-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-02-14 08:58:29 +01:00
Nick Hudson
870374214e hw/net: e1000e: Clear ICR on read when using non MSI-X interrupts
In section 7.4.3 of the 82574 datasheet it states that

    "In systems that do not support MSI-X, reading the ICR
     register clears it's bits..."

Some OSes rely on this.

Signed-off-by: Nick Hudson <skrll@netbsd.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-02-14 11:50:44 +08:00
Philippe Mathieu-Daudé
f3e5a17593 hw/net/vmxnet3: Log guest-triggerable errors using LOG_GUEST_ERROR
The "Interrupt Cause" register (VMXNET3_REG_ICR) is read-only.
Write accesses are ignored. Log them with as LOG_GUEST_ERROR
instead of aborting:

  [R +0.239743] writeq 0xe0002031 0x46291a5a55460800
  ERROR:hw/net/vmxnet3.c:1819:vmxnet3_io_bar1_write: code should not be reached
  Thread 1 "qemu-system-i38" received signal SIGABRT, Aborted.
  (gdb) bt
  #3  0x74c397d3 in __GI_abort () at abort.c:79
  #4  0x76d3cd4c in g_assertion_message (domain=<optimized out>, file=<optimized out>, line=<optimized out>, func=<optimized out>, message=<optimized out>) at ../glib/gtestutils.c:3223
  #5  0x76d9d45f in g_assertion_message_expr
      (domain=0x0, file=0x59fc2e53 "hw/net/vmxnet3.c", line=1819, func=0x59fc11e0 <__func__.vmxnet3_io_bar1_write> "vmxnet3_io_bar1_write", expr=<optimized out>)
      at ../glib/gtestutils.c:3249
  #6  0x57e80a3a in vmxnet3_io_bar1_write (opaque=0x62814100, addr=56, val=70, size=4) at hw/net/vmxnet3.c:1819
  #7  0x58c2d894 in memory_region_write_accessor (mr=0x62816b90, addr=56, value=0x7fff9450, size=4, shift=0, mask=4294967295, attrs=...) at softmmu/memory.c:492
  #8  0x58c2d1d2 in access_with_adjusted_size (addr=56, value=0x7fff9450, size=1, access_size_min=4, access_size_max=4, access_fn=
      0x58c2d290 <memory_region_write_accessor>, mr=0x62816b90, attrs=...) at softmmu/memory.c:554
  #9  0x58c2bae7 in memory_region_dispatch_write (mr=0x62816b90, addr=56, data=70, op=MO_8, attrs=...) at softmmu/memory.c:1504
  #10 0x58bfd034 in flatview_write_continue (fv=0x606000181700, addr=0xe0002038, attrs=..., ptr=0x7fffb9e0, len=1, addr1=56, l=1, mr=0x62816b90)
      at softmmu/physmem.c:2782
  #11 0x58beba00 in flatview_write (fv=0x606000181700, addr=0xe0002031, attrs=..., buf=0x7fffb9e0, len=8) at softmmu/physmem.c:2822
  #12 0x58beb589 in address_space_write (as=0x608000015f20, addr=0xe0002031, attrs=..., buf=0x7fffb9e0, len=8) at softmmu/physmem.c:2914

Reported-by: Dike <dike199774@qq.com>
Reported-by: Duhao <504224090@qq.com>
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=2032932
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-02-14 11:50:44 +08:00
Bernhard Beschow
fdb8541b2e hw/block/fdc-isa: Respect QOM properties when building AML
Other ISA devices such as serial-isa use the properties in their
build_aml functions. fdc-isa not using them is probably an oversight.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20220209191558.30393-1-shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-02-11 17:37:26 +01:00
Cédric Le Goater
005b69fdcc target/ppc: Remove PowerPC 601 CPUs
The PowerPC 601 processor is the first generation of processors to
implement the PowerPC architecture. It was designed as a bridge
processor and also could execute most of the instructions of the
previous POWER architecture. It was found on the first Macs and IBM
RS/6000 workstations.

There is not much interest in keeping the CPU model of this
POWER-PowerPC bridge processor. We have the 603 and 604 CPU models of
the 60x family which implement the complete PowerPC instruction set.

Cc: "Hervé Poussineau" <hpoussin@reactos.org>
Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20220203142756.1302515-1-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-09 09:08:55 +01:00
Kevin Townsend
4fd1ebb105 hw/sensor: Add lsm303dlhc magnetometer device
This commit adds emulation of the magnetometer on the LSM303DLHC.
It allows the magnetometer's X, Y and Z outputs to be set via the
mag-x, mag-y and mag-z properties, as well as the 12-bit
temperature output via the temperature property. Sensor can be
enabled with 'CONFIG_LSM303DLHC_MAG=y'.

Signed-off-by: Kevin Townsend <kevin.townsend@linaro.org>
Message-id: 20220130095032.35392-1-kevin.townsend@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-02-08 10:56:29 +00:00
Peter Maydell
d7d359c4ac hw/intc/arm_gicv3_its: Split error checks
In most of the ITS command processing, we check different error
possibilities one at a time and log them appropriately. In
process_mapti() and process_mapd() we have code which checks
multiple error cases at once, which means the logging is less
specific than it could be. Split those cases up.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-14-peter.maydell@linaro.org
2022-02-08 10:56:29 +00:00
Peter Maydell
3330241407 hw/intc/arm_gicv3_its: Don't allow intid 1023 in MAPI/MAPTI
When handling MAPI/MAPTI, we allow the supplied interrupt ID to be
either 1023 or something in the valid LPI range.  This is a mistake:
only a real valid LPI is allowed.  (The general behaviour of the ITS
is that most interrupt ID fields require a value in the LPI range;
the exception is that fields specifying a doorbell value, which are
all in GICv4 commands, allow also 1023 to mean "no doorbell".)
Remove the condition that incorrectly allows 1023 here.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-13-peter.maydell@linaro.org
2022-02-08 10:56:29 +00:00
Peter Maydell
84d43d2e82 hw/intc/arm_gicv3_its: In MAPC with V=0, don't check rdbase field
In the MAPC command, if V=0 this is a request to delete a collection
table entry and the rdbase field of the command packet will not be
used.  In particular, the specification says that the "UNPREDICTABLE
if rdbase is not valid" only applies for V=1.

We were doing a check-and-log-guest-error on rdbase regardless of
whether the V bit was set, and also (harmlessly but confusingly)
storing the contents of the rdbase field into the updated collection
table entry.  Update the code so that if V=0 we don't check or use
the rdbase field value.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-12-peter.maydell@linaro.org
2022-02-08 10:56:29 +00:00
Peter Maydell
da4680ce3a hw/intc/arm_gicv3_its: Drop TableDesc and CmdQDesc valid fields
Currently we track in the TableDesc and CmdQDesc structs the state of
the GITS_BASER<n> and GITS_CBASER Valid bits.  However we aren't very
consistent abut checking the valid field: we test it in update_cte()
and update_dte(), but not anywhere else we look things up in tables.

The GIC specification says that it is UNPREDICTABLE if a guest fails
to set any of these Valid bits before enabling the ITS via
GITS_CTLR.Enabled.  So we can choose to handle Valid == 0 as
equivalent to a zero-length table.  This is in fact how we're already
catching this case in most of the table-access paths: when Valid is 0
we leave the num_entries fields in TableDesc or CmdQDesc set to zero,
and then the out-of-bounds check "index >= num_entries" that we have
to do anyway before doing any of these table lookups will always be
true, catching the no-valid-table case without any extra code.

So we can remove the checks on the valid field from update_cte()
and update_dte(): since these happen after the bounds check there
was never any case when the test could fail. That means the valid
fields would be entirely unused, so just remove them.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-11-peter.maydell@linaro.org
2022-02-08 10:56:29 +00:00
Peter Maydell
7eb54267f2 hw/intc/arm_gicv3_its: Make update_ite() use ITEntry
Make the update_ite() struct use the new ITEntry struct, so that
callers don't need to assemble the in-memory ITE data themselves, and
only get_ite() and update_ite() need to care about that in-memory
layout.  We can then drop the no-longer-used IteEntry struct
definition.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-10-peter.maydell@linaro.org
2022-02-08 10:56:29 +00:00
Peter Maydell
244194fe24 hw/intc/arm_gicv3_its: Pass ITE values back from get_ite() via a struct
In get_ite() we currently return the caller some of the fields of an
Interrupt Table Entry via a set of pointer arguments, and validate
some of them internally (interrupt type and valid bit) to return a
simple true/false 'valid' indication. Define a new ITEntry struct
which has all the fields that the in-memory ITE has, and bring the
get_ite() function in to line with get_dte() and get_cte().

This paves the way for handling virtual interrupts, which will want
a different subset of the fields in the ITE. Handling them under
the old "lots of pointer arguments" scheme would have meant a
confusingly large set of arguments for this function.

The new struct ITEntry is obviously confusably similar to the
existing IteEntry struct, whose fields are the raw 12 bytes
of the in-memory ITE. In the next commit we will make update_ite()
use ITEntry instead of IteEntry, which will allow us to delete
the IteEntry struct and remove the confusion.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-9-peter.maydell@linaro.org
2022-02-08 10:56:29 +00:00
Peter Maydell
2954b93fe6 hw/intc/arm_gicv3_its: Avoid nested ifs in get_ite()
The get_ite() code has some awkward nested if statements; clean
them up by returning early if the memory accesses fail.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-8-peter.maydell@linaro.org
2022-02-08 10:56:29 +00:00
Peter Maydell
a1ce993da6 hw/intc/arm_gicv3_its: Fix address calculation in get_ite() and update_ite()
In get_ite() and update_ite() we work with a 12-byte in-guest-memory
table entry, which we intend to handle as an 8-byte value followed by
a 4-byte value.  Unfortunately the calculation of the address of the
4-byte value is wrong, because we write it as:

 table_base_address + (index * entrysize) + 4
(obfuscated by the way the expression has been written)

when it should be + 8.  This bug meant that we overwrote the top
bytes of the 8-byte value with the 4-byte value.  There are no
guest-visible effects because the top half of the 8-byte value
contains only the doorbell interrupt field, which is used only in
GICv4, and the two bugs in the "write ITE" and "read ITE" codepaths
cancel each other out.

We can't simply change the calculation, because this would break
migration of a (TCG) guest from the old version of QEMU which had
in-guest-memory interrupt tables written using the buggy version of
update_ite().  We must also at the same time change the layout of the
fields within the ITE_L and ITE_H values so that the in-memory
locations of the fields we care about (VALID, INTTYPE, INTID and
ICID) stay the same.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-7-peter.maydell@linaro.org
2022-02-08 10:56:29 +00:00
Peter Maydell
06985cc3fe hw/intc/arm_gicv3_its: Pass CTEntry to update_cte()
Make update_cte() take a CTEntry struct rather than all the fields
of the new CTE as separate arguments.

This brings it into line with the update_dte() API.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-6-peter.maydell@linaro.org
2022-02-08 10:56:28 +00:00
Peter Maydell
d37cf49b11 hw/intc/arm_gicv3_its: Keep CTEs as a struct, not a raw uint64_t
In the ITS, a CTE is an entry in the collection table, which contains
multiple fields. Currently the function get_cte() which reads one
entry from the device table returns a success/failure boolean and
passes back the raw 64-bit integer CTE value via a pointer argument.
We then extract fields from the CTE as we need them.

Create a real C struct with the same fields as the CTE, and
populate it in get_cte(), so that that function and update_cte()
are the only ones which need to care about the in-guest-memory
format of the CTE.

This brings get_cte()'s API into line with get_dte().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-5-peter.maydell@linaro.org
2022-02-08 10:56:28 +00:00
Peter Maydell
22d62b08ba hw/intc/arm_gicv3_its: Pass DTEntry to update_dte()
Make update_dte() take a DTEntry struct rather than all the fields of
the new DTE as separate arguments.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-4-peter.maydell@linaro.org
2022-02-08 10:56:28 +00:00
Peter Maydell
4acf93e193 hw/intc/arm_gicv3_its: Keep DTEs as a struct, not a raw uint64_t
In the ITS, a DTE is an entry in the device table, which contains
multiple fields. Currently the function get_dte() which reads one
entry from the device table returns it as a raw 64-bit integer,
which we then pass around in that form, only extracting fields
from it as we need them.

Create a real C struct with the same fields as the DTE, and
populate it in get_dte(), so that that function and update_dte()
are the only ones that need to care about the in-guest-memory
format of the DTE.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-3-peter.maydell@linaro.org
2022-02-08 10:56:28 +00:00
Peter Maydell
b6f96009ac hw/intc/arm_gicv3_its: Use address_space_map() to access command queue packets
Currently the ITS accesses each 8-byte doubleword in a 4-doubleword
command packet with a separate address_space_ldq_le() call.  This is
awkward because the individual command processing functions have
ended up with code to handle "load more doublewords out of the
packet", which is both unwieldy and also a potential source of bugs
because it's not obvious when looking at a line that pulls a field
out of the 'value' variable which of the 4 doublewords that variable
currently holds.

Switch to using address_space_map() to map the whole command packet
at once and fish the four doublewords out of it.  Then each process_*
function can start with a few lines of code that extract the fields
it cares about.

This requires us to split out the guts of process_its_cmd() into a
new do_process_its_cmd(), because we were previously overloading the
value and offset arguments as a backdoor way to directly pass the
devid and eventid from a write to GITS_TRANSLATER.  The new
do_process_its_cmd() takes those arguments directly, and
process_its_cmd() is just a wrapper that does the "read fields from
command packet" part.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-2-peter.maydell@linaro.org
2022-02-08 10:56:28 +00:00
Eric Auger
43530095e1 hw/arm/smmuv3: Fix device reset
We currently miss a bunch of register resets in the device reset
function. This sometimes prevents the guest from rebooting after
a system_reset (with virtio-blk-pci). For instance, we may get
the following errors:

invalid STE
smmuv3-iommu-memory-region-0-0 translation failed for iova=0x13a9d2000(SMMU_EVT_C_BAD_STE)
Invalid read at addr 0x13A9D2000, size 2, region '(null)', reason: rejected
invalid STE
smmuv3-iommu-memory-region-0-0 translation failed for iova=0x13a9d2000(SMMU_EVT_C_BAD_STE)
Invalid write at addr 0x13A9D2000, size 2, region '(null)', reason: rejected
invalid STE

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20220202111602.627429-1-eric.auger@redhat.com
Fixes: 10a83cb988 ("hw/arm/smmuv3: Skeleton")
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-02-08 10:56:28 +00:00
Richard Petri
77cd997161 hw/timer/armv7m_systick: Update clock source before enabling timer
Starting the SysTick timer and changing the clock source a the same time
will result in an error, if the previous clock period was zero. For exmaple,
on the mps2-tz platforms, no refclk is present. Right after reset, the
configured ptimer period is zero, and trying to enabling it will turn it off
right away. E.g., code running on the platform setting

    SysTick->CTRL  = SysTick_CTRL_CLKSOURCE_Msk | SysTick_CTRL_ENABLE_Msk;

should change the clock source and enable the timer on real hardware, but
resulted in an error in qemu.

Signed-off-by: Richard Petri <git@rpls.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20220201192650.289584-1-git@rpls.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-02-08 10:56:28 +00:00
Edgar E. Iglesias
40874a383d hw/arm: versal-virt: Always call arm_load_kernel()
Always call arm_load_kernel() regardless of kernel_filename being
set. This is needed because arm_load_kernel() sets up reset for
the CPUs.

Fixes: 6f16da53ff (hw/arm: versal: Add a virtual Xilinx Versal board)
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20220130110313.4045351-2-edgar.iglesias@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-02-08 10:56:28 +00:00
Peter Maydell
e4b0bb8071 hw/arm/boot: Drop existing dtb /psci node rather than retaining it
If we're using PSCI emulation, we add a /psci node to the device tree
we pass to the guest.  At the moment, if the dtb already has a /psci
node in it, we retain it, rather than replacing it. (This behaviour
was added in commit c39770cd63 in 2018.)

This is a problem if the existing node doesn't match our PSCI
emulation.  In particular, it might specify the wrong method (HVC vs
SMC), or wrong function IDs for cpu_suspend/cpu_off/etc, in which
case the guest will not get the behaviour it wants when it makes PSCI
calls.

An example of this is trying to boot the highbank or midway board
models using the device tree supplied in the kernel sources: this
device tree includes a /psci node that specifies function IDs that
don't match the (PSCI 0.2 compliant) IDs that QEMU uses.  The dtb
cpu_suspend function ID happens to match the PSCI 0.2 cpu_off ID, so
the guest hangs after booting when the kernel tries to idle the CPU
and instead it gets turned off.

Instead of retaining an existing /psci node, delete it entirely
and replace it with a node whose properties match QEMU's PSCI
emulation behaviour. This matches the way we handle /memory nodes,
where we also delete any existing nodes and write in ones that
match the way QEMU is going to behave.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20220127154639.2090164-17-peter.maydell@linaro.org
2022-02-08 10:56:28 +00:00
Peter Maydell
d6dc926e6e hw/arm/boot: Drop nb_cpus field from arm_boot_info
We use the arm_boot_info::nb_cpus field in only one place, and that
place can easily get the number of CPUs locally rather than relying
on the board code to have set the field correctly.  (At least one
board, xlnx-versal-virt, does not set the field despite having more
than one CPU.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Tested-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20220127154639.2090164-16-peter.maydell@linaro.org
2022-02-08 10:56:28 +00:00
Peter Maydell
45dd668f23 hw/arm/highbank: Drop unused secondary boot stub code
The highbank and midway board code includes boot-stub code for
handling secondary CPU boot which keeps the secondaries in a pen
until the primary writes to a known location with the address they
should jump to.

This code is never used, because the boards enable QEMU's PSCI
emulation, so secondary CPUs are kept powered off until the PSCI call
which turns them on, and then start execution from the address given
by the guest in that PSCI call.  Delete the unreachable code.

(The code was wrong for midway in any case -- on the Cortex-A15 the
GIC CPU interface registers are at a different offset from PERIPHBASE
compared to the Cortex-A9, and the code baked-in the offsets for
highbank's A9.)

Note that this commit implicitly depends on the preceding "Don't
write secondary boot stub if using PSCI" commit -- the default
secondary-boot stub code overlaps with one of the highbank-specific
bootcode rom blobs, so we must suppress the secondary-boot
stub code entirely, not merely replace the highbank-specific
version with the default.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Tested-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20220127154639.2090164-15-peter.maydell@linaro.org
2022-02-08 10:56:28 +00:00
Peter Maydell
d4a29ed6db hw/arm/boot: Don't write secondary boot stub if using PSCI
If we're using PSCI emulation to start secondary CPUs, there is no
point in writing the "secondary boot" stub code, because it will
never be used -- secondary CPUs start powered-off, and when powered
on are set to begin execution at the address specified by the guest's
power-on PSCI call, not at the stub.

Move the call to the hook that writes the secondary boot stub code so
that we can do it only if we're starting a Linux kernel and not using
PSCI.

(None of the users of the hook care about the ordering of its call
relative to anything else: they only use it to write a rom blob to
guest memory.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Tested-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20220127154639.2090164-14-peter.maydell@linaro.org
2022-02-08 10:56:28 +00:00
Peter Maydell
dc888dd43b hw/arm/boot: Prevent setting both psci_conduit and secure_board_setup
Now that we have dealt with the one special case (highbank) that needed
to set both psci_conduit and secure_board_setup, we don't need to
allow that combination any more. It doesn't make sense in general,
so use an assertion to ensure we don't add new boards that do it
by accident without thinking through the consequences.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Tested-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20220127154639.2090164-13-peter.maydell@linaro.org
2022-02-08 10:56:28 +00:00
Peter Maydell
61b82973e7 hw/arm/highbank: Drop use of secure_board_setup
Guest code on highbank may make non-PSCI SMC calls in order to
enable/disable the L2x0 cache controller (see the Linux kernel's
arch/arm/mach-highbank/highbank.c highbank_l2c310_write_sec()
function).  The ABI for this is documented in kernel commit
8e56130dcb as being borrowed from the OMAP44xx ROM.  The OMAP44xx TRM
documents this function ID as having no return value and potentially
trashing all guest registers except SP and PC. For QEMU's purposes
(where our L2x0 model is a stub and enabling or disabling it doesn't
affect the guest behaviour) a simple "do nothing" SMC is fine.

We currently implement this NOP behaviour using a little bit of
Secure code we run before jumping to the guest kernel, which is
written by arm_write_secure_board_setup_dummy_smc().  The code sets
up a set of Secure vectors where the SMC entry point returns without
doing anything.

Now that the PSCI SMC emulation handles all SMC calls (setting r0 to
an error code if the input r0 function identifier is not recognized),
we can use that default behaviour as sufficient for the highbank
cache controller call.  (Because the guest code assumes r0 has no
interesting value on exit it doesn't matter that we set it to the
error code).  We can therefore delete the highbank board code that
sets secure_board_setup to true and writes the secure-code bootstub.

(Note that because the OMAP44xx ABI puts function-identifiers in
r12 and PSCI uses r0, we only avoid a clash because Linux's code
happens to put the function-identifier in both registers. But this
is true also when the kernel is running on real firmware that
implements both ABIs as far as I can see.)

This change fixes in passing booting on the 'midway' board model,
which has been completely broken since we added support for Hyp
mode to the Cortex-A15 CPU. When we did that boot.c was made to
start running the guest code in Hyp mode; this includes the
board_setup hook, which instantly UNDEFs because the NSACR is
not accessible from Hyp. (Put another way, we never made the
secure_board_setup hook support cope with Hyp mode.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Tested-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20220127154639.2090164-12-peter.maydell@linaro.org
2022-02-08 10:56:28 +00:00
Peter Maydell
33284d482c hw/arm: highbank: For EL3 guests, don't enable PSCI, start all cores
Change the highbank/midway boards to use the new boot.c functionality
to allow us to enable psci-conduit only if the guest is being booted
in EL1 or EL2, so that if the user runs guest EL3 firmware code our
PSCI emulation doesn't get in its way.

To do this we stop setting the psci-conduit and start-powered-off
properties on the CPU objects in the board code, and instead set the
psci_conduit field in the arm_boot_info struct to tell the common
boot loader code that we'd like PSCI if the guest is starting at an
EL that it makes sense with (in which case it will set these
properties).

This means that when running guest code at EL3, all the cores
will start execution at once on poweron. This matches the
real hardware behaviour. (A brief description of the hardware
boot process is in the u-boot documentation for these boards:
https://u-boot.readthedocs.io/en/latest/board/highbank/highbank.html#boot-process
 -- in theory one might run the 'a9boot'/'a15boot' secure monitor
code in QEMU, though we probably don't emulate enough for that.)

This affects the highbank and midway boards.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Tested-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20220127154639.2090164-10-peter.maydell@linaro.org
2022-02-08 10:56:28 +00:00