QEMU currently crashes when the user tries to do something like:
qemu-system-x86_64 -M x-remote -device piix3-ide
This happens because the "isabus" variable is not initialized with
the x-remote machine yet. Add a proper check for this condition
and propagate the error to the caller, so we can fail there gracefully.
Message-Id: <20210416125256.2039734-1-thuth@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
QEMU currently crashes when it's started like this:
cat << EOF | ./qemu-system-i386 -device vmxnet3 -nodefaults -qtest stdio
outl 0xcf8 0x80001014
outl 0xcfc 0xe0001000
outl 0xcf8 0x80001018
outl 0xcf8 0x80001004
outw 0xcfc 0x7
outl 0xcf8 0x80001083
write 0x0 0x1 0xe1
write 0x1 0x1 0xfe
write 0x2 0x1 0xbe
write 0x3 0x1 0xba
writeq 0xe0001020 0xefefff5ecafe0000
writeq 0xe0001020 0xffff5e5ccafe0002
EOF
It hits this assertion:
qemu-system-i386: ../qemu/hw/net/net_tx_pkt.c:453: net_tx_pkt_reset:
Assertion `pkt->raw' failed.
This happens because net_tx_pkt_init() is called with max_frags == 0 and
thus the allocation
p->raw = g_new(struct iovec, max_frags);
results in a NULL pointer that causes the
assert(pkt->raw);
in net_tx_pkt_reset() to fail later. To fix this issue we can check
that max_raw_frags was not zero before asserting that pkt->raw is
a non-NULL pointer.
Buglink: https://bugs.launchpad.net/qemu/+bug/1890157
Message-Id: <20210715193219.1132571-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
QEMU should never abort just because the guest is doing something odd.
Let's simply log the error and ignore the bad transmit queue instead.
Buglink: https://bugs.launchpad.net/qemu/+bug/1926111
Message-Id: <20210715103755.1035566-1-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Lots of last minute stuff.
vhost-user-i2c.
vhost-vsock SOCK_SEQPACKET support.
IOMMU bypass.
ACPI based pci hotplug.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmDxoZ8PHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRp76oH/1NJm5vFD+IjXC2sjKfxgqhi4QZGZOgU5rP6
fDQA4HA0VM0iO7wUqIwc77Dn2BJLDnHG2biNlr1E0LnUWcpTATjX9qlonMz4gFn0
1j49t1CmlkXinW55rVLsD5tjish1dR4mdzZMLQAxuUVxb5lH+rI2RDcbS2xrBs9G
jPOm5TBl1FDREqnToVpWD+/gaCQAvFyjpXbSBKDahsxAZDL9GCpcZKNGgyBQrnPK
1cYxPJZwTjVPwDXVArbFNCqUFemux2f5Hgx8gNFHomiNk9+mW0Q2quuVLwWHkbHC
KEI7ZTfpjJkDXsQayhQ7HCTsiFl0qwj44BNUX07ZvpNkxNXXRHM=
=4iA8
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream3' into staging
pc,pci,virtio: lots of new features
Lots of last minute stuff.
vhost-user-i2c.
vhost-vsock SOCK_SEQPACKET support.
IOMMU bypass.
ACPI based pci hotplug.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 16 Jul 2021 16:11:27 BST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream3:
vhost-vsock: SOCK_SEQPACKET feature bit support
docs: Add documentation for iommu bypass
hw/i386/acpi-build: Add IVRS support to bypass iommu
hw/i386/acpi-build: Add DMAR support to bypass iommu
hw/arm/virt-acpi-build: Add IORT support to bypass SMMUv3
hw/pci: Add pci_bus_range() to get PCI bus number range
hw/i386: Add a default_bus_bypass_iommu pc machine option
hw/arm/virt: Add default_bus_bypass_iommu machine option
hw/pxb: Add a bypass iommu property
hw/pci/pci_host: Allow PCI host to bypass iommu
docs: Add '-device intel-iommu' entry
hw/virtio: add vhost-user-i2c-pci boilerplate
hw/virtio: add boilerplate for vhost-user-i2c device
bios-tables-test: Update golden binaries
hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35
bios-tables-test: Allow changes in DSDT ACPI tables
hw/pci/pcie: Do not set HPC flag if acpihp is used
hw/acpi/ich9: Enable ACPI PCI hot-plug
hw/i386/acpi-build: Add ACPI PCI hot-plug methods to Q35
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This adds processing of VIRTIO_VSOCK_F_SEQPACKET features bit. Guest
negotiates it with vhost, thus both will know that SOCK_SEQPACKET
supported by peer.
Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
Message-Id: <20210622144747.2949134-1-arseny.krasnov@kaspersky.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Check bypass_iommu to exclude the devices which will bypass iommu.
Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Message-Id: <1625748919-52456-9-git-send-email-wangxingang5@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In DMAR table, the drhd is set to cover all PCI devices when intel_iommu
is on. To support bypass iommu feature, we need to walk the PCI bus with
bypass_iommu disabled and add explicit scope data in DMAR drhd structure.
/mnt/sdb/wxg/qemu-next/qemu/build/x86_64-softmmu/qemu-system-x86_64 \
-machine q35,accel=kvm,default_bus_bypass_iommu=true \
-cpu host \
-m 16G \
-smp 36,sockets=2,cores=18,threads=1 \
-device pxb-pcie,bus_nr=0x10,id=pci.10,bus=pcie.0,addr=0x3 \
-device pxb-pcie,bus_nr=0x20,id=pci.20,bus=pcie.0,addr=0x4,bypass_iommu=true \
-device pcie-root-port,port=0x1,chassis=1,id=pci.11,bus=pci.10,addr=0x0 \
-device pcie-root-port,port=0x2,chassis=2,id=pci.21,bus=pci.20,addr=0x0 \
-device virtio-scsi-pci,id=scsi0,bus=pci.11,addr=0x0 \
-device virtio-scsi-pci,id=scsi1,bus=pci.21,addr=0x0 \
-drive file=/mnt/sdb/wxg/fedora-48g.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0,cache=none,aio=native \
-device scsi-hd,bus=scsi1.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 \
-device intel-iommu \
-nographic \
And we get the guest configuration:
~ lspci -vt
-+-[0000:20]---00.0-[21]----00.0 Red Hat, Inc. Virtio SCSI
+-[0000:10]---00.0-[11]----00.0 Red Hat, Inc. Virtio SCSI
\-[0000:00]-+-00.0 Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
+-01.0 Device 1234:1111
+-02.0 Intel Corporation 82574L Gigabit Network Connection
+-03.0 Red Hat, Inc. QEMU PCIe Expander bridge
+-04.0 Red Hat, Inc. QEMU PCIe Expander bridge
+-1f.0 Intel Corporation 82801IB (ICH9) LPC Interface Controller
+-1f.2 Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode]
\-1f.3 Intel Corporation 82801I (ICH9 Family) SMBus Controller
With bypass_iommu enabled on root bus, the attached devices will bypass iommu:
/sys/class/iommu/dmar0
├── devices
│ ├── 0000:10:00.0 -> ../../../../pci0000:10/0000:10:00.0
│ └── 0000:11:00.0 -> ../../../../pci0000:10/0000:10:00.0/0000:11:00.0
Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Message-Id: <1625748919-52456-8-git-send-email-wangxingang5@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When we build IORT table with SMMUv3 and bypass iommu feature enabled,
we can no longer setup one map from RC to SMMUv3 covering the whole RIDs.
We need to walk the PCI bus and check whether the root bus will bypass
iommu, setup RC -> SMMUv3 -> ITS map for RC which will not bypass iommu.
When a SMMUv3 node exist, we setup the idmap from SMMUv3 to ITS
covering the whole RIDs, and only modify the map from RC to SMMUv3.
We build RC -> SMMUv3 -> ITS map for root bus with bypass_iommu
disabled, and build idmap from RC to ITS directly for the rest of
the whole RID space.
For example we run qemu with command line:
qemu/build/aarch64-softmmu/qemu-system-aarch64 \
-kernel arch/arm64/boot/Image \
-enable-kvm \
-cpu host \
-m 8G \
-smp 8,sockets=2,cores=4,threads=1 \
-machine virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3,default_bus_bypass_iommu=true \
-drive file=./QEMU_EFI-pflash.raw,if=pflash,format=raw,unit=0,readonly=on \
-device pxb-pcie,bus_nr=0x10,id=pci.10,bus=pcie.0,addr=0x3.0x1 \
-device pxb-pcie,bus_nr=0x20,id=pci.20,bus=pcie.0,addr=0x3.0x2,bypass_iommu=true \
-device pcie-root-port,port=0x20,chassis=1,id=pci.1,bus=pcie.0,addr=0x2 \
-device pcie-root-port,port=0x20,chassis=11,id=pci.11,bus=pci.10,addr=0x1 \
-device pcie-root-port,port=0x20,chassis=21,id=pci.21,bus=pci.20,addr=0x1 \
-device virtio-scsi-pci,id=scsi0,bus=pci.1,addr=0x1 \
-device virtio-scsi-pci,id=scsi1,bus=pci.11,addr=0x1 \
-device virtio-scsi-pci,id=scsi2,bus=pci.21,addr=0x1 \
-initrd /mnt/davinci/wxg/kill-linux/rootfs/mfs.cpio.gz \
-nographic \
-append "rdinit=init console=ttyAMA0 earlycon=pl011,0x9000000 nokaslr" \
And we get guest configuration:
-+-[0000:20]---01.0-[21]--
+-[0000:10]---01.0-[11]--
\-[0000:00]-+-00.0 Device 1b36:0008
+-01.0 Device 1af4:1000
\-02.0-[01]--
With bypass_iommu enabled, the attached devices will bypass iommu.
/sys/class/iommu/smmu3.0x0000000009050000/
|-- device -> ../../../arm-smmu-v3.0.auto
|-- devices
| `-- 0000:10:01.0 -> ../../../../../pci0000:10/0000:10:01.0
Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Message-Id: <1625748919-52456-7-git-send-email-wangxingang5@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This helps to get the min and max bus number of a PCI bus hierarchy.
Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <1625748919-52456-6-git-send-email-wangxingang5@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a default_bus_bypass_iommu pc machine option to enable/disable
bypass_iommu for default root bus. The option is disabled by default
and can be enabled with:
$QEMU -machine q35,default_bus_bypass_iommu=true
Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Message-Id: <1625748919-52456-5-git-send-email-wangxingang5@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a default_bus_bypass_iommu machine option to enable/disable
bypass_iommu for default root bus. The option is disabled by
default and can be enabled with:
$QEMU -machine virt,iommu=smmuv3,default_bus_bypass_iommu=true
Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Message-Id: <1625748919-52456-4-git-send-email-wangxingang5@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a bypass_iommu property for pci_expander_bridge, the property
is used to indicate whether pxb root bus will bypass iommu. By
default the bypass_iommu is disabled, and it can be enabled with:
qemu -device pxb-pcie,bus_nr=0x10,addr=0x1,bypass_iommu=true
Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <1625748919-52456-3-git-send-email-wangxingang5@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a new bypass_iommu property for PCI host and use it to check
whether devices attached to the PCI root bus will bypass iommu.
In pci_device_iommu_address_space(), check the property and
avoid getting iommu address space for devices bypass iommu.
Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <1625748919-52456-2-git-send-email-wangxingang5@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This allows is to instantiate a vhost-user-i2c device as part of a PCI
bus. It is mostly boilerplate which looks pretty similar to the
vhost-user-fs-pci device.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Message-Id: <8a083eaa57d93feaab12acd1f94b225879212f20.1625806763.git.viresh.kumar@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This creates the QEMU side of the vhost-user-i2c device which connects
to the remote daemon. It is based of vhost-user-fs code.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Message-Id: <e80591b52fea4b51631818bb92a798a3daf90399.1625806763.git.viresh.kumar@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Q35 has three different types of PCI devices hot-plug: PCIe Native,
SHPC Native and ACPI hot-plug. This patch changes the default choice
for cold-plugged bridges from PCIe Native to ACPI Hot-plug with
ability to use SHPC and PCIe Native for hot-plugged bridges.
This is a list of the PCIe Native hot-plug issues that led to this
change:
* no racy behavior during boot (see 110c477c2e)
* no delay during deleting - after the actual power off software
must wait at least 1 second before indicating about it. This case
is quite important for users, it even has its own bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1594168
* no timer-based behavior - in addition to the previous example,
the attention button has a 5-second waiting period, during which
the operation can be canceled with a second press. While this
looks fine for manual button control, automation will result in
the need to queue or drop events, and the software receiving
events in all sort of unspecified combinations of attention/power
indicator states, which is racy and uppredictable.
* fixes:
* https://bugzilla.redhat.com/show_bug.cgi?id=1752465
* https://bugzilla.redhat.com/show_bug.cgi?id=1690256
To return to PCIe Native hot-plug:
-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off
Known issue: older linux guests need the following flag
to allow hotplugged pci express devices to use io:
-device pcie-root-port,io-reserve=4096.
io is unusual for pci express so this seems minor.
We'll fix this by a follow up patch.
Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210713004205.775386-6-jusual@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Instead of changing the hot-plug type in _OSC register, do not
set the 'Hot-Plug Capable' flag. This way guest will choose ACPI
hot-plug if it is preferred and leave the option to use SHPC with
pcie-pci-bridge.
The ability to control hot-plug for each downstream port is retained,
while 'hotplug=off' on the port means all hot-plug types are disabled.
Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20210713004205.775386-4-jusual@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add acpi_pcihp to ich9_pm as part of
'acpi-pci-hotplug-with-bridge-support' option. Set default to false.
Signed-off-by: Julia Suvorova <jusual@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210713004205.775386-3-jusual@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Implement notifications and gpe to support q35 ACPI PCI hot-plug.
Use 0xcc4 - 0xcd7 range for 'acpi-pci-hotplug' io ports.
Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Message-Id: <20210713004205.775386-2-jusual@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Coverity reports that we don't check the error result of fdt_pack(), so
let's save the result and assert that it is 0.
Fixes: Coverity CID 1458136
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 07325315b49d5555269f76094e4bc5296e0643b9.1626303527.git.alistair.francis@wdc.com
Currently the firmware dynamic info (fw_dyn) is put right after
the reset vector, which is not 8-byte aligned on RV64. OpenSBI
fw_dynamic uses ld to read contents from 'struct fw_dynamic_info',
which expects fw_dyn to be on the 8-byte boundary, otherwise the
misaligned load exception may happen. Fortunately this does not
cause any issue on QEMU, as QEMU does support misaligned load.
RV32 does not have any issue as it is 4-byte aligned already.
Change to make sure it is 8-byte aligned which works for both
RV32 and RV64.
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210708143319.10441-1-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
At present the CLINT timebase frequency is set to 10MHz on sifive_u,
but on the real hardware the timebase frequency is 1Mhz.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210706102616.1922469-1-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Introduce the X86_FW_OVMF Kconfig symbol for OVMF-specific code.
Move the OVMF-specific code from pc_sysfw.c to pc_sysfw_ovmf.c,
adding a pair of stubs.
Update MAINTAINERS to reach OVMF maintainers when these new
files are modified.
This fixes when building the microvm machine standalone:
/usr/bin/ld: libqemu-i386-softmmu.fa.p/target_i386_monitor.c.o: in
function `qmp_sev_inject_launch_secret':
target/i386/monitor.c:749: undefined reference to `pc_system_ovmf_table_find'
Fixes: f522cef9b3 ("sev: update sev-inject-launch-secret to make gpa optional")
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20210616204328.2611406-22-philmd@redhat.com>
Add assertion in pc_system_ovmf_table_find that verifies that the flash
was indeed previously parsed (looking for the OVMF table) by
pc_system_parse_ovmf_flash.
Now pc_system_ovmf_table_find distinguishes between "no one called
pc_system_parse_ovmf_flash" (which will abort due to assertion failure)
and "the flash was parsed but no OVMF table was found, or it is invalid"
(which will return false).
Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210701052749.934744-2-dovmurik@linux.ibm.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Fix pba_offset initialization value for BAIDU KUNLUN Virtual
Function device. The KUNLUN hardware returns an incorrect
value for the VF PBA offset, and add a quirk to instead
return a hardcoded value of 0xb400.
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Link: https://lore.kernel.org/r/20210713093743.942-1-caihuoqing@baidu.com
[aw: comment & whitespace tuning]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
CID 1458134: Integer handling issues (BAD_SHIFT)
In expression "1 << ctz64(container->pgsizes)", left shifting by more
than 31 bits has undefined behavior. The shift amount,
"ctz64(container->pgsizes)", is 64.
Commit 5e3b981c33 ("vfio: Support for RamDiscardManager in the !vIOMMU
case") added an assertion that our granularity is at least as big as the
page size.
Although unlikely, we could have a page size that does not fit into
32 bit. In that case, we'd try shifting by more than 31 bit.
Let's use 1ULL instead and make sure we're not shifting by more than 63
bit by asserting that any bit in container->pgsizes is set.
Fixes: CID 1458134
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Link: https://lore.kernel.org/r/20210712083135.15755-1-david@redhat.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The GDateTime APIs provided by GLib avoid portability pitfalls, such
as some platforms where 'struct timeval.tv_sec' field is still 'long'
instead of 'time_t'. When combined with automatic cleanup, GDateTime
often results in simpler code too.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
I thought I'd sent the last PR before the 6.1 soft freeze, but
unfortunately I need one more. This last minute one puts in a SLOF
update, along with a couple of bugfixes.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAmDs9VgACgkQbDjKyiDZ
s5JPHRAApamC4lmoHD9eENznNKPvggAQ10h0OMNEvATyq4boAQ6rZRdAkeBqYAGA
5DF0sWIyRy7/IZUSEcHHlHiw1KQqem3lBUYWQ1L78nt6AphcRQciTeQ73WMIsduA
ruFxYlCHxFQ/2wixUWmyCnGyKqFsinrbc9DrAfPFnuf3SMwr0gl4x+V+mwQbcvRZ
dn/rR8RXmOnqgX8dsViyftnmijqoyIUSWWPL7jk5WiaRdRcdCd8ly9pmkinPj6IX
k+Cgty3DSV0mn9d8zH+tDkqXwU8R/HHY8TWkmLSTtR1nXtbDBIphcmeLQ8j8Eugy
SNWxZb3ft2fmfPJICcCYOy0qcPyNekRRkmQhADqtoA4OVAdd5QQmVNXtmAV+jKp7
WX4Ozsbt4P1FXSuvhzyOTIumNsz9NxgtuGmnEl09suJ2WdzN4XOI1SzC9/JzPM/s
K/0dalIQf9NymyWQMpbVUFcPiAqGr+yuHXy5FZssTa/lgD76Odds5EVFmua95HMl
J1XRMRYmsKzRq/TCOZFr72cCzGOixzYY/Oe/yoa48oPX5HMchCsm5h5ljgqKgTh2
R7uAHmqNvsvJ0PuH9DWCPEMGr0f1f16m4ayIELysyvd1geSL/SQ9nuT/phaqmUKO
Myo0unIcuJbagf9JwG19j9fVp1dpNee/AhR38jlaNgMNX2sXI5M=
=i+2e
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.1-20210713' into staging
ppc patch queue 2021-07-13
I thought I'd sent the last PR before the 6.1 soft freeze, but
unfortunately I need one more. This last minute one puts in a SLOF
update, along with a couple of bugfixes.
# gpg: Signature made Tue 13 Jul 2021 03:07:20 BST
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dg-gitlab/tags/ppc-for-6.1-20210713:
mv64361: Remove extra break from a switch case
pseries: Update SLOF firmware image
ppc/pegasos2: Allow setprop in VOF
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When parsing cpus= attribute of -numa object couple of checks
is performed, such as correct initiator setting (see the if()
statement at the end of for() loop in
machine_set_cpu_numa_node()).
However, with the current code cpus= attribute is parsed before
initiator= attribute and thus the check may fail even though it
is not obvious why. But since parsing the initiator= attribute
does not depend on the cpus= attribute we can swap the order of
the two.
It's fairly easy to reproduce with the following command line
(snippet of an actual cmd line):
-smp 4,sockets=4,cores=1,threads=1 \
-object '{"qom-type":"memory-backend-ram","id":"ram-node0","size":2147483648}' \
-numa node,nodeid=0,cpus=0-1,initiator=0,memdev=ram-node0 \
-object '{"qom-type":"memory-backend-ram","id":"ram-node1","size":2147483648}' \
-numa node,nodeid=1,cpus=2-3,initiator=1,memdev=ram-node1 \
-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=5 \
-numa hmat-lb,initiator=0,target=0,hierarchy=first-level,data-type=access-latency,latency=10 \
-numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-latency,latency=5 \
-numa hmat-lb,initiator=1,target=1,hierarchy=first-level,data-type=access-latency,latency=10 \
-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=204800K \
-numa hmat-lb,initiator=0,target=0,hierarchy=first-level,data-type=access-bandwidth,bandwidth=208896K \
-numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=204800K \
-numa hmat-lb,initiator=1,target=1,hierarchy=first-level,data-type=access-bandwidth,bandwidth=208896K \
-numa hmat-cache,node-id=0,size=10K,level=1,associativity=direct,policy=write-back,line=8 \
-numa hmat-cache,node-id=1,size=10K,level=1,associativity=direct,policy=write-back,line=8 \
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <b27a6a88986d63e3f610a728c845e01ff8d92e2e.1625662776.git.mprivozn@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
When setting up NUMA with HMAT enabled there's a check performed
in machine_set_cpu_numa_node() that reports an error when a NUMA
node has a CPU but the node's initiator is not itself. The error
message reported contains only the expected value and not the
actual value (which is different because an error is being
reported). Report both values in the error message.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Message-Id: <ebdf871551ea995bafa7a858899a26aa9bc153d3.1625662776.git.mprivozn@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Currently all built-in CPUs report cache information via CPUID leaves 2
and 4, but these have never been defined for AMD. In the case of
SEV-SNP this can cause issues with CPUID enforcement. Address this by
allowing CPU types to suppress these via a new "x-vendor-cpuid-only"
CPU property, which is true by default, but switched off for older
machine types to maintain compatibility.
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: zhenwei pi <pizhenwei@bytedance.com>
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-Id: <20210708003623.18665-1-michael.roth@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The switch case of writing PCI 1 IO base address had an extra break
statement that made part of the code unreachable. This did not cause a
problem as guests ususally leave this register at its default value.
Fixes: dcdf98a901 ("Add emulation of Marvell MV64361 PPC system
controller")
Reported-by: Coverity (CID 1458135)
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20210712131259.B705B7456E3@zero.eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Linux needs setprop to fix up the device tree, otherwise it's not
finding devices and cannot boot. Since recent VOF change now we need
to add a callback to allow this which is what this patch does.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20210709132920.6544E7457EF@zero.eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
OSS-Fuzz found sending illegal addresses when querying the write
protection bits triggers an assertion:
qemu-fuzz-i386: hw/sd/sd.c:824: uint32_t sd_wpbits(SDState *, uint64_t): Assertion `wpnum < sd->wpgrps_size' failed.
==11578== ERROR: libFuzzer: deadly signal
#8 0x7ffff628e091 in __assert_fail
#9 0x5555588f1a3c in sd_wpbits hw/sd/sd.c:824:9
#10 0x5555588dd271 in sd_normal_command hw/sd/sd.c:1383:38
#11 0x5555588d777c in sd_do_command hw/sd/sd.c
#12 0x555558cb25a0 in sdbus_do_command hw/sd/core.c💯16
#13 0x555558e02a9a in sdhci_send_command hw/sd/sdhci.c:337:12
#14 0x555558dffa46 in sdhci_write hw/sd/sdhci.c:1187:9
#15 0x5555598b9d76 in memory_region_write_accessor softmmu/memory.c:489:5
Similarly to commit 8573378e62 ("hw/sd: fix out-of-bounds check
for multi block reads"), check the address range before sending
the status of the write protection bits.
Include the qtest reproducer provided by Alexander Bulekov:
$ make check-qtest-i386
...
Running test qtest-i386/fuzz-sdcard-test
qemu-system-i386: ../hw/sd/sd.c:824: sd_wpbits: Assertion `wpnum < sd->wpgrps_size' failed.
Reported-by: OSS-Fuzz (Issue 29225)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/450
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210702155900.148665-4-f4bug@amsat.org>
Multiple commands have to check the address requested is valid.
Extract this code pattern as a new address_in_range() helper, and
log invalid accesses as guest errors.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210624142209.1193073-3-f4bug@amsat.org>
We report the card is in an inconsistent state, but don't precise
in which state it is. Add this information, as it is useful when
debugging problems.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210624142209.1193073-2-f4bug@amsat.org>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" set .impl.min_access_size
and .impl.max_access_size to 4 to try and fix the Linux jazzsonic driver which uses
32-bit accesses.
The problem with forcing the register access to 32-bit in this way is that since the
dp8393x uses 16-bit registers, a manual endian swap is required for devices on big
endian machines with 32-bit accesses.
For both access sizes and machine endians the QEMU memory API can do the right thing
automatically: all that is needed is to set .impl.min_access_size to 2 to declare that
the dp8393x implements 16-bit registers.
Normally .impl.max_access_size should also be set to 2, however that doesn't quite
work in this case since the register stride is specified using a (dynamic) it_shift
property which is applied during the MMIO access itself. The effect of this is that
for a 32-bit access the memory API performs 2 x 16-bit accesses, but the use of
it_shift within the MMIO access itself causes the register value to be repeated in both
the top 16-bits and bottom 16-bits. The Linux jazzsonic driver expects the stride to be
zero-extended up to access size and therefore fails to correctly detect the dp8393x
device due to the extra data in the top 16-bits.
The solution here is to remove .impl.max_access_size so that the memory API will
correctly zero-extend the 16-bit registers to the access size up to and including
it_shift. Since it_shift is never greater than 2 than this will always do the right
thing for both 16-bit and 32-bit accesses regardless of the machine endian, allowing
the manual endian swap code to be removed.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: 3fe9a838ec ("dp8393x: Always use 32-bit accesses")
Message-Id: <20210705214929.17222-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Instead of accessing N registers via a single address_space API
call using a temporary buffer (stored in the device state) and
updating each register, move the address_space call in the
register put/get. The load/store and word size checks are moved
to put/get too. This simplifies a bit, making the code easier
to read.
Co-developed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Co-developed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Message-Id: <20210710174954.2577195-8-f4bug@amsat.org>
Per the DP83932C datasheet from July 1995:
4.0 SONIC Registers
4.1 THE CAM UNIT
The Content Addressable Memory (CAM) consists of sixteen
48-bit entries for complete address filtering of network
packets. Each entry corresponds to a 48-bit destination
address that is user programmable and can contain any
combination of Multicast or Physical addresses. Each entry
is partitioned into three 16-bit CAM cells accessible
through CAM Address Ports (CAP 2, CAP 1 and CAP 0) with
CAP0 corresponding to the least significant 16 bits of
the Destination Address and CAP2 corresponding to the
most significant bits.
Store the CAM registers as 16-bit as it simplifies the code.
Having now the CAM registers as arrays of 3 uint16_t, we can avoid
using the VMSTATE_BUFFER_UNSAFE macro by using VMSTATE_UINT16_2DARRAY
which is more appropriate. This breaks the migration stream however.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Message-Id: <20210710174954.2577195-5-f4bug@amsat.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Message-Id: <20210710174954.2577195-3-f4bug@amsat.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Replace address_space_rw(is_write=1) by address_space_write()
and remove pointless cast.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Message-Id: <20210710174954.2577195-2-f4bug@amsat.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Currently when a LOAD CAM command is executed the entries are loaded into the
CAM from memory in order which is incorrect. According to the datasheet the
first entry in the CAM descriptor is the entry index which means that each
descriptor may update any single entry in the CAM rather than the Nth entry.
Decode the CAM entry index and use it store the descriptor in the appropriate
slot in the CAM. This fixes the issue where the MacOS toolbox loads a single
CAM descriptor into the final slot in order to perform a loopback test which
must succeed before the Ethernet port is enabled.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210625065401.30170-10-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Rather than using the magic 0x80000000 number for the PCI I/O BAR
physical address on the main system bus, use a definition.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20210417103028.601124-6-f4bug@amsat.org>
The ASIC PCI bridge chipset from Motorola is named 'Raven'.
This chipset is used in the PowerPC Reference Platform (PReP),
but not restricted to it. Rename it accordingly.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20210417103028.601124-5-f4bug@amsat.org>
Move some ADC file to where they belong and move some sensors to a
sensor directory, since with new BMCs coming in lots of different
sensors should be coming in. Keep from cluttering things up.
Add support for I2C PMBus devices.
Replace the confusing and error-prone i2c_send_recv and i2c_transfer with
specific send and receive functions. Several errors have already been
made with these, avoid any new errors.
Fix the watchdog_expired field in the IPMI watchdog, it's not a bool,
it's a u8. After a vmstate transfer, the new value could be wrong.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE/Q1c5nzg9ZpmiCaGYfOMkJGb/4EFAmDoeGAACgkQYfOMkJGb
/4HpkQ//YLAddy3Kqo5LbyI0BPxcx567htHH34XXp4Pz22DcAoPvZRyRlsGiAAcR
HRmqjSXNBan+KhlA82coczHUami8C+9NQAfnABf7Mv4CQnvf9Hziy9YR0riTAgEk
Xfc25eYYK5dzfOHVVbp2KJ2IWPo/sY3UqbIKu9MyYzriNpzIuT5NW1aGjIrhW7Js
2feeIHxH4fEXzhkkgMKMJfXdxkYyD78mB3IfnpVCnoODq3fMZXtMskRDHWjdMEnT
0u53zeQd5T0jElLZXlwDKAd9QpiPWVnfTJwND3fwWjSMIAVkd7kTqJbVtHpYJMWB
amOR9qsw72r/WIOow4lLbSjS57GbyEjB344dtlrzHhhW68y9SGbtaSXWD/imkAfB
NhLlMzNzsIb+dHZhAwdEgmg4Nz/1HZngiK4Awk2YFBVCn7G6BqAvpMwLGdYYr3Az
P4EPs0aVdGYBbcN5XxAwkKEqESrzF+iF7iZvNTocdSJSCV+ebFNLlZWIq/aXS1TA
IED1mBmyc/drWF+jA0D520uafT1m+zN4HhPvHpRyK65bqqiH2veDpK20kz0lwQe8
e2ZeDVG2oVTMIQ+mmyu7kD9xgxjmapoHz64K1VkUwsnXE3jp9yqfa17B9vHOjqkq
e/iaHiek8xeTWWyoShGk2zoAHBChkwSy/Y2JuZA6FXRnoU6kKsg=
=8708
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/cminyard/tags/for-qemu-6.1-2' into staging
Some qemu updates for IPMI and I2C
Move some ADC file to where they belong and move some sensors to a
sensor directory, since with new BMCs coming in lots of different
sensors should be coming in. Keep from cluttering things up.
Add support for I2C PMBus devices.
Replace the confusing and error-prone i2c_send_recv and i2c_transfer with
specific send and receive functions. Several errors have already been
made with these, avoid any new errors.
Fix the watchdog_expired field in the IPMI watchdog, it's not a bool,
it's a u8. After a vmstate transfer, the new value could be wrong.
# gpg: Signature made Fri 09 Jul 2021 17:25:04 BST
# gpg: using RSA key FD0D5CE67CE0F59A6688268661F38C90919BFF81
# gpg: Good signature from "Corey Minyard <cminyard@mvista.com>" [unknown]
# gpg: aka "Corey Minyard <minyard@acm.org>" [unknown]
# gpg: aka "Corey Minyard <corey@minyard.net>" [unknown]
# gpg: aka "Corey Minyard <minyard@mvista.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: FD0D 5CE6 7CE0 F59A 6688 2686 61F3 8C90 919B FF81
* remotes/cminyard/tags/for-qemu-6.1-2: (24 commits)
tests/qtest: add tests for MAX34451 device model
hw/misc: add MAX34451 device
tests/qtest: add tests for ADM1272 device model
hw/misc: add ADM1272 device
hw/i2c: add support for PMBus
ipmi/sim: fix watchdog_expired data type error in IPMIBmcSim struct
hw/i2c: Introduce i2c_start_recv() and i2c_start_send()
hw/i2c: Extract i2c_do_start_transfer() from i2c_start_transfer()
hw/i2c: Make i2c_start_transfer() direction argument a boolean
hw/i2c: Rename i2c_set_slave_address() -> i2c_slave_set_address()
hw/i2c: Remove confusing i2c_send_recv()
hw/misc/auxbus: Replace i2c_send_recv() by i2c_recv() & i2c_send()
hw/misc/auxbus: Replace 'is_write' boolean by its value
hw/misc/auxbus: Explode READ_I2C / WRITE_I2C_MOT cases
hw/misc/auxbus: Fix MOT/classic I2C mode
hw/i2c/ppc4xx_i2c: Replace i2c_send_recv() by i2c_recv() & i2c_send()
hw/i2c/ppc4xx_i2c: Add reference to datasheet
hw/display/sm501: Replace i2c_send_recv() by i2c_recv() & i2c_send()
hw/display/sm501: Simplify sm501_i2c_write() logic
hw/input/lm832x: Define TYPE_LM8323 in public header
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
- Make blockdev-reopen stable
- Remove deprecated qemu-img backing file without format
- rbd: Convert to coroutines and add write zeroes support
- rbd: Updated MAINTAINERS
- export/fuse: Allow other users access to the export
- vhost-user: Fix backends without multiqueue support
- Fix drive-backup transaction endless drained section
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmDoRdIRHGt3b2xmQHJl
ZGhhdC5jb20ACgkQfwmycsiPL9bvgQ/+Ogq24n1UOQc8FEKRYfyhajNToQ9ofzWN
iLiblSGx2QDq+CauD3qdu6z7DLlqEXeoM4NYM462oIPumptQj+9XZt7ftfh6FLWW
4yJEbjfnVKOba+vFdJ+E0DStwnPaxYdnrPGd53cwHZfbZh4ZmkpTM350mzHHiLTb
KYKOgWd+UHZbkYeCVNYTGe30SRBiKeAecTpsVZ5HVhe7LstjByuy5stk8dytLpdV
YqdKOToZfOp77XiHr8YcLLp1HHBGlr5hw73V4SDas0beCp7hqtnAqsTYyXBue4xO
4zfD4Gujr5JVOCb0crDTyOmOQY5E+y2dqFoOUF00D5AoN2vj4nfQ9ESkbqlE9BVh
mgJ1izSokYlN2X8rIwGXNR5fbxRmxxfkAA4rScNRytj1KxDHyrDxrp/k8YFemxSQ
qwgb/FBm0fcr69evPRzovKwZFhcyPremksluHQE4rZZ66qBQ2cGuDJPE7PWVTpPH
67JCrIVK/O6n5p+4ilFHmQQ3aP3ol0frMFcboYVRchJ2MhIDTsfFL3F/tTK8hy86
AmrrdQ1BQIAoKNOKnAmOSOUdExM55OcfPmX69+AhEk2GeWP6kgz5Pks4H3qCiKGf
YoRk8F1V+N4q+C0mFFovB61bNQ6COIlBuzmD9EtmpDD/Ta3Wib+3ZnoGVIdPS+OI
jyj+qJxd9z4=
=kH+r
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
- Make blockdev-reopen stable
- Remove deprecated qemu-img backing file without format
- rbd: Convert to coroutines and add write zeroes support
- rbd: Updated MAINTAINERS
- export/fuse: Allow other users access to the export
- vhost-user: Fix backends without multiqueue support
- Fix drive-backup transaction endless drained section
# gpg: Signature made Fri 09 Jul 2021 13:49:22 BST
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (28 commits)
block: Make blockdev-reopen stable API
iotests: Test reopening multiple devices at the same time
block: Support multiple reopening with x-blockdev-reopen
block: Acquire AioContexts during bdrv_reopen_multiple()
block: Add bdrv_reopen_queue_free()
qcow2: Fix dangling pointer after reopen for 'file'
qemu-img: Improve error for rebase without backing format
qemu-img: Require -F with -b backing image
qcow2: Prohibit backing file changes in 'qemu-img amend'
blockdev: fix drive-backup transaction endless drained section
vhost-user: Fix backends without multiqueue support
MAINTAINERS: add block/rbd.c reviewer
block/rbd: fix type of task->complete
iotests/fuse-allow-other: Test allow-other
iotests/308: Test +w on read-only FUSE exports
export/fuse: Let permissions be adjustable
export/fuse: Give SET_ATTR_SIZE its own branch
export/fuse: Add allow-other option
export/fuse: Pass default_permissions for mount
util/uri: do not check argument of uri_free()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Here's a (probably) final pull request before the qemu-6.1 soft
freeze. Includes:
* Implementation of the new H_RPT_INVALIDATE hypercall
* Virtual Open Firmware for pSeries and pegasos2 machine types.
This is an experimental minimal Open Firmware implementation which
works by delegating nearly everything to qemu itself via a special
hypercall.
* A number of cleanups to the ppc soft MMU code
* Fix to handling of two-level radix mode translations for the
powernv machine type
* Update the H_GET_CPU_CHARACTERISTICS call with newly defined bits.
This will allow more flexible handling of possible future CPU
Spectre-like flaws
* Correctly treat mtmsrd as an illegal instruction on BookE cpus
* Firmware update for the ppce500 machine type
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAmDn27oACgkQbDjKyiDZ
s5I3eRAA2q76JMP1wH/orAS4gwgJVKxpdQ8F29xgtUnmL1w5RlVs2E0gXSEHYdt2
8rwmxtaz2iCzvc3hv6jZMjFz6A+otrEPFUqlE030mruxQDj2JXFnNLQP2dir3ZPg
Nn0K2U+ChSr2MXjSyUzbB0vQJSVyLxFmR43MsyCbeHSxq2kfSuZ2dNfclzUJ0IXD
8QtCnjZrnOLHtaJ2Vkr/11Yb7rFmbDVZkA1c/ljE3NHGiYjWyZBgSG/Mk/SLeEZe
7wVblUFKZtuiqGCyg2BBAnoWJXPDzDO/ZHFsn5NeUf2d5KTgoeKO3MYfVKQLv3d2
W8JdI09S1OL6g1XEMWvm80S8NPCi0YxUGBXCJaKnuofiU+qwzBMUoj7Xk/2gheT3
uWZCSATUWiKLmOzksR4PbKmHCG9J1EiEMLma7IoNuVw6+pLwMgurM3hlYZtrXGSh
35oBsUT5fMkAM3BtkKh/ZKOfvKfgb1M5FmickI9O0L9BXbzPrmXre5fENhI0ROVs
JeKNPjk/QUG1ftEMqpoYms+JR1rUiUN+jQBh+sFfJTi3CJFAbomoPitV6iGGPWZR
sbDCRqkOyqZ9fR3hEXHzO+ThhwoI2xJPIs6+3/8XbVnFd74siqxOornWJZPpKjcf
CSuOL2n81KJab8h/ERnb9QmJJWb9IJGv6YgZ5E4EFARDWvdSE3c=
=9cX3
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.1-20210709' into staging
ppc patch queue 2021-07-09
Here's a (probably) final pull request before the qemu-6.1 soft
freeze. Includes:
* Implementation of the new H_RPT_INVALIDATE hypercall
* Virtual Open Firmware for pSeries and pegasos2 machine types.
This is an experimental minimal Open Firmware implementation which
works by delegating nearly everything to qemu itself via a special
hypercall.
* A number of cleanups to the ppc soft MMU code
* Fix to handling of two-level radix mode translations for the
powernv machine type
* Update the H_GET_CPU_CHARACTERISTICS call with newly defined bits.
This will allow more flexible handling of possible future CPU
Spectre-like flaws
* Correctly treat mtmsrd as an illegal instruction on BookE cpus
* Firmware update for the ppce500 machine type
# gpg: Signature made Fri 09 Jul 2021 06:16:42 BST
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dg-gitlab/tags/ppc-for-6.1-20210709: (33 commits)
target/ppc: Support for H_RPT_INVALIDATE hcall
linux-headers: Update
spapr: Fix implementation of Open Firmware client interface
target/ppc: Don't compile ppc_tlb_invalid_all without TCG
ppc/pegasos2: Implement some RTAS functions with VOF
ppc/pegasos2: Fix use of && instead of &
ppc/pegasos2: Use Virtual Open Firmware as firmware replacement
target/ppc/spapr: Update H_GET_CPU_CHARACTERISTICS L1D cache flush bits
target/ppc: Allow virtual hypervisor on CPU without HV
ppc/pegasos2: Introduce Pegasos2MachineState structure
target/ppc: mtmsrd is an illegal instruction on BookE
spapr: Implement Open Firmware client interface
docs/system: ppc: Update ppce500 documentation with eTSEC support
roms/u-boot: Bump ppce500 u-boot to v2021.07 to add eTSEC support
target/ppc: change ppc_hash32_xlate to use mmu_idx
target/ppc: introduce mmu-books.h
target/ppc: changed ppc_hash64_xlate to use mmu_idx
target/ppc: fix address translation bug for radix mmus
target/ppc: Fix compilation with DEBUG_BATS debug option
target/ppc: Fix compilation with FLUSH_ALL_TLBS debug option
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We currently select CONFIG_V7M for a bunch of our m-profile devices.
The last sticking point is translate.c which cannot be compiled
without expecting v7m support. Express this dependency in Kconfig
rather than in default devices as a stepping stone to a fully
configurable translate.c.
While we are at it we also need to select ARM_COMPATIBLE_SEMIHOSTING
as that is implied for M profile machines.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210707131744.26027-4-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We need this functionality due to:
/* XRAM IRQs get ORed into a single line. */
object_initialize_child(OBJECT(s), "xram-irq-orgate",
&s->lpd.xram.irq_orgate, TYPE_OR_IRQ);
Fixes: a55b441b2c ("hw/arm: versal: Add support for the XRAMs")
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210707131744.26027-3-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Drop one more shared library dependency (libusb) from core qemu.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-34-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-33-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce an usb device flag instead, set it when usb-host looks at the
device descriptors anyway. Also set it for emulated storage devices,
for consistency. Add an inline helper function to check the flag.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-32-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-13-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-11-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-10-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-7-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-6-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a space in the message printed when gicr_read*/gicr_write* returns
MEMTX_ERROR in arm_gicv3_redist.c.
Signed-off-by: Rebecca Cran <rebecca@nuviainc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210706211432.31902-1-rebecca@nuviainc.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The stellaris board doesn't emulate the handling of the OLED
chipselect line correctly. Expand the comment describing this,
including a sketch of the theoretical correct way to do it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Luminary PL061s in the Stellaris LM3S9695 don't all have the same
reset value for GPIOPUR. We can get away with not letting the board
configure the PUR reset value because we don't actually wire anything
up to the lines which should reset to pull-up. Add a comment noting
this omission.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
The PL061 comes out of reset with all its lines configured as input,
which means they might need to be pulled to 0 or 1 depending on the
'pullups' and 'pulldowns' properties. Currently we do not assert
these lines on reset; they will only be set whenever the guest first
touches a register that triggers a call to pl061_update().
Convert the device to three-phase reset so we have a place where we
can safely call qemu_set_irq() to set the floating lines to their
correct values.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
For the virt board we have two PL061 devices -- one for NonSecure which
is inputs only, and one for Secure which is outputs only. For the former,
we don't care whether its outputs are pulled low or high when the line is
configured as an input, because we don't connect them. For the latter,
we do care, because we wire the lines up to the gpio-pwr device, which
assumes that level 1 means "do the action" and 1 means "do nothing".
For consistency in case we add more outputs in future, configure both
PL061s to pull GPIO lines down to 0.
Reported-by: Maxim Uvarov <maxim.uvarov@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The PL061 GPIO does not itself include pullup or pulldown resistors
to set the value of a GPIO line treated as an output when it is
configured as an input (ie when the PL061 itself is not driving it).
In real hardware it is up to the board to add suitable pullups or
pulldowns. Currently our implementation hardwires this to "outputs
pulled high", which is correct for some boards (eg the realview ones:
see figure 3-29 in the "RealView Platform Baseboard for ARM926EJ-S
User Guide" DUI0224I), but wrong for others.
In particular, the wiring in the 'virt' board and the gpio-pwr device
assumes that wires should be pulled low, because otherwise the
pull-to-high will trigger a shutdown or reset action. (The only
reason this doesn't happen immediately on startup is due to another
bug in the PL061, where we don't assert the GPIOs to the correct
value on reset, but will do so as soon as the guest touches a
register and pl061_update() gets called.)
Add properties to the pl061 so the board can configure whether it
wants GPIO lines to have pullup, pulldown, or neither.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The Luminary variant of the PL061 has registers GPIOPUR and GPIOPDR
which lets the guest configure whether the GPIO lines are pull-up,
pull-down, or truly floating. Instead of assuming all lines are pulled
high, honour the PUR and PDR registers.
For the plain PL061, continue to assume that lines have an external
pull-up resistor, as we did before.
The stellaris board actually relies on this behaviour -- the CD line
of the ssd0323 display device is connected to GPIO output C7, and it
is only because of a different bug which we're about to fix that we
weren't incorrectly driving this line high on reset and putting the
ssd0323 into data mode.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Add a comment documenting the "QEMU interface" of this device:
which MMIO regions, IRQ lines, GPIO lines, etc it exposes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Add tracepoints for reads and writes to the PL061 registers. This requires
restructuring pl061_read() to only return after the tracepoint, rather
than having lots of early-returns.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Currently the pl061_read() and pl061_write() functions handle offsets
using a combination of three if() statements and a switch(). Clean
this up to use just a switch, using case ranges.
This requires that instead of catching accesses to the luminary-only
registers on a stock PL061 via a check on s->rsvd_start we use
an "is this luminary?" check in the cases for each luminary-only
register.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the use of the DPRINTF debug macro in the PL061 model to
use tracepoints.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
icv_eoir_write() and icv_dir_write() ignore invalid virtual IRQ numbers
(like LPIs). The issue is that these functions check against the number
of implemented IRQs (QEMU's default is num_irq=288) which can be lower
than the maximum virtual IRQ number (1020 - 1). The consequence is that
if a hypervisor creates an LR for an IRQ between 288 and 1020, then the
guest is unable to deactivate the resulting IRQ. Note that other
functions that deal with large IRQ numbers, like icv_iar_read, check
against 1020 and not against num_irq.
Fix the checks by using GICV3_MAXIRQ (1020) instead of the number of
implemented IRQs.
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Message-id: 20210702233701.3369-1-ricarkol@google.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This SoC is similar to stm32f205 SoC.
This will be used by the STM32VLDISCOVERY to create a machine.
Signed-off-by: Alexandre Iooss <erdnaxe@crans.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210617165647.2575955-2-erdnaxe@crans.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
dev->max_queues was never initialised for backends that don't support
VHOST_USER_PROTOCOL_F_MQ, so it would use 0 as the maximum number of
queues to check against and consequently fail for any such backend.
Set it to 1 if the backend doesn't have multiqueue support.
Fixes: c90bd505a3
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210705171429.29286-1-kwolf@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If KVM_CAP_RPT_INVALIDATE KVM capability is enabled, then
- indicate the availability of H_RPT_INVALIDATE hcall to the guest via
ibm,hypertas-functions property.
- Enable the hcall
Both the above are done only if the new sPAPR machine capability
cap-rpt-invalidate is set.
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Message-Id: <20210706112440.1449562-3-bharata@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This addresses the comments from v22.
The functional changes are (the VOF ones need retesting with Pegasos2):
(VOF) setprop will start failing if the machine class callback
did not handle it;
(VOF) unit addresses are lowered in path_offset();
(SPAPR) /chosen/bootargs is initialized from kernel_cmdline if
the client did not change it.
Fixes: 5c991e5d4378 ("spapr: Implement Open Firmware client interface")
Cc: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20210708065625.548396-1-aik@ozlabs.ru>
Tested-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Linux uses RTAS functions to access PCI devices so we need to provide
these with VOF. Implement some of the most important functions to
allow booting Linux with VOF. With this the board is now usable
without a binary ROM image and we can enable it by default as other
boards.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20210708215113.B3F747456E3@zero.eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The pegasos2 board comes with an Open Firmware compliant ROM based on
SmartFirmware but it has some changes that are not open source
therefore the ROM binary cannot be included in QEMU. Guests running on
the board however depend on services provided by the firmware. The
Virtual Open Firmware recently added to QEMU implements a minimal set
of these services to allow some guests to boot without the original
firmware. This patch adds VOF as the default firmware for pegasos2
which allows booting Linux and MorphOS via -kernel option while a ROM
image can still be used with -bios for guests that don't run with VOF.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <1d6ed6f290c5c1f0b5a1e1c51cf1151452d70d9a.1624811233.git.balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
There are several new L1D cache flush bits added to the hcall which reflect
hardware security features for speculative cache access issues.
These behaviours are now being specified as negative in order to simplify
patched kernel compatibility with older firmware (a new problem found in
existing systems would automatically be vulnerable).
[dwg: Technically this changes behaviour for existing machine types.
After discussion with Nick, we've determined this is safe, because
the worst that will happen if a guest gets the wrong information due
to a migration is that it will perform some unnecessary workarounds,
but will remain correct and secure (well, as secure as it was going
to be anyway). In addition the change only affects cap-cfpc=safe
which is not enabled by default, and in fact is not possible to set
on any current hardware (though it's expected it will be possible on
POWER10)]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20210615044107.1481608-1-npiggin@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add own machine state structure which will be used to store state
needed for firmware emulation.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <7f6d5fbf4f70c64dba001483174a2921dd616ecd.1624811233.git.balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The PAPR platform describes an OS environment that's presented by
a combination of a hypervisor and firmware. The features it specifies
require collaboration between the firmware and the hypervisor.
Since the beginning, the runtime component of the firmware (RTAS) has
been implemented as a 20 byte shim which simply forwards it to
a hypercall implemented in qemu. The boot time firmware component is
SLOF - but a build that's specific to qemu, and has always needed to be
updated in sync with it. Even though we've managed to limit the amount
of runtime communication we need between qemu and SLOF, there's some,
and it has become increasingly awkward to handle as we've implemented
new features.
This implements a boot time OF client interface (CI) which is
enabled by a new "x-vof" pseries machine option (stands for "Virtual Open
Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall
which implements Open Firmware Client Interface (OF CI). This allows
using a smaller stateless firmware which does not have to manage
the device tree.
The new "vof.bin" firmware image is included with source code under
pc-bios/. It also includes RTAS blob.
This implements a handful of CI methods just to get -kernel/-initrd
working. In particular, this implements the device tree fetching and
simple memory allocator - "claim" (an OF CI memory allocator) and updates
"/memory@0/available" to report the client about available memory.
This implements changing some device tree properties which we know how
to deal with, the rest is ignored. To allow changes, this skips
fdt_pack() when x-vof=on as not packing the blob leaves some room for
appending.
In absence of SLOF, this assigns phandles to device tree nodes to make
device tree traversing work.
When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree.
This adds basic instances support which are managed by a hash map
ihandle -> [phandle].
Before the guest started, the used memory is:
0..e60 - the initial firmware
8000..10000 - stack
400000.. - kernel
3ea0000.. - initramdisk
This OF CI does not implement "interpret".
Unlike SLOF, this does not format uninitialized nvram. Instead, this
includes a disk image with pre-formatted nvram.
With this basic support, this can only boot into kernel directly.
However this is just enough for the petitboot kernel and initradmdisk to
boot from any possible source. Note this requires reasonably recent guest
kernel with:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735
The immediate benefit is much faster booting time which especially
crucial with fully emulated early CPU bring up environments. Also this
may come handy when/if GRUB-in-the-userspace sees light of the day.
This separates VOF and sPAPR in a hope that VOF bits may be reused by
other POWERPC boards which do not support pSeries.
This assumes potential support for booting from QEMU backends
such as blockdev or netdev without devices/drivers used.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20210625055155.2252896-1-aik@ozlabs.ru>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
[dwg: Adjusted some includes which broke compile in some more obscure
compilation setups]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
QEMU reserves space for RTAS via /rtas/rtas-size which tells the client
how much space the RTAS requires to work which includes the RTAS binary
blob implementing RTAS runtime. Because pseries supports FWNMI which
requires plenty of space, QEMU reserves more than 2KB which is
enough for the RTAS blob as it is just 20 bytes (under QEMU).
Since FWNMI reset delivery was added, RTAS_SIZE macro is not used anymore.
This replaces RTAS_SIZE with RTAS_MIN_SIZE and uses it in
the /rtas/rtas-size calculation to account for the RTAS blob.
Fixes: 0e236d3477 ("ppc/spapr: Implement FWNMI System Reset delivery")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20210622070336.1463250-1-aik@ozlabs.ru>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>