When the DSM operation is cancelled asynchronously, we set iocb->ret to
-ECANCELED. However, the callback function only checks the return value
of the completed aio, which may have completed succesfully prior to the
cancellation and thus the callback ends up continuing the dsm operation
instead of bailing out. Fix this.
Secondly, fix a potential use-after-free by removing the bottom half and
enqueuing the completion directly.
Fixes: d7d1474fd8 ("hw/nvme: reimplement dsm to allow cancellation")
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
If the zone reset operation is cancelled but the block unmap operation
completes normally, the callback will continue resetting the next zone
since it neglects to check iocb->ret which will have been set to
-ECANCELED. Make sure that this is checked and bail out if an error is
present.
Secondly, fix a potential use-after-free by removing the bottom half and
enqueuing the completion directly.
Fixes: 63d96e4ffd ("hw/nvme: reimplement zone reset to allow cancellation")
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Make sure that iocb->aiocb is NULL'ed when cancelling.
Fix a potential use-after-free by removing the bottom half and enqueuing
the completion directly.
Fixes: 38f4ac65ac ("hw/nvme: reimplement flush to allow cancellation")
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
There are several bugs in the async cancel code for the Format command.
Firstly, cancelling a format operation neglects to set iocb->ret as well
as clearing the iocb->aiocb after cancelling the underlying aiocb which
causes the aio callback to ignore the cancellation. Trivial fix.
Secondly, and worse, because the request is queued up for posting to the
CQ in a bottom half, if the cancellation is due to the submission queue
being deleted (which calls blk_aio_cancel), the req structure is
deallocated in nvme_del_sq prior to the bottom half being schedulued.
Fix this by simply removing the bottom half, there is no reason to defer
it anyway.
Fixes: 3bcf26d3d6 ("hw/nvme: reimplement format nvm to allow cancellation")
Reported-by: Jonathan Derrick <jonathan.derrick@linux.dev>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
..and use for both virtio-user-blk and virtio-user-gpio. This avoids
the circular close by deferring shutdown due to disconnection until a
later point. virtio-user-blk already had this mechanism in place so
generalise it as a vhost-user helper function and use for both blk and
gpio devices.
While we are at it we also fix up vhost-user-gpio to re-establish the
event handler after close down so we can reconnect later.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20221130112439.2527228-5-alex.bennee@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As per the fix to vhost-user-blk in f5b22d06fb (vhost: recheck dev
state in the vhost_migration_log routine) we really should track the
connection and starting separately.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20221130112439.2527228-4-alex.bennee@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit 02b61f38d3 ("hw/virtio: incorporate backend features in features")
properly negotiates VHOST_USER_F_PROTOCOL_FEATURES with the vhost-user
backend, but we forgot to enable vrings as specified in
docs/interop/vhost-user.rst:
If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the
ring starts directly in the enabled state.
If ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated, the ring is
initialized in a disabled state and is enabled by
``VHOST_USER_SET_VRING_ENABLE`` with parameter 1.
Some vhost-user front-ends already did this by calling
vhost_ops.vhost_set_vring_enable() directly:
- backends/cryptodev-vhost.c
- hw/net/virtio-net.c
- hw/virtio/vhost-user-gpio.c
But most didn't do that, so we would leave the vrings disabled and some
backends would not work. We observed this issue with the rust version of
virtiofsd [1], which uses the event loop [2] provided by the
vhost-user-backend crate where requests are not processed if vring is
not enabled.
Let's fix this issue by enabling the vrings in vhost_dev_start() for
vhost-user front-ends that don't already do this directly. Same thing
also in vhost_dev_stop() where we disable vrings.
[1] https://gitlab.com/virtio-fs/virtiofsd
[2] https://github.com/rust-vmm/vhost/blob/240fc2966/crates/vhost-user-backend/src/event_loop.rs#L217
Fixes: 02b61f38d3 ("hw/virtio: incorporate backend features in features")
Reported-by: German Maglione <gmaglione@redhat.com>
Tested-by: German Maglione <gmaglione@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20221123131630.52020-1-sgarzare@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20221130112439.2527228-3-alex.bennee@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-6-philmd@linaro.org>
Have qxl_get_check_slot_offset() return false if the requested
buffer size does not fit within the slot memory region.
Similarly qxl_phys2virt() now returns NULL in such case, and
qxl_dirty_one_surface() aborts.
This avoids buffer overrun in the host pointer returned by
memory_region_get_ram_ptr().
Fixes: CVE-2022-4144 (out-of-bounds read)
Reported-by: Wenxu Yin (@awxylitol)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1336
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-5-philmd@linaro.org>
Currently qxl_phys2virt() doesn't check for buffer overrun.
In order to do so in the next commit, pass the buffer size
as argument.
For QXLCursor in qxl_render_cursor() -> qxl_cursor() we
verify the size of the chunked data ahead, checking we can
access 'sizeof(QXLCursor) + chunk->data_size' bytes.
Since in the SPICE_CURSOR_TYPE_MONO case the cursor is
assumed to fit in one chunk, no change are required.
In SPICE_CURSOR_TYPE_ALPHA the ahead read is handled in
qxl_unpack_chunks().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-4-philmd@linaro.org>
Only 3 command types are logged: no need to call qxl_phys2virt()
for the other types. Using different cases will help to pass
different structure sizes to qxl_phys2virt() in a pair of commits.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-2-philmd@linaro.org>
The only use of intel_hda_reset() is after its definition, so we
don't need to separately declare its prototype at the top of the
file; drop the unnecessary line.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221014142632.2092404-3-peter.maydell@linaro.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Currently the intel-hda device has a reset method which manually
resets all the codecs by calling device_legacy_reset() on them. This
means they get reset twice, once because child devices on a qbus get
reset before the parent device's reset method is called, and then
again because we're manually resetting them.
Drop the manual reset call, and ensure that codecs are still reset
when the guest does a reset via ICH6_GCTL_RESET by using
device_cold_reset() (which resets all the devices on the qbus as well
as the device itself) instead of a direct call to the reset function.
This is a slight ordering change because the (only) codec reset now
happens before the controller registers etc are reset, rather than
once before and then once after, but the codec reset function
hda_audio_reset() doesn't care.
This lets us drop a use of device_legacy_reset(), which is
deprecated.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221014142632.2092404-2-peter.maydell@linaro.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Currently the hcd-xhci-pci and hcd-xhci-sysbus devices, which are
mostly wrappers around the TYPE_XHCI device, which is a direct
subclass of TYPE_DEVICE. Since TYPE_DEVICE devices are not on any
qbus and do not get automatically reset, the wrapper devices both
reset the TYPE_XHCI device in their own reset functions. However,
they do this using device_legacy_reset(), which will reset the device
itself but not any bus it has.
Switch to device_cold_reset(), which avoids using a deprecated
function and also propagates reset along any child buses.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20221014145423.2102706-1-peter.maydell@linaro.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Fixes: effaf5a240
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20221105114851.306206-1-mjt@msgid.tls.msk.ru>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Run state is also in RUN_STATE_PRELAUNCH while "-S" is used.
This reverts commit 0631d4b448
Signed-off-by: Joelle van Dyne <j@getutm.app>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The original commit broke the usage of usbredir with libvirt, which
starts every domain with "-S".
This workaround is no longer needed because the usbredir behavior
has been fixed in the meantime:
https://gitlab.freedesktop.org/spice/usbredir/-/merge_requests/61
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Message-Id: <1689cec3eadcea87255e390cb236033aca72e168.1669193161.git.jtomko@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
-----BEGIN PGP SIGNATURE-----
iLMEAAEIAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCY3zJLQAKCRBAov/yOSY+
3wHkBADTgirzsosyvVGza6nHD1QTEK6ZqnvYB0wCoxKBt1btvzLoKO5gbe7S3Pq4
obFurS+5QclwIuLR6ZkihNGMBL+uq6XZryUGG3pU8h1XZigdduiEk7l7YGF+N+hz
/7ESJGHzCTiXyTTXBNgLJEJYfUYdV+kWbXkIQuJGT+kdlzUifA==
=Uod8
-----END PGP SIGNATURE-----
Merge tag 'pull-loongarch-20221122' of https://gitlab.com/gaosong/qemu into staging
LoongArch pull for 7.2-rc2
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEIAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCY3zJLQAKCRBAov/yOSY+
# 3wHkBADTgirzsosyvVGza6nHD1QTEK6ZqnvYB0wCoxKBt1btvzLoKO5gbe7S3Pq4
# obFurS+5QclwIuLR6ZkihNGMBL+uq6XZryUGG3pU8h1XZigdduiEk7l7YGF+N+hz
# /7ESJGHzCTiXyTTXBNgLJEJYfUYdV+kWbXkIQuJGT+kdlzUifA==
# =Uod8
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 22 Nov 2022 08:05:49 EST
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20221122' of https://gitlab.com/gaosong/qemu:
hw/loongarch: Replace the value of uart info with macro
hw/loongarch: Fix setprop_sized method in fdt rtc node.
hw/loongarch: Add default stdout uart in fdt
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
fixes regressions:
virtio error message triggered by seabios
failure in vhost due to VIRTIO_F_RING_RESET
broken keyboard under seabios
some biosbits test fixes
there's still a known regression with migration and vsock,
not fixed yet.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmN8os4PHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpw+UH/2JOoxqzkkq4QyV5PzmjsiyyMbrBqFSOTro2
98k9MwxNWHIk6fi3FVIz+LX1EdwDbMl44IHQ88xo7k39V7ThvVvLfjaQySSeoCfE
04++AvkOQ/O8W7sob17r2iPWIHKPIndq8uB652T89o1I/nhmFMsUkdRWmVQ3XsDk
DnPZxjBQnKY0oSvBd7SmSI3C+eA4lF1tht50ZxmOshLx4OSjK/maFVpIyLsUGTMn
6OMxKffDjfvw5z94+pGN1RQm9xu7OwtFZyUnH16xoQUmWSXjzcYXUZ3buTY+eAyJ
I5j9ox27Cqn1wGcypUiTCKLIExEMBfvNY4ovyRa1xvW+MXluVyE=
=bF2U
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
pc,virtio: regression, test fixes
fixes regressions:
virtio error message triggered by seabios
failure in vhost due to VIRTIO_F_RING_RESET
broken keyboard under seabios
some biosbits test fixes
there's still a known regression with migration and vsock,
not fixed yet.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmN8os4PHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpw+UH/2JOoxqzkkq4QyV5PzmjsiyyMbrBqFSOTro2
# 98k9MwxNWHIk6fi3FVIz+LX1EdwDbMl44IHQ88xo7k39V7ThvVvLfjaQySSeoCfE
# 04++AvkOQ/O8W7sob17r2iPWIHKPIndq8uB652T89o1I/nhmFMsUkdRWmVQ3XsDk
# DnPZxjBQnKY0oSvBd7SmSI3C+eA4lF1tht50ZxmOshLx4OSjK/maFVpIyLsUGTMn
# 6OMxKffDjfvw5z94+pGN1RQm9xu7OwtFZyUnH16xoQUmWSXjzcYXUZ3buTY+eAyJ
# I5j9ox27Cqn1wGcypUiTCKLIExEMBfvNY4ovyRa1xvW+MXluVyE=
# =bF2U
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 22 Nov 2022 05:22:06 EST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu:
virtio: disable error for out of spec queue-enable
acpi/tests/avocado/bits: keep the work directory when BITS_DEBUG is set in env
tests/avocado: configure acpi-bits to use avocado timeout
MAINTAINERS: add mst to list of biosbits maintainers
tests: acpi: x86: update expected DSDT after moving PRQx fields in _SB scope
acpi: x86: move RPQx field back to _SB scope
tests: acpi: whitelist DSDT before moving PRQx to _SB scope
vhost: mask VIRTIO_F_RING_RESET for vhost and vhost-user devices
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Using macro to replace the value of uart info such as addr, size
in acpi_build method.
Signed-off-by: Xiaojuan Yang <yangxiaojuan@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20221115115008.3372489-1-yangxiaojuan@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Fix setprop_sized method in fdt rtc node.
Signed-off-by: Xiaojuan Yang <yangxiaojuan@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20221116040300.3459818-1-yangxiaojuan@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Add "chosen" subnode into LoongArch fdt, and set it's
"stdout-path" prop to uart node.
Signed-off-by: Xiaojuan Yang <yangxiaojuan@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20221115114923.3372414-1-yangxiaojuan@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Virtio 1.0 is pretty clear that features have to be
negotiated before enabling VQs. Unfortunately Seabios
ignored this ever since gaining 1.0 support (UEFI is ok).
Comment the error out for now, and add a TODO.
Fixes: 3c37f8b8d1 ("virtio: introduce virtio_queue_enable()")
Cc: "Kangjie Xu" <kangjie.xu@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20221121200339.362452-1-mst@redhat.com>
Commit 47a373faa6 (acpi: pc/q35: drop ad-hoc PCI-ISA bridge AML routines and let bus ennumeration generate AML)
moved ISA bridge AML generation to respective devices and was using
aml_alias() to provide PRQx fields in _SB. scope. However, it turned
out that SeaBIOS was not able to process Alias opcode when parsing DSDT,
resulting in lack of keyboard during boot (SeaBIOS console, grub, FreeDOS).
While fix for SeaBIOS is posted
https://mail.coreboot.org/hyperkitty/list/seabios@seabios.org/thread/RGPL7HESH5U5JRLEO6FP77CZVHZK5J65/
fixed SeaBIOS might not make into QEMU-7.2 in time.
Hence this workaround that puts PRQx back into _SB scope
and gets rid of aliases in ISA bridge description, so
DSDT will be parsable by broken SeaBIOS.
That brings back hardcoded references to ISA bridge
PCI0.S08.P40C/PCI0.SF8.PIRQ
where middle part now is auto generated based on slot it's
plugged in, but it should be fine as bridge initialization
also hardcodes PCI address of the bridge so it can't ever
move. Once QEMU tree has fixed SeaBIOS blob, we should be able
to drop this part and revert back to alias based approach
Reported-by: Volker Rümelin <vr_qemu@t-online.de>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20221121153613.3972225-3-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit 69e1c14aa2 ("virtio: core: vq reset feature negotation support")
enabled VIRTIO_F_RING_RESET by default for all virtio devices.
This feature is not currently emulated by QEMU, so for vhost and
vhost-user devices we need to make sure it is supported by the offloaded
device emulation (in-kernel or in another process).
To do this we need to add VIRTIO_F_RING_RESET to the features bitmap
passed to vhost_get_features(). This way it will be masked if the device
does not support it.
This issue was initially discovered with vhost-vsock and vhost-user-vsock,
and then also tested with vhost-user-rng which confirmed the same issue.
They fail when sending features through VHOST_SET_FEATURES ioctl or
VHOST_USER_SET_FEATURES message, since VIRTIO_F_RING_RESET is negotiated
by the guest (Linux >= v6.0), but not supported by the device.
Fixes: 69e1c14aa2 ("virtio: core: vq reset feature negotation support")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1318
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20221121101101.29400-1-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
The Large-Send Task Offload Tx Descriptor (9.2.1 Transmit) has a
Large-Send MSS value where the driver specifies the MSS. See the
datasheet here:
http://realtek.info/pdf/rtl8139cp.pdf
The code ignores this value and uses a hardcoded MSS of 1500 bytes
instead. When the MTU is less than 1500 bytes the hardcoded value
results in IP fragmentation and poor performance.
Use the Large-Send MSS value to correctly size Large-Send packets.
Jason Wang <jasowang@redhat.com> noticed that the Large-Send MSS value
mask was incorrect so it is adjusted to match the datasheet and Linux
8139cp driver.
This issue was discussed in the past here:
https://lore.kernel.org/all/20161114162505.GD26664@stefanha-x1.localdomain/
Reported-by: Russell King - ARM Linux <linux@armlinux.org.uk>
Reported-by: Tobias Fiebig <tobias+git@fiebig.nl>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1312
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221117165554.1773409-4-stefanha@redhat.com>
There are two Tx Descriptor formats called mode 0 and mode 1. The mode
is determined by the Large Send bit.
CP_TX_IPCS (bit 18) is defined in mode 1 but the code checks the bit
unconditionally. In mode 0 bit 18 is part of the Large Send MSS value.
Explicitly check the Large Send bit to distinguish Tx command modes.
This avoids bugs where modes are confused. Note that I didn't find any
actual bugs aside from needlessly computing the IP checksum when the
Large Send bit is enabled.
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221117165554.1773409-3-stefanha@redhat.com>
The device turns the Tx Descriptor into a Tx Status descriptor after
fully reading the descriptor. This involves clearing Tx Own (bit 31) to
indicate that the driver has ownership of the descriptor again as well
as several other bits.
The code keeps the first dword of the Tx Descriptor in the txdw0 local
variable. txdw0 is reused to build the first word of the Tx Status
descriptor. Later on the code uses txdw0 again, incorrectly assuming
that it still contains the first dword of the Tx Descriptor. The tx
offloading code misbehaves because it sees bogus bits in txdw0.
Use a separate local variable for Tx Status and preserve Tx Descriptor
in txdw0.
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221117165554.1773409-2-stefanha@redhat.com>
a66a24585f (hw/intc/arm_gic: Implement read of GICC_IIDR) implemented
this for the CPU interface register. The fact we don't implement it
shows up when running Xen with -d guest_error which is definitely
wrong because the guest is perfectly entitled to read it.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
gic_dist_readb was returning a word value which just happened to work
as a result of the way we OR the data together. Lets fix it so only
the explicit byte is returned for each part of GICD_TYPER. I've
changed the return type to uint8_t although the overflow is only
detected with an explicit -Wconversion.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Trying to run U-Boot for Cubieboard (Allwinner A10) fails because it cannot
access SD card. The problem is that FIFO register in current
allwinner-sdhost implementation is at the address corresponding to
Allwinner H3, but not A10.
Linux kernel is not affected since Linux driver uses DMA access and does
not use FIFO register for reading/writing.
This patch adds new class parameter `is_sun4i` and based on that
parameter uses register at offset 0x100 either as FIFO register (if
sun4i) or as threshold register (if not sun4i; in this case register at
0x200 is FIFO register).
Tested with U-Boot and Linux kernel image built for Cubieboard and
OrangePi PC.
Signed-off-by: Strahinja Jankovic <strahinja.p.jankovic@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20221112214900.24152-1-strahinja.p.jankovic@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Fix typos (discovered with the 'codespell' utility).
Note: Though "migrateable" still seems to be a valid spelling, we change
it to "migratable" since this is the way more common spelling here.
Message-Id: <20221111182828.282251-1-thuth@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
With commit 39f29e5993 ("hw/intc/arm_gicv3: Use correct number of
priority bits for the CPU") the number of priority bits was changed from
the maximum value 8 to typically 5. As a consequence a few of the lowest
bits in ICC_PMR_EL1 becomes RAZ/WI. However prior to this patch one of
these bits was still used since the supplied priority value is masked
before it's eventually right shifted with one bit. So the bit is not
lost as one might expect when the register is read again.
The Linux kernel depends on lowest valid bit to be reset to zero, see
commit 33625282adaa ("irqchip/gic-v3: Probe for SCR_EL3 being clear
before resetting AP0Rn") for details.
So fix this by masking the priority value after it may have been right
shifted by one bit.
Cc: qemu-stable@nongnu.org
Fixes: 39f29e5993 ("hw/intc/arm_gicv3: Use correct number of priority bits for the CPU")
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Fix LoongArch check-tcg error:
TEST hello on loongarch64
qemu-system-loongarch64: Some ROM regions are overlapping
These ROM regions might have been loaded by direct user request or by default.
They could be BIOS/firmware images, a guest kernel, initrd or some other file loaded into guest memory.
Check whether you intended to load all this guest code, and whether it has been built to load to the correct addresses.
The following two regions overlap (in the memory address space):
hello ELF program header segment 0 (addresses 0x0000000000200000 - 0x0000000000242000)
fdt (addresses 0x0000000000200000 - 0x0000000000300000)
make[1]: *** [Makefile:177: run-hello] Error 1
Fixes: 021836936e ("hw/loongarch: Load FDT table into dram memory space")
Reported-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221109020449.978064-1-gaosong@loongson.cn>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Short queue with just a single pnv-phb fix from Thomas Huth.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQQX6/+ZI9AYAK8oOBk82cqW3gMxZAUCY24dtwAKCRA82cqW3gMx
ZNlDAQC+yqONSkYvoANSPNDuMtcK0Lk7KNXFTx5cg8ASNym0twEAkA/YuNv4t0m2
9IRfh/xJ+AhKf6VYKbUwftAsZGPTpAc=
=U0me
-----END PGP SIGNATURE-----
Merge tag 'pull-ppc-20221111' of https://gitlab.com/danielhb/qemu into staging
ppc patch queue for 2022-11-11:
Short queue with just a single pnv-phb fix from Thomas Huth.
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQQX6/+ZI9AYAK8oOBk82cqW3gMxZAUCY24dtwAKCRA82cqW3gMx
# ZNlDAQC+yqONSkYvoANSPNDuMtcK0Lk7KNXFTx5cg8ASNym0twEAkA/YuNv4t0m2
# 9IRfh/xJ+AhKf6VYKbUwftAsZGPTpAc=
# =U0me
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 11 Nov 2022 05:02:31 EST
# gpg: using EDDSA key 17EBFF9923D01800AF2838193CD9CA96DE033164
# gpg: Good signature from "Daniel Henrique Barboza <danielhb413@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 17EB FF99 23D0 1800 AF28 3819 3CD9 CA96 DE03 3164
* tag 'pull-ppc-20221111' of https://gitlab.com/danielhb/qemu:
hw/pci-host/pnv_phb: Avoid quitting QEMU if hotplug of pnv-phb-root-port fails
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Those typos are in files which are used to generate the QEMU manual.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20221110190825.879620-1-sw@weilnetz.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Ani Sinha <ani@anisinha.ca>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
[thuth: update sentence in can.rst as suggested by Peter]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Replaces TABs with spaces, making sure to have a consistent coding style
of 4 space indentations in the net subsystem.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/377
Signed-off-by: Ahmed Abouzied <email@aabouzied.com>
Message-Id: <20210614183849.20622-1-email@aabouzied.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[thuth: Fixed mis-aligned indentation in some of the files]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Variable unconnected used in usb_host_auto_check function is only incremented
but never read as line where it is read was disabled since introducing the code.
This causes 'Unused but set variable' warning on Clang 15.0.1 compiler.
Removing the variable and disabled code to prevent the warning.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <00df0db69ff9167d38bac81f6d03281955bd861a.1668009030.git.mrezanin@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Variable n used in tulip_idblock_crc function is only incremented but never read.
This causes 'Unused but set variable' warning on Clang 15.0.1 compiler.
Removing the variable to prevent the warning.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <02e1560d115c208df32236df8916fed98429fda1.1668009030.git.mrezanin@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Variable send_count used in rtl8139_cplus_transmit_one function is only
incremented but never read. This causes 'Unused but set variable' warning
on Clang 15.0.1 compiler.
Removing the variable to prevent the warning.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <15a32dd06c492216cbf27cd3ddcbe1e9afb8d8f5.1668009030.git.mrezanin@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Currently QEMU terminates if you try to hotplug pnv-phb-root-port in
an environment where it is not supported, e.g. if doing this:
echo "device_add pnv-phb-root-port" | \
./qemu-system-ppc64 -monitor stdio -M powernv9
To avoid this problem, the pnv_phb_root_port_realize() function should
not use error_fatal when trying to set the properties which might not
be available.
Fixes: c2f3f78af5 ("ppc/pnv: set root port chassis and slot using Bus properties")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20221109122210.115667-1-thuth@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Run shell script:
cat << EOF | valgrind qemu-system-i386 -display none -machine accel=qtest, -m \
512M -M q35 -nodefaults -device virtio-net,netdev=net0 -netdev \
user,id=net0 -qtest stdio
outl 0xcf8 0x80000810
outl 0xcfc 0xc000
outl 0xcf8 0x80000804
outl 0xcfc 0x01
outl 0xc00d 0x0200
outl 0xcf8 0x80000890
outb 0xcfc 0x4
outl 0xcf8 0x80000889
outl 0xcfc 0x1c000000
outl 0xcf8 0x80000893
outw 0xcfc 0x100
EOF
Got:
==68666== Invalid read of size 8
==68666== at 0x688536: virtio_net_queue_enable (virtio-net.c:575)
==68666== by 0x6E31AE: memory_region_write_accessor (memory.c:492)
==68666== by 0x6E098D: access_with_adjusted_size (memory.c:554)
==68666== by 0x6E4DB3: memory_region_dispatch_write (memory.c:1521)
==68666== by 0x6E31AE: memory_region_write_accessor (memory.c:492)
==68666== by 0x6E098D: access_with_adjusted_size (memory.c:554)
==68666== by 0x6E4DB3: memory_region_dispatch_write (memory.c:1521)
==68666== by 0x6EBCD3: flatview_write_continue (physmem.c:2820)
==68666== by 0x6EBFBF: flatview_write (physmem.c:2862)
==68666== by 0x6EF5E7: address_space_write (physmem.c:2958)
==68666== by 0x6DFDEC: cpu_outw (ioport.c:70)
==68666== by 0x6F6DF0: qtest_process_command (qtest.c:480)
==68666== Address 0x29087fe8 is 24 bytes after a block of size 416 in arena "client"
That is reported by Alexander Bulekov. https://gitlab.com/qemu-project/qemu/-/issues/1309
Here, the queue_index is the index of the cvq, but in some cases cvq
does not have the corresponding NetClientState, so overflow appears.
I add a check here, ignore illegal queue_index and cvq queue_index.
Note the queue_index is below the VIRTIO_QUEUE_MAX but greater or equal
than cvq index could hit this. Other devices are similar.
Fixes: 7f863302 ("virtio-net: support queue_enable")
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1309
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20221110095739.130393-1-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
acpi-vga-stub.c pulls in vga_int.h
However that currently pulls in ui/console.h which
breaks e.g. on systems without pixman.
It's better to remove ui/console.h from vga_int.h
and directly include it where it's used.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20221109222112.74519-1-mst@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Reported-by: Miroslav Rezanina <mrezanin@redhat.com>
Reported-by: Frederic Bezies <fredbezies@gmail.com>
Reported-by: Laurent Vivier <lvivier@redhat.com>
Fixes: cfead31326 ("AcpiDevAmlIf interface to build VGA device descs")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Commit 334c388f25 ("pflash_cfi: Error out if device length
isn't a power of two") aimed to finish the effort started by
commit 06f1521795 ("pflash: Require backend size to match device,
improve errors"), but unfortunately we are not quite there since
various machines are still ready to accept incomplete / oversized
pflash backend images, and now fail, i.e. on Debian bullseye:
$ qemu-system-x86_64 \
-drive \
if=pflash,format=raw,unit=0,readonly=on,file=/usr/share/OVMF/OVMF_CODE.fd
qemu-system-x86_64: Device size must be a power of two.
where OVMF_CODE.fd comes from the ovmf package, which doesn't
pad the firmware images to the flash size:
$ ls -lh /usr/share/OVMF/
-rw-r--r-- 1 root root 3.5M Aug 19 2021 OVMF_CODE_4M.fd
-rw-r--r-- 1 root root 1.9M Aug 19 2021 OVMF_CODE.fd
-rw-r--r-- 1 root root 128K Aug 19 2021 OVMF_VARS.fd
Since we entered the freeze period to prepare the v7.2.0 release,
the safest is to revert commit 334c388f25.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1294
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221108175755.95141-1-philmd@linaro.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20221108172633.860700-1-danielhb413@gmail.com>
The TABs should be replaced with spaces, to make sure that we have a
consistent coding style with an indentation of 4 spaces everywhere.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/370
Signed-off-by: Amarjargal Gundjalam <amarjargal16@gmail.com>
Message-Id: <6c993f57800f8fef7a910074620f6e80e077a3d1.1666707782.git.amarjargal16@gmail.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The TABs should be replaced with spaces, to make sure that we have a
consistent coding style with an indentation of 4 spaces everywhere.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/370
Signed-off-by: Amarjargal Gundjalam <amarjargal16@gmail.com>
Message-Id: <5cefd05b4d3721d416e48e6df19df18cb6338933.1666707782.git.amarjargal16@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
zPCI enhancement features (interpretation and forward assist) were
recently introduced to improve performance on PCI passthrough devices.
To maintain the same behaviour on older Z machines, deactivate the
features with the associated properties.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20221107161349.1032730-3-clg@kaod.org>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This reverts commit 59d1ce4439.
The "zpcii-disable" machine property is redundant with the "interpret"
zPCI device property. Remove it for clarification.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20221107161349.1032730-2-clg@kaod.org>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
lots of acpi rework
first version of biosbits infrastructure
ASID support in vhost-vdpa
core_count2 support in smbios
PCIe DOE emulation
virtio vq reset
HMAT support
part of infrastructure for viommu support in vhost-vdpa
VTD PASID support
fixes, tests all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmNpXDkPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpD0AH/2G8ZPrgrxJC9y3uD5/5J6QRzO+TsDYbg5ut
uBf4rKSHHzcu6zdyAfsrhbAKKzyD4HrEGNXZrBjnKM1xCiB/SGBcDIWntwrca2+s
5Dpbi4xvd4tg6tVD4b47XNDCcn2uUbeI0e2M5QIbtCmzdi/xKbFAfl5G8DQp431X
Kmz79G4CdKWyjVlM0HoYmdCw/4FxkdjD02tE/Uc5YMrePNaEg5Bw4hjCHbx1b6ur
6gjeXAtncm9s4sO0l+sIdyiqlxiTry9FSr35WaQ0qPU+Og5zaf1EiWfdl8TRo4qU
EAATw5A4hyw11GfOGp7oOVkTGvcNB/H7aIxD7emdWZV8+BMRPKo=
=zTCn
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
pci,pc,virtio: features, tests, fixes, cleanups
lots of acpi rework
first version of biosbits infrastructure
ASID support in vhost-vdpa
core_count2 support in smbios
PCIe DOE emulation
virtio vq reset
HMAT support
part of infrastructure for viommu support in vhost-vdpa
VTD PASID support
fixes, tests all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmNpXDkPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpD0AH/2G8ZPrgrxJC9y3uD5/5J6QRzO+TsDYbg5ut
# uBf4rKSHHzcu6zdyAfsrhbAKKzyD4HrEGNXZrBjnKM1xCiB/SGBcDIWntwrca2+s
# 5Dpbi4xvd4tg6tVD4b47XNDCcn2uUbeI0e2M5QIbtCmzdi/xKbFAfl5G8DQp431X
# Kmz79G4CdKWyjVlM0HoYmdCw/4FxkdjD02tE/Uc5YMrePNaEg5Bw4hjCHbx1b6ur
# 6gjeXAtncm9s4sO0l+sIdyiqlxiTry9FSr35WaQ0qPU+Og5zaf1EiWfdl8TRo4qU
# EAATw5A4hyw11GfOGp7oOVkTGvcNB/H7aIxD7emdWZV8+BMRPKo=
# =zTCn
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 07 Nov 2022 14:27:53 EST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (83 commits)
checkpatch: better pattern for inline comments
hw/virtio: introduce virtio_device_should_start
tests/acpi: update tables for new core count test
bios-tables-test: add test for number of cores > 255
tests/acpi: allow changes for core_count2 test
bios-tables-test: teach test to use smbios 3.0 tables
hw/smbios: add core_count2 to smbios table type 4
vhost-user: Support vhost_dev_start
vhost: Change the sequence of device start
intel-iommu: PASID support
intel-iommu: convert VTD_PE_GET_FPD_ERR() to be a function
intel-iommu: drop VTDBus
intel-iommu: don't warn guest errors when getting rid2pasid entry
vfio: move implement of vfio_get_xlat_addr() to memory.c
tests: virt: Update expected *.acpihmatvirt tables
tests: acpi: aarch64/virt: add a test for hmat nodes with no initiators
hw/arm/virt: Enable HMAT on arm virt machine
tests: Add HMAT AArch64/virt empty table files
tests: acpi: q35: update expected blobs *.hmat-noinitiators expected HMAT:
tests: acpi: q35: add test for hmat nodes without initiators
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this duel use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.
We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.
Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Message-Id: <20221107121407.1010913-1-alex.bennee@linaro.org>
In order to use the increased number of cpus, we need to bring smbios
tables in line with the SMBIOS 3.0 specification. This allows us to
introduce core_count2 which acts as a duplicate of core_count if we have
fewer cores than 256, and contains the actual core number per socket if
we have more.
core_enabled2 and thread_count2 fields work the same way.
Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20220731162141.178443-2-jusual@redhat.com>
Message-Id: <20221011111731.101412-2-jusual@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The motivation of adding vhost-user vhost_dev_start support is to
improve backend configuration speed and reduce live migration VM
downtime.
Today VQ configuration is issued one by one. For virtio net with
multi-queue support, backend needs to update RSS (Receive side
scaling) on every rx queue enable. Updating RSS is time-consuming
(typical time like 7ms).
Implement already defined vhost status and message in the vhost
specification [1].
(a) VHOST_USER_PROTOCOL_F_STATUS
(b) VHOST_USER_SET_STATUS
(c) VHOST_USER_GET_STATUS
Send message VHOST_USER_SET_STATUS with VIRTIO_CONFIG_S_DRIVER_OK for
device start and reset(0) for device stop.
On reception of the DRIVER_OK message, backend can apply the needed setting
only once (instead of incremental) and also utilize parallelism on enabling
queues.
This improves QEMU's live migration downtime with vhost user backend
implementation by great margin, specially for the large number of VQs of 64
from 800 msec to 250 msec.
[1] https://qemu-project.gitlab.io/qemu/interop/vhost-user.html
Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Parav Pandit <parav@nvidia.com>
Message-Id: <20221017064452.1226514-3-yajunw@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch is part of adding vhost-user vhost_dev_start support. The
motivation is to improve backend configuration speed and reduce live
migration VM downtime.
Moving the device start routines after finishing all the necessary device
and VQ configuration, further aligning to the virtio specification for
"device initialization sequence".
Following patch will add vhost-user vhost_dev_start support.
Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Parav Pandit <parav@nvidia.com>
Message-Id: <20221017064452.1226514-2-yajunw@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduce ECAP_PASID via "x-pasid-mode". Based on the
existing support for scalable mode, we need to implement the following
missing parts:
1) tag VTDAddressSpace with PASID and support IOMMU/DMA translation
with PASID
2) tag IOTLB with PASID
3) PASID cache and its flush
4) PASID based IOTLB invalidation
For simplicity PASID cache is not implemented so we can simply
implement the PASID cache flush as a no and leave it to be implemented
in the future. For PASID based IOTLB invalidation, since we haven't
had L1 stage support, the PASID based IOTLB invalidation is not
implemented yet. For PASID based device IOTLB invalidation, it
requires the support for vhost so we forbid enabling device IOTLB when
PASID is enabled now. Those work could be done in the future.
Note that though PASID based IOMMU translation is ready but no device
can issue PASID DMA right now. In this case, PCI_NO_PASID is used as
PASID to identify the address without PASID. vtd_find_add_as() has
been extended to provision address space with PASID which could be
utilized by the future extension of PCI core to allow device model to
use PASID based DMA translation.
This feature would be useful for:
1) prototyping PASID support for devices like virtio
2) future vPASID work
3) future PRS and vSVA work
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221028061436.30093-5-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We used to have a macro for VTD_PE_GET_FPD_ERR() but it has an
internal goto which prevents it from being reused. This patch convert
that macro to a dedicated function and let the caller to decide what
to do (e.g using goto or not). This makes sure it can be re-used for
other function that requires fault reporting.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221028061436.30093-4-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
We introduce VTDBus structure as an intermediate step for searching
the address space. This works well with SID based matching/lookup. But
when we want to support SID plus PASID based address space lookup,
this intermediate steps turns out to be a burden. So the patch simply
drops the VTDBus structure and use the PCIBus and devfn as the key for
the g_hash_table(). This simplifies the codes and the future PASID
extension.
To prevent being slower for past vtd_find_as_from_bus_num() callers, a
vtd_as cache indexed by the bus number is introduced to store the last
recent search result of a vtd_as belongs to a specific bus.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221028061436.30093-3-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
We use to warn on wrong rid2pasid entry. But this error could be
triggered by the guest and could happens during initialization. So
let's don't warn in this case.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221028061436.30093-2-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
- Move the implement vfio_get_xlat_addr to softmmu/memory.c, and
change the name to memory_get_xlat_addr(). So we can use this
function on other devices, such as vDPA device.
- Add a new function vfio_get_xlat_addr in vfio/common.c, and it will check
whether the memory is backed by a discard manager. then device can
have its own warning.
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20221031031020.1405111-2-lulu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since the patchset ("Build ACPI Heterogeneous Memory Attribute Table (HMAT)"),
HMAT is supported, but only x86 is enabled. Enable HMAT on arm virt machine.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Hesham Almatary <hesham.almatary@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20221027100037.251-7-hesham.almatary@huawei.com>
Tested-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The code currently assumes Q35 iff ICH9 and i440fx iff PIIX. Now that more
AML generation has been moved into the south bridges and since the
machines define themselves primarily through their north bridges, let's
switch to resolving the north bridges for AML generation instead. This
also allows for easier experimentation with different south bridges in
the "pc" machine, e.g. with PIIX4 and VT82xx.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20221028103419.93398-4-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The is_piix4 attribute is set once in one location and read once in
another. Doing both in one location allows for removing the attribute
altogether.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221026133110.91828-3-shentey@gmail.com>
Message-Id: <20221028103419.93398-3-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ammends commit b23046abe7 'pc: acpi-build:
simplify PCI bus tree generation'.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221026133110.91828-2-shentey@gmail.com>
Message-Id: <20221028103419.93398-2-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Early-boot e820 records will be inserted by the bios/efi/early boot
software and be reported to the kernel via insert_resource. Later, when
CXL drivers iterate through the regions again, they will insert another
resource and make the RESERVED memory area a child.
This RESERVED memory area causes the memory region to become unusable,
and as a result attempting to create memory regions with
`cxl create-region ...`
Will fail due to the RESERVED area intersecting with the CXL window.
During boot the following traceback is observed:
0xffffffff81101650 in insert_resource_expand_to_fit ()
0xffffffff83d964c5 in e820__reserve_resources_late ()
0xffffffff83e03210 in pcibios_resource_survey ()
0xffffffff83e04f4a in pcibios_init ()
Which produces a call to reserve the CFMWS area:
(gdb) p *new
$54 = {start = 0x290000000, end = 0x2cfffffff, name = "Reserved",
flags = 0x200, desc = 0x7, parent = 0x0, sibling = 0x0,
child = 0x0}
Later the Kernel parses ACPI tables and reserves the exact same area as
the CXL Fixed Memory Window:
0xffffffff811016a4 in insert_resource_conflict ()
insert_resource ()
0xffffffff81a81389 in cxl_parse_cfmws ()
0xffffffff818c4a81 in call_handler ()
acpi_parse_entries_array ()
(gdb) p/x *new
$59 = {start = 0x290000000, end = 0x2cfffffff, name = "CXL Window 0",
flags = 0x200, desc = 0x0, parent = 0x0, sibling = 0x0,
child = 0x0}
This produces the following output in /proc/iomem:
590000000-68fffffff : CXL Window 0
590000000-68fffffff : Reserved
This reserved area causes `get_free_mem_region()` to fail due to a check
against `__region_intersects()`. Due to this reserved area, the
intersect check will only ever return REGION_INTERSECTS, which causes
`cxl create-region` to always fail.
Signed-off-by: Gregory Price <gregory.price@memverge.com>
Message-Id: <20221026205912.8579-1-gregory.price@memverge.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
There were several different ways to deal with the situation where the
vector specified for a msix function is out of bound:
- early return a function and keep progresssing
- propagate the error to the caller
- mark msix unusable
- assert it is in bound
- just ignore
An out-of-bound vector should not be specified if the device
implementation is correct so let msix functions always assert that the
specified vector is in range.
An exceptional case is virtio-pci, which allows the guest to configure
vectors. For virtio-pci, it is more appropriate to introduce its own
checks because it is sometimes too late to check the vector range in
msix functions.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20220829083524.143640-1-akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Yuval Shaia <yuval.shaia.ml@gmail.com>
Signed-off-by: Akihiko Odaki <<a href="mailto:akihiko.odaki@daynix.com" target="_blank">akihiko.odaki@daynix.com</a>><br>
- Fix memset argument order: The second argument is
the value, the length goes last.
- Fix an integer overflow reported by Alexander Bulekov.
Both issues allow the guest to overrun the host buffer
allocated for the ERST memory device.
Cc: Eric DeVolder <eric.devolder@oracle.com
Cc: Alexander Bulekov <alxndr@bu.edu>
Cc: qemu-stable@nongnu.org
Fixes: f7e26ffa59 ("ACPI ERST: support for ACPI ERST feature")
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Christian A. Ehrhardt <lk@c--e.de>
Message-Id: <20221024154233.1043347-1-lk@c--e.de>
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1268
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Eric DeVolder <eric.devolder@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Move _GPE block declaration before it gets referenced by other
hotplug handlers. While at it move PCI hotplug (_E01) handler
after PCI tree description to avoid forward reference to
to not yet declared methods/devices.
PS:
Forward 'usage' usualy is fine as long as it's hidden within
method, however 'iasl' may print warnings. So be nice
to iasl/guest OS and do things in proper order.
PS2: Also follow up patches will move some of hotplug code
from PCI tree to _E01 and that also requires PCI Device
nodes build first, before Scope can reuse that from
global context.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20221017102146.2254096-11-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
to make that happen (bridge sits at _ADR: 0x001F0003),
relax PCI enumeration logic to include devices with *function* > 0
if device has something to say about itself (i.e. has build_dev_aml
callback set).
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20221017102146.2254096-8-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
PCI-ISA bridges that are built in PIIX/Q35 are building its own AML
using AcpiDevAmlIf interface. Now build_append_pci_bus_devices()
gained AcpiDevAmlIf interface support to get AML of devices atached
to PCI slots.
So drop ad-hoc build_q35_isa_bridge()/build_piix4_isa_bridge()
and let PCI bus enumeration to include PCI-ISA bridge AML
when it's enumerated by build_append_pci_bus_devices().
AML change is mostly contextual, which moves whole ISA hierarchy
directly under PCI host bridge instead of it being described
as separate \SB.PCI0.ISA block.
Note:
If bus/slot that hosts ISA bridge has BSEL set, it will gain new
ASUN and _DMS entries (i.e. acpi-index support, but it should not
cause any functional change and that is fine from PCI Firmware
spec point of view), potentially it's possible to suppress that
by adding a flag to PCIDevice but I don't see a reason to do that
yet, I'd rather treat bridge just as any other PCI device if it's
possible.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20221017102146.2254096-4-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
NB:
We do not expect any functional change in any ACPI tables with this
change. It's only a refactoring.
NB2:
Some targets (or1k) do not support acpi and CONFIG_ACPI is off for them.
However, modules are reused between all architectures so CONFIG_ACPI is
on. For those architectures, dummy stub function definitions help to
resolve symbols. This change uses more of these and so it adds a couple
of dummy stub definitions so that symbols for those can be resolved.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20221017102146.2254096-2-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Ani Sinha <ani@anisinha.ca>
CC: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <20221107152744.868434-1-ani@anisinha.ca>
vhost backend sends host notification for every VQ. If backend creates
VQs in parallel, the VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG may
arrive to QEMU in different order than incremental queue index order.
For example VQ 1's message arrive earlier than VQ 0's:
After alloc VhostUserHostNotifier for VQ 1. GPtrArray becomes
[ nil, VQ1 pointer ]
After alloc VhostUserHostNotifier for VQ 0. GPtrArray becomes
[ VQ0 pointer, nil, VQ1 pointer ]
This is wrong. fetch_notifier will return NULL for VQ 1 in
vhost_user_get_vring_base, causes host notifier miss removal(leak).
The fix is to remove current element from GPtrArray, make the right
position for element to insert.
Fixes: 503e355465 ("virtio/vhost-user: dynamically assign VhostUserHostNotifiers")
Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Parav Pandit <parav@nvidia.com>
Message-Id: <20221018023651.1359420-1-yajunw@nvidia.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Most other virtio-pci devices allow MSI-X, let's have it for rng too.
Signed-off-by: David Daney <david.daney@fungible.com>
Reviewed-by: Marcin Nowakowski <marcin.nowakowski@fungible.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@fungible.com>
Message-Id: <20221014160947.66105-1-philmd@fungible.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add virtqueue reset feature for virtio-net
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20221017092558.111082-16-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add virtqueue reset feature for vhost-kernel.
Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-15-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Support queue_enable in vhost-kernel scenario. It can be called when
a vq reset operation has been performed and the vq is restared.
It should be noted that we can restart the vq when the vhost has
already started. When launching a new vhost device, the vhost is not
started and all vqs are not initalized until VIRTIO_PCI_COMMON_STATUS
is written. Thus, we should use vhost_started to differentiate the
two cases: vq reset and device start.
Currently it only supports vhost-kernel.
Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-14-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
virtio-net and vhost-kernel implement queue reset.
Queued packets in the corresponding queue pair are flushed
or purged.
For virtio-net, userspace datapath will be disabled later in
__virtio_queue_reset(). It will set addr of vring to 0 and idx to 0.
Thus, virtio_net_receive() and virtio_net_flush_tx() will not receive
or send packets.
For vhost-net, the datapath will be disabled in vhost_net_virtqueue_reset().
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-13-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce the fucntion flush_or_purge_queued_packets(), it will be
used in device reset and virtqueue reset. Therefore, we extract the
common logic as a new function.
Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-12-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce vhost_net_virtqueue_restart(), which can restart the
specific virtqueue when the vhost net started running before.
If it fails to restart the virtqueue, the device will be stopped.
Here we do not reuse vhost_net_start_one() or vhost_dev_start()
because they work at queue pair level. The mem table and features
do not change, so we can call the vhost_virtqueue_start() to
restart a specific queue.
This patch only considers the case of vhost-kernel, when
NetClientDriver is NET_CLIENT_DRIVER_TAP.
Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-11-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce vhost_virtqueue_reset(), which can reset the specific
virtqueue in the device. Then it will unmap vrings and the desc
of the virtqueue.
Here we do not reuse the vhost_net_stop_one() or vhost_dev_stop(),
because they work at queue pair level. We do not use
vhost_virtqueue_stop() because it may stop the device in the
backend.
This patch only considers the case of vhost-kernel, when
NetClientDriver is NET_CLIENT_DRIVER_TAP.
Furthermore, we do not need net->nc->info->poll() because
it enables userspace datapath and we want to stop all
datapaths for this reset virtqueue here.
Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-10-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Expose vhost_virtqueue_stop(), we need to use it when resetting a
virtqueue.
Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-9-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Expose vhost_virtqueue_start(), we need to use it when restarting a
virtqueue.
Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-8-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
PCI devices support device specific vq enable.
Based on this function, the driver can re-enable the virtqueue after the
virtqueue is reset.
Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-7-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
PCI devices support vq reset.
Based on this function, the driver can adjust the size of the ring, and
quickly recycle the buffer in the ring.
The migration of the virtio devices will not happen during a reset
operation. This is becuase the global iothread lock is held. Migration
thread also needs the lock. As a result, when migration of virtio
devices starts, the 'reset' status of VirtIOPCIQueue will always be 0.
Thus, we do not need to add it in vmstate_virtio_pci_modern_queue_state.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-6-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
A a new command line parameter "queue_reset" is added.
Meanwhile, the vq reset feature is disabled for pre-7.2 machines.
Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-5-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce the interface queue_enable() in VirtioDeviceClass and the
fucntion virtio_queue_enable() in virtio, it can be called when
VIRTIO_PCI_COMMON_Q_ENABLE is written and related virtqueue can be
started. It only supports the devices of virtio 1 or later. The
not-supported devices can only start the virtqueue when DRIVER_OK.
Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-4-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce a new interface function virtio_queue_reset() to implement
reset for vq.
Add a new callback to VirtioDeviceClass for queue reset operation for
each child device.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-3-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Separate the logic of vq reset. This logic will be called directly
later.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-2-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In theory the virtio-iommu-pci could be plugged anywhere in the PCIe
topology and as long as the dt/acpi info are properly built this should
work. However at the moment we fail to do that because the
virtio-iommu-pci BDF is not computed at plug time and in that case
vms->virtio_iommu_bdf gets an incorrect value.
For instance if the virtio-iommu-pci is plugged onto a pcie root port
and the virtio-iommu protects a virtio-block-pci device the guest does
not boot.
So let's do not pretend we do support this case and fail the initialize()
if we detect the virtio-iommu-pci is plugged anywhere else than on the
root bus. Anyway this ability is not needed.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20221012163448.121368-1-eric.auger@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This Data Object Exchange Mailbox allows software to query the
latency and bandwidth between ports on the switch. For now
only provide information on routes between the upstream port and
each downstream port (not p2p).
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
--
Changes since v8: Mostly to match the type 3 equivalent
- Move enum out of function and give it a more descriptive namespace.
Message-Id: <20221014151045.24781-6-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The CDAT can be specified in two ways. One is to add ",cdat=<filename>"
in "-device cxl-type3"'s command option. The file is required to provide
the whole CDAT table in binary mode. The other is to use the default
that provides some 'reasonable' numbers based on type of memory and
size.
The DOE capability supporting CDAT is added to hw/mem/cxl_type3.c with
capability offset 0x190. The config read/write to this capability range
can be generated in the OS to request the CDAT data.
Signed-off-by: Huai-Cheng Kuo <hchkuo@avery-design.com.tw>
Signed-off-by: Chris Browy <cbrowy@avery-design.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20221014151045.24781-5-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The Data Object Exchange implementation of CXL Coherent Device Attribute
Table (CDAT). This implementation is referring to "Coherent Device
Attribute Table Specification, Rev. 1.03, July. 2022" and "Compute
Express Link Specification, Rev. 3.0, July. 2022"
This patch adds core support that will be shared by both
end-points and switch port emulation.
Signed-off-by: Huai-Cheng Kuo <hchkuo@avery-design.com.tw>
Signed-off-by: Chris Browy <cbrowy@avery-design.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20221014151045.24781-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This will be used by several upcoming patch sets so break it out
such that it doesn't matter which one lands first.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20221014151045.24781-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Emulation of PCIe Data Object Exchange (DOE)
PCIE Base Specification r6.0 6.3 Data Object Exchange
Supports multiple DOE PCIe Extended Capabilities for a single PCIe
device. For each capability, a static array of DOEProtocol should be passed
to pcie_doe_init(). The protocols in that array will be registered under
the DOE capability structure. For each protocol, vendor ID, type, and
corresponding callback function (handle_request()) should be implemented.
This callback function represents how the DOE request for corresponding
protocol will be handled.
pcie_doe_{read/write}_config() must be appended to corresponding PCI
device's config_read/write() handler to enable DOE access. In
pcie_doe_read_config(), false will be returned if pci_config_read()
offset is not within DOE capability range. In pcie_doe_write_config(),
the function will have no affect if the address is not within the related
DOE PCIE extended capability.
Signed-off-by: Huai-Cheng Kuo <hchkuo@avery-design.com.tw>
Signed-off-by: Chris Browy <cbrowy@avery-design.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20221014151045.24781-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
MADT has been updated with the GIC Structures from ACPI 6.0 Errata A
and so MADT revision and GICC Structure must be updated also.
Fixes: 37f33084ed ("acpi: arm/virt: madt: use build_append_int_noprefix() API to compose MADT table")
Signed-off-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <20221011181730.10885-4-miguel.luis@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Update the Fixed ACPI Description Table (FADT) to revision 6.0 of the ACPI
specification adding the field "Hypervisor Vendor Identity".
This field's description states the following: "64-bit identifier of hypervisor
vendor. All bytes in this field are considered part of the vendor identity.
These identifiers are defined independently by the vendors themselves,
usually following the name of the hypervisor product. Version information
should NOT be included in this field - this shall simply denote the vendor's
name or identifier. Version information can be communicated through a
supplemental vendor-specific hypervisor API. Firmware implementers would
place zero bytes into this field, denoting that no hypervisor is present in
the actual firmware."
Signed-off-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <20221011181730.10885-3-miguel.luis@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* bug fixes for -Wextra
* fix gdb XML for 32-bit x86
* improve error handling for module load
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmNndPcUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOVVwf+Nfvx9aoDTo6vU4kHmCbh7+BTtcUA
qEkaccL4pfujwNfRl5gHHKtRhgT6py9OjfIX8pAck3jtc5r+5/niFn7CLcOP2G/C
xuyVKPx3ONCMKCLjWwg63I8/t4JZDEnYEzyddAfV7Xb5600aUJlrxScW751K5eYQ
CzHFAASdc00eSiWE2eFL4rV7dcj0NadHnJxfjUcguHX4Qlr1TmK/ihEGv6MMRo5S
9Ak/vLmRs+LsQJm2cXUu5QVQiDwp31xbv3AP9knB+irYxrsp4LSDxV8eKZwLehpW
I4A3gz2xKAprfzVJKZI3Dc/hJSRpoKoCjz+QoI1NeVbWjsnYZAwD+VMSew==
=Nroy
-----END PGP SIGNATURE-----
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* bug fixes for Win32 event loop
* bug fixes for -Wextra
* fix gdb XML for 32-bit x86
* improve error handling for module load
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmNndPcUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOVVwf+Nfvx9aoDTo6vU4kHmCbh7+BTtcUA
# qEkaccL4pfujwNfRl5gHHKtRhgT6py9OjfIX8pAck3jtc5r+5/niFn7CLcOP2G/C
# xuyVKPx3ONCMKCLjWwg63I8/t4JZDEnYEzyddAfV7Xb5600aUJlrxScW751K5eYQ
# CzHFAASdc00eSiWE2eFL4rV7dcj0NadHnJxfjUcguHX4Qlr1TmK/ihEGv6MMRo5S
# 9Ak/vLmRs+LsQJm2cXUu5QVQiDwp31xbv3AP9knB+irYxrsp4LSDxV8eKZwLehpW
# I4A3gz2xKAprfzVJKZI3Dc/hJSRpoKoCjz+QoI1NeVbWjsnYZAwD+VMSew==
# =Nroy
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 06 Nov 2022 03:48:55 EST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
accel: abort if we fail to load the accelerator plugin
dmg: warn when opening dmg images containing blocks of unknown type
module: add Error arguments to module_load and module_load_qom
module: rename module_load_one to module_load
module: removed unused function argument "mayfail"
Add missing include statement for global xml_builtin
meson: avoid unused arguments of main() in compiler tests
Fix broken configure with -Wunused-parameter
gdb-xml: Fix size of EFER register on i386 architecture when debugged by GDB
util/aio-win32: Correct the event array size in aio_poll()
util/main-loop: Avoid adding the same HANDLE twice
util/main-loop: Fix maximum number of wait objects for win32
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>