Define the new macro VMSTATE_INSTANCE_ID_ANY for callers who wants to
auto-generate the vmstate instance ID. Previously it was hard coded
as -1 instead of this macro. It helps to change this default value in
the follow up patches. No functional change.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
If the redirected device has this capability, Windows guest may
place the device into D2 and expect it to wake when the device
becomes active, but this will never happen. For example, when
internal Bluetooth adapter is redirected, keyboards and mice
connected to it do not work. Current commit removes this
capability (starting from machine 5.0)
Set 'usb-redir.suppress-remote-wake' property to 'off' to keep
'remote wake' as is or to 'on' to remove 'remote wake' on
4.2 or earlier.
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Message-id: 20200108091044.18055-3-yuri.benditovich@daynix.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
If the redirected device has this capability, Windows guest may
place the device into D2 and expect it to wake when the device
becomes active, but this will never happen. For example, when
internal Bluetooth adapter is redirected, keyboards and mice
connected to it do not work. Current commit removes this
capability (starting from machine 5.0)
Set 'usb-host.suppress-remote-wake' property to 'off' to keep
'remote wake' as is or to 'on' to remove 'remote wake' on
4.2 or earlier.
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Message-id: 20200108091044.18055-2-yuri.benditovich@daynix.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Hi,
QDEV_PROP_PTR is marked in multiple places as "FIXME/TODO/remove
me". In most cases, it can be easily replaced with QDEV_PROP_LINK when
the pointer points to an Object.
There are a few places where such substitution isn't possible. For
those places, it seems reasonable to use a specific setter method
instead, and keep the user_creatable = false. In other places,
proper usage of qdev or other facilies is the solution.
The serial code wasn't converted to qdev, which makes it a bit more
archaic to deal with. Let's convert it first, so we can more easily
embed it from other devices, and re-export some properties and drop
QDEV_PROP_PTR usage.
-----BEGIN PGP SIGNATURE-----
iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAl4UnUYcHG1hcmNhbmRy
ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5cfYEACcTfXklXdxLDj94Q5/
d6MxYqZWckO+vyMqOwonodl9BS3clpDDxbYzyfTpqwKS2cVg1eUUBPR7/eioX6zT
grM0rlgsKWJf9UurJwJWw7Zys7dXZMVJ2BdigLUEZrv9hFF15t344qoKgk4wYmBj
2wC7l7j2WZZ0vtXN7IH4/ZXnaN5/kdoPj6BrF0oNSJaq1AjPByQxmOJhvrxVsm6y
gn3la4XbfMIC68qPjcDJAScGXtCWG1Vydw9cFHwRpMfcvPyL70l6FMjIwrLYNQ9b
j1AkcEXeev5nWT+gLGxt+TGXB0Sd2ID9uRYxhyZRA4fdjHFtlWfdOwepOOlSlTO+
yfpf9STDLuDQGLTJyNZpYGGDDcm4xsJ8arD/7/Mq/35BQl9ZUT+m6uC1tDhxEHzf
+AD/Kh8rMptyAjwtqD2XbqyLoaFJCsPjZbjTj3SY08WaeqClmaAbSD2eaJiNXy4H
+rFg9P/eOB+71R1AoMKfiBFzdGV6TG5PLZOJ/oN02yqp0oW8eDWYcETB3j0tIgS1
u2WVCS2cd8IqYa+UQ7COOpoX0UwICmIWV64kxioD7uFQiK/1nQYw4UnPHv29qY6k
fTa8jUC5hPiDN1rRYqNpNoVJsstSZfSgpo5jV75sxSyDucupu+SM9qmo3+fBab+q
Eol3Ypz4virkNU8IYCYFFiG4Qg==
=iYVd
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/elmarco/tags/prop-ptr-pull-request' into staging
Clean-ups: qom-ify serial and remove QDEV_PROP_PTR
Hi,
QDEV_PROP_PTR is marked in multiple places as "FIXME/TODO/remove
me". In most cases, it can be easily replaced with QDEV_PROP_LINK when
the pointer points to an Object.
There are a few places where such substitution isn't possible. For
those places, it seems reasonable to use a specific setter method
instead, and keep the user_creatable = false. In other places,
proper usage of qdev or other facilies is the solution.
The serial code wasn't converted to qdev, which makes it a bit more
archaic to deal with. Let's convert it first, so we can more easily
embed it from other devices, and re-export some properties and drop
QDEV_PROP_PTR usage.
# gpg: Signature made Tue 07 Jan 2020 15:01:26 GMT
# gpg: using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg: issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* remotes/elmarco/tags/prop-ptr-pull-request: (37 commits)
qdev/qom: remove some TODO limitations now that PROP_PTR is gone
qdev: remove QDEV_PROP_PTR
qdev: remove PROP_MEMORY_REGION
omap-gpio: remove PROP_PTR
omap-i2c: remove PROP_PTR
omap-intc: remove PROP_PTR
smbus-eeprom: remove PROP_PTR
cris: improve passing PIC interrupt vector to the CPU
mips/cps: fix setting saar property
qdev: use g_strcmp0() instead of open-coding it
leon3: use qdev gpio facilities for the PIL
leon3: use qemu_irq framework instead of callback as property
dp8393x: replace PROP_PTR with PROP_LINK
etraxfs: remove PROP_PTR usage
lance: replace PROP_PTR with PROP_LINK
vmmouse: replace PROP_PTR with PROP_LINK
sm501: make SerialMM a child, export chardev property
mips: use sysbus_mmio_get_region() instead of internal fields
mips: use sysbus_add_io()
mips: baudbase is 115200 by default
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Bugfixes all over the place.
HMAT support.
New flags for vhost-user-blk utility.
Auto-tuning of seg max for virtio storage.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl4TaMEPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpvzgH/2LyDAzCa9h93ikSJjmyUk5FUaqve38daEb3
S3JYjwKxQx7u1ydooKhvBQnBCZ2i3S+k62gfYyKB+nBv8xvjs0Eg5D1YJ5E8hciy
lf5OFGWWtX2iPDjZwQwT13kiJe0o3JRGxJJ6XqTEG+1EYOp7cky/FEv4PD030b9m
I2wROZ/Am+onB9YJX8c0Vv1CG+AryuJNXnvwQzTXEjj4U7bEYUyJwVZaCRyAdWQ3
uYXIZN9VwjVX6BFvy9ZAJbEsUVJvOM1/aQaDqcrLz+VlzRT7bRkKHi2G3vakrm1I
r5OpgyLo84132awCncbSykKDH5o8WaxLaJBjGmuBfasMz9wPzAg=
=uL1o
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pci, pc: fixes, features
Bugfixes all over the place.
HMAT support.
New flags for vhost-user-blk utility.
Auto-tuning of seg max for virtio storage.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon 06 Jan 2020 17:05:05 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (32 commits)
intel_iommu: add present bit check for pasid table entries
intel_iommu: a fix to vtd_find_as_from_bus_num()
virtio-net: delete also control queue when TX/RX deleted
virtio: reset region cache when on queue deletion
virtio-mmio: update queue size on guest write
tests: add virtio-scsi and virtio-blk seg_max_adjust test
virtio: make seg_max virtqueue size dependent
hw: fix using 4.2 compat in 5.0 machine types for i440fx/q35
vhost-user-scsi: reset the device if supported
vhost-user: add VHOST_USER_RESET_DEVICE to reset devices
hw/pci/pci_host: Let pci_data_[read/write] use unsigned 'size' argument
hw/pci/pci_host: Remove redundant PCI_DPRINTF()
virtio-mmio: Clear v2 transport state on soft reset
ACPI: add expected files for HMAT tests (acpihmat)
tests/bios-tables-test: add test cases for ACPI HMAT
tests/numa: Add case for QMP build HMAT
hmat acpi: Build Memory Side Cache Information Structure(s)
hmat acpi: Build System Locality Latency and Bandwidth Information Structure(s)
hmat acpi: Build Memory Proximity Domain Attributes Structure(s)
numa: Extend CLI to provide memory side cache information
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
No longer used in the tree. The comment about user_creatable is still
quite relevant, but there is already a similar comment in qdev-core.h.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Hi,
With external processes or helpers participating to the VM support, it
becomes necessary to handle their migration. Various options exist to
transfer their state:
1) as the VM memory, RAM or devices (we could say that's how
vhost-user devices can be handled today, they are expected to
restore from ring state)
2) other "vmstate" (as with TPM emulator state blobs)
3) left to be handled by management layer
1) is not practical, since an external processes may legitimatelly
need arbitrary state date to back a device or a service, or may not
even have an associated device.
2) needs ad-hoc code for each helper, but is simple and working
3) is complicated for management layer, QEMU has the migration timing
The proposed "dbus-vmstate" object will connect to a given D-Bus
address, and save/load from org.qemu.VMState1 owners on migration.
Thus helpers can easily have their state migrated with QEMU, without
implementing ad-hoc support (such as done for TPM emulation)
D-Bus is ubiquitous on Linux (it is systemd IPC), and can be made to
work on various other OSes. There are several implementations and good
bindings for various languages. (the tests/dbus-vmstate-test.c is a
good example of how simple the implementation of services can be, even
in C)
dbus-vmstate is put into use by the libvirt series "[PATCH 00/23] Use
a slirp helper process".
v2:
- fix build with broken mingw-glib
-----BEGIN PGP SIGNATURE-----
iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAl4TR5ccHG1hcmNhbmRy
ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5R6EEACFTd4hDG8i/GnxCFut
MGcTusJr+2IklIT/K0qpLf0axNUoIqycwv8m0T9QhoG8h+9lMykOd1YJpNetT5qK
gifOF2gcPK/9WIdFbX7dLSUAWpzO6fG/RzKK65Nc1uJSnXlb8JV0BU/6FrfCE+3U
Bg5PvVtxxtwejQfQPOI7bPxOqxr/SmjUGcbFgacMAMG0Lm/VG/92kdoC6Z4Xf/bd
FcAeiO2CiPoGXG5zD4WF1emwxnSu65PgcFpSpqvvFlmDbYlTwoMt4VWxTfkAzbAM
IES7j2IbhUEe3p0hvMTqmmsmds1QNCBgnQI/LtQiXPTnbfpBcZ0wT6QsSZXWvHz8
ClA9OAimxyELblTGjD9vsi3G5m2DQS+NdfPOX7hfHouVQzDJJaS8jxDItpPgXwSO
fZ9mUO8ps3N2YTakuKNBP/IzDOuyExrBg80GF+HbEc59Uhj8Yq/awyz1XsqjQzVP
54+TUjwC8HZxVWgMeqiJ1njPTfRJo6uAnguLbfAXj8P9vaXLtsy/3JGsmKiziXXW
XzvQDzhfOMjm7Uo7vN7Hp3X/UYJxnaQ3dViqZnv/gqG6yv+igVlqyrTx2IBhN2NW
DZt3c7VqVUBYFShLgfy0zDjzM/s7mFkQKCFHUsBqIwODugYEc3TTdAa60QYjX5i9
negngax45KM6nF3tq74fJpwWVw==
=M4kD
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/elmarco/tags/dbus-vmstate7-pull-request' into staging
Add dbus-vmstate
Hi,
With external processes or helpers participating to the VM support, it
becomes necessary to handle their migration. Various options exist to
transfer their state:
1) as the VM memory, RAM or devices (we could say that's how
vhost-user devices can be handled today, they are expected to
restore from ring state)
2) other "vmstate" (as with TPM emulator state blobs)
3) left to be handled by management layer
1) is not practical, since an external processes may legitimatelly
need arbitrary state date to back a device or a service, or may not
even have an associated device.
2) needs ad-hoc code for each helper, but is simple and working
3) is complicated for management layer, QEMU has the migration timing
The proposed "dbus-vmstate" object will connect to a given D-Bus
address, and save/load from org.qemu.VMState1 owners on migration.
Thus helpers can easily have their state migrated with QEMU, without
implementing ad-hoc support (such as done for TPM emulation)
D-Bus is ubiquitous on Linux (it is systemd IPC), and can be made to
work on various other OSes. There are several implementations and good
bindings for various languages. (the tests/dbus-vmstate-test.c is a
good example of how simple the implementation of services can be, even
in C)
dbus-vmstate is put into use by the libvirt series "[PATCH 00/23] Use
a slirp helper process".
v2:
- fix build with broken mingw-glib
# gpg: Signature made Mon 06 Jan 2020 14:43:35 GMT
# gpg: using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg: issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* remotes/elmarco/tags/dbus-vmstate7-pull-request:
tests: add dbus-vmstate-test
tests: add migration-helpers unit
dockerfiles: add dbus-daemon to some of latest distributions
configure: add GDBUS_CODEGEN
Add dbus-vmstate object
util: add dbus helper unit
docs: start a document to describe D-Bus usage
vmstate: replace DeviceState with VMStateIf
vmstate: add qom interface to get id
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Before the patch, seg_max parameter was immutable and hardcoded
to 126 (128 - 2) without respect to queue size. This has two negative effects:
1. when queue size is < 128, we have Virtio 1.1 specfication violation:
(2.6.5.3.1 Driver Requirements) seq_max must be <= queue_size.
This violation affects the old Linux guests (ver < 4.14). These guests
crash on these queue_size setups.
2. when queue_size > 128, as was pointed out by Denis Lunev <den@virtuozzo.com>,
seg_max restrics guest's block request length which affects guests'
performance making them issues more block request than needed.
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg03721.html
To mitigate this two effects, the patch adds the property adjusting seg_max
to queue size automaticaly. Since seg_max is a guest visible parameter,
the property is machine type managable and allows to choose between
old (seg_max = 126 always) and new (seg_max = queue_size - 2) behaviors.
Not to change the behavior of the older VMs, prevent setting the default
seg_max_adjust value for older machine types.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-Id: <20191220140905.1718-2-dplotnikov@virtuozzo.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Replace DeviceState dependency with VMStateIf on vmstate API.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Add an interface to get the instance id, instead of depending on
Device and qdev_get_dev_path().
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Add -numa hmat-cache option to provide Memory Side Cache Information.
These memory attributes help to build Memory Side Cache Information
Structure(s) in ACPI Heterogeneous Memory Attribute Table (HMAT).
Before using hmat-cache option, enable HMAT with -machine hmat=on.
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Liu Jingqi <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-4-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Add -numa hmat-lb option to provide System Locality Latency and
Bandwidth Information. These memory attributes help to build
System Locality Latency and Bandwidth Information Structure(s)
in ACPI Heterogeneous Memory Attribute Table (HMAT). Before using
hmat-lb option, enable HMAT with -machine hmat=on.
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Liu Jingqi <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-3-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
In ACPI 6.3 chapter 5.2.27 Heterogeneous Memory Attribute Table (HMAT),
The initiator represents processor which access to memory. And in 5.2.27.3
Memory Proximity Domain Attributes Structure, the attached initiator is
defined as where the memory controller responsible for a memory proximity
domain. With attached initiator information, the topology of heterogeneous
memory can be described. Add new machine property 'hmat' to enable all
HMAT specific options.
Extend CLI of "-numa node" option to indicate the initiator numa node-id.
In the linux kernel, the codes in drivers/acpi/hmat/hmat.c parse and report
the platform's HMAT tables. Before using initiator option, enable HMAT with
-machine hmat=on.
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-2-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently the SLOF firmware for pseries guests will disable/re-enable
a PCI device multiple times via IO/MEM/MASTER bits of PCI_COMMAND
register after the initial probe/feature negotiation, as it tends to
work with a single device at a time at various stages like probing
and running block/network bootloaders without doing a full reset
in-between.
In QEMU, when PCI_COMMAND_MASTER is disabled we disable the
corresponding IOMMU memory region, so DMA accesses (including to vring
fields like idx/flags) will no longer undergo the necessary
translation. Normally we wouldn't expect this to happen since it would
be misbehavior on the driver side to continue driving DMA requests.
However, in the case of pseries, with iommu_platform=on, we trigger the
following sequence when tearing down the virtio-blk dataplane ioeventfd
in response to the guest unsetting PCI_COMMAND_MASTER:
#2 0x0000555555922651 in virtqueue_map_desc (vdev=vdev@entry=0x555556dbcfb0, p_num_sg=p_num_sg@entry=0x7fffe657e1a8, addr=addr@entry=0x7fffe657e240, iov=iov@entry=0x7fffe6580240, max_num_sg=max_num_sg@entry=1024, is_write=is_write@entry=false, pa=0, sz=0)
at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:757
#3 0x0000555555922a89 in virtqueue_pop (vq=vq@entry=0x555556dc8660, sz=sz@entry=184)
at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:950
#4 0x00005555558d3eca in virtio_blk_get_request (vq=0x555556dc8660, s=0x555556dbcfb0)
at /home/mdroth/w/qemu.git/hw/block/virtio-blk.c:255
#5 0x00005555558d3eca in virtio_blk_handle_vq (s=0x555556dbcfb0, vq=0x555556dc8660)
at /home/mdroth/w/qemu.git/hw/block/virtio-blk.c:776
#6 0x000055555591dd66 in virtio_queue_notify_aio_vq (vq=vq@entry=0x555556dc8660)
at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:1550
#7 0x000055555591ecef in virtio_queue_notify_aio_vq (vq=0x555556dc8660)
at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:1546
#8 0x000055555591ecef in virtio_queue_host_notifier_aio_poll (opaque=0x555556dc86c8)
at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:2527
#9 0x0000555555d02164 in run_poll_handlers_once (ctx=ctx@entry=0x55555688bfc0, timeout=timeout@entry=0x7fffe65844a8)
at /home/mdroth/w/qemu.git/util/aio-posix.c:520
#10 0x0000555555d02d1b in try_poll_mode (timeout=0x7fffe65844a8, ctx=0x55555688bfc0)
at /home/mdroth/w/qemu.git/util/aio-posix.c:607
#11 0x0000555555d02d1b in aio_poll (ctx=ctx@entry=0x55555688bfc0, blocking=blocking@entry=true)
at /home/mdroth/w/qemu.git/util/aio-posix.c:639
#12 0x0000555555d0004d in aio_wait_bh_oneshot (ctx=0x55555688bfc0, cb=cb@entry=0x5555558d5130 <virtio_blk_data_plane_stop_bh>, opaque=opaque@entry=0x555556de86f0)
at /home/mdroth/w/qemu.git/util/aio-wait.c:71
#13 0x00005555558d59bf in virtio_blk_data_plane_stop (vdev=<optimized out>)
at /home/mdroth/w/qemu.git/hw/block/dataplane/virtio-blk.c:288
#14 0x0000555555b906a1 in virtio_bus_stop_ioeventfd (bus=bus@entry=0x555556dbcf38)
at /home/mdroth/w/qemu.git/hw/virtio/virtio-bus.c:245
#15 0x0000555555b90dbb in virtio_bus_stop_ioeventfd (bus=bus@entry=0x555556dbcf38)
at /home/mdroth/w/qemu.git/hw/virtio/virtio-bus.c:237
#16 0x0000555555b92a8e in virtio_pci_stop_ioeventfd (proxy=0x555556db4e40)
at /home/mdroth/w/qemu.git/hw/virtio/virtio-pci.c:292
#17 0x0000555555b92a8e in virtio_write_config (pci_dev=0x555556db4e40, address=<optimized out>, val=1048832, len=<optimized out>)
at /home/mdroth/w/qemu.git/hw/virtio/virtio-pci.c:613
I.e. the calling code is only scheduling a one-shot BH for
virtio_blk_data_plane_stop_bh, but somehow we end up trying to process
an additional virtqueue entry before we get there. This is likely due
to the following check in virtio_queue_host_notifier_aio_poll:
static bool virtio_queue_host_notifier_aio_poll(void *opaque)
{
EventNotifier *n = opaque;
VirtQueue *vq = container_of(n, VirtQueue, host_notifier);
bool progress;
if (!vq->vring.desc || virtio_queue_empty(vq)) {
return false;
}
progress = virtio_queue_notify_aio_vq(vq);
namely the call to virtio_queue_empty(). In this case, since no new
requests have actually been issued, shadow_avail_idx == last_avail_idx,
so we actually try to access the vring via vring_avail_idx() to get
the latest non-shadowed idx:
int virtio_queue_empty(VirtQueue *vq)
{
bool empty;
...
if (vq->shadow_avail_idx != vq->last_avail_idx) {
return 0;
}
rcu_read_lock();
empty = vring_avail_idx(vq) == vq->last_avail_idx;
rcu_read_unlock();
return empty;
but since the IOMMU region has been disabled we get a bogus value (0
usually), which causes virtio_queue_empty() to falsely report that
there are entries to be processed, which causes errors such as:
"virtio: zero sized buffers are not allowed"
or
"virtio-blk missing headers"
and puts the device in an error state.
This patch works around the issue by introducing virtio_set_disabled(),
which sets a 'disabled' flag to bypass checks like virtio_queue_empty()
when bus-mastering is disabled. Since we'd check this flag at all the
same sites as vdev->broken, we replace those checks with an inline
function which checks for either vdev->broken or vdev->disabled.
The 'disabled' flag is only migrated when set, which should be fairly
rare, but to maintain migration compatibility we disable it's use for
older machine types. Users requiring the use of the flag in conjunction
with older machine types can set it explicitly as a virtio-device
option.
NOTES:
- This leaves some other oddities in play, like the fact that
DRIVER_OK also gets unset in response to bus-mastering being
disabled, but not restored (however the device seems to continue
working)
- Similarly, we disable the host notifier via
virtio_bus_stop_ioeventfd(), which seems to move the handling out
of virtio-blk dataplane and back into the main IO thread, and it
ends up staying there till a reset (but otherwise continues working
normally)
Cc: David Gibson <david@gibson.dropbear.id.au>,
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Message-Id: <20191120005003.27035-1-mdroth@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit aa57020774, by mistake used MachineClass::numa_mem_supported
to check if NUMA is supported by machine and also as unrelated change
set it to true for sbsa-ref board.
Luckily change didn't break machines that support NUMA, as the field
is set to true for them.
But the field is not intended for checking if NUMA is supported and
will be flipped to false within this release for new machine types.
Fix it:
- by using previously used condition
!mc->cpu_index_to_instance_props || !mc->get_default_cpu_node_id
the first time and then use MachineState::numa_state down the road
to check if NUMA is supported
- dropping stray sbsa-ref chunk
Fixes: aa57020774
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1576154936-178362-3-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Currently parse_numa_node() is always called from already numa
enabled context.
Drop unnecessary check if numa is supported.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1576154936-178362-2-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Rename Error ** parameter in check_only_migratable to common errp.
In device_set_realized:
- Move "if (local_err != NULL)" closer to error setters.
- Drop 'Error **local_errp': it doesn't save any LoCs, but it's very
unusual.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20191205174635.18758-10-vsementsov@virtuozzo.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
We don't need Error **, as all callers pass local Error object, which
isn't used after the call. Use Error * instead.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20191205174635.18758-5-vsementsov@virtuozzo.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
fit_load_fdt() passes @errp to fit_image_addr(), then recovers from
ENOENT failures. Passing @errp is wrong, because it works only as
long as @errp is neither @error_fatal nor @error_abort. Error
recovery dereferences @errp. That's also wrong; see the big comment
in error.h. Error recovery can leave *errp pointing to a freed
Error object. Wrong, it must be null on success. Messed up in
commit 3eb99edb48 "loader-fit: Wean off error_printf()".
No caller actually passes such values, or uses *errp on success.
Fix anyway: splice in a local Error *err, and error_propagate().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20191204093625.14836-8-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
The first machine property to fall is Xen's Intel integrated graphics
passthrough. The "-machine igd-passthru" option does not set anymore
a property on the machine object, but desugars to a GlobalProperty on
accelerator objects.
The setter is very simple, since the value ends up in a
global variable, so this patch also provides an example before the more
complicated cases that follow it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Drop the "accel" property from MachineState, and instead desugar
"-machine accel=" to a list of "-accel" options.
This has a semantic change due to removing merge_lists from -accel.
For example:
- "-accel kvm -accel tcg" all but ignored "-accel kvm". This is a bugfix.
- "-accel kvm -accel thread=single" ignored "thread=single", since it
applied the option to KVM. Now it fails due to not specifying the
accelerator on "-accel thread=single".
- "-accel tcg -accel thread=single" chose single-threaded TCG, while now
it will fail due to not specifying the accelerator on "-accel
thread=single".
Also, "-machine accel" and "-accel" become incompatible.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Virtio spec 1.1 (and earlier), 5.2.5.2 Driver Requirements: Device
Initialization:
"Devices SHOULD always offer VIRTIO_BLK_F_FLUSH, and MUST offer it if
they offer VIRTIO_BLK_F_CONFIG_WCE"
Currently F_CONFIG_WCE and F_WCE are not connected to each other.
Qemu will advertise F_CONFIG_WCE if config-wce argument is
set for virtio-blk device. And F_WCE is advertised only if
underlying block backend actually has it's caching enabled.
Fix this by advertising F_WCE if F_CONFIG_WCE is also advertised.
To preserve backwards compatibility with newer machine types make this
behaviour governed by "x-enable-wce-if-config-wce" virtio-blk-device
property and introduce hw_compat_4_2 with new property being off by
default for all machine types <= 4.2 (but don't introduce 4.3
machine type itself yet).
Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>
Message-Id: <1572978137-189218-1-git-send-email-wrfsh@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If memory allocation fails when using -mem-path, QEMU is supposed to print
out a message to indicate that fallback to anonymous RAM is deprecated. This
is done with error_printf() which does output buffering. As a consequence,
the message is only printed at the next flush, eg. when quiting QEMU, and
it also lacks a trailing newline:
qemu-system-ppc64: unable to map backing store for guest RAM: Cannot allocate memory
qemu-system-ppc64: warning: falling back to regular RAM allocation
QEMU 4.1.50 monitor - type 'help' for more information
(qemu) q
This is deprecated. Make sure that -mem-path specified path has sufficient resources to allocate -m specified RAM amountgreg@boss02:~/Work/qemu/qemu-spapr$
Add the missing \n to fix both issues.
Fixes: cb79224b7e "deprecate -mem-path fallback to anonymous RAM"
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <157304440026.351774.14607704217028190097.stgit@bahia.lan>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Now all the users of ptimers have converted to the transaction-based
API, we can remove ptimer_init_with_bh() and all the code paths
that are used only by bottom-half based ptimers, and tidy up the
documentation comments to consider the transaction-based API the
only possibility.
The code changes result from:
* s->bh no longer exists
* s->callback is now always non-NULL
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191025142411.17085-1-peter.maydell@linaro.org
This function isn't used anymore.
This reverts commit 22ec3283ef.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Esteban Bosse <estebanbosse@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
- use --enable-plugins @ configure
- low impact introspection (-plugin empty.so to measure overhead)
- plugins cannot alter guest state
- example plugins included in source tree (tests/plugins)
- -d plugin to enable plugin output in logs
- check-tcg runs extra tests when plugins enabled
- documentation in docs/devel/plugins.rst
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAl23BZMACgkQ+9DbCVqe
KkRPegf/QHygZ4ER2jOaWEookxiOEcik+dzQKVGNqLNXeMLvo5fGjGVpFoFxSgfv
ZvCAL4xbW44zsYlVfh59tfn4Tu9qK7s7/qM3WXpHsmuvEuhoWef0Lt2jSe+D46Rs
KeG/aX+rHLUR8rr9eCgE+1/MQmxPUj3VUonkUpNkk2ebBbSNoLSOudB4DD9Vcyl7
Pya1kPvA6W9bwI20ZSWihE7flg13o62Pp+LgAFLrsfxXOxOMkPrU8Pp+B0Dvr+hL
5Oh0clZLhiRi75x+KVGZ90TVsoftdjYoOWGMOudS/+NNmqKT1NTLm0K1WJYyRMQ1
V0ne4/OcGNq7x8gcOx/xs09ADu5/VA==
=UXR/
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stsquad/tags/pull-tcg-plugins-281019-4' into staging
TCG Plugins initial implementation
- use --enable-plugins @ configure
- low impact introspection (-plugin empty.so to measure overhead)
- plugins cannot alter guest state
- example plugins included in source tree (tests/plugins)
- -d plugin to enable plugin output in logs
- check-tcg runs extra tests when plugins enabled
- documentation in docs/devel/plugins.rst
# gpg: Signature made Mon 28 Oct 2019 15:13:23 GMT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-tcg-plugins-281019-4: (57 commits)
travis.yml: enable linux-gcc-debug-tcg cache
MAINTAINERS: add me for the TCG plugins code
scripts/checkpatch.pl: don't complain about (foo, /* empty */)
.travis.yml: add --enable-plugins tests
include/exec: wrap cpu_ldst.h in CONFIG_TCG
accel/stubs: reduce headers from tcg-stub
tests/plugin: add hotpages to analyse memory access patterns
tests/plugin: add instruction execution breakdown
tests/plugin: add a hotblocks plugin
tests/tcg: enable plugin testing
tests/tcg: drop test-i386-fprem from TESTS when not SLOW
tests/tcg: move "virtual" tests to EXTRA_TESTS
tests/tcg: set QEMU_OPTS for all cris runs
tests/tcg/Makefile.target: fix path to config-host.mak
tests/plugin: add sample plugins
linux-user: support -plugin option
vl: support -plugin option
plugin: add qemu_plugin_outs helper
plugin: add qemu_plugin_insn_disas helper
plugin: expand the plugin_init function to include an info block
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In "b06424de62 migration: Disable hotplug/unplug during migration" we
added a check to disable unplug for all devices until we have figured
out what works. For failover primary devices qdev_unplug() is called
from the migration handler, i.e. during migration.
This patch adds a flag to DeviceState which is set to false for all
devices and makes an exception for PCI devices that are also
primary devices in a failover pair.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Message-Id: <20191029114905.6856-8-jfreimann@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This adds support for hiding a device to the qbus and qdev APIs. The
first user of this will be the virtio-net failover feature but the API
introduced with this patch could be used to implement other features as
well, for example hiding pci devices when a pci bus is powered off.
qdev_device_add() is modified to check for a failover_pair_id
argument in the option string. A DeviceListener callback
should_be_hidden() is added. It can be used by a standby device to
inform qdev that this device should not be added now. The standby device
handler can store the device options to plug the device in at a later
point in time.
One reason for hiding the device is that we don't want to expose both
devices to the guest kernel until the respective virtio feature bit
VIRTIO_NET_F_STANDBY was negotiated and we know that the devices will be
handled correctly by the guest.
More information on the kernel feature this is using:
https://www.kernel.org/doc/html/latest/networking/net_failover.html
An example where the primary device is a vfio-pci device and the standby
device is a virtio-net device:
A device is hidden when it has an "failover_pair_id" option, e.g.
-device virtio-net-pci,...,failover=on,...
-device vfio-pci,...,failover_pair_id=net1,...
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Message-Id: <20191029114905.6856-2-jfreimann@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Add MachineClass::auto_enable_numa field. When it is true, a NUMA node
is expected to be created implicitly.
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190905083238.1799-1-tao3.xu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Provide the new transaction-based API. If a ptimer is created
using ptimer_init() rather than ptimer_init_with_bh(), then
instead of providing a QEMUBH, it provides a pointer to the
callback function directly, and has opted into the transaction
API. All calls to functions which modify ptimer state:
- ptimer_set_period()
- ptimer_set_freq()
- ptimer_set_limit()
- ptimer_set_count()
- ptimer_run()
- ptimer_stop()
must be between matched calls to ptimer_transaction_begin()
and ptimer_transaction_commit(). When ptimer_transaction_commit()
is called it will evaluate the state of the timer after all the
changes in the transaction, and call the callback if necessary.
In the old API the individual update functions generally would
call ptimer_trigger() immediately, which would schedule the QEMUBH.
In the new API the update functions will instead defer the
"set s->next_event and call ptimer_reload()" work to
ptimer_transaction_commit().
Because ptimer_trigger() can now immediately call into the
device code which may then call other ptimer functions that
update ptimer_state fields, we must be more careful in
ptimer_reload() not to cache fields from ptimer_state across
the ptimer_trigger() call. (This was harmless with the QEMUBH
mechanism as the BH would not be invoked until much later.)
We use assertions to check that:
* the functions modifying ptimer state are not called outside
a transaction block
* ptimer_transaction_begin() and _commit() calls are paired
* the transaction API is not used with a QEMUBH ptimer
There is some slight repetition of code:
* most of the set functions have similar looking "if s->bh
call ptimer_reload, otherwise set s->need_reload" code
* ptimer_init() and ptimer_init_with_bh() have similar code
We deliberately don't try to avoid this repetition, because
it will all be deleted when the QEMUBH version of the API
is removed.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-3-peter.maydell@linaro.org
Currently the ptimer design uses a QEMU bottom-half as its
mechanism for calling back into the device model using the
ptimer when the timer has expired. Unfortunately this design
is fatally flawed, because it means that there is a lag
between the ptimer updating its own state and the device
callback function updating device state, and guest accesses
to device registers between the two can return inconsistent
device state.
We want to replace the bottom-half design with one where
the guest device's callback is called either immediately
(when the ptimer triggers by timeout) or when the device
model code closes a transaction-begin/end section (when the
ptimer triggers because the device model changed the
ptimer's count value or other state). As the first step,
rename ptimer_init() to ptimer_init_with_bh(), to free up
the ptimer_init() name for the new API. We can then convert
all the ptimer users away from ptimer_init_with_bh() before
removing it entirely.
(Commit created with
git grep -l ptimer_init | xargs sed -i -e 's/ptimer_init/ptimer_init_with_bh/'
and three overlong lines folded by hand.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-2-peter.maydell@linaro.org
Both, "rom->addr" and "addr" are derived from the binary image
that can be loaded with the "-kernel" paramer. The code in
rom_copy() then calculates:
d = dest + (rom->addr - addr);
and uses "d" as destination in a memcpy() some lines later. Now with
bad kernel images, it is possible that rom->addr is smaller than addr,
thus "rom->addr - addr" gets negative and the memcpy() then tries to
copy contents from the image to a bad memory location. This could
maybe be used to inject code from a kernel image into the QEMU binary,
so we better fix it with an additional sanity check here.
Cc: qemu-stable@nongnu.org
Reported-by: Guangming Liu
Buglink: https://bugs.launchpad.net/qemu/+bug/1844635
Message-Id: <20190925130331.27825-1-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
With the merge of notdirty handling into store_helper,
the last user of cpu->mem_io_vaddr was removed.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This adds a trace point which prints every loaded image. This includes
bios/firmware/kernel/initradmdisk/pcirom.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190613050937.124903-1-aik@ozlabs.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Introduce this new per-machine hook to give any machine class a chance
to do a sanity check on the to-be-hotplugged device as a sanity test.
This will be used for x86 to try to detect some illegal configuration
of devices, e.g., possible conflictions between vfio-pci and x86
vIOMMU.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20190916080718.3299-3-peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch fixes a possible integer overflow when we calculate
the total size of ELF segments loaded.
Reported-by: Coverity (CID 1405299)
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20190910124828.39794-1-sgarzare@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Using FLR becomes convenient in cases where resetting the bus is
impractical, for example, when debugging the behavior of individual
functions.
Signed-off-by: Julia Suvorova <jusual@redhat.com>
Message-Id: <20190820163005.1880-1-jusual@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When user doesn't request any explicit CPU model with libvirt or QEMU,
a machine type specific CPU model is picked. Currently there is no way
to determine what this QEMU built-in default is, so libvirt cannot
report this back to the user in the XML config.
This extends the "query-machines" QMP command so that it reports the
default CPU model typename for each machine.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190822100412.23746-1-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Move existing numa global numa_info (renamed as "nodes") into NumaState.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190809065731.9097-5-tao3.xu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>