According to
https://lore.kernel.org/qemu-devel/20200921174118.39352-1-richard.henderson@linaro.org/
there was an issue with Capstone 3.0.4 from Ubuntu 18, which was the reason
for bumping our minimum Capstone requirement to version 4.0. And indeed,
compiling with that version 3.0.4 from Ubuntu 18.04 still fails (after
allowing it with a hack in meson.build). But now that we've dropped support
for Ubuntu 18.04, that issue is not relevant anymore. Compiling with Capstone
version 3.0.5 (e.g. used in Ubuntu 20.04) seems to work fine, so let's allow
that version again.
Message-Id: <20220516145823.148450-3-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The Capstone library that is shipped with NetBSD and OpenBSD works
fine when compiling QEMU, so let's enable this in our build-test
VMs to get a little bit more build-test coverage.
Message-Id: <20220516145823.148450-2-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Our support statement for Windows currently talks about "Vista / Server
2008" - which is related to the API of Windows, and this is not easy
to understand for the non-technical users. Additionally, glib sets the
_WIN32_WINNT macro to 0x0601 already, which indicates the Windows 7 API,
so QEMU effectively depends on the Windows 7 API, too.
Thus let's bump the _WIN32_WINNT setting in QEMU to the same level as
glib uses and adjust our support statement in the documentation to
something similar that we're using for Linux and the *BSD systems
(i.e. only the two most recent versions), which should hopefully be
easier to understand for the users now.
And since we're nowadays also compile-testing QEMU with MSYS2 on Windows
itself, I think we could mention this build environment here, too.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/880
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20220513063958.1181443-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Although we register a ABRT handler to kill off QEMU when g_assert()
triggers, we want an extra safety net. The QEMU process might be
non-functional and thus not have responded to SIGTERM. The test script
might also have crashed with SEGV, in which case the cleanup handlers
won't ever run.
Using the Linux specific prctl(PR_SET_PDEATHSIG) syscall, we
can ensure that QEMU gets sent SIGKILL as soon as the controlling
qtest exits, if nothing else has correctly told it to quit.
Note, technically the death signal is sent when the *thread* that
called fork() exits. IOW, if you are calling qtest_init() in one
thread, letting that thread exit, and then expecting to run
qtest_quit() in a different thread, things are not going to work
out. Fortunately that is not a scenario that exists in qtests,
as pairs of qtest_init and qtest_quit are always called from the
same thread.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220513154906.206715-3-berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
qtest_init registers a hook to cleanup the running QEMU process
should g_assert() fire before qtest_quit is called. When the first
hook is registered, it is supposed to triggere registration of the
SIGABRT handler. Unfortunately the logic in hook_list_is_empty is
inverted, so the SIGABRT handler never gets registered, unless
2 or more QEMU processes are run concurrently. This caused qtest
to leak QEMU processes anytime g_assert triggers.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220513154906.206715-2-berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
According to our "Supported build platforms" policy, we now do not support
Ubuntu 18.04 anymore. Remove the related container files and entries from
our CI.
Message-Id: <20220516115912.120951-1-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The 'check-patch' and 'check-dco' jobs only need Python and git for
checking the patches, so it's not really necessary to use a container
here that has all the other build dependencies installed. By using a
lightweight Alpine container, we can improve the runtime here quite a
bit, cutting it down from ca. 1:30 minutes to ca. 45 seconds.
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220516082310.33876-1-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The MAC of the tulip card is stored in the EEPROM and at startup
tulip_fill_eeprom() is called to initialize the EEPROM with the MAC
address given on the command line, e.g.:
-device tulip,mac=00:11:22:33:44:55
In case the mac address was not given on the command line,
tulip_fill_eeprom() initializes the MAC in EEPROM with 00:00:00:00:00:00
which breaks e.g. a HP-UX guest.
Fix this problem by moving qemu_macaddr_default_if_unset() a few lines
up, so that a default mac address is assigned before tulip_fill_eeprom()
initializes the EEPROM.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Interaction with vmnet.framework in different modes
differs only on configuration stage, so we can create
common `send`, `receive`, etc. procedures and reuse them.
Reviewed-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Tested-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Signed-off-by: Phillip Tennen <phillip@axleos.com>
Signed-off-by: Vladislav Yaroshchuk <Vladislav.Yaroshchuk@jetbrains.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
vmnet.framework dependency is added with 'vmnet' option
to enable or disable it. Default value is 'auto'.
used vmnet features are available since macOS 11.0,
but new backend can be built and work properly with
subset of them on 10.15 too.
Reviewed-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Tested-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Signed-off-by: Vladislav Yaroshchuk <Vladislav.Yaroshchuk@jetbrains.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
most of CXL support
fixes, cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmKCuLIPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpdDUH/12SmWaAo+0+SdIHgWFFxsmg3t/EdcO38fgi
MV+GpYdbp6TlU3jdQhrMZYmFdkVVydBdxk93ujCLbFS0ixTsKj31j0IbZMfdcGgv
SLqnV+E3JdHqnGP39q9a9rdwYWyqhkgHoldxilIFW76ngOSapaZVvnwnOMAMkf77
1LieL4/Xq7N9Ho86Zrs3IczQcf0czdJRDaFaSIu8GaHl8ELyuPhlSm6CSqqrEEWR
PA/COQsLDbLOMxbfCi5v88r5aaxmGNZcGbXQbiH9qVHw65nlHyLH9UkNTdJn1du1
f2GYwwa7eekfw/LCvvVwxO1znJrj02sfFai7aAtQYbXPvjvQiqA=
=xdSk
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: fixes,cleanups,features
most of CXL support
fixes, cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmKCuLIPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpdDUH/12SmWaAo+0+SdIHgWFFxsmg3t/EdcO38fgi
# MV+GpYdbp6TlU3jdQhrMZYmFdkVVydBdxk93ujCLbFS0ixTsKj31j0IbZMfdcGgv
# SLqnV+E3JdHqnGP39q9a9rdwYWyqhkgHoldxilIFW76ngOSapaZVvnwnOMAMkf77
# 1LieL4/Xq7N9Ho86Zrs3IczQcf0czdJRDaFaSIu8GaHl8ELyuPhlSm6CSqqrEEWR
# PA/COQsLDbLOMxbfCi5v88r5aaxmGNZcGbXQbiH9qVHw65nlHyLH9UkNTdJn1du1
# f2GYwwa7eekfw/LCvvVwxO1znJrj02sfFai7aAtQYbXPvjvQiqA=
# =xdSk
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 16 May 2022 01:48:50 PM PDT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (86 commits)
vhost-user-scsi: avoid unlink(NULL) with fd passing
virtio-net: don't handle mq request in userspace handler for vhost-vdpa
vhost-vdpa: change name and polarity for vhost_vdpa_one_time_request()
vhost-vdpa: backend feature should set only once
vhost-net: fix improper cleanup in vhost_net_start
vhost-vdpa: fix improper cleanup in net_init_vhost_vdpa
virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa
virtio-net: setup vhost_dev and notifiers for cvq only when feature is negotiated
hw/i386/amd_iommu: Fix IOMMU event log encoding errors
hw/i386: Make pic a property of common x86 base machine type
hw/i386: Make pit a property of common x86 base machine type
include/hw/pci/pcie_host: Correct PCIE_MMCFG_SIZE_MAX
include/hw/pci/pcie_host: Correct PCIE_MMCFG_BUS_MASK
docs/vhost-user: Clarifications for VHOST_USER_ADD/REM_MEM_REG
vhost-user: more master/slave things
virtio: add vhost support for virtio devices
virtio: drop name parameter for virtio_init()
virtio/vhost-user: dynamically assign VhostUserHostNotifiers
hw/virtio/vhost-user: don't suppress F_CONFIG when supported
include/hw: start documenting the vhost API
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* misc qga-vss fixes
* remove the deprecated CPU model 'Icelake-Client'
* support for x86 architectural LBR
* remove deprecated properties
* replace deprecated -soundhw with -audio
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJ/hZ4UHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroN2Igf/bFs+yluOikt0eFNmXYnshrGBWPXr
oam0iumPox34vTzZnjpSjF6tJGxHWOgi+wbgIvbwOYHA/ONxx8akW580j+1VhEWa
X29VyUzjZBffgFtmlF4fM74/ELYm7s4c1a1/D9TpVP6Dr0fSWbMujbx4dfeVstvf
sONN+A8sVxaNdV9QKPE6BvqfMlPLoCiigrOetf6iY1KuUtkQDF8xDB0MdzdutqAQ
szAtQ0rrzjxDx9EuGN1SECFM1/riDUbtOOoA9g2C7gGKrx3/iUc6pzrkIcAfWLFK
xXbH7+6Wynia0cbUxnrvRdY4daMIxm4N3wUvN7szXgF9kxYxeQcsdgGsNA==
=n4lu
-----END PGP SIGNATURE-----
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* fix WHPX debugging
* misc qga-vss fixes
* remove the deprecated CPU model 'Icelake-Client'
* support for x86 architectural LBR
* remove deprecated properties
* replace deprecated -soundhw with -audio
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJ/hZ4UHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroN2Igf/bFs+yluOikt0eFNmXYnshrGBWPXr
# oam0iumPox34vTzZnjpSjF6tJGxHWOgi+wbgIvbwOYHA/ONxx8akW580j+1VhEWa
# X29VyUzjZBffgFtmlF4fM74/ELYm7s4c1a1/D9TpVP6Dr0fSWbMujbx4dfeVstvf
# sONN+A8sVxaNdV9QKPE6BvqfMlPLoCiigrOetf6iY1KuUtkQDF8xDB0MdzdutqAQ
# szAtQ0rrzjxDx9EuGN1SECFM1/riDUbtOOoA9g2C7gGKrx3/iUc6pzrkIcAfWLFK
# xXbH7+6Wynia0cbUxnrvRdY4daMIxm4N3wUvN7szXgF9kxYxeQcsdgGsNA==
# =n4lu
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 14 May 2022 03:34:06 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (23 commits)
configure: remove duplicate help messages
configure: remove another dead variable
build: remove useless dependency
introduce -audio as a replacement for -soundhw
soundhw: move help handling to vl.c
soundhw: unify initialization for ISA and PCI soundhw
soundhw: extract soundhw help to a separate function
soundhw: remove ability to create multiple soundcards
rng: make opened property read-only
crypto: make loaded property read-only
target/i386: Support Arch LBR in CPUID enumeration
target/i386: introduce helper to access supported CPUID
target/i386: Enable Arch LBR migration states in vmstate
target/i386: Add MSR access interface for Arch LBR
target/i386: Add XSAVES support for Arch LBR
target/i386: Enable support for XSAVES based features
target/i386: Add kvm_get_one_msr helper
target/i386: Add lbr-fmt vPMU option to support guest LBR
qdev-properties: Add a new macro with bitmask check for uint64_t property
i386/cpu: Remove the deprecated cpu model 'Icelake-Client'
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Commit 747421e949 ("Implements Backend
Program conventions for vhost-user-scsi") introduced fd-passing support
as part of implementing the vhost-user backend program conventions.
When fd passing is used the UNIX domain socket path is NULL and we must
not call unlink(2).
The unlink(2) call is necessary when the listen socket, lsock, was
created successfully since that means the UNIX domain socket is visible
in the file system.
Fixes: Coverity CID 1488353
Fixes: 747421e949 ("Implements Backend Program conventions for vhost-user-scsi")
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220516155701.1789638-1-stefanha@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
virtio_queue_host_notifier_read() tends to read pending event
left behind on ioeventfd in the vhost_net_stop() path, and
attempts to handle outstanding kicks from userspace vq handler.
However, in the ctrl_vq handler, virtio_net_handle_mq() has a
recursive call into virtio_net_set_status(), which may lead to
segmentation fault as shown in below stack trace:
0 0x000055f800df1780 in qdev_get_parent_bus (dev=0x0) at ../hw/core/qdev.c:376
1 0x000055f800c68ad8 in virtio_bus_device_iommu_enabled (vdev=vdev@entry=0x0) at ../hw/virtio/virtio-bus.c:331
2 0x000055f800d70d7f in vhost_memory_unmap (dev=<optimized out>) at ../hw/virtio/vhost.c:318
3 0x000055f800d70d7f in vhost_memory_unmap (dev=<optimized out>, buffer=0x7fc19bec5240, len=2052, is_write=1, access_len=2052) at ../hw/virtio/vhost.c:336
4 0x000055f800d71867 in vhost_virtqueue_stop (dev=dev@entry=0x55f8037ccc30, vdev=vdev@entry=0x55f8044ec590, vq=0x55f8037cceb0, idx=0) at ../hw/virtio/vhost.c:1241
5 0x000055f800d7406c in vhost_dev_stop (hdev=hdev@entry=0x55f8037ccc30, vdev=vdev@entry=0x55f8044ec590) at ../hw/virtio/vhost.c:1839
6 0x000055f800bf00a7 in vhost_net_stop_one (net=0x55f8037ccc30, dev=0x55f8044ec590) at ../hw/net/vhost_net.c:315
7 0x000055f800bf0678 in vhost_net_stop (dev=dev@entry=0x55f8044ec590, ncs=0x55f80452bae0, data_queue_pairs=data_queue_pairs@entry=7, cvq=cvq@entry=1)
at ../hw/net/vhost_net.c:423
8 0x000055f800d4e628 in virtio_net_set_status (status=<optimized out>, n=0x55f8044ec590) at ../hw/net/virtio-net.c:296
9 0x000055f800d4e628 in virtio_net_set_status (vdev=vdev@entry=0x55f8044ec590, status=15 '\017') at ../hw/net/virtio-net.c:370
10 0x000055f800d534d8 in virtio_net_handle_ctrl (iov_cnt=<optimized out>, iov=<optimized out>, cmd=0 '\000', n=0x55f8044ec590) at ../hw/net/virtio-net.c:1408
11 0x000055f800d534d8 in virtio_net_handle_ctrl (vdev=0x55f8044ec590, vq=0x7fc1a7e888d0) at ../hw/net/virtio-net.c:1452
12 0x000055f800d69f37 in virtio_queue_host_notifier_read (vq=0x7fc1a7e888d0) at ../hw/virtio/virtio.c:2331
13 0x000055f800d69f37 in virtio_queue_host_notifier_read (n=n@entry=0x7fc1a7e8894c) at ../hw/virtio/virtio.c:3575
14 0x000055f800c688e6 in virtio_bus_cleanup_host_notifier (bus=<optimized out>, n=n@entry=14) at ../hw/virtio/virtio-bus.c:312
15 0x000055f800d73106 in vhost_dev_disable_notifiers (hdev=hdev@entry=0x55f8035b51b0, vdev=vdev@entry=0x55f8044ec590)
at ../../../include/hw/virtio/virtio-bus.h:35
16 0x000055f800bf00b2 in vhost_net_stop_one (net=0x55f8035b51b0, dev=0x55f8044ec590) at ../hw/net/vhost_net.c:316
17 0x000055f800bf0678 in vhost_net_stop (dev=dev@entry=0x55f8044ec590, ncs=0x55f80452bae0, data_queue_pairs=data_queue_pairs@entry=7, cvq=cvq@entry=1)
at ../hw/net/vhost_net.c:423
18 0x000055f800d4e628 in virtio_net_set_status (status=<optimized out>, n=0x55f8044ec590) at ../hw/net/virtio-net.c:296
19 0x000055f800d4e628 in virtio_net_set_status (vdev=0x55f8044ec590, status=15 '\017') at ../hw/net/virtio-net.c:370
20 0x000055f800d6c4b2 in virtio_set_status (vdev=0x55f8044ec590, val=<optimized out>) at ../hw/virtio/virtio.c:1945
21 0x000055f800d11d9d in vm_state_notify (running=running@entry=false, state=state@entry=RUN_STATE_SHUTDOWN) at ../softmmu/runstate.c:333
22 0x000055f800d04e7a in do_vm_stop (state=state@entry=RUN_STATE_SHUTDOWN, send_stop=send_stop@entry=false) at ../softmmu/cpus.c:262
23 0x000055f800d04e99 in vm_shutdown () at ../softmmu/cpus.c:280
24 0x000055f800d126af in qemu_cleanup () at ../softmmu/runstate.c:812
25 0x000055f800ad5b13 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../softmmu/main.c:51
For now, temporarily disable handling MQ request from the ctrl_vq
userspace hanlder to avoid the recursive virtio_net_set_status()
call. Some rework is needed to allow changing the number of
queues without going through a full virtio_net_set_status cycle,
particularly for vhost-vdpa backend.
This patch will need to be reverted as soon as future patches of
having the change of #queues handled in userspace is merged.
Fixes: 402378407d ("vhost-vdpa: multiqueue support")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <1651890498-24478-8-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The name vhost_vdpa_one_time_request() was confusing. No
matter whatever it returns, its typical occurrence had
always been at requests that only need to be applied once.
And the name didn't suggest what it actually checks for.
Change it to vhost_vdpa_first_dev() with polarity flipped
for better readibility of code. That way it is able to
reflect what the check is really about.
This call is applicable to request which performs operation
only once, before queues are set up, and usually at the beginning
of the caller function. Document the requirement for it in place.
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Message-Id: <1651890498-24478-7-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
The vhost_vdpa_one_time_request() branch in
vhost_vdpa_set_backend_cap() incorrectly sends down
ioctls on vhost_dev with non-zero index. This may
end up with multiple VHOST_SET_BACKEND_FEATURES
ioctl calls sent down on the vhost-vdpa fd that is
shared between all these vhost_dev's.
To fix it, send down ioctl only once via the first
vhost_dev with index 0. Toggle the polarity of the
vhost_vdpa_one_time_request() test should do the
trick.
Fixes: 4d191cfdc7 ("vhost-vdpa: classify one time request")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <1651890498-24478-6-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vhost_net_start() missed a corresponding stop_one() upon error from
vhost_set_vring_enable(). While at it, make the error handling for
err_start more robust. No real issue was found due to this though.
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <1651890498-24478-5-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
... such that no memory leaks on dangling net clients in case of
error.
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <1651890498-24478-4-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
With MQ enabled vdpa device and non-MQ supporting guest e.g.
booting vdpa with mq=on over OVMF of single vqp, below assert
failure is seen:
../hw/virtio/vhost-vdpa.c:560: vhost_vdpa_get_vq_index: Assertion `idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs' failed.
0 0x00007f8ce3ff3387 in raise () at /lib64/libc.so.6
1 0x00007f8ce3ff4a78 in abort () at /lib64/libc.so.6
2 0x00007f8ce3fec1a6 in __assert_fail_base () at /lib64/libc.so.6
3 0x00007f8ce3fec252 in () at /lib64/libc.so.6
4 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:563
5 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:558
6 0x0000558f52d7329a in vhost_virtqueue_mask (hdev=0x558f55c01800, vdev=0x558f568f91f0, n=2, mask=<optimized out>) at ../hw/virtio/vhost.c:1557
7 0x0000558f52c6b89a in virtio_pci_set_guest_notifier (d=d@entry=0x558f568f0f60, n=n@entry=2, assign=assign@entry=true, with_irqfd=with_irqfd@entry=false)
at ../hw/virtio/virtio-pci.c:974
8 0x0000558f52c6c0d8 in virtio_pci_set_guest_notifiers (d=0x558f568f0f60, nvqs=3, assign=true) at ../hw/virtio/virtio-pci.c:1019
9 0x0000558f52bf091d in vhost_net_start (dev=dev@entry=0x558f568f91f0, ncs=0x558f56937cd0, data_queue_pairs=data_queue_pairs@entry=1, cvq=cvq@entry=1)
at ../hw/net/vhost_net.c:361
10 0x0000558f52d4e5e7 in virtio_net_set_status (status=<optimized out>, n=0x558f568f91f0) at ../hw/net/virtio-net.c:289
11 0x0000558f52d4e5e7 in virtio_net_set_status (vdev=0x558f568f91f0, status=15 '\017') at ../hw/net/virtio-net.c:370
12 0x0000558f52d6c4b2 in virtio_set_status (vdev=vdev@entry=0x558f568f91f0, val=val@entry=15 '\017') at ../hw/virtio/virtio.c:1945
13 0x0000558f52c69eff in virtio_pci_common_write (opaque=0x558f568f0f60, addr=<optimized out>, val=<optimized out>, size=<optimized out>) at ../hw/virtio/virtio-pci.c:1292
14 0x0000558f52d15d6e in memory_region_write_accessor (mr=0x558f568f19d0, addr=20, value=<optimized out>, size=1, shift=<optimized out>, mask=<optimized out>, attrs=...)
at ../softmmu/memory.c:492
15 0x0000558f52d127de in access_with_adjusted_size (addr=addr@entry=20, value=value@entry=0x7f8cdbffe748, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=0x558f52d15cf0 <memory_region_write_accessor>, mr=0x558f568f19d0, attrs=...) at ../softmmu/memory.c:554
16 0x0000558f52d157ef in memory_region_dispatch_write (mr=mr@entry=0x558f568f19d0, addr=20, data=<optimized out>, op=<optimized out>, attrs=attrs@entry=...)
at ../softmmu/memory.c:1504
17 0x0000558f52d078e7 in flatview_write_continue (fv=fv@entry=0x7f8accbc3b90, addr=addr@entry=103079215124, attrs=..., ptr=ptr@entry=0x7f8ce6300028, len=len@entry=1, addr1=<optimized out>, l=<optimized out>, mr=0x558f568f19d0) at /home/opc/qemu-upstream/include/qemu/host-utils.h:165
18 0x0000558f52d07b06 in flatview_write (fv=0x7f8accbc3b90, addr=103079215124, attrs=..., buf=0x7f8ce6300028, len=1) at ../softmmu/physmem.c:2822
19 0x0000558f52d0b36b in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>)
at ../softmmu/physmem.c:2914
20 0x0000558f52d0b3da in address_space_rw (as=<optimized out>, addr=<optimized out>, attrs=...,
attrs@entry=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>, is_write=<optimized out>) at ../softmmu/physmem.c:2924
21 0x0000558f52dced09 in kvm_cpu_exec (cpu=cpu@entry=0x558f55c2da60) at ../accel/kvm/kvm-all.c:2903
22 0x0000558f52dcfabd in kvm_vcpu_thread_fn (arg=arg@entry=0x558f55c2da60) at ../accel/kvm/kvm-accel-ops.c:49
23 0x0000558f52f9f04a in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:556
24 0x00007f8ce4392ea5 in start_thread () at /lib64/libpthread.so.0
25 0x00007f8ce40bb9fd in clone () at /lib64/libc.so.6
The cause for the assert failure is due to that the vhost_dev index
for the ctrl vq was not aligned with actual one in use by the guest.
Upon multiqueue feature negotiation in virtio_net_set_multiqueue(),
if guest doesn't support multiqueue, the guest vq layout would shrink
to a single queue pair, consisting of 3 vqs in total (rx, tx and ctrl).
This results in ctrl_vq taking a different vhost_dev group index than
the default. We can map vq to the correct vhost_dev group by checking
if MQ is supported by guest and successfully negotiated. Since the
MQ feature is only present along with CTRL_VQ, we ensure the index
2 is only meant for the control vq while MQ is not supported by guest.
Fixes: 22288fe ("virtio-net: vhost control virtqueue support")
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <1651890498-24478-3-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When the control virtqueue feature is absent or not negotiated,
vhost_net_start() still tries to set up vhost_dev and install
vhost notifiers for the control virtqueue, which results in
erroneous ioctl calls with incorrect queue index sending down
to driver. Do that only when needed.
Fixes: 22288fe ("virtio-net: vhost control virtqueue support")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <1651890498-24478-2-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Coverity issues several UNINIT warnings against amd_iommu.c [1]. This
patch fixes them by clearing evt before encoding. On top of it, this
patch changes the event log size to 16 bytes per IOMMU specification,
and fixes the event log entry format in amdvi_encode_event().
[1] CID 1487116/1487200/1487190/1487232/1487115/1487258
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Message-Id: <20220422055146.3312226-1-wei.huang2@amd.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Legacy PIC (8259) cannot be supported for TDX guests since TDX module
doesn't allow directly interrupt injection. Using posted interrupts
for the PIC is not a viable option as the guest BIOS/kernel will not
do EOI for PIC IRQs, i.e. will leave the vIRR bit set.
Make PIC the property of common x86 machine type. Hence all x86
machines, including microvm, can disable it.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-Id: <20220310122811.807794-3-xiaoyao.li@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Both pc and microvm have pit property individually. Let's just make it
the property of common x86 base machine type.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-Id: <20220310122811.807794-2-xiaoyao.li@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to 7.2.2 in [1] bit 27 is the last bit that can be part of the
bus number, this makes the ECAM max size equal to '1 << 28'. This patch
restores back this value into the PCIE_MMCFG_SIZE_MAX define (which was
changed in commit 58d5b22bbd ("ppc4xx: Add device models found in PPC440
core SoCs")).
[1] PCI Express® Base Specification Revision 5.0 Version 1.0
Signed-off-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Message-Id: <20220411221836.17699-3-frasse.iglesias@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to [1] address bits 27 - 20 are mapped to the bus number (the
TLPs bus number field is 8 bits). Below is the formula taken from Table
7-1 in [1].
"
Memory Address | PCI Express Configuration Space
A[(20+n-1):20] | Bus Number, 1 ≤ n ≤ 8
"
[1] PCI Express® Base Specification Revision 5.0 Version 1.0
Signed-off-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Message-Id: <20220411221836.17699-2-frasse.iglesias@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The specification for VHOST_USER_ADD/REM_MEM_REG messages is unclear
in several points, which has led to clients having incompatible
implementations. This changes the specification to be more explicit
about them:
* VHOST_USER_ADD_MEM_REG is not specified as receiving a file
descriptor, though it obviously does need to do so. All
implementations agree on this one, fix the specification.
* VHOST_USER_REM_MEM_REG is not specified as receiving a file
descriptor either, and it also has no reason to do so. rust-vmm does
not send file descriptors for removing a memory region (in agreement
with the specification), libvhost-user and QEMU do (which is a bug),
though libvhost-user doesn't actually make any use of it.
Change the specification so that for compatibility QEMU's behaviour
becomes legal, even if discouraged, but rust-vmm's behaviour becomes
the explicitly recommended mode of operation.
* VHOST_USER_ADD_MEM_REG doesn't have a documented return value, which
is the desired behaviour in the non-postcopy case. It also implemented
like this in QEMU and rust-vmm, though libvhost-user is buggy and
sometimes sends an unexpected reply. This will be fixed in a separate
patch.
However, in postcopy mode it does reply like VHOST_USER_SET_MEM_TABLE.
This behaviour is shared between libvhost-user and QEMU; rust-vmm
doesn't implement postcopy mode yet. Mention it explicitly in the
spec.
* The specification doesn't mention how VHOST_USER_REM_MEM_REG
identifies the memory region to be removed. Change it to describe the
existing behaviour of libvhost-user (guest address, user address and
size must match).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20220407133657.155281-2-kwolf@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(This replaces the 28th April through 10th May sets)
Compared to that last set it just has the Alpine
uring check that Leo has added; although that's also
now fixed upstream in Alpine.
It contains:
TLS test fixes from Dan
Zerocopy migration feature from Leo
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEERfXHG0oMt/uXep+pBRYzHrxb/ecFAmKCY80ACgkQBRYzHrxb
/eeEEhAAoUogch7ifxFItr1EA0AU6Sgd3Dcn8wY9pm0NySVg7OcIpk1H++A3CgIh
bubJSwRmpIxGw+5q5w5OvBukFCGYMlAK7J8k1tZmaqdKS8wD0ZwhpPyqTWd14Q/v
xXSGOQfHMMvbBILiXPjSkfNw8yKJhZr+lW39uMz/kZRwZUmTcrdKAT3Q8PW+1DI9
v3mNoFNXqtDlHcQ4nQ1TGk/RDO6oXDlTJwdnjoJT3Dopf8Jhl2etvZgVk2kOf4i5
LmJbSVBr5FNOhJ6P4WL4OEQFOiXXquKdfuGTXIGGhkrW2WkPZulQwB6uO4Gv1wf2
aj9bLDAFoPxFx2zYS6S/9L6rGeBMcTL9xHCfzyylM6YRjoscRdxXc67PClw71JUy
regsoSQej0FpmsGx0uuAsDjCELleVIjeYzuQo5OYOP1BCg/5unLIrMgkyQw7COJI
w+MIZq7IqvUTehU2yXpUGOqPkyDLBlib92dMRgqqG9r9UU7iL3BREbGW4ugW+GM2
a9k8W9HjyDIIODsdXy1ugPHgjr/arHDAPgYosJMLvjTfdJDcIldAw6CbCcqhCDES
UOjMVN9VS+716nY2AqvtEHxf47YwqmeRb+tg4SQ0dHLH5Pvfe2bk1sbZiiQpcelt
Bd88yeBOpcmdzJVur2V4fEZXu5JB/qt/jeJeQa82hS3k93PWm/w=
=Axhk
-----END PGP SIGNATURE-----
Merge tag 'pull-migration-20220516a' of https://gitlab.com/dagrh/qemu into staging
Migration pull 2022-05-16
(This replaces the 28th April through 10th May sets)
Compared to that last set it just has the Alpine
uring check that Leo has added; although that's also
now fixed upstream in Alpine.
It contains:
TLS test fixes from Dan
Zerocopy migration feature from Leo
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEERfXHG0oMt/uXep+pBRYzHrxb/ecFAmKCY80ACgkQBRYzHrxb
# /eeEEhAAoUogch7ifxFItr1EA0AU6Sgd3Dcn8wY9pm0NySVg7OcIpk1H++A3CgIh
# bubJSwRmpIxGw+5q5w5OvBukFCGYMlAK7J8k1tZmaqdKS8wD0ZwhpPyqTWd14Q/v
# xXSGOQfHMMvbBILiXPjSkfNw8yKJhZr+lW39uMz/kZRwZUmTcrdKAT3Q8PW+1DI9
# v3mNoFNXqtDlHcQ4nQ1TGk/RDO6oXDlTJwdnjoJT3Dopf8Jhl2etvZgVk2kOf4i5
# LmJbSVBr5FNOhJ6P4WL4OEQFOiXXquKdfuGTXIGGhkrW2WkPZulQwB6uO4Gv1wf2
# aj9bLDAFoPxFx2zYS6S/9L6rGeBMcTL9xHCfzyylM6YRjoscRdxXc67PClw71JUy
# regsoSQej0FpmsGx0uuAsDjCELleVIjeYzuQo5OYOP1BCg/5unLIrMgkyQw7COJI
# w+MIZq7IqvUTehU2yXpUGOqPkyDLBlib92dMRgqqG9r9UU7iL3BREbGW4ugW+GM2
# a9k8W9HjyDIIODsdXy1ugPHgjr/arHDAPgYosJMLvjTfdJDcIldAw6CbCcqhCDES
# UOjMVN9VS+716nY2AqvtEHxf47YwqmeRb+tg4SQ0dHLH5Pvfe2bk1sbZiiQpcelt
# Bd88yeBOpcmdzJVur2V4fEZXu5JB/qt/jeJeQa82hS3k93PWm/w=
# =Axhk
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 16 May 2022 07:46:37 AM PDT
# gpg: using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
* tag 'pull-migration-20220516a' of https://gitlab.com/dagrh/qemu:
multifd: Implement zero copy write in multifd migration (multifd-zero-copy)
multifd: Send header packet without flags if zero-copy-send is enabled
multifd: multifd_send_sync_main now returns negative on error
migration: Add migrate_use_tls() helper
migration: Add zero-copy-send parameter for QMP/HMP for Linux
QIOChannelSocket: Implement io_writev zero copy flag & io_flush for CONFIG_LINUX
QIOChannel: Add flags on io_writev and introduce io_flush callback
meson.build: Fix docker-test-build@alpine when including linux/errqueue.h
tests: ensure migration status isn't reported as failed
tests: add multifd migration tests of TLS with x509 credentials
tests: add multifd migration tests of TLS with PSK credentials
tests: convert multifd migration tests to use common helper
tests: convert XBZRLE migration test to use common helper
tests: add migration tests of TLS with x509 credentials
tests: add migration tests of TLS with PSK credentials
tests: add more helper macros for creating TLS x509 certs
tests: fix encoding of IP addresses in x509 certs
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The X cursor postion can be calculated based on the backporch and
interleave values. In the emulation we ignore the HP-UX settings for
backporch and use instead twice the size of the emulated cursor. With
those changes the X-position of the graphics cursor is now finally
working correctly on HP-UX 10 and HP-UX 11.
Based on coding in Xorg X11R6.6
Signed-off-by: Helge Deller <deller@gmx.de>
The misc_video and misc_ctrl registers control the visibility of the
screen. Start with the screen turned on, and hide or show the screen
based on the control registers.
Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Bit 0x80 in the cursor_cntrl register specifies if the cursor
should be visible. Prevent rendering the cursor if it's invisible.
Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Drop the hard-coded value of 1146 lines which seems to work with HP-UX
11, but not with HP-UX 10. Instead encode the screen height in byte 0 of
active_lines_low and byte 3 of misc_video as it's expected by the Xorg
X11 graphics driver.
This potentially allows for higher vertical screen resolutions than
1280x1024 with X11.
Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Convert the variable names of some registers to human-readable and
understandable names.
Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Add the constant NGLE_MAX_SPRITE_SIZE which defines the currently
maximum supported cursor size.
Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
New features and fixes in SeaBIOS for hppa/parisc:
* STI firmware now contains additional fonts built-in, which
can be selected with qemu command-line options:
-fw_cfg opt/font,string=1 - a HP 8x16 font
-fw_cfg opt/font,string=2 - a HP 6x13 font
-fw_cfg opt/font,string=3 - a HP 10x20 font
-fw_cfg opt/font,string=4 - a Linux 16x32 font
* Fixed PS/2 keyboard emulation when running in graphical mode.
This allows to type boot commands in the firmware boot menu if
qemu was started with "-boot menu=on" (and no linux kernel was
given on the qemu command line).
* Fix firmware rendenzvous code to clear all pending external intrrupts
before entering the waiting loop.
Signed-off-by: Helge Deller <deller@gmx.de>
Implement zero copy send on nocomp_send_write(), by making use of QIOChannel
writev + flags & flush interface.
Change multifd_send_sync_main() so flush_zero_copy() can be called
after each iteration in order to make sure all dirty pages are sent before
a new iteration is started. It will also flush at the beginning and at the
end of migration.
Also make it return -1 if flush_zero_copy() fails, in order to cancel
the migration process, and avoid resuming the guest in the target host
without receiving all current RAM.
This will work fine on RAM migration because the RAM pages are not usually freed,
and there is no problem on changing the pages content between writev_zero_copy() and
the actual sending of the buffer, because this change will dirty the page and
cause it to be re-sent on a next iteration anyway.
A lot of locked memory may be needed in order to use multifd migration
with zero-copy enabled, so disabling the feature should be necessary for
low-privileged users trying to perform multifd migrations.
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220513062836.965425-9-leobras@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Since d48c3a0445 ("multifd: Use a single writev on the send side"),
sending the header packet and the memory pages happens in the same
writev, which can potentially make the migration faster.
Using channel-socket as example, this works well with the default copying
mechanism of sendmsg(), but with zero-copy-send=true, it will cause
the migration to often break.
This happens because the header packet buffer gets reused quite often,
and there is a high chance that by the time the MSG_ZEROCOPY mechanism get
to send the buffer, it has already changed, sending the wrong data and
causing the migration to abort.
It means that, as it is, the buffer for the header packet is not suitable
for sending with MSG_ZEROCOPY.
In order to enable zero copy for multifd, send the header packet on an
individual write(), without any flags, and the remanining pages with a
writev(), as it was happening before. This only changes how a migration
with zero-copy-send=true works, not changing any current behavior for
migrations with zero-copy-send=false.
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220513062836.965425-8-leobras@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Even though multifd_send_sync_main() currently emits error_reports, it's
callers don't really check it before continuing.
Change multifd_send_sync_main() to return -1 on error and 0 on success.
Also change all it's callers to make use of this change and possibly fail
earlier.
(This change is important to next patch on multifd zero copy
implementation, to make it sure an error in zero-copy flush does not go
unnoticed.
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220513062836.965425-7-leobras@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
A lot of places check parameters.tls_creds in order to evaluate if TLS is
in use, and sometimes call migrate_get_current() just for that test.
Add new helper function migrate_use_tls() in order to simplify testing
for TLS usage.
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220513062836.965425-6-leobras@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Add property that allows zero-copy migration of memory pages
on the sending side, and also includes a helper function
migrate_use_zero_copy_send() to check if it's enabled.
No code is introduced to actually do the migration, but it allow
future implementations to enable/disable this feature.
On non-Linux builds this parameter is compiled-out.
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20220513062836.965425-5-leobras@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
For CONFIG_LINUX, implement the new zero copy flag and the optional callback
io_flush on QIOChannelSocket, but enables it only when MSG_ZEROCOPY
feature is available in the host kernel, which is checked on
qio_channel_socket_connect_sync()
qio_channel_socket_flush() was implemented by counting how many times
sendmsg(...,MSG_ZEROCOPY) was successfully called, and then reading the
socket's error queue, in order to find how many of them finished sending.
Flush will loop until those counters are the same, or until some error occurs.
Notes on using writev() with QIO_CHANNEL_WRITE_FLAG_ZERO_COPY:
1: Buffer
- As MSG_ZEROCOPY tells the kernel to use the same user buffer to avoid copying,
some caution is necessary to avoid overwriting any buffer before it's sent.
If something like this happen, a newer version of the buffer may be sent instead.
- If this is a problem, it's recommended to call qio_channel_flush() before freeing
or re-using the buffer.
2: Locked memory
- When using MSG_ZERCOCOPY, the buffer memory will be locked after queued, and
unlocked after it's sent.
- Depending on the size of each buffer, and how often it's sent, it may require
a larger amount of locked memory than usually available to non-root user.
- If the required amount of locked memory is not available, writev_zero_copy
will return an error, which can abort an operation like migration,
- Because of this, when an user code wants to add zero copy as a feature, it
requires a mechanism to disable it, so it can still be accessible to less
privileged users.
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20220513062836.965425-4-leobras@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Add flags to io_writev and introduce io_flush as optional callback to
QIOChannelClass, allowing the implementation of zero copy writes by
subclasses.
How to use them:
- Write data using qio_channel_writev*(...,QIO_CHANNEL_WRITE_FLAG_ZERO_COPY),
- Wait write completion with qio_channel_flush().
Notes:
As some zero copy write implementations work asynchronously, it's
recommended to keep the write buffer untouched until the return of
qio_channel_flush(), to avoid the risk of sending an updated buffer
instead of the buffer state during write.
As io_flush callback is optional, if a subclass does not implement it, then:
- io_flush will return 0 without changing anything.
Also, some functions like qio_channel_writev_full_all() were adapted to
receive a flag parameter. That allows shared code between zero copy and
non-zero copy writev, and also an easier implementation on new flags.
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20220513062836.965425-3-leobras@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
A build error happens in alpine CI when linux/errqueue.h is included
in io/channel-socket.c, due to redefining of 'struct __kernel_timespec':
===
ninja: job failed: [...]
In file included from /usr/include/linux/errqueue.h:6,
from ../io/channel-socket.c:29:
/usr/include/linux/time_types.h:7:8: error: redefinition of 'struct __kernel_timespec'
7 | struct __kernel_timespec {
| ^~~~~~~~~~~~~~~~~
In file included from /usr/include/liburing.h:19,
from /builds/user/qemu/include/block/aio.h:18,
from /builds/user/qemu/include/io/channel.h:26,
from /builds/user/qemu/include/io/channel-socket.h:24,
from ../io/channel-socket.c:24:
/usr/include/liburing/compat.h:9:8: note: originally defined here
9 | struct __kernel_timespec {
| ^~~~~~~~~~~~~~~~~
ninja: subcommand failed
===
As above error message suggests, 'struct __kernel_timespec' was already
defined by liburing/compat.h.
Fix alpine CI by adding test to disable liburing in configure step if a
redefinition happens between linux/errqueue.h and liburing/compat.h.
[dgilbert: This has been fixed in Alpine issue 13813 and liburing]
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Message-Id: <20220513062836.965425-2-leobras@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>