Commit Graph

1237 Commits

Author SHA1 Message Date
Hyman Huang(黄勇)
15699cf542 migration: Extend query-migrate to provide dirty page limit info
Extend query-migrate to provide throttle time and estimated
ring full time with dirty-limit capability enabled, through which
we can observe if dirty limit take effect during live migration.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <168733225273.5845.15871826788879741674-8@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Philippe Mathieu-Daudé
217a190eec accel/kvm: Re-include "exec/memattrs.h" header
Commit 1e05888ab5 ("sysemu/kvm: Remove unused headers") was
a bit overzealous while cleaning "sysemu/kvm.h" headers:
kvm_arch_post_run() returns a MemTxAttrs type, so depends on
"exec/memattrs.h" for its definition.

Fixes: 1e05888ab5 ("sysemu/kvm: Remove unused headers")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230619074153.44268-5-philmd@linaro.org>
2023-06-28 14:14:22 +02:00
Philippe Mathieu-Daudé
3b295bcb32 accel: Rename HVF 'struct hvf_vcpu_state' -> AccelCPUState
We want all accelerators to share the same opaque pointer in
CPUState.

Rename the 'hvf_vcpu_state' structure as 'AccelCPUState'.

Use the generic 'accel' field of CPUState instead of 'hvf'.

Replace g_malloc0() by g_new0() for readability.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230624174121.11508-17-philmd@linaro.org>
2023-06-28 14:14:22 +02:00
Philippe Mathieu-Daudé
6970030177 accel: Document generic accelerator headers
These headers are meant to be include by any file to check
the availability of accelerators, thus are not accelerator
specific.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230624174121.11508-3-philmd@linaro.org>
2023-06-28 13:55:35 +02:00
Marc-André Lureau
09b4c198b8 console/win32: allocate shareable display surface
Introduce qemu_win32_map_alloc() and qemu_win32_map_free() to allocate
shared memory mapping. The handle can be used to share the mapping with
another process.

Teach qemu_create_displaysurface() to allocate shared memory. Following
patches will introduce other places for shared memory allocation.

Other patches for -display dbus will share the memory when possible with
the client, to avoid expensive memory copy between the processes.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230606115658.677673-10-marcandre.lureau@redhat.com>
2023-06-27 17:08:56 +02:00
Francesco Cagnin
eb2edc42b1 hvf: add guest debugging handlers for Apple Silicon hosts
Guests can now be debugged through the gdbstub. Support is added for
single-stepping, software breakpoints, hardware breakpoints and
watchpoints. The code has been structured like the KVM counterpart.

While guest debugging is enabled, the guest can still read and write the
DBG*_EL1 registers but they don't have any effect.

Signed-off-by: Francesco Cagnin <fcagnin@quarkslab.com>
Message-id: 20230601153107.81955-5-fcagnin@quarkslab.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-06-06 10:19:30 +01:00
Francesco Cagnin
f41520402c hvf: add breakpoint handlers
Required for guest debugging. The code has been structured like the KVM
counterpart.

Signed-off-by: Francesco Cagnin <fcagnin@quarkslab.com>
Message-id: 20230601153107.81955-4-fcagnin@quarkslab.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-06-06 10:19:30 +01:00
Vladimir Sementsov-Ogievskiy
e76005a081 runstate: drop unused runstate_store()
The function is unused since previous commit. Drop it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230517123752.21615-4-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-06-02 01:03:19 +02:00
Vladimir Sementsov-Ogievskiy
242b74eb69 runstate: add runstate_get()
It's necessary to restore the state after failed/cancelled migration in
further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230517123752.21615-2-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-06-02 01:03:19 +02:00
Stefan Hajnoczi
41abca8c39 block: add blk_io_plug_call() API
Introduce a new API for thread-local blk_io_plug() that does not
traverse the block graph. The goal is to make blk_io_plug() multi-queue
friendly.

Instead of having block drivers track whether or not we're in a plugged
section, provide an API that allows them to defer a function call until
we're unplugged: blk_io_plug_call(fn, opaque). If blk_io_plug_call() is
called multiple times with the same fn/opaque pair, then fn() is only
called once at the end of the function - resulting in batching.

This patch introduces the API and changes blk_io_plug()/blk_io_unplug().
blk_io_plug()/blk_io_unplug() no longer require a BlockBackend argument
because the plug state is now thread-local.

Later patches convert block drivers to blk_io_plug_call() and then we
can finally remove .bdrv_co_io_plug() once all block drivers have been
converted.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20230530180959.1108766-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-06-01 07:34:03 -04:00
Stefan Hajnoczi
ab61335025 block: drain from main loop thread in bdrv_co_yield_to_drain()
For simplicity, always run BlockDevOps .drained_begin/end/poll()
callbacks in the main loop thread. This makes it easier to implement the
callbacks and avoids extra locks.

Move the function pointer declarations from the I/O Code section to the
Global State section for BlockDevOps, BdrvChildClass, and BlockDriver.

Narrow IO_OR_GS_CODE() to GLOBAL_STATE_CODE() where appropriate.

The test-bdrv-drain test case calls bdrv_drain() from an IOThread. This
is now only allowed from coroutine context, so update the test case to
run in a coroutine.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-11-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30 17:32:02 +02:00
Stefan Hajnoczi
ff82b7835b block: add blk_in_drain() API
The BlockBackend quiesce_counter is greater than zero during drained
sections. Add an API to check whether the BlockBackend is in a drained
section.

The next patch will use this API.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-10-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30 17:32:02 +02:00
Marc-André Lureau
d89f30b4df win32: wrap socket close() with an exception handler
Since commit abe34282 ("win32: avoid mixing SOCKET and file descriptor
space"), we set HANDLE_FLAG_PROTECT_FROM_CLOSE on the socket FD, to
prevent closing the HANDLE with CloseHandle. This raises an exception
which under gdb is fatal, and qemu exits.

Let's catch the expected error instead.

Note: this appears to work, but the mingw64 macro is not well documented
or tested, and it's not obvious how it is meant to be used.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230515132440.1025315-1-marcandre.lureau@redhat.com>
2023-05-28 13:08:25 +04:00
Gavin Shan
b20cc77692 kvm: Synchronize the backup bitmap in the last stage
In the last stage of live migration or memory slot removal, the
backup bitmap needs to be synchronized when it has been enabled.

Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Zhenyu Zhang <zhenyzha@redhat.com>
Message-Id: <20230509022122.20888-3-gshan@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-05-18 08:53:50 +02:00
Philippe Mathieu-Daudé
1e05888ab5 sysemu/kvm: Remove unused headers
All types used are forward-declared in "qemu/typedefs.h".

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230405160454.97436-2-philmd@linaro.org>
[thuth: Add hw/core/cpu.h to migration/dirtyrate.c to fix compile failure]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-05-16 09:13:34 +02:00
Sam Li
4751d09adc block: introduce zone append write for zoned devices
A zone append command is a write operation that specifies the first
logical block of a zone as the write position. When writing to a zoned
block device using zone append, the byte offset of the call may point at
any position within the zone to which the data is being appended. Upon
completion the device will respond with the position where the data has
been written in the zone.

Signed-off-by: Sam Li <faithilikerun@gmail.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20230508051510.177850-3-faithilikerun@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-15 08:18:10 -04:00
Sam Li
6d43eaa396 block/block-backend: add block layer APIs resembling Linux ZonedBlockDevice ioctls
Add zoned device option to host_device BlockDriver. It will be presented only
for zoned host block devices. By adding zone management operations to the
host_block_device BlockDriver, users can use the new block layer APIs
including Report Zone and four zone management operations
(open, close, finish, reset, reset_all).

Qemu-io uses the new APIs to perform zoned storage commands of the device:
zone_report(zrp), zone_open(zo), zone_close(zc), zone_reset(zrs),
zone_finish(zf).

For example, to test zone_report, use following command:
$ ./build/qemu-io --image-opts -n driver=host_device, filename=/dev/nullb0
-c "zrp offset nr_zones"

Signed-off-by: Sam Li <faithilikerun@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20230508045533.175575-4-faithilikerun@gmail.com
Message-id: 20230324090605.28361-4-faithilikerun@gmail.com
[Adjust commit message prefix as suggested by Philippe Mathieu-Daudé
<philmd@linaro.org> and remove spurious ret = -errno in
raw_co_zone_mgmt().
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-15 08:17:03 -04:00
Kevin Wolf
b2ab5f545f block: bdrv/blk_co_unref() for calls in coroutine context
These functions must not be called in coroutine context, because they
need write access to the graph.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-4-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10 14:16:53 +02:00
Emanuele Giuseppe Esposito
ab50533b69 linux-aio: use LinuxAioState from the running thread
Remove usage of aio_context_acquire by always submitting asynchronous
AIO to the current thread's LinuxAioState.

In order to prevent mistakes from the caller side, avoid passing LinuxAioState
in laio_io_{plug/unplug} and laio_co_submit, and document the functions
to make clear that they work in the current thread's AioContext.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20230203131731.851116-2-eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-25 13:17:28 +02:00
Richard Henderson
45608654aa Merge tpm 2023/04/20 v1
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEuBi5yt+QicLVzsZrda1lgCoLQhEFAmRBLgoACgkQda1lgCoL
 QhEPyQf/WfEg8k2hDLExobsSgup1IsnT+mHHTBOZVJvq2efg2YXUTHA56fmD9X6d
 crqTq68L5oaMES5iYEZhA7EAgfk3RvxDQGrlXBByPzrc6SSwEHHMR4Zzi5zrbCoW
 t6TmaKQrlQqYwkhhsbyqnG46bj0ugCDagkBLfJdVl96fjkYgTspcDxaNwqwy/DPn
 GTmQlvdRY09D1nylIdtcLBIfsM+sIkRslyngbUEIy+Bx8EWRy2a8Qw0BdY9g1XoE
 e0CaRaFMpju1KOIjq0YSIzt0LSQDFfPc1IlUAC0ZALhNmp+PPNtr4E7+4kFfO2ym
 1sT2w25ho8dYDdm/m8tIauCdGoHw4A==
 =ML27
 -----END PGP SIGNATURE-----

Merge tag 'pull-tpm-2023-04-20-1' of https://github.com/stefanberger/qemu-tpm into staging

Merge tpm 2023/04/20 v1

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEuBi5yt+QicLVzsZrda1lgCoLQhEFAmRBLgoACgkQda1lgCoL
# QhEPyQf/WfEg8k2hDLExobsSgup1IsnT+mHHTBOZVJvq2efg2YXUTHA56fmD9X6d
# crqTq68L5oaMES5iYEZhA7EAgfk3RvxDQGrlXBByPzrc6SSwEHHMR4Zzi5zrbCoW
# t6TmaKQrlQqYwkhhsbyqnG46bj0ugCDagkBLfJdVl96fjkYgTspcDxaNwqwy/DPn
# GTmQlvdRY09D1nylIdtcLBIfsM+sIkRslyngbUEIy+Bx8EWRy2a8Qw0BdY9g1XoE
# e0CaRaFMpju1KOIjq0YSIzt0LSQDFfPc1IlUAC0ZALhNmp+PPNtr4E7+4kFfO2ym
# 1sT2w25ho8dYDdm/m8tIauCdGoHw4A==
# =ML27
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 20 Apr 2023 01:20:26 PM BST
# gpg:                using RSA key B818B9CADF9089C2D5CEC66B75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE  C66B 75AD 6580 2A0B 4211

* tag 'pull-tpm-2023-04-20-1' of https://github.com/stefanberger/qemu-tpm:
  qtest: Add a test case for TPM TIS I2C connected to Aspeed I2C controller
  qtest: Move tpm_util_tis_transmit() into tpm-tis-utils.c and rename it
  qtest: Add functions for accessing devices on Aspeed I2C controller
  tests/avocado/aspeed: Add TPM TIS I2C test
  tpm: Add support for TPM device over I2C bus
  tpm: Extend common APIs to support TPM TIS I2C
  docs: Add support for TPM devices over I2C bus

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-04-21 20:02:51 +01:00
Ninad Palsule
139fdb3ed8 tpm: Add support for TPM device over I2C bus
Qemu already supports devices attached to ISA and sysbus. This drop adds
support for the I2C bus attached TPM devices. I2C model only supports
TPM2 protocol.

This commit includes changes for the common code.
- Added I2C emulation model. Logic was added in the model to temporarily
  cache the data as I2C interface works per byte basis.
- New tpm type "tpm-tis-i2c" added for I2C support. The user has to
  provide this string on command line.

Testing:
  TPM I2C device module is tested using SWTPM (software based TPM
  package). Qemu uses the rainier machine and is connected to swtpm over
  the socket interface.

  The command to start swtpm is as follows:
  $ swtpm socket --tpmstate dir=/tmp/mytpm1    \
                 --ctrl type=unixio,path=/tmp/mytpm1/swtpm-sock  \
                 --tpm2 --log level=100

  The command to start qemu is as follows:
  $ qemu-system-arm -M rainier-bmc -nographic \
            -kernel ${IMAGEPATH}/fitImage-linux.bin \
            -dtb ${IMAGEPATH}/aspeed-bmc-ibm-rainier.dtb \
            -initrd ${IMAGEPATH}/obmc-phosphor-initramfs.rootfs.cpio.xz \
            -drive file=${IMAGEPATH}/obmc-phosphor-image.rootfs.wic.qcow2,if=sd,index=2 \
            -net nic -net user,hostfwd=:127.0.0.1:2222-:22,hostfwd=:127.0.0.1:2443-:443 \
            -chardev socket,id=chrtpm,path=/tmp/mytpm1/swtpm-sock \
            -tpmdev emulator,id=tpm0,chardev=chrtpm \
            -device tpm-tis-i2c,tpmdev=tpm0,bus=aspeed.i2c.bus.12,address=0x2e

Signed-off-by: Ninad Palsule <ninad@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Tested-by: Joel Stanley <joel@jms.id.au>
Message-id: 20230414220754.1191476-4-ninadpalsule@us.ibm.com
2023-04-20 08:17:15 -04:00
Thomas Huth
c7a6bf5d92 softmmu/qtest: Move the target-specific pseries RTAS code out of qtest.c
Ideally, qtest.c should be independent from target specific code, so
we only have to compile it once for all targets. Thus start improving
the situation by moving the pseries related code to hw/ppc/spapr_rtas.c
instead and allow target code to register a callback handler for such
target specific commands.

Message-Id: <20230411183418.1640500-2-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-04-20 11:25:32 +02:00
Paolo Bonzini
81f730d4d0 block, block-backend: write some hot coroutine wrappers by hand
The introduction of the graph lock is causing blk_get_geometry, a hot function
used in the I/O path, to create a coroutine.  However, the only part that really
needs to run in coroutine context is the call to bdrv_co_refresh_total_sectors,
which in turn only happens in the rare case of host CD-ROM devices.

So, write by hand the three wrappers on the path from blk_co_get_geometry to
bdrv_co_refresh_total_sectors, so that the coroutine wrapper is only created
if bdrv_nb_sectors actually calls bdrv_refresh_total_sectors.

Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20230407153303.391121-9-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-11 16:46:49 +02:00
Kevin Wolf
d8fbf9aa85 block/export: Fix graph locking in blk_get_geometry() call
blk_get_geometry() eventually calls bdrv_nb_sectors(), which is a
co_wrapper_mixed_bdrv_rdlock. This means that when it is called from
coroutine context, it already assume to have the graph locked.

However, virtio_blk_sect_range_ok() in block/export/virtio-blk-handler.c
(used by vhost-user-blk and VDUSE exports) runs in a coroutine, but
doesn't take the graph lock - blk_*() functions are generally expected
to do that internally. This causes an assertion failure when accessing
an export for the first time if it runs in an iothread.

This is an example of the crash:

  $ ./storage-daemon/qemu-storage-daemon --object iothread,id=th0 --blockdev file,filename=/home/kwolf/images/hd.img,node-name=disk --export vhost-user-blk,addr.type=unix,addr.path=/tmp/vhost.sock,node-name=disk,id=exp0,iothread=th0
  qemu-storage-daemon: ../block/graph-lock.c:268: void assert_bdrv_graph_readable(void): Assertion `qemu_in_main_thread() || reader_count()' failed.

  (gdb) bt
  #0  0x00007ffff6eafe5c in __pthread_kill_implementation () from /lib64/libc.so.6
  #1  0x00007ffff6e5fa76 in raise () from /lib64/libc.so.6
  #2  0x00007ffff6e497fc in abort () from /lib64/libc.so.6
  #3  0x00007ffff6e4971b in __assert_fail_base.cold () from /lib64/libc.so.6
  #4  0x00007ffff6e58656 in __assert_fail () from /lib64/libc.so.6
  #5  0x00005555556337a3 in assert_bdrv_graph_readable () at ../block/graph-lock.c:268
  #6  0x00005555555fd5a2 in bdrv_co_nb_sectors (bs=0x5555564c5ef0) at ../block.c:5847
  #7  0x00005555555ee949 in bdrv_nb_sectors (bs=0x5555564c5ef0) at block/block-gen.c:256
  #8  0x00005555555fd6b9 in bdrv_get_geometry (bs=0x5555564c5ef0, nb_sectors_ptr=0x7fffef7fedd0) at ../block.c:5884
  #9  0x000055555562ad6d in blk_get_geometry (blk=0x5555564cb200, nb_sectors_ptr=0x7fffef7fedd0) at ../block/block-backend.c:1624
  #10 0x00005555555ddb74 in virtio_blk_sect_range_ok (blk=0x5555564cb200, block_size=512, sector=0, size=512) at ../block/export/virtio-blk-handler.c:44
  #11 0x00005555555dd80d in virtio_blk_process_req (handler=0x5555564cbb98, in_iov=0x7fffe8003830, out_iov=0x7fffe8003860, in_num=1, out_num=0) at ../block/export/virtio-blk-handler.c:189
  #12 0x00005555555dd546 in vu_blk_virtio_process_req (opaque=0x7fffe8003800) at ../block/export/vhost-user-blk-server.c:66
  #13 0x00005555557bf4a1 in coroutine_trampoline (i0=-402635264, i1=32767) at ../util/coroutine-ucontext.c:177
  #14 0x00007ffff6e75c20 in ?? () from /lib64/libc.so.6
  #15 0x00007fffefffa870 in ?? ()
  #16 0x0000000000000000 in ?? ()

Fix this by creating a new blk_co_get_geometry() that takes the lock,
and changing blk_get_geometry() to be a co_wrapper_mixed around it.

To make the resulting code cleaner, virtio-blk-handler.c can directly
call the coroutine version now (though that wouldn't be necessary for
fixing the bug, taking the lock in blk_co_get_geometry() is what fixes
it).

Fixes: 8ab8140a04
Reported-by: Lukáš Doktor <ldoktor@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230327113959.60071-1-kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-03-27 15:16:05 +02:00
Marc-André Lureau
f3ab43accf win32: add qemu_close_socket_osfhandle()
Close the given file descriptor, but returns the underlying SOCKET.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230320133643.1618437-2-marcandre.lureau@redhat.com>
2023-03-21 11:16:03 +04:00
Marc-André Lureau
25657fc6c1 win32: replace closesocket() with close() wrapper
Use a close() wrapper instead, so that we don't need to worry about
closesocket() vs close() anymore, let's hope.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-Id: <20230221124802.4103554-17-marcandre.lureau@redhat.com>
2023-03-13 15:39:31 +04:00
Marc-André Lureau
b7e5374637 os-posix: remove useless ioctlsocket() define
The API is specific to win32.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230221124802.4103554-16-marcandre.lureau@redhat.com>
2023-03-13 15:39:31 +04:00
Marc-André Lureau
abe34282b0 win32: avoid mixing SOCKET and file descriptor space
Until now, a win32 SOCKET handle is often cast to an int file
descriptor, as this is what other OS use for sockets. When necessary,
QEMU eventually queries whether it's a socket with the help of
fd_is_socket(). However, there is no guarantee of conflict between the
fd and SOCKET space. Such conflict would have surprising consequences,
we shouldn't mix them.

Also, it is often forgotten that SOCKET must be closed with
closesocket(), and not close().

Instead, let's make the win32 socket wrapper functions return and take a
file descriptor, and let util/ wrappers do the fd/SOCKET conversion as
necessary. A bit of adaptation is necessary in io/ as well.

Unfortunately, we can't drop closesocket() usage, despite
_open_osfhandle() documentation claiming transfer of ownership, testing
shows bad behaviour if you forget to call closesocket().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-Id: <20230221124802.4103554-15-marcandre.lureau@redhat.com>
2023-03-13 15:39:31 +04:00
Marc-André Lureau
a4aafea261 win32/socket: introduce qemu_socket_unselect() helper
A more explicit version of qemu_socket_select() with no events.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-Id: <20230221124802.4103554-8-marcandre.lureau@redhat.com>
2023-03-13 15:23:37 +04:00
Marc-André Lureau
f5fd677ae7 win32/socket: introduce qemu_socket_select() helper
This is a wrapper for WSAEventSelect, with Error handling. By default,
it will produce a warning, so callers don't have to be modified
now, and yet we can spot potential mis-use.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-Id: <20230221124802.4103554-7-marcandre.lureau@redhat.com>
2023-03-13 15:23:37 +04:00
Peter Maydell
674acdd178 virtio,pc,pci: features, fixes
Several features that landed at the last possible moment:
 
 Passthrough HDM decoder emulation
 Refactor cryptodev
 RAS error emulation and injection
 acpi-index support on non-hotpluggable slots
 Dynamically switch to vhost shadow virtqueues at vdpa net migration
 
 Plus a couple of bugfixes that look important to have in the release.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmQJ8TYPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRp37YIAMpQA5/ddmKKz/ABtBMHB5JX/SVYcG+1xkBR
 j9IFYusOfmmDfmgAhv0Qxi9+Wik95lszVZUnphvocSGd0PXH47pK7yv9RZ1ttaYX
 oAbXrGqXo8rUhl1ksQsJ8Iasj2di1BLP0byPuozbRkg1Kkz5TqRd9+hBqSBGEx21
 tsP5708UVCDAriwYYO78Cx0ZasmB9bqqeom5FdEsg9sYJ5aElOOvitp9YO1p2xhU
 gRvhD+k/aqNi+mfOUF7qGDBanxKgx75VV/KU1cjjS9R1vNtwRhfc/26PBrROY00a
 wkZWnAxmzDFKRS6cEfeb+eDGEVjC3IqLAjcFeuAIT/78CwdvIiY=
 =e1yv
 -----END PGP SIGNATURE-----

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pc,pci: features, fixes

Several features that landed at the last possible moment:

Passthrough HDM decoder emulation
Refactor cryptodev
RAS error emulation and injection
acpi-index support on non-hotpluggable slots
Dynamically switch to vhost shadow virtqueues at vdpa net migration

Plus a couple of bugfixes that look important to have in the release.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmQJ8TYPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp37YIAMpQA5/ddmKKz/ABtBMHB5JX/SVYcG+1xkBR
# j9IFYusOfmmDfmgAhv0Qxi9+Wik95lszVZUnphvocSGd0PXH47pK7yv9RZ1ttaYX
# oAbXrGqXo8rUhl1ksQsJ8Iasj2di1BLP0byPuozbRkg1Kkz5TqRd9+hBqSBGEx21
# tsP5708UVCDAriwYYO78Cx0ZasmB9bqqeom5FdEsg9sYJ5aElOOvitp9YO1p2xhU
# gRvhD+k/aqNi+mfOUF7qGDBanxKgx75VV/KU1cjjS9R1vNtwRhfc/26PBrROY00a
# wkZWnAxmzDFKRS6cEfeb+eDGEVjC3IqLAjcFeuAIT/78CwdvIiY=
# =e1yv
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 09 Mar 2023 14:46:14 GMT
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (72 commits)
  virtio: fix reachable assertion due to stale value of cached region size
  hw/virtio/vhost-user: avoid using unitialized errp
  hw/pxb-cxl: Support passthrough HDM Decoders unless overridden
  hw/pci: Add pcie_count_ds_port() and pcie_find_port_first() helpers
  hw/mem/cxl_type3: Add CXL RAS Error Injection Support.
  hw/pci/aer: Make PCIE AER error injection facility available for other emulation to use.
  hw/cxl: Fix endian issues in CXL RAS capability defaults / masks
  hw/mem/cxl-type3: Add AER extended capability
  hw/pci-bridge/cxl_root_port: Wire up MSI
  hw/pci-bridge/cxl_root_port: Wire up AER
  hw/pci/aer: Add missing routing for AER errors
  hw/pci/aer: Implement PCI_ERR_UNCOR_MASK register
  pcihp: add ACPI PCI hotplug specific is_hotpluggable_bus() callback
  pcihp: move fields enabling hotplug into AcpiPciHpState
  acpi: pci: move out ACPI PCI hotplug generator from generic slot generator build_append_pci_bus_devices()
  acpi: pci: move BSEL into build_append_pcihp_slots()
  acpi: pci: drop BSEL usage when deciding that device isn't hotpluggable
  pci: move acpi-index uniqueness check to generic PCI device code
  tests: acpi: update expected blobs
  tests: acpi: add non zero function device with acpi-index on non-hotpluggble bus
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-03-10 14:31:37 +00:00
Mads Ynddal
412ae12647 gdbstub: move update guest debug to accel ops
Continuing the refactor of a48e7d9e52 (gdbstub: move guest debug support
check to ops) by removing hardcoded kvm_enabled() from generic cpu.c
code, and replace it with a property of AccelOpsClass.

Signed-off-by: Mads Ynddal <m.ynddal@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230207131721.49233-1-mads@ynddal.dk>
[AJB: add ifdef around update_guest_debug_ops, fix brace]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230302190846.2593720-27-alex.bennee@linaro.org>
Message-Id: <20230303025805.625589-30-richard.henderson@linaro.org>
2023-03-07 20:44:09 +00:00
zhenwei pi
2580b452ff cryptodev: support QoS
Add 'throttle-bps' and 'throttle-ops' limitation to set QoS. The
two arguments work with both QEMU command line and QMP command.

Example of QEMU command line:
-object cryptodev-backend-builtin,id=cryptodev1,throttle-bps=1600,\
throttle-ops=100

Example of QMP command:
virsh qemu-monitor-command buster --hmp qom-set /objects/cryptodev1 \
throttle-ops 100

or cancel limitation:
virsh qemu-monitor-command buster --hmp qom-set /objects/cryptodev1 \
throttle-ops 0

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-11-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
e7a775fd9f cryptodev: Account statistics
Account OPS/BPS for crypto device, this will be used for 'query-stats'
QEMU monitor command and QoS in the next step.

Note that a crypto device may support symmetric mode, asymmetric mode,
both symmetric and asymmetric mode. So we use two structure to
describe the statistics of a crypto device.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-10-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
2cb0692768 cryptodev: Use CryptoDevBackendOpInfo for operation
Move queue_index, CryptoDevCompletionFunc and opaque into struct
CryptoDevBackendOpInfo, then cryptodev_backend_crypto_operation()
needs an argument CryptoDevBackendOpInfo *op_info only. And remove
VirtIOCryptoReq from cryptodev. It's also possible to hide
VirtIOCryptoReq into virtio-crypto.c in the next step. (In theory,
VirtIOCryptoReq is a private structure used by virtio-crypto only)

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-9-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
999c789f00 cryptodev: Introduce cryptodev alg type in QAPI
Introduce cryptodev alg type in cryptodev.json, then apply this to
related codes, and drop 'enum CryptoDevBackendAlgType'.

There are two options:
1, { 'enum': 'QCryptodevBackendAlgType',
  'prefix': 'CRYPTODEV_BACKEND_ALG',
  'data': ['sym', 'asym']}
Then we can keep 'CRYPTODEV_BACKEND_ALG_SYM' and avoid lots of
changes.
2, changes in this patch(with prefix 'QCRYPTODEV_BACKEND_ALG').

To avoid breaking the rule of QAPI, use 2 here.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-4-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
3f478371fd cryptodev: Remove 'name' & 'model' fields
We have already used qapi to generate crypto device types, this allows
to convert type to a string 'model', so the 'model' field is not
needed.

And the 'name' field is not used by any backend driver, drop it.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-3-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
14c9fd1673 cryptodev: Introduce cryptodev.json
Introduce QCryptodevBackendType in cryptodev.json, also apply this to
related codes. Then we can drop 'enum CryptoDevBackendOptionsType'.

Note that `CRYPTODEV_BACKEND_TYPE_NONE` is *NOT* used by anywhere, so
drop it(no 'none' enum in QCryptodevBackendType).

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-2-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
David Woodhouse
e16aff4cc2 kvm/i386: Add xen-evtchn-max-pirq property
The default number of PIRQs is set to 256 to avoid issues with 32-bit MSI
devices. Allow it to be increased if the user desires.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:09:22 +00:00
David Woodhouse
8b57d5c523 i386/xen: Reserve Xen special pages for console, xenstore rings
Xen has eight frames at 0xfeff8000 for this; we only really need two for
now and KVM puts the identity map at 0xfeffc000, so limit ourselves to
four.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:07:52 +00:00
David Woodhouse
6f43f2ee49 kvm/i386: Add xen-gnttab-max-frames property
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:07:52 +00:00
David Woodhouse
ddf0fd9ae1 hw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callback
The GSI callback (and later PCI_INTX) is a level triggered interrupt. It
is asserted when an event channel is delivered to vCPU0, and is supposed
to be cleared when the vcpu_info->evtchn_upcall_pending field for vCPU0
is cleared again.

Thankfully, Xen does *not* assert the GSI if the guest sets its own
evtchn_upcall_pending field; we only need to assert the GSI when we
have delivered an event for ourselves. So that's the easy part, kind of.

There's a slight complexity in that we need to hold the BQL before we
can call qemu_set_irq(), and we definitely can't do that while holding
our own port_lock (because we'll need to take that from the qemu-side
functions that the PV backend drivers will call). So if we end up
wanting to set the IRQ in a context where we *don't* already hold the
BQL, defer to a BH.

However, we *do* need to poll for the evtchn_upcall_pending flag being
cleared. In an ideal world we would poll that when the EOI happens on
the PIC/IOAPIC. That's how it works in the kernel with the VFIO eventfd
pairs — one is used to trigger the interrupt, and the other works in the
other direction to 'resample' on EOI, and trigger the first eventfd
again if the line is still active.

However, QEMU doesn't seem to do that. Even VFIO level interrupts seem
to be supported by temporarily unmapping the device's BARs from the
guest when an interrupt happens, then trapping *all* MMIO to the device
and sending the 'resample' event on *every* MMIO access until the IRQ
is cleared! Maybe in future we'll plumb the 'resample' concept through
QEMU's irq framework but for now we'll do what Xen itself does: just
check the flag on every vmexit if the upcall GSI is known to be
asserted.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:06:44 +00:00
David Woodhouse
c723d4c15e hw/xen: Implement EVTCHNOP_bind_virq
Add the array of virq ports to each vCPU so that we can deliver timers,
debug ports, etc. Global virqs are allocated against vCPU 0 initially,
but can be migrated to other vCPUs (when we implement that).

The kernel needs to know about VIRQ_TIMER in order to accelerate timers,
so tell it via KVM_XEN_VCPU_ATTR_TYPE_TIMER. Also save/restore the value
of the singleshot timer across migration, as the kernel will handle the
hypercalls automatically now.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 08:22:50 +00:00
David Woodhouse
27d4075dd8 i386/xen: Add support for Xen event channel delivery to vCPU
The kvm_xen_inject_vcpu_callback_vector() function will either deliver
the per-vCPU local APIC vector (as an MSI), or just kick the vCPU out
of the kernel to trigger KVM's automatic delivery of the global vector.
Support for asserting the GSI/PCI_INTX callbacks will come later.

Also add kvm_xen_get_vcpu_info_hva() which returns the vcpu_info of
a given vCPU.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 08:22:50 +00:00
David Woodhouse
d40ddd5290 hw/xen: Add xen_overlay device for emulating shared xenheap pages
For the shared info page and for grant tables, Xen shares its own pages
from the "Xen heap" to the guest. The guest requests that a given page
from a certain address space (XENMAPSPACE_shared_info, etc.) be mapped
to a given GPA using the XENMEM_add_to_physmap hypercall.

To support that in qemu when *emulating* Xen, create a memory region
(migratable) and allow it to be mapped as an overlay when requested.

Xen theoretically allows the same page to be mapped multiple times
into the guest, but that's hard to track and reinstate over migration,
so we automatically *unmap* any previous mapping when creating a new
one. This approach has been used in production with.... a non-trivial
number of guests expecting true Xen, without any problems yet being
noticed.

This adds just the shared info page for now. The grant tables will be
a larger region, and will need to be overlaid one page at a time. I
think that means I need to create separate aliases for each page of
the overall grant_frames region, so that they can be mapped individually.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 08:22:49 +00:00
Joao Martins
79b7067dc6 i386/xen: implement HYPERVISOR_sched_op, SCHEDOP_shutdown
It allows to shutdown itself via hypercall with any of the 3 reasons:
  1) self-reboot
  2) shutdown
  3) crash

Implementing SCHEDOP_shutdown sub op let us handle crashes gracefully rather
than leading to triple faults if it remains unimplemented.

In addition, the SHUTDOWN_soft_reset reason is used for kexec, to reset
Xen shared pages and other enlightenments and leave a clean slate for the
new kernel without the hypervisor helpfully writing information at
unexpected addresses.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Ditch sched_op_compat which was never available for HVM guests,
        Add SCHEDOP_soft_reset]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 08:22:49 +00:00
David Woodhouse
61491cf441 i386/kvm: Add xen-version KVM accelerator property and init KVM Xen support
This just initializes the basic Xen support in KVM for now. Only permitted
on TYPE_PC_MACHINE because that's where the sysbus devices for Xen heap
overlay, event channel, grant tables and other stuff will exist. There's
no point having the basic hypercall support if nothing else works.

Provide sysemu/kvm_xen.h and a kvm_xen_get_caps() which will be used
later by support devices.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 08:22:49 +00:00
Philippe Mathieu-Daudé
5b5968c477 replay: Extract core API to 'exec/replay-core.h'
replay API is used deeply within TCG common code (common to user
and system emulation). Unfortunately "sysemu/replay.h" requires
some QAPI headers for few system-specific declarations, example:

  void replay_input_event(QemuConsole *src, InputEvent *evt);

Since commit c2651c0eaa ("qapi/meson: Restrict UI module to system
emulation and tools") the QAPI header defining the InputEvent is
not generated anymore.
To keep it simple, extract the 'core' replay prototypes to a new
"exec/replay-core.h" header which we include in the TCG code that
doesn't need the rest of the replay API.

Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Message-Id: <20221219170806.60580-5-philmd@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-02-27 22:29:01 +01:00
Philippe Mathieu-Daudé
5db2dac462 sysemu/kvm: Remove CONFIG_USER_ONLY guard
User emulation shouldn't really include this header; if included
these declarations are guarded by CONFIG_KVM_IS_POSSIBLE.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221216220738.7355-2-philmd@linaro.org>
2023-02-27 22:29:01 +01:00
Philippe Mathieu-Daudé
c32c487a03 exec: Remove unused 'qemu/timer.h' timer
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221219170806.60580-2-philmd@linaro.org>
2023-02-27 22:29:01 +01:00