Host notifiers can now use is_external=false since virtio-blk and
virtio-scsi no longer rely on is_external=true for drained sections.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230516190238.8401-20-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The virtio-scsi Host Bus Adapter provides access to devices on a SCSI
bus. Those SCSI devices typically have a BlockBackend. When the
BlockBackend enters a drained section, the SCSI device must temporarily
stop submitting new I/O requests.
Implement this behavior by temporarily stopping virtio-scsi virtqueue
processing when one of the SCSI devices enters a drained section. The
new scsi_device_drained_begin() API allows scsi-disk to message the
virtio-scsi HBA.
scsi_device_drained_begin() uses a drain counter so that multiple SCSI
devices can have overlapping drained sections. The HBA only sees one
pair of .drained_begin/end() calls.
After this commit, virtio-scsi no longer depends on hw/virtio's
ioeventfd aio_set_event_notifier(is_external=true). This commit is a
step towards removing the aio_disable_external() API.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230516190238.8401-19-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Detach ioeventfds during drained sections to stop I/O submission from
the guest. virtio-blk is no longer reliant on aio_disable_external()
after this patch. This will allow us to remove the
aio_disable_external() API once all other code that relies on it is
converted.
Take extra care to avoid attaching/detaching ioeventfds if the data
plane is started/stopped during a drained section. This should be rare,
but maybe the mirror block job can trigger it.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230516190238.8401-18-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
virtio_queue_aio_detach_host_notifier() does two things:
1. It removes the fd handler from the event loop.
2. It processes the virtqueue one last time.
The first step can be peformed by any thread and without taking the
AioContext lock.
The second step may need the AioContext lock (depending on the device
implementation) and runs in the thread where request processing takes
place. virtio-blk and virtio-scsi therefore call
virtio_queue_aio_detach_host_notifier() from a BH that is scheduled in
AioContext.
The next patch will introduce a .drained_begin() function that needs to
call virtio_queue_aio_detach_host_notifier(). .drained_begin() functions
cannot call aio_poll() to wait synchronously for the BH. It is possible
for a .drained_poll() callback to asynchronously wait for the BH, but
that is more complex than necessary here.
Move the virtqueue processing out to the callers of
virtio_queue_aio_detach_host_notifier() so that the function can be
called from any thread. This is in preparation for the next patch.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230516190238.8401-17-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is part of ongoing work to remove the aio_disable_external() API.
Use BlockDevOps .drained_begin/end/poll() instead of
aio_set_fd_handler(is_external=true).
As a side-effect the FUSE export now follows AioContext changes like the
other export types.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-16-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The FUSE export calls blk_exp_ref/unref() without the AioContext lock.
Instead of fixing the FUSE export, adjust blk_exp_ref/unref() so they
work without the AioContext lock. This way it's less error-prone.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-15-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
vduse_blk_detach_ctx() waits for in-flight requests using
AIO_WAIT_WHILE(). This is not allowed according to a comment in
bdrv_set_aio_context_commit():
/*
* Take the old AioContex when detaching it from bs.
* At this point, new_context lock is already acquired, and we are now
* also taking old_context. This is safe as long as bdrv_detach_aio_context
* does not call AIO_POLL_WHILE().
*/
Use this opportunity to rewrite the drain code in vduse-blk:
- Use the BlockExport refcount so that vduse_blk_exp_delete() is only
called when there are no more requests in flight.
- Implement .drained_poll() so in-flight request coroutines are stopped
by the time .bdrv_detach_aio_context() is called.
- Remove AIO_WAIT_WHILE() from vduse_blk_detach_ctx() to solve the
.bdrv_detach_aio_context() constraint violation. It's no longer
needed due to the previous changes.
- Always handle the VDUSE file descriptor, even in drained sections. The
VDUSE file descriptor doesn't submit I/O, so it's safe to handle it in
drained sections. This ensures that the VDUSE kernel code gets a fast
response.
- Suspend virtqueue fd handlers in .drained_begin() and resume them in
.drained_end(). This eliminates the need for the
aio_set_fd_handler(is_external=true) flag, which is being removed from
QEMU.
This is a long list but splitting it into individual commits would
probably lead to git bisect failures - the changes are all related.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-14-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
is_external=true suspends fd handlers between aio_disable_external() and
aio_enable_external(). The block layer's drain operation uses this
mechanism to prevent new I/O from sneaking in between
bdrv_drained_begin() and bdrv_drained_end().
The previous commit converted the xen-block device to use BlockDevOps
.drained_begin/end() callbacks. It no longer relies on is_external=true
so it is safe to pass is_external=false.
This is part of ongoing work to remove the aio_disable_external() API.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-13-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Detach event channels during drained sections to stop I/O submission
from the ring. xen-block is no longer reliant on aio_disable_external()
after this patch. This will allow us to remove the
aio_disable_external() API once all other code that relies on it is
converted.
Extend xen_device_set_event_channel_context() to allow ctx=NULL. The
event channel still exists but the event loop does not monitor the file
descriptor. Event channel processing can resume by calling
xen_device_set_event_channel_context() with a non-NULL ctx.
Factor out xen_device_set_event_channel_context() calls in
hw/block/dataplane/xen-block.c into attach/detach helper functions.
Incidentally, these don't require the AioContext lock because
aio_set_fd_handler() is thread-safe.
It's safer to register BlockDevOps after the dataplane instance has been
created. The BlockDevOps .drained_begin/end() callbacks depend on the
dataplane instance, so move the blk_set_dev_ops() call after
xen_block_dataplane_create().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-12-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
For simplicity, always run BlockDevOps .drained_begin/end/poll()
callbacks in the main loop thread. This makes it easier to implement the
callbacks and avoids extra locks.
Move the function pointer declarations from the I/O Code section to the
Global State section for BlockDevOps, BdrvChildClass, and BlockDriver.
Narrow IO_OR_GS_CODE() to GLOBAL_STATE_CODE() where appropriate.
The test-bdrv-drain test case calls bdrv_drain() from an IOThread. This
is now only allowed from coroutine context, so update the test case to
run in a coroutine.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-11-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The BlockBackend quiesce_counter is greater than zero during drained
sections. Add an API to check whether the BlockBackend is in a drained
section.
The next patch will use this API.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-10-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is no need to suspend activity between aio_disable_external() and
aio_enable_external(), which is mainly used for the block layer's drain
operation.
This is part of ongoing work to remove the aio_disable_external() API.
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-9-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
vhost-user activity must be suspended during bdrv_drained_begin/end().
This prevents new requests from interfering with whatever is happening
in the drained section.
Previously this was done using aio_set_fd_handler()'s is_external
argument. In a multi-queue block layer world the aio_disable_external()
API cannot be used since multiple AioContext may be processing I/O, not
just one.
Switch to BlockDevOps->drained_begin/end() callbacks.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-8-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Each vhost-user-blk request runs in a coroutine. When the BlockBackend
enters a drained section we need to enter a quiescent state. Currently
any in-flight requests race with bdrv_drained_begin() because it is
unaware of vhost-user-blk requests.
When blk_co_preadv/pwritev()/etc returns it wakes the
bdrv_drained_begin() thread but vhost-user-blk request processing has
not yet finished. The request coroutine continues executing while the
main loop thread thinks it is in a drained section.
One example where this is unsafe is for blk_set_aio_context() where
bdrv_drained_begin() is called before .aio_context_detached() and
.aio_context_attach(). If request coroutines are still running after
bdrv_drained_begin(), then the AioContext could change underneath them
and they race with new requests processed in the new AioContext. This
could lead to virtqueue corruption, for example.
(This example is theoretical, I came across this while reading the
code and have not tried to reproduce it.)
It's easy to make bdrv_drained_begin() wait for in-flight requests: add
a .drained_poll() callback that checks the VuServer's in-flight counter.
VuServer just needs an API that returns true when there are requests in
flight. The in-flight counter needs to be atomic.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-7-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The VuServer object has a refcount field and ref/unref APIs. The name is
confusing because it's actually an in-flight request counter instead of
a refcount.
Normally a refcount destroys the object upon reaching zero. The VuServer
counter is used to wake up the vhost-user coroutine when there are no
more requests.
Avoid confusing by renaming refcount and ref/unref to in_flight and
inc/dec.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-6-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch is part of an effort to remove the aio_disable_external()
API because it does not fit in a multi-queue block layer world where
many AioContexts may be submitting requests to the same disk.
The SCSI emulation code is already in good shape to stop using
aio_disable_external(). It was only used by commit 9c5aad84da1c
("virtio-scsi: fixed virtio_scsi_ctx_check failed when detaching scsi
disk") to ensure that virtio_scsi_hotunplug() works while the guest
driver is submitting I/O.
Ensure virtio_scsi_hotunplug() is safe as follows:
1. qdev_simple_device_unplug_cb() -> qdev_unrealize() ->
device_set_realized() calls qatomic_set(&dev->realized, false) so
that future scsi_device_get() calls return NULL because they exclude
SCSIDevices with realized=false.
That means virtio-scsi will reject new I/O requests to this
SCSIDevice with VIRTIO_SCSI_S_BAD_TARGET even while
virtio_scsi_hotunplug() is still executing. We are protected against
new requests!
2. scsi_qdev_unrealize() already contains a call to
scsi_device_purge_requests() so that in-flight requests are cancelled
synchronously. This ensures that no in-flight requests remain once
qdev_simple_device_unplug_cb() returns.
Thanks to these two conditions we don't need aio_disable_external()
anymore.
Cc: Zhengui Li <lizhengui@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-5-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Only report a transport reset event to the guest after the SCSIDevice
has been unrealized by qdev_simple_device_unplug_cb().
qdev_simple_device_unplug_cb() sets the SCSIDevice's qdev.realized field
to false so that scsi_device_find/get() no longer see it.
scsi_target_emulate_report_luns() also needs to be updated to filter out
SCSIDevices that are unrealized.
Change virtio_scsi_push_event() to take event information as an argument
instead of the SCSIDevice. This allows virtio_scsi_hotunplug() to emit a
VIRTIO_SCSI_T_TRANSPORT_RESET event after the SCSIDevice has already
been unrealized.
These changes ensure that the guest driver does not see the SCSIDevice
that's being unplugged if it responds very quickly to the transport
reset event.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-4-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add a helper function to check whether the device is realized without
requiring the Big QEMU Lock. The next patch adds a second caller. The
goal is to avoid spreading DeviceState field accesses throughout the
code.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-3-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
blk_set_aio_context() is not fully transactional because
blk_do_set_aio_context() updates blk->ctx outside the transaction. Most
of the time this goes unnoticed but a BlockDevOps.drained_end() callback
that invokes blk_get_aio_context() fails assert(ctx == blk->ctx). This
happens because blk->ctx is only assigned after
BlockDevOps.drained_end() is called and we're in an intermediate state
where BlockDrvierState nodes already have the new context and the
BlockBackend still has the old context.
Making blk_set_aio_context() fully transactional solves this assertion
failure because the BlockBackend's context is updated as part of the
transaction (before BlockDevOps.drained_end() is called).
Split blk_do_set_aio_context() in order to solve this assertion failure.
This helper function actually serves two different purposes:
1. It drives blk_set_aio_context().
2. It responds to BdrvChildClass->change_aio_ctx().
Get rid of the helper function. Do #1 inside blk_set_aio_context() and
do #2 inside blk_root_set_aio_ctx_commit(). This simplifies the code.
The only drawback of the fully transactional approach is that
blk_set_aio_context() must contend with blk_root_set_aio_ctx_commit()
being invoked as part of the AioContext change propagation. This can be
solved by temporarily setting blk->allow_aio_context_change to true.
Future patches call blk_get_aio_context() from
BlockDevOps->drained_end(), so this patch will become necessary.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-2-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If blockdev-create references an existing node in an iothread (e.g. as
it's 'file' child), then suddenly all of the image creation code must
run in that AioContext, too. Test that this actually works.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-13-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It has no internal callers, so its only use is being called from
individual test cases. If the name starts with an underscore, it is
considered private and linters warn against calling it. 256 only gets
away with it currently because it's on the exception list for linters.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-12-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
While calling bdrv_new_open_driver_opts(), the main AioContext lock must
be held, not the lock of the AioContext of the block subtree it will be
added to afterwards.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-11-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_refresh_total_sectors() and bdrv_refresh_limits() expect to be
called under the AioContext lock of the node. Take the lock.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-10-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The AioContext lock must not be held for bdrv_open_child(), but it is
necessary for the following operations, in particular those using nested
event loops in coroutine wrappers.
Temporarily dropping the main AioContext lock is not necessary because
we know we run in the main thread.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-9-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When opening the 'file' child moves bs to an iothread, we need to hold
the AioContext lock of it before we can call raw_apply_options() (and
more specifically, bdrv_getlength() inside of it).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-8-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2_open() doesn't work correctly when opening the 'file' child moves
bs to an iothread, for several reasons:
- It uses BDRV_POLL_WHILE() to wait for the qcow2_open_entry()
coroutine, which involves dropping the AioContext lock for bs when it
is not in the main context - but we don't hold it, so this crashes.
- It runs the qcow2_open_entry() coroutine in the current thread instead
of the new AioContext of bs.
- qcow2_open_entry() doesn't notify the main loop when it's done.
This patches fixes these issues around delegating work to a coroutine.
Temporarily dropping the main AioContext lock is not necessary because
we know we run in the main thread.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-7-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_open_backing_file() calls bdrv_open_inherit(), so all callers must
hold the main AioContext lock.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-6-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This fixes blk_new_open() to not assume that bs is in the main context.
In particular, the BlockBackend must be created with the right
AioContext because it will refuse to move to a different context
afterwards. (blk->allow_aio_context_change is false.)
Use this opportunity to use blk_insert_bs() instead of duplicating the
bdrv_root_attach_child() call. This is consistent with what
blk_new_with_bs() does. Add comments to document the locking rules.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-5-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The function documentation already says that all callers must hold the
main AioContext lock, but not all of them do. This can cause assertion
failures when functions called by bdrv_open() try to drop the lock. Fix
a few more callers to take the lock before calling bdrv_open().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-4-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
These functions specify that the caller must hold the "@filename
AioContext lock". This doesn't make sense, file names don't have an
AioContext. New BlockDriverStates always start in the main AioContext,
so this is what we really need here.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-3-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All of the functions that currently take a BlockDriverState, BdrvChild
or BlockBackend as their first parameter expect the associated
AioContext to be locked when they are called. In the case of
no_co_wrappers, they are called from bottom halves directly in the main
loop, so no other caller can be expected to take the lock for them. This
can result in assertion failures because a lock that isn't taken is
released in nested event loops.
Looking at the first parameter is already done by co_wrappers to decide
where the coroutine should run, so doing the same in no_co_wrappers is
only consistent. Take the lock in the generated bottom halves to fix the
problem.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230525124713.401149-2-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* fsl-imx6: Add SNVS support for i.MX6 boards
* smmuv3: Add support for stage 2 translations
* hw/dma/xilinx_axidma: Check DMASR.HALTED to prevent infinite loop
* hw/arm/xlnx-zynqmp: fix unsigned error when checking the RPUs number
* cleanups for recent Kconfig changes
* target/arm: Explicitly select short-format FSR for M-profile
* tests/qtest: Run arm-specific tests only if the required machine is available
* hw/arm/sbsa-ref: add GIC node into DT
* docs: sbsa: correct graphics card name
* Update copyright dates to 2023
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmR2DYsZHHBldGVyLm1h
eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3ubED/40MFaRWfJuVhD3NzWltzhD
5Y2/kxd3Bm51ki56hiBWXBXeovR3Exve9rP8OOGJ5RUK0SoEb4xdIjwMAGRt1Ksi
Ln4MUqjv0tqUNv1hBDKgnGJ4dW34bhmJAnU/Jdzt8yrpGuSmN+LCWoPC+vTNCWYm
sNFm8VLB+nmVq/sjTKwQc/Uo+7l9JZ+aY6poyHfN7kKpITUHtoCPgwz34btRrXEk
4+eNYQV1UvofRhLRVsIrvA89bd7Rcn5iHbhY+xYHaJDEaoYy7iBfUJeDlUtEgW8k
0fXt5Z5bXUXpz7jmzXdbq//68p8HcqinarIFH4r0Nbu+u2UgkZDJZRns+p5i8Wmv
qE+hLGOgEg8s9n2e6chGuvw6wX49T3Xtr7tNpKQfi5ou5VT7qZIwl50m/JefuoPI
eHu4uPj7pS0z/s8KDk0mNtbfcHkzmT5KpZkbvS2JOzg9o2t1fwGCbKPlcgJPxcIV
Ro7R3rNvd6XSSQBlmcYNXWE7P7zuJyfjfSN7D7b0MdFP/hBTpLGKI2LBggZEdcce
21fiEkEE6d1L2oN+Eiq3q8xQNoVjYSGaE5LJ34+997z7W1JRB/dyJhZM0AkabSMl
mkgyi9kBKxU4S9pxtZ//Uh9B/5blpMQAI4U8S/svuGqzwfI6luY/Qxue+YzRUD0H
XsDSBnq1x2LW2Fhu7YVW3Q==
=/OdY
-----END PGP SIGNATURE-----
Merge tag 'pull-target-arm-20230530-1' of https://git.linaro.org/people/pmaydell/qemu-arm into staging
target-arm queue:
* fsl-imx6: Add SNVS support for i.MX6 boards
* smmuv3: Add support for stage 2 translations
* hw/dma/xilinx_axidma: Check DMASR.HALTED to prevent infinite loop
* hw/arm/xlnx-zynqmp: fix unsigned error when checking the RPUs number
* cleanups for recent Kconfig changes
* target/arm: Explicitly select short-format FSR for M-profile
* tests/qtest: Run arm-specific tests only if the required machine is available
* hw/arm/sbsa-ref: add GIC node into DT
* docs: sbsa: correct graphics card name
* Update copyright dates to 2023
# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmR2DYsZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3ubED/40MFaRWfJuVhD3NzWltzhD
# 5Y2/kxd3Bm51ki56hiBWXBXeovR3Exve9rP8OOGJ5RUK0SoEb4xdIjwMAGRt1Ksi
# Ln4MUqjv0tqUNv1hBDKgnGJ4dW34bhmJAnU/Jdzt8yrpGuSmN+LCWoPC+vTNCWYm
# sNFm8VLB+nmVq/sjTKwQc/Uo+7l9JZ+aY6poyHfN7kKpITUHtoCPgwz34btRrXEk
# 4+eNYQV1UvofRhLRVsIrvA89bd7Rcn5iHbhY+xYHaJDEaoYy7iBfUJeDlUtEgW8k
# 0fXt5Z5bXUXpz7jmzXdbq//68p8HcqinarIFH4r0Nbu+u2UgkZDJZRns+p5i8Wmv
# qE+hLGOgEg8s9n2e6chGuvw6wX49T3Xtr7tNpKQfi5ou5VT7qZIwl50m/JefuoPI
# eHu4uPj7pS0z/s8KDk0mNtbfcHkzmT5KpZkbvS2JOzg9o2t1fwGCbKPlcgJPxcIV
# Ro7R3rNvd6XSSQBlmcYNXWE7P7zuJyfjfSN7D7b0MdFP/hBTpLGKI2LBggZEdcce
# 21fiEkEE6d1L2oN+Eiq3q8xQNoVjYSGaE5LJ34+997z7W1JRB/dyJhZM0AkabSMl
# mkgyi9kBKxU4S9pxtZ//Uh9B/5blpMQAI4U8S/svuGqzwfI6luY/Qxue+YzRUD0H
# XsDSBnq1x2LW2Fhu7YVW3Q==
# =/OdY
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 30 May 2023 07:51:55 AM PDT
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
* tag 'pull-target-arm-20230530-1' of https://git.linaro.org/people/pmaydell/qemu-arm: (21 commits)
docs: sbsa: correct graphics card name
hw/arm/sbsa-ref: add GIC node into DT
Update copyright dates to 2023
arm/Kconfig: Make TCG dependence explicit
arm/Kconfig: Keep Kconfig default entries in default.mak as documentation
target/arm: Explain why we need to select ARM_V7M
target/arm: Explicitly select short-format FSR for M-profile
tests/qtest: Run arm-specific tests only if the required machine is available
hw/arm/xlnx-zynqmp: fix unsigned error when checking the RPUs number
hw/dma/xilinx_axidma: Check DMASR.HALTED to prevent infinite loop.
hw/arm/smmuv3: Add knob to choose translation stage and enable stage-2
hw/arm/smmuv3: Add stage-2 support in iova notifier
hw/arm/smmuv3: Add CMDs related to stage-2
hw/arm/smmuv3: Add VMID to TLB tagging
hw/arm/smmuv3: Make TLB lookup work for stage-2
hw/arm/smmuv3: Parse STE config for stage-2
hw/arm/smmuv3: Add page table walk for stage-2
hw/arm/smmuv3: Refactor stage-1 PTW
hw/arm/smmuv3: Update translation config to hold stage-2
hw/arm/smmuv3: Add missing fields for IDR0
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We moved from VGA to Bochs to have PCIe card.
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Let add GIC information into DeviceTree as part of SBSA-REF versioning.
Trusted Firmware will read it and provide to next firmware level.
Bumps platform version to 0.1 one so we can check is node is present.
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
I noticed that in the latest version, the copyright string is still
2022, even though 2023 is halfway through. This patch fixes that and
fixes the documentation along with it.
Signed-off-by: Enze Li <lienze@kylinos.cn>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230525064345.1152801-1-lienze@kylinos.cn
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Replace the 'default y if TCG' pattern with 'default y; depends on
TCG'.
That makes explict that there is a dependence on TCG and enabling
these CONFIGs via .mak files without TCG present will fail earlier.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230523180525.29994-4-farosas@suse.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When we moved the arm default CONFIGs into Kconfig and removed them
from default.mak, we made it harder to identify which CONFIGs are
selected by default in case users want to disable them.
Bring back the default entries into default.mak, but keep them
commented out. This way users can keep their workflows of editing
default.mak to remove build options without needing to search through
Kconfig.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20230523180525.29994-3-farosas@suse.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We currently need to select ARM_V7M unconditionally when TCG is
present in the build because some translate.c helpers and the whole of
m_helpers.c are not yet under CONFIG_ARM_V7M.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230523180525.29994-2-farosas@suse.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For M-profile, there is no guest-facing A-profile format FSR, but we
still use the env->exception.fsr field to pass fault information from
the point where a fault is raised to the code in
arm_v7m_cpu_do_interrupt() which interprets it and sets the M-profile
specific fault status registers. So it doesn't matter whether we
fill in env->exception.fsr in the short format or the LPAE format, as
long as both sides agree. As it happens arm_v7m_cpu_do_interrupt()
assumes short-form.
In compute_fsr_fsc() we weren't explicitly choosing short-form for
M-profile, but instead relied on it falling out in the wash because
arm_s1_regime_using_lpae_format() would be false. This was broken in
commit 452c67a4 when we added v8R support, because we said "PMSAv8 is
always LPAE format" (as it is for v8R), forgetting that we were
implicitly using this code path on M-profile. At that point we would
hit a g_assert_not_reached():
ERROR:../../target/arm/internals.h:549:arm_fi_to_lfsc: code should not be reached
#7 0x0000555555e055f7 in arm_fi_to_lfsc (fi=0x7fffecff9a90) at ../../target/arm/internals.h:549
#8 0x0000555555e05a27 in compute_fsr_fsc (env=0x555557356670, fi=0x7fffecff9a90, target_el=1, mmu_idx=1, ret_fsc=0x7fffecff9a1c)
at ../../target/arm/tlb_helper.c:95
#9 0x0000555555e05b62 in arm_deliver_fault (cpu=0x555557354800, addr=268961344, access_type=MMU_INST_FETCH, mmu_idx=1, fi=0x7fffecff9a90)
at ../../target/arm/tlb_helper.c:132
#10 0x0000555555e06095 in arm_cpu_tlb_fill (cs=0x555557354800, address=268961344, size=1, access_type=MMU_INST_FETCH, mmu_idx=1, probe=false, retaddr=0)
at ../../target/arm/tlb_helper.c:260
The specific assertion changed when commit fcc7404eff24b4c added
"assert not M-profile" to arm_is_secure_below_el3(), because the
conditions being checked in compute_fsr_fsc() include
arm_el_is_aa64(), which will end up calling arm_is_secure_below_el3()
and asserting before we try to call arm_fi_to_lfsc():
#7 0x0000555555efaf43 in arm_is_secure_below_el3 (env=0x5555574665a0) at ../../target/arm/cpu.h:2396
#8 0x0000555555efb103 in arm_is_el2_enabled (env=0x5555574665a0) at ../../target/arm/cpu.h:2448
#9 0x0000555555efb204 in arm_el_is_aa64 (env=0x5555574665a0, el=1) at ../../target/arm/cpu.h:2509
#10 0x0000555555efbdfd in compute_fsr_fsc (env=0x5555574665a0, fi=0x7fffecff99e0, target_el=1, mmu_idx=1, ret_fsc=0x7fffecff996c)
Avoid the assertion and the incorrect FSR format selection by
explicitly making M-profile use the short-format in this function.
Fixes: 452c67a42704 ("target/arm: Enable TTBCR_EAE for ARMv8-R AArch32")a
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1658
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230523131726.866635-1-peter.maydell@linaro.org
pflash-cfi02-test.c always uses the "musicpal" machine for testing,
test-arm-mptimer.c always uses the "vexpress-a9" machine, and
microbit-test.c requires the "microbit" machine, so we should only
run these tests if the machines have been enabled in the configuration.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-id: 20230524080600.1618137-1-thuth@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When passing --smp with a number lower than XLNX_ZYNQMP_NUM_APU_CPUS,
the expression (ms->smp.cpus - XLNX_ZYNQMP_NUM_APU_CPUS) will result
in a positive number as ms->smp.cpus is a unsigned int.
This will raise the following error afterwards, as Qemu will try to
instantiate some additional RPUs.
| $ qemu-system-aarch64 --smp 1 -M xlnx-zcu102
| **
| ERROR:../src/tcg/tcg.c:777:tcg_register_thread:
| assertion failed: (n < tcg_max_ctxs)
Signed-off-by: Clément Chigot <chigot@adacore.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Tested-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Message-id: 20230524143714.565792-1-chigot@adacore.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When we receive a packet from the xilinx_axienet and then try to s2mem
through the xilinx_axidma, if the descriptor ring buffer is full in the
xilinx axidma driver, we’ll assert the DMASR.HALTED in the
function : stream_process_s2mem and return 0. In the end, we’ll be stuck in
an infinite loop in axienet_eth_rx_notify.
This patch checks the DMASR.HALTED state when we try to push data
from xilinx axi-enet to xilinx axi-dma. When the DMASR.HALTED is asserted,
we will not keep pushing the data and then prevent the infinte loop.
Signed-off-by: Tommy Wu <tommy.wu@sifive.com>
Reviewed-by: Edgar E. Iglesias <edgar@zeroasic.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Message-id: 20230519062137.1251741-1-tommy.wu@sifive.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
As everything is in place, we can use a new system property to
advertise which stage is supported and remove bad_ste from STE
stage2 config.
The property added arm-smmuv3.stage can have 3 values:
- "1": Stage-1 only is advertised.
- "2": Stage-2 only is advertised.
If not passed or an unsupported value is passed, it will default to
stage-1.
Advertise VMID16.
Don't try to decode CD, if stage-2 is configured.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-id: 20230516203327.2051088-11-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In smmuv3_notify_iova, read the granule based on translation stage
and use VMID if valid value is sent.
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-id: 20230516203327.2051088-10-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
CMD_TLBI_S2_IPA: As S1+S2 is not enabled, for now this can be the
same as CMD_TLBI_NH_VAA.
CMD_TLBI_S12_VMALL: Added new function to invalidate TLB by VMID.
For stage-1 only commands, add a check to throw CERROR_ILL if used
when stage-1 is not supported.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-id: 20230516203327.2051088-9-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Allow TLB to be tagged with VMID.
If stage-1 is only supported, VMID is set to -1 and ignored from STE
and CMD_TLBI_NH* cmds.
Update smmu_iotlb_insert trace event to have vmid.
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-id: 20230516203327.2051088-8-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Right now, either stage-1 or stage-2 are supported, this simplifies
how we can deal with TLBs.
This patch makes TLB lookup work if stage-2 is enabled instead of
stage-1.
TLB lookup is done before a PTW, if a valid entry is found we won't
do the PTW.
To be able to do TLB lookup, we need the correct tagging info, as
granularity and input size, so we get this based on the supported
translation stage. The TLB entries are added correctly from each
stage PTW.
When nested translation is supported, this would need to change, for
example if we go with a combined TLB implementation, we would need to
use the min of the granularities in TLB.
As stage-2 shouldn't be tagged by ASID, it will be set to -1 if S1P
is not enabled.
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-id: 20230516203327.2051088-7-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Parse stage-2 configuration from STE and populate it in SMMUS2Cfg.
Validity of field values are checked when possible.
Only AA64 tables are supported and Small Translation Tables (STT) are
not supported.
According to SMMUv3 UM(IHI0070E) "5.2 Stream Table Entry": All fields
with an S2 prefix (with the exception of S2VMID) are IGNORED when
stage-2 bypasses translation (Config[1] == 0).
Which means that VMID can be used(for TLB tagging) even if stage-2 is
bypassed, so we parse it unconditionally when S2P exists. Otherwise
it is set to -1.(only S1P)
As stall is not supported, if S2S is set the translation would abort.
For S2R, we reuse the same code used for stage-1 with flag
record_faults. However when nested translation is supported we would
need to separate stage-1 and stage-2 faults.
Fix wrong shift in STE_S2HD, STE_S2HA, STE_S2S.
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20230516203327.2051088-6-smostafa@google.com
[PMM: fixed format string]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In preparation for adding stage-2 support, add Stage-2 PTW code.
Only Aarch64 format is supported as stage-1.
Nesting stage-1 and stage-2 is not supported right now.
HTTU is not supported, SW is expected to maintain the Access flag.
This is described in the SMMUv3 manual(IHI 0070.E.a)
"5.2. Stream Table Entry" in "[181] S2AFFD".
This flag determines the behavior on access of a stage-2 page whose
descriptor has AF == 0:
- 0b0: An Access flag fault occurs (stall not supported).
- 0b1: An Access flag fault never occurs.
An Access fault takes priority over a Permission fault.
There are 3 address size checks for stage-2 according to
(IHI 0070.E.a) in "3.4. Address sizes".
- As nesting is not supported, input address is passed directly to
stage-2, and is checked against IAS.
We use cfg->oas to hold the OAS when stage-1 is not used, this is set
in the next patch.
This check is done outside of smmu_ptw_64_s2 as it is not part of
stage-2(it throws stage-1 fault), and the stage-2 function shouldn't
change it's behavior when nesting is supported.
When nesting is supported and we figure out how to combine TLB for
stage-1 and stage-2 we can move this check into the stage-1 function
as described in ARM DDI0487I.a in pseudocode
aarch64/translation/vmsa_translation/AArch64.S1Translate
aarch64/translation/vmsa_translation/AArch64.S1DisabledOutput
- Input to stage-2 is checked against s2t0sz, and throws stage-2
transaltion fault if exceeds it.
- Output of stage-2 is checked against effective PA output range.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-id: 20230516203327.2051088-5-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In preparation for adding stage-2 support, rename smmu_ptw_64 to
smmu_ptw_64_s1 and refactor some of the code so it can be reused in
stage-2 page table walk.
Remove AA64 check from PTW as decode_cd already ensures that AA64 is
used, otherwise it faults with C_BAD_CD.
A stage member is added to SMMUPTWEventInfo to differentiate
between stage-1 and stage-2 ptw faults.
Add stage argument to trace_smmu_ptw_level be consistent with other
trace events.
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-id: 20230516203327.2051088-4-smostafa@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>