In commit 1b7bc9b5c8 we changed handle_vec_simd_sqshrn() so
that instead of starting with a 0 value and depositing in each new
element from the narrowing operation, it instead started with the raw
result of the narrowing operation of the first element.
This is fine in the vector case, because the deposit operations for
the second and subsequent elements will always overwrite any higher
bits that might have been in the first element's result value in
tcg_rd. However in the scalar case we only go through this loop
once. The effect is that for a signed narrowing operation, if the
result is negative then we will now return a value where the bits
above the first element are incorrectly 1 (because the narrowfn
returns a sign-extended result, not one that is truncated to the
element size).
Fix this by using an extract operation to get exactly the correct
bits of the output of the narrowfn for element 1, instead of a
plain move.
Cc: qemu-stable@nongnu.org
Fixes: 1b7bc9b5c8 ("target/arm: Avoid tcg_const_ptr in handle_vec_simd_sqshrn")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2089
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240123153416.877308-1-peter.maydell@linaro.org
(cherry picked from commit 6fffc83785)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
r[id]tlb[01], [iw][id]tlb opcodes use TLB way index passed in a register
by the guest. The host uses 3 bits of the index for ITLB indexing and 4
bits for DTLB, but there's only 7 entries in the ITLB array and 10 in
the DTLB array, so a malicious guest may trigger out-of-bound access to
these arrays.
Change split_tlb_entry_spec return type to bool to indicate whether TLB
way passed to it is valid. Change get_tlb_entry to return NULL in case
invalid TLB way is requested. Add assertion to xtensa_tlb_get_entry that
requested TLB way and entry indices are valid. Add checks to the
[rwi]tlb helpers that requested TLB way is valid and return 0 or do
nothing when it's not.
Cc: qemu-stable@nongnu.org
Fixes: b67ea0cd74 ("target-xtensa: implement memory protection options")
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20231215120307.545381-1-jcmvbkbc@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 604927e357)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
On a loaded system with --enable-debug, this test can take longer than
5 minutes. Raising the timeout to 6 minutes gives greater headroom for
such situations.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
[thuth: Increase the timeout to 6 minutes for very loaded systems]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20231215070357.10888-11-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
(cherry picked from commit e8a12fe31f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context fixup in tests/qtest/meson.build)
monitor_qmp_dispatcher_co() runs in the iohandler AioContext that is not
polled during nested event loops. The coroutine currently reschedules
itself in the main loop's qemu_aio_context AioContext, which is polled
during nested event loops. One known problem is that QMP device-add
calls drain_call_rcu(), which temporarily drops the BQL, leading to all
sorts of havoc like other vCPU threads re-entering device emulation code
while another vCPU thread is waiting in device emulation code with
aio_poll().
Paolo Bonzini suggested running non-coroutine QMP handlers in the
iohandler AioContext. This avoids trouble with nested event loops. His
original idea was to move coroutine rescheduling to
monitor_qmp_dispatch(), but I resorted to moving it to qmp_dispatch()
because we don't know if the QMP handler needs to run in coroutine
context in monitor_qmp_dispatch(). monitor_qmp_dispatch() would have
been nicer since it's associated with the monitor implementation and not
as general as qmp_dispatch(), which is also used by qemu-ga.
A number of qemu-iotests need updated .out files because the order of
QMP events vs QMP responses has changed.
Solves Issue #1933.
Cc: qemu-stable@nongnu.org
Fixes: 7bed89958b ("device_core: use drain_call_rcu in in qmp_device_add")
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215192
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2214985
Buglink: https://issues.redhat.com/browse/RHEL-17369
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240118144823.1497953-4-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit effd60c878)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The common.qemu bash functions allow tests to interact with the QMP
monitor of a QEMU process. I spent two days trying to update 141 when
the order of the test output changed, but found it would still fail
occassionally because printf() and QMP events race with synchronous QMP
communication.
I gave up and ported 141 to the existing Python API for QMP tests. The
Python API is less affected by the order in which QEMU prints output
because it does not print all QMP traffic by default.
The next commit changes the order in which QMP messages are received.
Make 141 reliable first.
Cc: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240118144823.1497953-3-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 9ee2dd4c22)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Add a filter function for QMP responses that contain QEMU's
automatically generated node ids. The ids change between runs and must
be masked in the reference output.
The next commit will use this new function.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240118144823.1497953-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit da62b507a2)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
There is a bug in the blklogwrites driver pertaining to logging "write
zeroes" operations, causing log corruption. This can be easily observed
by setting detect-zeroes to something other than "off" for the driver.
The issue is caused by a concurrency bug pertaining to the fact that
"write zeroes" operations have to be logged in two parts: first the log
entry metadata, then the zeroed-out region. While the log entry
metadata is being written by bdrv_co_pwritev(), another operation may
begin in the meanwhile and modify the state of the blklogwrites driver.
This is as intended by the coroutine-driven I/O model in QEMU, of
course.
Unfortunately, this specific scenario is mishandled. A short example:
1. Initially, in the current operation (#1), the current log sector
number in the driver state is only incremented by the number of sectors
taken by the log entry metadata, after which the log entry metadata is
written. The current operation yields.
2. Another operation (#2) may start while the log entry metadata is
being written. It uses the current log position as the start offset for
its log entry. This is in the sector right after the operation #1 log
entry metadata, which is bad!
3. After bdrv_co_pwritev() returns (#1), the current log sector
number is reread from the driver state in order to find out the start
offset for bdrv_co_pwrite_zeroes(). This is an obvious blunder, as the
offset will be the sector right after the (misplaced) operation #2 log
entry, which means that the zeroed-out region begins at the wrong
offset.
4. As a result of the above, the log is corrupt.
Fix this by only reading the driver metadata once, computing the
offsets and sizes in one go (including the optional zeroed-out region)
and setting the log sector number to the appropriate value for the next
operation in line.
Signed-off-by: Ari Sundholm <ari@tuxera.com>
Cc: qemu-stable@nongnu.org
Message-ID: <20240109184646.1128475-1-megari@gmx.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit a9c8ea9547)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When HASH_REPORT is negotiated, the guest_hdr_len might be larger than
the size of the mergeable rx buffer header. Using
virtio_net_hdr_mrg_rxbuf during the header swap might lead a stack
overflow in this case. Fixing this by using virtio_net_hdr_v1_hash
instead.
Reported-by: Xiao Lei <leixiao.nop@zju.edu.cn>
Cc: Yuri Benditovich <yuri.benditovich@daynix.com>
Cc: qemu-stable@nongnu.org
Cc: Mauro Matteo Cascella <mcascell@redhat.com>
Fixes: CVE-2023-6693
Fixes: e22f0603fb ("virtio-net: reference implementation of hash report")
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 2220e8189f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When tcg_out_qemu_st_{index,direct} were merged, the direct case for
MO_64 was omitted, causing qemu_st_i64 to be encoded as 0xffffffff due
to underflow when adding h.base and h.index.
Fixes: 1df6d611bd ("tcg/arm: Introduce HostAddress")
Signed-off-by: Joseph Burt <caseorum@gmail.com>
Message-Id: <20240121211439.100829-1-caseorum@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 9f6523e8e4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Using fleecing backup like in [0] on a qcow2 image (with metadata
preallocation) can lead to the following assertion failure:
> bdrv_co_do_block_status: Assertion `!(ret & BDRV_BLOCK_ZERO)' failed.
In the reproducer [0], it happens because the BDRV_BLOCK_RECURSE flag
will be set by the qcow2 driver, so the caller will recursively check
the file child. Then the BDRV_BLOCK_ZERO set too. Later up the call
chain, in bdrv_co_do_block_status() for the snapshot-access driver,
the assertion failure will happen, because both flags are set.
To fix it, clear the recurse flag after the recursive check was done.
In detail:
> #0 qcow2_co_block_status
Returns 0x45 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA |
BDRV_BLOCK_OFFSET_VALID.
> #1 bdrv_co_do_block_status
Because of the data flag, bdrv_co_do_block_status() will now also set
BDRV_BLOCK_ALLOCATED. Because of the recurse flag,
bdrv_co_do_block_status() for the bdrv_file child will be called,
which returns 0x16 = BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_OFFSET_VALID |
BDRV_BLOCK_ZERO. Now the return value inherits the zero flag.
Returns 0x57 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA |
BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_ZERO.
> #2 bdrv_co_common_block_status_above
> #3 bdrv_co_block_status_above
> #4 bdrv_co_block_status
> #5 cbw_co_snapshot_block_status
> #6 bdrv_co_snapshot_block_status
> #7 snapshot_access_co_block_status
> #8 bdrv_co_do_block_status
Return value is propagated all the way up to here, where the assertion
failure happens, because BDRV_BLOCK_RECURSE and BDRV_BLOCK_ZERO are
both set.
> #9 bdrv_co_common_block_status_above
> #10 bdrv_co_block_status_above
> #11 block_copy_block_status
> #12 block_copy_dirty_clusters
> #13 block_copy_common
> #14 block_copy_async_co_entry
> #15 coroutine_trampoline
[0]:
> #!/bin/bash
> rm /tmp/disk.qcow2
> ./qemu-img create /tmp/disk.qcow2 -o preallocation=metadata -f qcow2 1G
> ./qemu-img create /tmp/fleecing.qcow2 -f qcow2 1G
> ./qemu-img create /tmp/backup.qcow2 -f qcow2 1G
> ./qemu-system-x86_64 --qmp stdio \
> --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/disk.qcow2 \
> --blockdev qcow2,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.qcow2 \
> --blockdev qcow2,node-name=node2,file.driver=file,file.filename=/tmp/backup.qcow2 \
> <<EOF
> {"execute": "qmp_capabilities"}
> {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", "file": "node0", "target": "node1", "node-name": "node3" } }
> {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", "file": "node3", "node-name": "snap0" } }
> {"execute": "blockdev-backup", "arguments": { "device": "snap0", "target": "node1", "sync": "full", "job-id": "backup0" } }
> EOF
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-id: 20240116154839.401030-1-f.ebner@proxmox.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 8a9be79924)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is causing regressions that have not been analyzed yet. Revert the
change on stable branches.
Cc: qemu-stable@nongnu.org
Cc: Michael Tokarev <mjt@tls.msk.ru>
Related: https://gitlab.com/qemu-project/qemu/-/issues/2092
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is now expected by rtd so I've expanded using their example as
22.04 is one of our supported platforms. I tried to work out if there
was an easy way to re-generate a requirements.txt from our
pythondeps.toml but in the end went for the easier solution.
Cc: <qemu-stable@nongnu.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231221174200.2693694-1-alex.bennee@linaro.org>
(cherry picked from commit b16a45bc5e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Even though the BLAST command isn't fully implemented in QEMU, the DMA_STAT_BCMBLT
bit should be set after the command has been issued to indicate that the command
has completed.
This fixes an issue with the DC390 DOS driver which issues the BLAST command as
part of its normal error recovery routine at startup, and otherwise sits in a
tight loop waiting for DMA_STAT_BCMBLT to be set before continuing.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Message-ID: <20240112131529.515642-5-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit c2d7de557d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The setting of DMA_STAT_DONE at the end of a DMA transfer can be configured to
generate an interrupt, however the Linux driver manually checks for DMA_STAT_DONE
being set and if it is, considers that a DMA transfer has completed.
If DMA_STAT_DONE is set but the ESP device isn't indicating an interrupt then
the Linux driver considers this to be a spurious interrupt. However this can
occur in QEMU as there is a delay between the end of DMA transfer where
DMA_STAT_DONE is set, and the ESP device raising its completion interrupt.
This appears to be an incorrect assumption in the Linux driver as the ESP and
PCI DMA interrupt sources are separate (and may not be raised exactly
together), however we can work around this by synchronising the setting of
DMA_STAT_DONE at the end of a DMA transfer with the ESP completion interrupt.
In conjunction with the previous commit Linux is now able to correctly boot
from an am53c974 PCI SCSI device on the hppa C3700 machine without emitting
"iget: checksum invalid" and "Spurious irq, sreg=10" errors.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Message-ID: <20240112131529.515642-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 1e8e6644e0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The am53c974/dc390 PCI interrupt has two separate sources: the first is from the
internal ESP device, and the second is from the PCI DMA transfer logic.
Update the ESP interrupt handler so that it sets DMA_STAT_SCSIINT rather than
driving the PCI IRQ directly, and introduce a new esp_pci_update_irq() function
to generate the correct PCI IRQ level. In particular this fixes spurious interrupts
being generated by setting DMA_STAT_DONE at the end of a transfer if DMA_CMD_INTE_D
isn't set in the DMA_CMD register.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Message-ID: <20240112131529.515642-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 6b41417d93)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The current code in esp_pci_dma_memory_rw() sets the DMA address to the value
of the DMA_SPA (Starting Physical Address) register which is incorrect: this
means that for each callback from the SCSI layer the DMA address is set back
to the starting address.
In the case where only a single SCSI callback occurs (currently for transfer
lengths < 128kB) this works fine, however for larger transfers the DMA address
wraps back to the initial starting address, corrupting the buffer holding the
data transferred to the guest.
Fix esp_pci_dma_memory_rw() to use the DMA_WAC (Working Address Counter) for
the DMA address which is correctly incremented across multiple SCSI layer
transfers.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Message-ID: <20240112131529.515642-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 84a6835e00)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Solaris has #defines for htonll and ntohll which cause syntax errors
when compiling code that attempts to (re)define these functions..
Signed-off-by: Nick Briggs <nicholas.h.briggs@gmail.com>
Link: https://lore.kernel.org/r/65a04a7d.497ab3.3e7bef1f@gateway.sonic.net
Signed-off-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 44ce1b5d2f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Add an update buffer where all block updates are staged.
Flush or discard updates properly, so we should never see
half-completed block writes in pflash storage.
Drop a bunch of FIXME comments ;)
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240108160900.104835-4-kraxel@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 284a7ee2e2)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: drop const in hw/block/pflash_cfi01.c for before
v8.2.0-220-g7d5dc0a367 "hw/block: Constify VMState")
Use the helper functions we have to read/write multi-byte values
in correct byte order.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240108160900.104835-3-kraxel@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 5dd58358a5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Move the offset calculation, do it once at the start of the function and
let the 'p' variable point directly to the memory location which should
be updated. This makes it simpler to update other buffers than
pfl->storage in an upcoming patch. No functional change.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240108160900.104835-2-kraxel@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 3b14a555fd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Both cryptodev_backend_set_throttle() and CryptoDevBackendClass::init()
can set their Error** argument. Do not ignore them, return early
on failure. Without that, running into another failure trips
error_setv()'s assertion. Use the ERRP_GUARD() macro as suggested
in commit ae7c80a7bd ("error: New macro ERRP_GUARD()").
Cc: qemu-stable@nongnu.org
Fixes: e7a775fd9f ("cryptodev: Account statistics")
Fixes: 2580b452ff ("cryptodev: support QoS")
Reviewed-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20231120150418.93443-1-philmd@linaro.org>
(cherry picked from commit 484aecf2d3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
For PC-relative translation blocks, env->eip changes during the
execution of a translation block, Therefore, QEMU must be able to
recover an instruction's PC just from the TranslationBlock struct and
the instruction data with. Because a TB will not span two pages, QEMU
stores all the low bits of EIP in the instruction data and replaces them
in x86_restore_state_to_opc. Bits 12 and higher (which may vary between
executions of a PCREL TB, since these only use the physical address in
the hash key) are kept unmodified from env->eip. The assumption is that
these bits of EIP, unlike bits 0-11, will not change as the translation
block executes.
Unfortunately, this is incorrect when the CS base is not aligned to a page.
Then the linear address of the instructions (i.e. the one with the
CS base addred) indeed will never span two pages, but bits 12+ of EIP
can actually change. For example, if CS base is 0x80262200 and EIP =
0x6FF4, the first instruction in the translation block will be at linear
address 0x802691F4. Even a very small TB will cross to EIP = 0x7xxx,
while the linear addresses will remain comfortably within a single page.
The fix is simply to use the low bits of the linear address for data[0],
since those don't change. Then x86_restore_state_to_opc uses tb->cs_base
to compute a temporary linear address (referring to some unknown
instruction in the TB, but with the correct values of bits 12 and higher);
the low bits are replaced with data[0], and EIP is obtained by subtracting
again the CS base.
Huge thanks to Mark Cave-Ayland for the image and initial debugging,
and to Gitlab user @kjliew for help with bisecting another occurrence
of (hopefully!) the same bug.
It should be relatively easy to write a testcase that performs MMIO on
an EIP with different bits 12+ than the first instruction of the translation
block; any help is welcome.
Fixes: e3a79e0e87 ("target/i386: Enable TARGET_TB_PCREL", 2022-10-11)
Cc: qemu-stable@nongnu.org
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Richard Henderson <richard.henderson@linaro.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1759
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1964
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2012
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 729ba8e933)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The PCREL patches introduced a bug when updating EIP in the !CF_PCREL case.
Using s->pc in func gen_update_eip_next() solves the problem.
Cc: qemu-stable@nongnu.org
Fixes: b5e0d5d22f ("target/i386: Fix 32-bit wrapping of pc/eip computation")
Fixes: 5b2fd6cf37 (b5e0d5d22f in 8.1.4)
Signed-off-by: guoguangyao <guoguangyao18@mails.ucas.ac.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240115020804.30272-1-guoguangyao18@mails.ucas.ac.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2926eab896)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
With PCREL, we have a page-relative view of EIP, and an
approximation of PC = EIP+CSBASE that is good enough to
detect page crossings. If we try to recompute PC after
masking EIP, we will mess up that approximation and write
a corrupt value to EIP.
We already handled masking properly for PCREL, so the
fix in b5e0d5d2 was only needed for the !PCREL path.
Cc: qemu-stable@nongnu.org
Fixes: b5e0d5d22f ("target/i386: Fix 32-bit wrapping of pc/eip computation")
Fixes: 5b2fd6cf37 (b5e0d5d22f in 8.1.4)
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240101230617.129349-1-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a58506b748)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
j is used while loading an ELF file to byteswap segments'
data. If data is larger than 2GB an overflow may happen.
So j should be elf_word.
This commit fixes a minor bug: it's unlikely anybody is trying to
load ELF files with 2GB+ segments for wrong-endianness targets,
but if they did, it wouldn't work correctly.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Cc: qemu-stable@nongnu.org
Fixes: 7ef295ea5b ("loader: Add data swap option to load-elf")
Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 410c2a4d75)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Sometimes the CI "pages" job fails with a message like this from
htags:
$ htags -anT --tree-view=filetree -m qemu_init -t "Welcome to the QEMU sourcecode"
htags: Negative exec line limit = -371
This is due to a bug in hflags where if the environment is too large it
falls over:
https://lists.gnu.org/archive/html/bug-global/2024-01/msg00000.html
This happens to us because GitLab CI puts the commit message of the
commit under test into the CI_COMMIT_MESSAGE and/or CI_COMMIT_TAG_MESSAGE
environment variables, so the job will fail if the commit happens to
have a verbose commit message.
Work around the htags bug by unsetting these variables while running
htags.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2080
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240111125543.1573473-1-peter.maydell@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 52a21689cd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
LAE should set the access register corresponding to the first operand,
instead, it always modifies access register 1.
Co-developed-by: Ido Plat <Ido.Plat@ibm.com>
Cc: qemu-stable@nongnu.org
Fixes: a1c7610a68 ("target-s390x: implement LAY and LAEY instructions")
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20240111092328.929421-2-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit e358a25a97)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: target/s390x/tcg/translate.c: fixup for
v8.1.0-1189-gad75a51e84 "tcg: Rename cpu_env to tcg_env")
An apparent copy-paste error tests for the presence of the
virtio-rng-ccw device in order to perform tests on the virtio-scsi-ccw
device.
Signed-off-by: Samuel Tardieu <sam@rfc1149.net>
Message-ID: <20240106130121.1244993-1-sam@rfc1149.net>
Fixes: 65331bf5d1 ("tests/qtest: Check for virtio-ccw devices before using them")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit c98873ee4a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Use PPC_FEATURE2_ISEL and PPC_FEATURE2_VEC_CRYPTO from linux headers
instead of the GNU specific PPC_FEATURE2_HAS_ISEL and
PPC_FEATURE2_HAS_VEC_CRYPTO. This fixes build with musl libc.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1861
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Fixes: 63922f467a ("tcg/ppc: Replace HAVE_ISEL macro with a variable")
Fixes: 68f340d4cd ("tcg/ppc: Enable Altivec detection")
Message-Id: <20231219105236.7059-1-ncopa@alpinelinux.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 1d513e06d9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The hypervisor can deliver (virtual) LPIs to a guest by setting up a
list register to have an intid which is an LPI. The GIC has to treat
these a little differently to standard interrupt IDs, because LPIs
have no Active state, and so the guest will only EOI them, it will
not also deactivate them. So icv_eoir_write() must do two things:
* if the LPI ID is not in any list register, we drop the
priority but do not increment the EOI count
* if the LPI ID is in a list register, we immediately deactivate
it, regardless of the split-drop-and-deactivate control
This can be seen in the VirtualWriteEOIR0() and VirtualWriteEOIR1()
pseudocode in the GICv3 architecture specification.
Without this fix, potentially a hypervisor guest might stall because
LPIs get stuck in a bogus Active+Pending state.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
(cherry picked from commit 82a65e3188)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Current error message:
qemu-system-x86_64: -chardev spice,id=foo: Parameter 'driver' expects an abstract device type
while in fact the meaning is in reverse, -chardev expects
a non-abstract device type.
Fixes: 777357d758 ("chardev: qom-ify" 2016-12-07)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
(cherry picked from commit 4ad87cd4b2)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The mcycle/minstret counter's stop flag is mistakenly updated on a copy
on stack. Thus the counter increments even when the CY/IR bit in the
mcountinhibit register is set. This commit corrects its behavior.
Fixes: 3780e33732 (target/riscv: Support mcycle/minstret write operation)
Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 5cb0e7abe1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
A CAN sja1000 standard frame filter mask has been computed and applied
incorrectly for standard frames when single Acceptance Filter Mode
(MOD_AFM = 1) has been selected. The problem has not been found
by Linux kernel testing because it uses dual filter mode (MOD_AFM = 0)
and leaves falters fully open.
The problem has been noticed by Grant Ramsay when testing with Zephyr
RTOS which uses single filter mode.
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Reported-by: Grant Ramsay <gramsay@enphaseenergy.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2028
Fixes: 733210e754 ("hw/net/can: SJA1000 chip register level emulation")
Message-ID: <20240103231426.5685-1-pisa@fel.cvut.cz>
(cherry picked from commit 25145a7d77)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
We have a few test cases that include tests for corner case aspects of
internal snapshots, but nothing that tests that they actually function
as snapshots or that involves deleting a snapshot. Add a test for this
kind of basic internal snapshot functionality.
The error cases include a regression test for the crash we just fixed
with snapshot operations on inactive images.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231201142520.32255-4-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit bb6e2511eb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Currently, the conflict between -incoming and -loadvm is only detected
when loading the snapshot fails because the image is still inactive for
the incoming migration. This results in a suboptimal error message:
$ ./qemu-system-x86_64 -hda /tmp/test.qcow2 -loadvm foo -incoming defer
qemu-system-x86_64: Device 'ide0-hd0' is writable but does not support snapshots
Catch the situation already in qemu_validate_options() to improve the
message:
$ ./qemu-system-x86_64 -hda /tmp/test.qcow2 -loadvm foo -incoming defer
qemu-system-x86_64: 'incoming' and 'loadvm' options are mutually exclusive
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231201142520.32255-3-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 5a7f21efaf)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
bdrv_is_read_only() only checks if the node is configured to be
read-only eventually, but even if it returns false, writing to the node
may not be permitted at the moment (because it's inactive).
bdrv_is_writable() checks that the node can be written to right now, and
this is what the snapshot operations really need.
Change bdrv_can_snapshot() to use bdrv_is_writable() to fix crashes like
the following:
$ ./qemu-system-x86_64 -hda /tmp/test.qcow2 -loadvm foo -incoming defer
qemu-system-x86_64: ../block/io.c:1990: int bdrv_co_write_req_prepare(BdrvChild *, int64_t, int64_t, BdrvTrackedRequest *, int): Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed.
The resulting error message after this patch isn't perfect yet, but at
least it doesn't crash any more:
$ ./qemu-system-x86_64 -hda /tmp/test.qcow2 -loadvm foo -incoming defer
qemu-system-x86_64: Device 'ide0-hd0' is writable but does not support snapshots
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231201142520.32255-2-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit d3007d348a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
MDCR_EL2.HPMN allows an hypervisor to limit the number of PMU counters
available to EL1 and EL0 (to keep the others to itself). QEMU already
implements this split correctly, except for PMCR_EL0.N reads: the number
of counters read by EL1 or EL0 should be the one configured in
MDCR_EL2.HPMN.
Cc: qemu-stable@nongnu.org
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-id: 20231215144652.4193815-2-jean-philippe@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 6980c31dec)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In 32-bit mode, pc = eip + cs_base is also 32-bit, and must wrap.
Failure to do so results in incorrect memory exceptions to the guest.
Before 732d548732, this was implicitly done via truncation to
target_ulong but only in qemu-system-i386, not qemu-system-x86_64.
To fix this, we must add conditional zero-extensions.
Since we have to test for 32 vs 64-bit anyway, note that cs_base
is always zero in 64-bit mode.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2022
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20231212172510.103305-1-richard.henderson@linaro.org>
(cherry picked from commit b5e0d5d22f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context fix in target/i386/tcg/tcg-cpu.c for v8.1.0-1190-gb77af26e97
"accel/tcg: Replace CPUState.env_ptr with cpu_env()")
The AioContext must be unlocked before calling blk_co_unref(), because
it takes the AioContext lock internally in blk_unref_bh(), which is
scheduled in the main thread. If we don't unlock, the AioContext is
locked twice and nested event loops such as in bdrv_graph_wrlock() will
deadlock.
Cc: <qemu-stable@nongnu.org>
Fixes: https://issues.redhat.com/browse/RHEL-15965
Fixes: 0c7d204f50
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231208124352.30295-1-kwolf@redhat.com>
(cherry picked from commit 755ae3811f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Using direct pointer dereferencing can allow for unaligned accesses,
which was seen during execution with sanitizers enabled.
Cc: qemu-stable@nongnu.org
Reviewed-by: Chris Rauer <crauer@google.com>
Reviewed-by: Peter Foley <pefoley@google.com>
Signed-off-by: Patrick Venture <venture@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20231116163633.276671-1-venture@google.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 2b8fe81b3c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
There is no architectural requirement that SME implies SVE, but
our implementation currently assumes it. (FEAT_SME_FA64 does
imply SVE.) So if you try to run a CPU with eg "-cpu max,sve=off"
you quickly run into an assert when the guest tries to write to
SMCR_EL1:
#6 0x00007ffff4b38e96 in __GI___assert_fail
(assertion=0x5555566e69cb "sm", file=0x5555566e5b24 "../../target/arm/helper.c", line=6865, function=0x5555566e82f0 <__PRETTY_FUNCTION__.31> "sve_vqm1_for_el_sm") at ./assert/assert.c:101
#7 0x0000555555ee33aa in sve_vqm1_for_el_sm (env=0x555557d291f0, el=2, sm=false) at ../../target/arm/helper.c:6865
#8 0x0000555555ee3407 in sve_vqm1_for_el (env=0x555557d291f0, el=2) at ../../target/arm/helper.c:6871
#9 0x0000555555ee3724 in smcr_write (env=0x555557d291f0, ri=0x555557da23b0, value=2147483663) at ../../target/arm/helper.c:6995
#10 0x0000555555fd1dba in helper_set_cp_reg64 (env=0x555557d291f0, rip=0x555557da23b0, value=2147483663) at ../../target/arm/tcg/op_helper.c:839
#11 0x00007fff60056781 in code_gen_buffer ()
Avoid this unsupported and slightly odd combination by
disabling SME when SVE is not present.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2005
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20231127173318.674758-1-peter.maydell@linaro.org
(cherry picked from commit f7767ca301)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit d921fea338 ("ui/vnc-clipboard: fix infinite loop in
inflate_buffer (CVE-2023-3255)") removed this hunk, but it is still
required, because it can happen that stream.avail_in becomes zero
before coming across a return value of Z_STREAM_END in the loop.
This fixes the host->guest direction of the clipboard with noVNC and
TigerVNC as clients.
Fixes: d921fea338 ("ui/vnc-clipboard: fix infinite loop in inflate_buffer (CVE-2023-3255)")
Reported-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20231122125826.228189-1-f.ebner@proxmox.com>
(cherry picked from commit ebfbf39467)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit 6f189a08c1 ("ui/gtk-egl: Check EGLSurface before doing
scanout") introduced a regression when QEMU is running with a
virtio-gpu-gl-device on a host under X11. After the guest has
initialized the virtio-gpu-gl-device, the guest screen only
shows "Display output is not active.".
Commit 6f189a08c1 moved all function calls in
gd_egl_scanout_texture() to a code path which is only called
once after gd_egl_init() succeeds in gd_egl_scanout_texture().
Move all function calls in gd_egl_scanout_texture() back to
the regular code path so they get always called if one of the
gd_egl_init() calls was successful.
Fixes: 6f189a08c1 ("ui/gtk-egl: Check EGLSurface before doing scanout")
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20231111104020.26183-1-vr_qemu@t-online.de>
(cherry picked from commit 53a939f1bf)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The first time gd_egl_scanout_texture() is called, there's a possibility
that the GTK drawing area might not be realized yet, in which case its
associated GdkWindow is NULL. This means gd_egl_init() was also skipped
and the EGLContext and EGLSurface stored in the VirtualGfxConsole are
not valid yet.
Continuing with the scanout in this conditions would result in hitting
an assert in libepoxy: "Couldn't find current GLX or EGL context".
A possible workaround is to just ignore the scanout request, giving the
the GTK drawing area some time to finish its realization. At that point,
the gd_egl_init() will succeed and the EGLContext and EGLSurface stored
in the VirtualGfxConsole will be valid.
Signed-off-by: Antonio Caggiano <quic_acaggian@quicinc.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20231016123215.2699269-1-quic_acaggian@quicinc.com>
(cherry picked from commit 6f189a08c1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In the rollback in msix_set_vector_notifiers(), original patch forgot to
undo msix_vector_poll_notifier pointer.
Fixes: bbef882cc1 ("msi: add API to get notified about pending bit poll")
Signed-off-by: Robert Hoo <robert.hoo.linux@gmail.com>
Message-Id: <20231113081349.1307-1-robert.hoo.linux@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 2d37fe9e5e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
erst_realizefn() passes @errp to functions without checking for
failure. If it runs into another failure, it trips error_setv()'s
assertion.
Use the ERRP_GUARD() macro and check *errp, as suggested in commit
ae7c80a7bd ("error: New macro ERRP_GUARD()").
Cc: qemu-stable@nongnu.org
Fixes: f7e26ffa59 ("ACPI ERST: support for ACPI ERST feature")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20231120130017.81286-1-philmd@linaro.org>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 20bc50137f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
g_new() aborts if the allocation fails so it returns NULL only if the
requested allocation size is zero. register_vfs() makes such an
allocation if NumVFs is zero so it should not assert that g_new()
returns a non-NULL value.
Fixes: 7c0fa8dff8 ("pcie: Add support for Single Root I/O Virtualization (SR/IOV)")
Buglink: https://issues.redhat.com/browse/RHEL-17209
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20231123075630.12057-1-akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Yanghang Liu<yanghliu@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 714a1415d7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
After a relatively short time, there is an multiplication overflow
when multiplying (now - buft_start) with hda_bytes_per_second().
While the uptime now - buft_start only overflows after 2**63 ns
= 292.27 years, this happens hda_bytes_per_second() times faster
with the multiplication. At 44100 samples/s * 2 channels
* 2 bytes/channel = 176400 bytes/s that is 14.52 hours. After the
multiplication overflow the affected audio stream stalls.
Replace the multiplication and following division with muldiv64()
to prevent a multiplication overflow.
Fixes: 280c1e1cdb ("audio/hda: create millisecond timers that handle IO")
Reported-by: M_O_Bz <m_o_bz@163.com>
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20231105172552.8405-1-vr_qemu@t-online.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 74e8593e7e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The current implementation initializes the stack pointer of AVR devices
to 0. Although older AVR devices used to be like that, newer ones set
it to RAMEND.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1525
Signed-off-by: Gihun Nam <gihun.nam@outlook.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <PH0P222MB0010877445B594724D40C924DEBDA@PH0P222MB0010.NAMP222.PROD.OUTLOOK.COM>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 235948bf53)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The VirtioPCIDeviceTypeInfo structure, added in commit a4ee4c8baa
("virtio: Helper for registering virtio device types") got extended
in commit 8ea90ee690 ("virtio: add class_size") with the @class_size
field. Do similarly with the @instance_finalize field.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231121174051.63038-2-philmd@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 837053a7f4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit 0be6bfac62 ("qdev: Implement variable length array properties")
added the DEFINE_PROP_ARRAY() macro with the following comment:
* It is the responsibility of the device deinit code to free the
* @_arrayfield memory.
Commit 9e4aa1fafe added:
DEFINE_PROP_ARRAY("pg0-lock",
XlnxVersalEFuseCtrl, extra_pg0_lock_n16,
extra_pg0_lock_spec, qdev_prop_uint16, uint16_t),
but forgot to free the 'extra_pg0_lock_spec' array. Do it in the
instance_finalize() handler.
Cc: qemu-stable@nongnu.org
Fixes: 9e4aa1fafe ("hw/nvram: Xilinx Versal eFuse device") # v6.2.0+
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231121174051.63038-6-philmd@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 4f10c66077)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit 0be6bfac62 ("qdev: Implement variable length array properties")
added the DEFINE_PROP_ARRAY() macro with the following comment:
* It is the responsibility of the device deinit code to free the
* @_arrayfield memory.
Commit 68fbcc344e added:
DEFINE_PROP_ARRAY("read-only", XlnxEFuse, ro_bits_cnt, ro_bits,
qdev_prop_uint32, uint32_t),
but forgot to free the 'ro_bits' array. Do it in the instance_finalize
handler.
Cc: qemu-stable@nongnu.org
Fixes: 68fbcc344e ("hw/nvram: Introduce Xilinx eFuse QOM") # v6.2.0+
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231121174051.63038-5-philmd@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 49b3e28b7b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit 0be6bfac62 ("qdev: Implement variable length array properties")
added the DEFINE_PROP_ARRAY() macro with the following comment:
* It is the responsibility of the device deinit code to free the
* @_arrayfield memory.
Commit 4fb013afcc added:
DEFINE_PROP_ARRAY("oscclk", MPS2SCC, num_oscclk, oscclk_reset,
qdev_prop_uint32, uint32_t),
but forgot to free the 'oscclk_reset' array. Do it in the
instance_finalize() handler.
Cc: qemu-stable@nongnu.org
Fixes: 4fb013afcc ("hw/misc/mps2-scc: Support configurable number of OSCCLK values") # v6.0.0+
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231121174051.63038-4-philmd@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 896dd6ff7b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit 0be6bfac62 ("qdev: Implement variable length array properties")
added the DEFINE_PROP_ARRAY() macro with the following comment:
* It is the responsibility of the device deinit code to free the
* @_arrayfield memory.
Commit 8077b8e549 added:
DEFINE_PROP_ARRAY("reserved-regions", VirtIOIOMMUPCI,
vdev.nb_reserved_regions, vdev.reserved_regions,
qdev_prop_reserved_region, ReservedRegion),
but forgot to free the 'vdev.reserved_regions' array. Do it in the
instance_finalize() handler.
Cc: qemu-stable@nongnu.org
Fixes: 8077b8e549 ("virtio-iommu-pci: Add array of Interval properties") # v5.1.0+
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20231121174051.63038-3-philmd@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit c9a4aa06df)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: fixup hw/virtio/virtio-iommu-pci.c for before v8.1.0-2552-g41cc70cdf5,
"virtio-iommu: Rename reserved_regions into prop_resv_regions" -- so now
patch subject matches actual change again)
In commit edac4d8a16 back in 2015 when we added support for
the virtual timer offset CNTVOFF_EL2, we didn't correctly update
the timer-recalculation code that figures out when the timer
interrupt is next going to change state. We got it wrong in
two ways:
* for the 0->1 transition, we didn't notice that gt->cval + offset
can overflow a uint64_t
* for the 1->0 transition, we didn't notice that the transition
might now happen before the count rolls over, if offset > count
In the former case, we end up trying to set the next interrupt
for a time in the past, which results in QEMU hanging as the
timer fires continuously.
In the latter case, we would fail to update the interrupt
status when we are supposed to.
Fix the calculations in both cases.
The test case is Alex Bennée's from the bug report, and tests
the 0->1 transition overflow case.
Fixes: edac4d8a16 ("target-arm: Add CNTVOFF_EL2")
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/60
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20231120173506.3729884-1-peter.maydell@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 8d37a1425b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The syndrome register value always has an IL field at bit 25, which
is 0 for a trap on a 16 bit instruction, and 1 for a trap on a 32
bit instruction (or for exceptions which aren't traps on a known
instruction, like PC alignment faults). This means that our
syn_*() functions should always either take an is_16bit argument to
determine whether to set the IL bit, or else unconditionally set it.
We missed setting the IL bit for the syndrome for three kinds of trap:
* an SVE access exception
* a pointer authentication check failure
* a BTI (branch target identification) check failure
All of these traps are AArch64 only, and so the instruction causing
the trap is always 64 bit. This means we can unconditionally set
the IL bit in the syn_*() function.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20231120150121.3458408-1-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 11a3c4a286)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If the text description file is larger than DESC_SIZE, we force the last
byte in the buffer to be 0 and write it out.
This results in a corruption.
Try to allocate a big buffer in this case.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1923
Signed-off-by: Fam Zheng <fam@euphon.net>
Message-ID: <20231124115654.3239137-1-fam@euphon.net>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 9fb7b350ba)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
According to RISCV Specification sect 9.5 on two stage translation when
V=1 the vsstatus(mstatus in QEMU's terms) field MXR, which makes
execute-only pages readable, only overrides VS-stage page protection.
Setting MXR at HS-level(mstatus_hs), however, overrides both VS-stage
and G-stage execute-only permissions.
The hypervisor extension changes the behavior of MXR\MPV\MPRV bits.
Due to RISCV Specification sect. 9.4.1 when MPRV=1, explicit memory
accesses are translated and protected, and endianness is applied, as
though the current virtualization mode were set to MPV and the current
nominal privilege mode were set to MPP. vsstatus.MXR makes readable
those pages marked executable at the VS translation stage.
Fixes: 36a18664ba ("target/riscv: Implement second stage MMU")
Signed-off-by: Ivan Klokov <ivan.klokov@syntacore.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20231121071757.7178-3-ivan.klokov@syntacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 6bca4d7d1f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
According to RISCV privileged spec sect. 5.3.2 Virtual Address Translation Process
access-fault exceptions may raise only after PMA/PMP check. Current implementation
generates an access-fault for mbare mode even if there were no PMA/PMP errors.
This patch removes the erroneous MMU mode check and generates an access-fault
exception based on the pmp_violation flag only.
Fixes: 1448689c7b ("target/riscv: Allow specifying MMU stage")
Signed-off-by: Ivan Klokov <ivan.klokov@syntacore.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20231121071757.7178-2-ivan.klokov@syntacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 82d53adfbb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If you check the manual of SiFive E310 (https://cdn.sparkfun.com/assets/7/f/0/2/7/fe310-g002-manual-v19p05.pdf),
you can see in Figure 1 that the CLINT is connected to the real time clock, which also feeds the AON peripheral (they share the same clock).
In page 43, the docs also say that the timer registers of the CLINT count ticks from the rtcclk.
I am currently playing with bare metal applications both in QEMU and a physical SiFive E310 board and
I confirm that the CLINT clock in the physical board runs at 32.768 kHz.
In QEMU, the same app produces a completely different outcome, as sometimes a new CLINT interrupt is triggered before finishing other tasks.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1978
Signed-off-by: Rom\ufffd\ufffdn C\ufffd\ufffdrdenas <rcardenas.rod@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20231117082840.55705-1-rcardenas.rod@gmail.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit a7472560ca)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit 49554856f0 fixed a problem, where TPM devices were not appearing
in the FDT, by delaying the FDT creation up until virt_machine_done().
This create a side effect (see gitlab #1925) - devices that need access
to the '/chosen' FDT node during realize() stopped working because, at
that point, we don't have a FDT.
This happens because our FDT creation is monolithic, but it doesn't need
to be. We can add the needed FDT components for realize() time and, at
the same time, do another FDT round where we account for dynamic sysbus
devices. In other words, the problem fixed by 49554856f0 could also be
fixed by postponing only create_fdt_sockets() and its dependencies,
leaving everything else from create_fdt() to be done during init().
Split the FDT creation in two parts:
- create_fdt(), now moved back to virt_machine_init(), will create FDT
nodes that doesn't depend on additional (dynamic) devices from the
sysbus;
- a new finalize_fdt() step is added, where create_fdt_sockets() and
friends is executed, accounting for the dynamic sysbus devices that
were added during realize().
This will make both use cases happy: TPM devices are still working as
intended, and devices such as 'guest-loader' have a FDT to work on
during realize().
Fixes: 49554856f0 ("riscv: Generate devicetree only after machine initialization is complete")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1925
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231110172559.73209-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 7a87ba8956)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Support for probing the Zicboz block size landed in Linux 6.6, which was
released a few weeks ago. This provides the user-configured block size
when Zicboz is enabled.
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20231110173716.24423-1-palmer@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 301c65f49f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: fixup linux-user/syscall.c to before v8.1.0-2602-ge57039ddab
"target/riscv: rename ext_icboz to ext_zicboz")
assertRegexpMatches() has been removed in Python 3.12 and should be replaced by
assertRegex(). See: https://docs.python.org/3.12/whatsnew/3.12.html#id3
Inspired-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231114144832.71612-1-philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit f0a663b4ce)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: adjust context for before v8.1.0-1582-g684750ab4f
"python/qemu: rename command() to cmd()")
assertEquals() has been removed in Python 3.12 and should be replaced by
assertEqual(). See: https://docs.python.org/3.12/whatsnew/3.12.html#id3
Message-ID: <20231114134326.287242-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 861f724d03)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: adjust context in pc_cpu_hotplug_props.py & cpu_queries.py for before
v8.1.0-1582-g684750ab4f "python/qemu: rename command() to cmd()")
The file offset of the load segment is not relevant to the
low address, only the beginning of the virtual address page.
Cc: qemu-stable@nongnu.org
Fixes: a93934fecd ("elf: take phdr offset into account when calculating the program load address")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1952
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 82d70a84c8)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Recently MemReentrancyGuard was added to DeviceState to record that the
device is engaging in I/O. The network device backend needs to update it
when delivering a packet to a device.
This implementation follows what bottom half does, but it does not add
a tracepoint for the case that the network device backend started
delivering a packet to a device which is already engaging in I/O. This
is because such reentrancy frequently happens for
qemu_flush_queued_packets() and is insignificant.
Fixes: CVE-2023-3019
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Acked-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 9050f976e4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Recently MemReentrancyGuard was added to DeviceState to record that the
device is engaging in I/O. The network device backend needs to update it
when delivering a packet to a device.
In preparation for such a change, add MemReentrancyGuard * as a
parameter of qemu_new_nic().
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 7d0fefdf81)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: fixup in hw/net/xen_nic.c due to lack of v8.1.0-2771-g25967ff69f
"hw/xen: update Xen PV NIC to XenDevice model")
Legacy software contains a standard mechanism for generating a reset to a
Serial ATA device - setting the SRST (software reset) bit in the Device
Control register.
Serial ATA has a more robust mechanism called COMRESET, also referred to
as port reset. A port reset is the preferred mechanism for error
recovery and should be used in place of software reset.
Commit e2a5d9b3d9 ("hw/ide/ahci: simplify and document PxCI handling")
(mjt: 16cc9594d2 in stable-8.1 series, v8.1.1)
improved the handling of PxCI, such that PxCI gets cleared after handling
a non-NCQ, or NCQ command (instead of incorrectly clearing PxCI after
receiving anything - even a FIS that failed to parse, which should NOT
clear PxCI, so that you can see which command slot that caused an error).
However, simply clearing PxCI after a non-NCQ, or NCQ command, is not
enough, we also need to clear PxCI when receiving a SRST in the Device
Control register.
A legacy software reset is performed by the host sending two H2D FISes,
the first H2D FIS asserts SRST, and the second H2D FIS deasserts SRST.
The first H2D FIS will not get a D2H reply, and requires the FIS to have
the C bit set to one, such that the HBA itself will clear the bit in PxCI.
The second H2D FIS will get a D2H reply once the diagnostic is completed.
The clearing of the bit in PxCI for this command should ideally be done
in ahci_init_d2h() (if it was a legacy software reset that caused the
reset (a COMRESET does not use a command slot)). However, since the reset
value for PxCI is 0, modify ahci_reset_port() to actually clear PxCI to 0,
that way we can avoid complex logic in ahci_init_d2h().
This fixes an issue for FreeBSD where the device would fail to reset.
The problem was not noticed in Linux, because Linux uses a COMRESET
instead of a legacy software reset by default.
Fixes: e2a5d9b3d9 ("hw/ide/ahci: simplify and document PxCI handling")
Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Message-ID: <20231108222657.117984-1-nks@flawful.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit eabb921250)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(mjt: mention 16cc9594d2 for stable-8.1)
In FDPIC signal handlers are passed around as FD pointers. Actual code
address and GOT pointer must be fetched from memory by the QEMU code
that implements kernel signal delivery functionality. This change is
equivalent to the following kernel change:
9c2cc74fb31e ("xtensa: fix signal delivery to FDPIC process")
Cc: qemu-stable@nongnu.org
Fixes: d2796be69d ("linux-user: add support for xtensa FDPIC")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
(cherry picked from commit 1b173d0606)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
AArch64 permits code at EL3 to use the HVC instruction; however the
exception we take should go to EL3, not down to EL2 (see the pseudocode
AArch64.CallHypervisor()). Fix the target EL.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar@zeroasic.com>
Message-id: 20231109151917.1925107-1-peter.maydell@linaro.org
(cherry picked from commit fc58891d04)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If the host kernel lacks vfio DMA limit reporting, do not attempt
to shrink the guest DMA aperture.
Fixes: df202e3ff3 ("s390x/pci: shrink DMA aperture to be bound by vfio DMA limit")
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Message-ID: <20231110175108.465851-3-mjrosato@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 8011b508cf)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
KVM for RISC-V started supporting KVM_GET_REG_LIST in Linux 6.6. It
consists of a KVM ioctl() that retrieves a list of all available regs
for get_one_reg/set_one_reg. Regs that aren't present in the list aren't
supported in the host.
This simplifies our lives when initing the KVM regs since we don't have
to always attempt a KVM_GET_ONE_REG for all regs QEMU knows. We'll only
attempt a get_one_reg() if we're sure the reg is supported, i.e. it was
retrieved by KVM_GET_REG_LIST. Any error in get_one_reg() will then
always considered fatal, instead of having to handle special error codes
that might indicate a non-fatal failure.
Start by moving the current kvm_riscv_init_multiext_cfg() logic into a
new kvm_riscv_read_multiext_legacy() helper. We'll prioritize using
KVM_GET_REG_LIST, so check if we have it available and, in case we
don't, use the legacy() logic.
Otherwise, retrieve the available reg list and use it to check if the
host supports our known KVM regs, doing the usual get_one_reg() for
the supported regs and setting cpu->cfg accordingly.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20231003132148.797921-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 608bdebb60)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: trivial context tweak in target/riscv/kvm.c)
Our error message is returning the value of 'ret', which will be always
-1 in case of error, and will not be that useful:
qemu-system-riscv64: Unable to read ISA_EXT KVM register ssaia, error -1
Improve the error message by outputting 'errno' instead of 'ret'. Use
strerrorname_np() to output the error name instead of the error code.
This will give us what we need to know right away:
qemu-system-riscv64: Unable to read ISA_EXT KVM register ssaia, error code: ENOENT
Given that we're going to exit(1) in this condition instead of
attempting to recover, remove the 'kvm_riscv_destroy_scratch_vcpu()'
call.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231003132148.797921-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 082e9e4a58)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is an error in Python 3.12; fix it by using a raw string literal.
Cc: <qemu-stable@nongnu.org>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231108105649.60453-1-marcandre.lureau@redhat.com>
(cherry picked from commit 4d96307c5b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Add a small test to prevent regressions.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20231106093605.1349201-5-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit ebc14107f1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
LAALG uses op_laa() and wout_addu64(). The latter expects cc_src to be
set, but the former does not do it. This can lead to assertion failures
if something sets cc_src to neither 0 nor 1 before.
Fix by introducing op_laa_addu64(), which sets cc_src, and using it for
LAALG.
Fixes: 4dba4d6fef ("target/s390x: Use atomic operations for LOAD AND OP")
Cc: qemu-stable@nongnu.org
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20231106093605.1349201-4-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit bea402482a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Add a small test to prevent regressions.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20231106093605.1349201-3-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 43fecbe7a5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
CLC updates cc_src before accessing the second operand; if the latter
is inaccessible, the former ends up containing a bogus value.
Fix by reading cc_src into a temporary first.
Fixes: 4f7403d52b ("target-s390: Convert CLC")
Closes: https://gitlab.com/qemu-project/qemu/-/issues/1865
Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20231106093605.1349201-2-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit aba2ec341c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Before commit "hw/ide: reset: cancel async DMA operation before
resetting state", this test would fail, because a reset with a
pending write operation would lead to an unsolicited write to the
first sector of the disk.
The test writes a pattern to the beginning of the disk and verifies
that it is still intact after a reset with a pending operation. It
also checks that the pending operation actually completes correctly.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20230906130922.142845-2-f.ebner@proxmox.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit cc610857bb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If there is a pending DMA operation during ide_bus_reset(), the fact
that the IDEState is already reset before the operation is canceled
can be problematic. In particular, ide_dma_cb() might be called and
then use the reset IDEState which contains the signature after the
reset. When used to construct the IO operation this leads to
ide_get_sector() returning 0 and nsector being 1. This is particularly
bad, because a write command will thus destroy the first sector which
often contains a partition table or similar.
Traces showing the unsolicited write happening with IDEState
0x5595af6949d0 being used after reset:
> ahci_port_write ahci(0x5595af6923f0)[0]: port write [reg:PxSCTL] @ 0x2c: 0x00000300
> ahci_reset_port ahci(0x5595af6923f0)[0]: reset port
> ide_reset IDEstate 0x5595af6949d0
> ide_reset IDEstate 0x5595af694da8
> ide_bus_reset_aio aio_cancel
> dma_aio_cancel dbs=0x7f64600089a0
> dma_blk_cb dbs=0x7f64600089a0 ret=0
> dma_complete dbs=0x7f64600089a0 ret=0 cb=0x5595acd40b30
> ahci_populate_sglist ahci(0x5595af6923f0)[0]
> ahci_dma_prepare_buf ahci(0x5595af6923f0)[0]: prepare buf limit=512 prepared=512
> ide_dma_cb IDEState 0x5595af6949d0; sector_num=0 n=1 cmd=DMA WRITE
> dma_blk_io dbs=0x7f6420802010 bs=0x5595ae2c6c30 offset=0 to_dev=1
> dma_blk_cb dbs=0x7f6420802010 ret=0
> (gdb) p *qiov
> $11 = {iov = 0x7f647c76d840, niov = 1, {{nalloc = 1, local_iov = {iov_base = 0x0,
> iov_len = 512}}, {__pad = "\001\000\000\000\000\000\000\000\000\000\000",
> size = 512}}}
> (gdb) bt
> #0 blk_aio_pwritev (blk=0x5595ae2c6c30, offset=0, qiov=0x7f6420802070, flags=0,
> cb=0x5595ace6f0b0 <dma_blk_cb>, opaque=0x7f6420802010)
> at ../block/block-backend.c:1682
> #1 0x00005595ace6f185 in dma_blk_cb (opaque=0x7f6420802010, ret=<optimized out>)
> at ../softmmu/dma-helpers.c:179
> #2 0x00005595ace6f778 in dma_blk_io (ctx=0x5595ae0609f0,
> sg=sg@entry=0x5595af694d00, offset=offset@entry=0, align=align@entry=512,
> io_func=io_func@entry=0x5595ace6ee30 <dma_blk_write_io_func>,
> io_func_opaque=io_func_opaque@entry=0x5595ae2c6c30,
> cb=0x5595acd40b30 <ide_dma_cb>, opaque=0x5595af6949d0,
> dir=DMA_DIRECTION_TO_DEVICE) at ../softmmu/dma-helpers.c:244
> #3 0x00005595ace6f90a in dma_blk_write (blk=0x5595ae2c6c30,
> sg=sg@entry=0x5595af694d00, offset=offset@entry=0, align=align@entry=512,
> cb=cb@entry=0x5595acd40b30 <ide_dma_cb>, opaque=opaque@entry=0x5595af6949d0)
> at ../softmmu/dma-helpers.c:280
> #4 0x00005595acd40e18 in ide_dma_cb (opaque=0x5595af6949d0, ret=<optimized out>)
> at ../hw/ide/core.c:953
> #5 0x00005595ace6f319 in dma_complete (ret=0, dbs=0x7f64600089a0)
> at ../softmmu/dma-helpers.c:107
> #6 dma_blk_cb (opaque=0x7f64600089a0, ret=0) at ../softmmu/dma-helpers.c:127
> #7 0x00005595ad12227d in blk_aio_complete (acb=0x7f6460005b10)
> at ../block/block-backend.c:1527
> #8 blk_aio_complete (acb=0x7f6460005b10) at ../block/block-backend.c:1524
> #9 blk_aio_write_entry (opaque=0x7f6460005b10) at ../block/block-backend.c:1594
> #10 0x00005595ad258cfb in coroutine_trampoline (i0=<optimized out>,
> i1=<optimized out>) at ../util/coroutine-ucontext.c:177
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: simon.rowe@nutanix.com
Message-ID: <20230906130922.142845-1-f.ebner@proxmox.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 7d7512019f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The PC offset is *signed*.
Cc: qemu-stable@nongnu.org
Reported-by: Sergey Evlashev <vectorchiefrocks@gmail.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1624
Fixes: c7a9ef7517 ("target/mips: Introduce decode tree bindings for MSA ASE")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230914085807.12241-1-philmd@linaro.org>
(cherry picked from commit 04591b3ddd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Scale factor needs to be applied when calculating width/height of the
GTK windows.
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20231012222643.13996-1-dongwon.kim@intel.com>
(cherry picked from commit 47fd6ab1e3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fixes the GL context creation from a widget that isn't yet realized (in
a hidden tab for example).
Resolves:
https://gitlab.com/qemu-project/qemu/-/issues/1727
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Antonio Caggiano <quic_acaggian@quicinc.com>
Message-Id: <20231017111642.1155545-1-marcandre.lureau@redhat.com>
(cherry picked from commit 565f85a9c2)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Pixman routines can fail if no implementation is available and it will
become optional soon so add fallbacks when pixman does not work.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <ed0fba3f74e48143f02228b83bf8796ca49f3e7d.1698871239.git.balaton@eik.bme.hu>
(cherry picked from commit 08730ee0cc)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
raw_co_zone_append() sets "s->offset" where "BDRVRawState *s". This pointer
is used later at raw_co_prw() to save the block address where the data is
written.
When multiple IOs are on-going at the same time, a later IO's
raw_co_zone_append() call over-writes a former IO's offset address before
raw_co_prw() completes. As a result, the former zone append IO returns the
initial value (= the start address of the writing zone), instead of the
proper address.
Fix the issue by passing the offset pointer to raw_co_prw() instead of
passing it through s->offset. Also, remove "offset" from BDRVRawState as
there is no usage anymore.
Fixes: 4751d09adc ("block: introduce zone append write for zoned devices")
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Message-Id: <20231030073853.2601162-1-naohiro.aota@wdc.com>
Reviewed-by: Sam Li <faithilikerun@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
(cherry picked from commit ad4feaca61)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When the zoned request fail, it needs to update only the wp of
the target zones for not disrupting the in-flight writes on
these other zones. The wp is updated successfully after the
request completes.
Fixed the callers with right offset and nr_zones.
Signed-off-by: Sam Li <faithilikerun@gmail.com>
Message-Id: <20230825040556.4217-1-faithilikerun@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[hreitz: Rebased and fixed comment spelling]
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
(cherry picked from commit 10b9e0802a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When the discard-no-unref flag is enabled, we keep the reference for
normal discard requests.
But when a discard is executed on a snapshot/qcow2 image with backing,
the discards are saved as zero clusters in the snapshot image.
When committing the snapshot to the backing file, not
discard_in_l2_slice is called but zero_in_l2_slice. Which did not had
any logic to keep the reference when discard-no-unref is enabled.
Therefor we add logic in the zero_in_l2_slice call to keep the reference
on commit.
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621
Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be>
Message-Id: <20231003125236.216473-2-jean-louis@dupond.be>
[hreitz: Made the documentation change more verbose, as discussed
on-list]
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
(cherry picked from commit b2b109041e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In commit be23a049 in the conversion to decodetree we broke the
decoding of the immediate value in the LDRA instruction. This should
be a 10 bit signed value that is scaled by 8, but in the conversion
we incorrectly ended up scaling it only by 2. Fix the scaling
factor.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1970
Fixes: be23a049 ("target/arm: Convert load (pointer auth) insns to decodetree")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231106113445.1163063-1-peter.maydell@linaro.org
(cherry picked from commit 5722fc4712)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
NVMeQueuePair::reqs has length NVME_NUM_REQS, which less than
NVME_QUEUE_SIZE by 1.
Fixes: 1086e95da1 ("block/nvme: switch to a NVMeRequest freelist")
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Maksim Davydov <davydov-max@yandex-team.ru>
Message-id: 20231017125941.810461-5-vsementsov@yandex-team.ru
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit cc8fb0c3ae)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
"blob" resources don't have an associated pixman image:
#0 pixman_image_get_stride (image=0x0) at ../pixman/pixman-image.c:921
#1 0x0000562327c25236 in virtio_gpu_save (f=0x56232bb13b00, opaque=0x56232b555a60, size=0, field=0x5623289ab6c8 <__compound_literal.3+104>, vmdesc=0x56232ab59fe0) at ../hw/display/virtio-gpu.c:1225
Related to:
https://bugzilla.redhat.com/show_bug.cgi?id=2236353
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 9c549ab689)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Even on x86_64 the default protocol is the x86-32 one if the guest doesn't
specifically ask for x86-64.
Cc: qemu-stable@nongnu.org
Fixes: b6af8926fb ("xen: add implementations of xen-block connect and disconnect functions...")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
(cherry picked from commit a1c1082908)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The xen_evtchn_soft_reset() function requires the iothread mutex, but is
also called for the EVTCHNOP_reset hypercall. Ensure the mutex is taken
in that case.
Cc: qemu-stable@nongnu.org
Fixes: a15b10978f ("hw/xen: Implement EVTCHNOP_reset")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
(cherry picked from commit debc995e88)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When fire_watch_cb() found the response buffer empty, it would call
deliver_watch() to generate the XS_WATCH_EVENT message in the response
buffer and send an event channel notification to the guest… without
actually *copying* the response buffer into the ring. So there was
nothing for the guest to see. The pending response didn't actually get
processed into the ring until the guest next triggered some activity
from its side.
Add the missing call to put_rsp().
It might have been slightly nicer to call xen_xenstore_event() here,
which would *almost* have worked. Except for the fact that it calls
xen_be_evtchn_pending() to check that it really does have an event
pending (and clear the eventfd for next time). And under Xen it's
defined that setting that fd to O_NONBLOCK isn't guaranteed to work,
so the emu implementation follows suit.
This fixes Xen device hot-unplug.
Cc: qemu-stable@nongnu.org
Fixes: 0254c4d19d ("hw/xen: Add xenstore wire implementation and implementation stubs")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
(cherry picked from commit 4a5780f520)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The refcounts actually correspond to 'active_ref' structures stored in a
GHashTable per "user" on the backend side (mostly, per XenDevice).
If we zero map_track[] on reset, then when the backend drivers get torn
down and release their mapping we hit the assert(s->map_track[ref] != 0)
in gnt_unref().
So leave them in place. Each backend driver will disconnect and reconnect
as the guest comes back up again and reconnects, and it all works out OK
in the end as the old refs get dropped.
Cc: qemu-stable@nongnu.org
Fixes: de26b26197 ("hw/xen: Implement soft reset for emulated gnttab")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
(cherry picked from commit 3de75ed352)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
A guest which has configured the per-vCPU upcall vector may set the
HVM_PARAM_CALLBACK_IRQ param to fairly much anything other than zero.
For example, Linux v6.0+ after commit b1c3497e604 ("x86/xen: Add support
for HVMOP_set_evtchn_upcall_vector") will just do this after setting the
vector:
/* Trick toolstack to think we are enlightened. */
if (!cpu)
rc = xen_set_callback_via(1);
That's explicitly setting the delivery to GSI#1, but it's supposed to be
overridden by the per-vCPU vector setting. This mostly works in Qemu
*except* for the logic to enable the in-kernel handling of event channels,
which falsely determines that the kernel cannot accelerate GSI delivery
in this case.
Add a kvm_xen_has_vcpu_callback_vector() to report whether vCPU#0 has
the vector set, and use that in xen_evtchn_set_callback_param() to
enable the kernel acceleration features even when the param *appears*
to be set to target a GSI.
Preserve the Xen behaviour that when HVM_PARAM_CALLBACK_IRQ is set to
*zero* the event channel delivery is disabled completely. (Which is
what that bizarre guest behaviour is working round in the first place.)
Cc: qemu-stable@nongnu.org
Fixes: 91cce75617 ("hw/xen: Add xen_evtchn device for event channel emulation")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
(cherry picked from commit 18e83f28bf)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The per-vCPU upcall vector support had three problems. Firstly it was
using the wrong hypercall argument and would always return -EFAULT when
the guest tried to set it up. Secondly it was using the wrong ioctl() to
pass the vector to the kernel and thus the *kernel* would always return
-EINVAL. Finally, even when delivering the event directly from userspace
with an MSI, it put the destination CPU ID into the wrong bits of the
MSI address.
Linux doesn't (yet) use this mode so it went without decent testing
for a while.
Cc: qemu-stable@nongnu.org
Fixes: 105b47fdf2 ("i386/xen: implement HVMOP_set_evtchn_upcall_vector")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
(cherry picked from commit e7dbb62ff1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This confuses lscpu into thinking it's running in PVH mode.
Cc: qemu-stable@nongnu.org
Fixes: bedcc13924 ("i386/xen: implement HYPERVISOR_xen_version")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
(cherry picked from commit e969f992c6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Dangerous and now unused.
Cc: Fam Zheng <fam@euphon.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: "Denis V. Lunev" <den@openvz.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit 4ef9d97b1a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
qemu_uuid_unparse() includes a trailing NUL when writing the uuid
string and the buffer size should be UUID_FMT_LEN + 1 bytes. Use the
recently added UUID_STR_LEN which defines the correct size.
Fixes: CID 1522913
Fixes: 2dca1b37a7 ("vfio/pci: add support for VF token")
Cc: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: "Denis V. Lunev" <den@openvz.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit f8d6f3b16c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
qemu_uuid_unparse() includes a trailing NUL when writing the uuid
string and the buffer size should be UUID_FMT_LEN + 1 bytes. Add a
define for this size and use it where required.
Cc: Fam Zheng <fam@euphon.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: "Denis V. Lunev" <den@openvz.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit 721da0396c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In a two-stage translation, the result of the BTI guarded bit should
be the guarded bit from the first stage of translation, as there is
no BTI guard information in stage two. Our code tried to do this,
but got it wrong, because we currently have two fields where the GP
bit information might live (ARMCacheAttrs::guarded and
CPUTLBEntryFull::extra::arm::guarded), and we were storing the GP bit
in the latter during the stage 1 walk but trying to copy the former
in combine_cacheattrs().
Remove the duplicated storage, and always use the field in
CPUTLBEntryFull; correctly propagate the stage 1 value to the output
in get_phys_addr_twostage().
Note for stable backports: in v8.0 and earlier the field is named
result->f.guarded, not result->f.extra.arm.guarded.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1950
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20231031173723.26582-1-peter.maydell@linaro.org
(cherry picked from commit 4c09abeae8)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: replace f.extra.arm.guarded -> f.guarded due to v8.1.0-1179-ga81fef4b64)
The previous change missed updating one of the increments and
one of the MemOps. Add a test case for all vector lengths.
Cc: qemu-stable@nongnu.org
Fixes: e6dd5e782b ("target/arm: Use tcg_gen_qemu_{ld, st}_i128 in gen_sve_{ld, st}r")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231031143215.29764-1-richard.henderson@linaro.org
[PMM: fixed checkpatch nit]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit b11293c212)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context fix in tests/tcg/aarch64/Makefile.target)
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Before previous commit, rebase was getting infitely stuck in case of
rebasing within the same backing chain and when overlay_size > backing_size.
Let's add this case to the rebasing test 024 to make sure it doesn't
break again.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20230919165804.439110-3-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 827171c318)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In case when we're rebasing within one backing chain, and when target image
is larger than old backing file, bdrv_is_allocated_above() ends up setting
*pnum = 0. As a result, target offset isn't getting incremented, and we
get stuck in an infinite for loop. Let's detect this case and proceed
further down the loop body, as the offsets beyond the old backing size need
to be explicitly zeroed.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20230919165804.439110-2-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 8b097fd6b0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
A build of GCC 13.2 will have stack protector enabled by default if it
was configured with --enable-default-ssp option. For such a compiler,
it is necessary to explicitly disable stack protector when linking
without standard libraries.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230731091042.139159-3-akihiko.odaki@daynix.com>
[AJB: fix comment string typo]
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231029145033.592566-3-alex.bennee@linaro.org>
(cherry picked from commit 580731dcc8)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
blk_insert_bs() requires that the caller holds the AioContext lock for
the node to be inserted. Since commit c066e808e1, neglecting to do so
causes a crash when the child has to be moved to a different AioContext
to attach it to the BlockBackend.
This fixes qmp_blockdev_insert_anon_medium(), which is called for the
QMP commands 'blockdev-insert-medium' and 'blockdev-change-medium', to
correctly take the lock.
Cc: qemu-stable@nongnu.org
Fixes: https://issues.redhat.com/browse/RHEL-3922
Fixes: c066e808e1
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231013153302.39234-2-kwolf@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit fed8245015)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Testing of the LED state showed that when the LED polarity was
set to GPIO_POLARITY_ACTIVE_LOW and a low logic value was set on
the input GPIO of the LED, the LED was being turn off when it was
expected to be turned on.
Fixes: ddb67f6402 ("hw/misc/led: Allow connecting from GPIO output")
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Message-id: 20231024191945.4135036-1-milesg@linux.vnet.ibm.com
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 6f83dc6716)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In commit 442c9d682c when we converted the ERET, ERETAA, ERETAB
instructions to decodetree, the conversion accidentally lost the
correct setting of the syndrome register when taking a trap because
of the FEAT_FGT HFGITR_EL1.ERET bit. Instead of reporting a correct
full syndrome value with the EC and IL bits, we only reported the low
two bits of the syndrome, because the call to syn_erettrap() got
dropped.
Fix the syndrome values for these traps by reinstating the
syn_erettrap() calls.
Fixes: 442c9d682c ("target/arm: Convert ERET, ERETAA, ERETAB to decodetree")
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20231024172438.2990945-1-peter.maydell@linaro.org
(cherry picked from commit 307521d6e2)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
With pairs of jmp+rett, pc == DYNAMIC_PC_LOOKUP and
npc == DYNAMIC_PC. Make sure that we exit for interrupts.
Cc: qemu-stable@nongnu.org
Fixes: 633c42834c ("target/sparc: Introduce DYNAMIC_PC_LOOKUP")
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 930f1865cc)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In query_port() we pass the address of a local pvrdma_port_attr
struct to the rdma_query_backend_port() function. Unfortunately,
rdma_backend_query_port() wants a pointer to a struct ibv_port_attr,
and the two are not the same length.
Coverity spotted this (CID 1507146): pvrdma_port_attr is 48 bytes
long, and ibv_port_attr is 52 bytes, because it has a few extra
fields at the end.
Fortunately, all we do with the attrs struct after the call is to
read a few specific fields out of it which are all at the same
offsets in both structs, so we can simply make the local variable the
correct type. This also lets us drop the cast (which should have
been a bit of a warning flag that we were doing something wrong
here).
We do however need to add extra casts for the fields of the
struct that are enums: clang will complain about the implicit
cast to a different enum type otherwise.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 4ab9a7429b)
Block Size Register bits [14:12] is SDMA Buffer Boundary, it is missed
in register write, but it is needed in SDMA transfer. e.g. it will be
used in sdhci_sdma_transfer_multi_blocks to calculate boundary_ variables.
Missing this field will cause wrong operation for different SDMA Buffer
Boundary settings.
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Fixes: dfba99f17f ("hw/sdhci: Fix DMA Transfer Block Size field")
Signed-off-by: Lu Gao <lu.gao@verisilicon.com>
Signed-off-by: Jianxian Wen <jianxian.wen@verisilicon.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-ID: <20220321055618.4026-1-lu.gao@verisilicon.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit ae5f70baf5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
On an attempt to access CNTPCT_EL0 from EL0 using a guest running on top
of Xen, a trap from EL2 was observed which is something not reproducible
on HW (also, Xen does not trap accesses to physical counter).
This is because gt_counter_access() checks for an incorrect bit (1
instead of 0) of CNTHCTL_EL2 if HCR_EL2.E2H is 0 and access is made to
physical counter. Refer ARM ARM DDI 0487J.a, D19.12.2:
When HCR_EL2.E2H is 0:
- EL1PCTEN, bit [0]: refers to physical counter
- EL1PCEN, bit [1]: refers to physical timer registers
Drop entire block "if (hcr & HCR_E2H) {...} else {...}" from EL0 case
and fall through to EL1 case, given that after fixing checking for the
correct bit, the handling is the same.
Fixes: 5bc8437136 ("target/arm: Update timer access for VHE")
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Message-id: 20230928094404.20802-1-michal.orzel@amd.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit d01448c79d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Those PS/2 ports are created with the LASI controller when
a 32-bit PA-RISC machine is created.
Mark them not user-createable to avoid showing them in
the qemu device list.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: qemu-stable@nongnu.org
(cherry picked from commit a1e6a5c462)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
sh4 uses gUSA (general UserSpace Atomicity) to provide atomicity on CPUs
that don't have atomic instructions. A gUSA region that adds 1 to an
atomic variable stored in @R2 looks like this:
4004b6: 03 c7 mova 4004c4 <gusa+0x10>,r0
4004b8: f3 61 mov r15,r1
4004ba: 09 00 nop
4004bc: fa ef mov #-6,r15
4004be: 22 63 mov.l @r2,r3
4004c0: 01 73 add #1,r3
4004c2: 32 22 mov.l r3,@r2
4004c4: 13 6f mov r1,r15
R0 contains a pointer to the end of the gUSA region
R1 contains the saved stack pointer
R15 contains negative length of the gUSA region
When this region is interrupted by a signal, the kernel detects if
R15 >= -128U. If yes, the kernel rolls back PC to the beginning of the
region and restores SP by copying R1 to R15.
The problem happens if we are interrupted by a signal at address 4004c4.
R15 still holds the value -6, but the atomic value was already written by
an instruction at address 4004c2. In this situation we can't undo the
gUSA. The function unwind_gusa does nothing, the signal handler attempts
to push a signal frame to the address -6 and crashes.
This patch fixes it, so that if we are interrupted at the last instruction
in a gUSA region, we copy R1 to R15 to restore the correct stack pointer
and avoid crashing.
There's another bug: if we are interrupted in a delay slot, we save the
address of the instruction in the delay slot. We must save the address of
the previous instruction.
Cc: qemu-stable@nongnu.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Yoshinori Sato <ysato@users.sourcefoege.jp>
Message-Id: <b16389f7-6c62-70b7-59b3-87533c0bcc@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 3b894b699c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
QEMU mips userspace emulation crashes with "qemu: unhandled CPU exception
0x15 - aborting" when one of the integer arithmetic instructions detects
an overflow.
This patch fixes it so that it delivers SIGFPE with FPE_INTOVF instead.
Cc: qemu-stable@nongnu.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Message-Id: <3ef979a8-3ee1-eb2d-71f7-d788ff88dd11@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 6fad9b4bb9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The previous change, 2d385be615, assumed !PAGE_VALID meant that
the page would be unmapped by the elf image. However, since we
reserved the entire image space via mmap, PAGE_VALID will always
be set. Instead, assume PROT_NONE for the same condition.
Furthermore, assume bss is only ever present for writable segments,
and that there is no page overlap between PT_LOAD segments.
Instead of an assert, return false to indicate failure.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1854
Fixes: 2d385be615 ("linux-user: Do not adjust zero_bss for host page size")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit e6e66b0328)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is an error in Python 3.12; fix it by using a raw string literal.
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e6d8e5e6e3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is an error in Python 3.12; fix it by using a raw string literal
or by double-escaping the backslash.
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 86a8989d45)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is an error in Python 3.12; fix it by using a raw string literal.
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 1b5f3f65cc)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is an error in Python 3.12; fix it by using a raw string literal.
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e41c40d101)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is an error in Python 3.12; fix it by using a raw string literal.
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e4b6532cc0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is an error in Python 3.12; fix it by using a raw string literal.
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a5e3cb3b90)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This patch is a backport from
e03a3334b6
According to Guido in https://github.com/python/cpython/issues/104344 ,
this call was never meant to wait for the server to shut down - that is
handled synchronously - but instead, this waits for all connections to
close. Or, it would have, if it wasn't broken since it was introduced.
3.12 fixes the bug, which now causes a hang in our code. The fix is just
to remove the wait.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-id: 20231006195243.3131140-3-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit acf873873a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
RDMA was having trouble because
migrate_multifd_flush_after_each_section() can only be true or false,
but we don't want to send any flush when we are not in multifd
migration.
CC: Fabiano Rosas <farosas@suse.de
Fixes: 294e5a4034 ("multifd: Only flush once each full round of memory")
Reported-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011205548.10571-2-quintela@redhat.com>
(cherry picked from commit d4f34485ca)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The migration code uses unsigned values for 16, 32 and 64-bit
operations. Fix the script to do the same.
This was causing an issue when parsing the migration stream generated
on the ppc64 target because one of instance_ids was larger than the
32bit signed maximum:
Traceback (most recent call last):
File "/home/fabiano/kvm/qemu/build/scripts/analyze-migration.py", line 658, in <module>
dump.read(dump_memory = args.memory)
File "/home/fabiano/kvm/qemu/build/scripts/analyze-migration.py", line 592, in read
classdesc = self.section_classes[section_key]
KeyError: ('spapr_iommu', -2147483648)
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231009184326.15777-6-farosas@suse.de>
(cherry picked from commit caea03279e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Guest driver allocates and initialize page tables to be used as a ring
of descriptors for CQ and async events.
The page table that represents the ring, along with the number of pages
in the page table is passed to the device.
Currently our device supports only one page table for a ring.
Let's make sure that the number of page table entries the driver
reports, do not exceeds the one page table size.
Reported-by: Soul Chen <soulchen8650@gmail.com>
Signed-off-by: Yuval Shaia <yuval.shaia.ml@gmail.com>
Fixes: CVE-2023-1544
Message-ID: <20230301142926.18686-1-yuval.shaia.ml@gmail.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 85fc35afa9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The operator (fwmacc16) of vfwmaccbf16.vf helper function should be
replaced by fwmaccbf16.
Fixes: adf772b0f7 ("target/riscv: Add support for Zvfbfwma extension")
Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231005095734.567575-1-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 837570cef2)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fix the inverted order of pmpaddr13 and pmpaddr14 in csr_name().
Signed-off-by: Alvin Chang <alvinga@andestech.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230907084500.328-1-alvinga@andestech.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit cffa995490)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
python3 may not be the expected python version.
Use PYTHON to invoke python.
Fixes: 22e11539e1 ("edk2: replace build scripts")
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 17b8d8ac33)
Reset the current sample counter when writing the Channel Sample
Count Register. The Linux ens1370 driver and the AROS sb128
driver expect the current sample counter counts down from sample
count to 0 after a write to the Channel Sample Count Register.
Currently the current sample counter starts from 0 after a reset
or the last count when the counter was stopped.
The current sample counter is used to raise an interrupt whenever
a complete buffer was transferred. When the counter starts with a
value lower than the reload value, the interrupt triggeres before
the buffer was completly transferred. This may lead to corrupted
audio streams.
Tested-by: Rene Engel <ReneEngel80@emailn.de>
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20230917065813.6692-1-vr_qemu@t-online.de>
(cherry picked from commit 00e3b29d06)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
QEMU will crash if anyone tries to set tls-authz (which is a type
StrOrNull) with 'null' value. Fix it in the easy way by converting it to
qstring just like the other two tls parameters.
Cc: qemu-stable@nongnu.org # v4.0+
Fixes: d2f1d29b95 ("migration: add support for a "tls-authz" migration parameter")
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230905162335.235619-2-peterx@redhat.com>
(cherry picked from commit 86dec715a7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit 59bde21374 ("util/log: do not close and reopen log files when
flags are turned off") prevented switching away from stderr on a
subsequent invocation of qemu_set_log_internal(). This prevented
switching away from stderr with the 'logfile' monitor command as well
as an invocation like
> ./qemu-system-x86_64 -trace 'qemu_mutex_lock,file=log'
from opening the specified log file.
Fixes: 59bde21374 ("util/log: do not close and reopen log files when flags are turned off")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231004124446.491481-1-f.ebner@proxmox.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit f05142d511)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The below referenced commit renames scanout_width/height to
backing_width/height, but also promotes these fields in various portions
of the egl interface. Meanwhile vfio dmabuf support has never used the
previous scanout fields and is therefore missed in the update. This
results in a black screen when transitioning from ramfb to dmabuf display
when using Intel vGPU with these features.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1891
Link: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg02726.html
Fixes: 9ac06df8b6 ("virtio-gpu-udmabuf: correct naming of QemuDmaBuf size properties")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit 931150e56b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
An MSI from I/O APIC may not exactly equal to APIC_DEFAULT_ADDRESS. In
fact, Windows 17763.3650 configures I/O APIC to set the dest_mode bit.
Cover the range assigned to APIC.
Fixes: 577c470f43 ("x86_iommu/amd: Prepare for interrupt remap support")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230921114612.40671-1-akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 0114c45130)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This patch solves a few issues. The most obvious is that the feature
set was done previous to ACKNOWLEDGE | DRIVER status bit set. Current
vdpa devices are permissive with this, but it is better to follow the
standard.
Fixes: 152128d646 ("vdpa: move CVQ isolation check to net_init_vhost_vdpa")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230915170836.3078172-4-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 845ec38ae1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Otherwise it continues the CVQ isolation probing.
Fixes: 152128d646 ("vdpa: move CVQ isolation check to net_init_vhost_vdpa")
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230915170836.3078172-3-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit f1085882d0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
It incorrectly prints "error setting features", probably because a copy
paste miss.
Fixes: 152128d646 ("vdpa: move CVQ isolation check to net_init_vhost_vdpa")
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230915170836.3078172-2-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit cbc9ae87b5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Not zeroing it causes a SIGSEGV if the live migration is cancelled, at
net device restart.
This is caused because CVQ tries to reuse the iova_tree that is present
in the first vhost_vdpa device at the end of vhost_vdpa_net_cvq_start.
As a consequence, it tries to access an iova_tree that has been already
free.
Fixes: 00ef422e9f ("vdpa net: move iova tree creation from init to start")
Reported-by: Yanhui Ma <yama@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230913123408.2819185-1-eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 0a7a164bc3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Use abi_ullong not uint64_t so that the alignment of the field
and therefore the layout of the struct is correct.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 33bc4fa78b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When starting a guest via libvirt with "virsh start --console ...",
the first second of the console output is missing. This is especially
annoying on s390x that only has a text console by default and no graphical
output - if the bios fails to boot here, the information about what went
wrong is completely lost.
One part of the problem (there is also some things to be done on the
libvirt side) is that QEMU only checks with a 1 second timer whether
the other side of the pty is already connected, so the first second of
the console output is always lost.
This likely used to work better in the past, since the code once checked
for a re-connection during write, but this has been removed in commit
f8278c7d74 ("char-pty: remove the check for connection on write") to avoid
some locking.
To ease the situation here at least a little bit, let's check with g_poll()
whether we could send out the data anyway, even if the connection has not
been marked as "connected" yet. The file descriptor is marked as non-blocking
anyway since commit fac6688a18 ("Do not hang on full PTY"), so this should
not cause any trouble if the other side is not ready for receiving yet.
With this patch applied, I can now successfully see the bios output of
a s390x guest when running it with "virsh start --console" (with a patched
version of virsh that fixes the remaining issues there, too).
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230816210743.1319018-1-thuth@redhat.com>
(cherry picked from commit 4f7689f081)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The fw_cfg DMA write callback in ramfb prepares a new display surface in
QEMU; this new surface is put to use ("swapped in") upon the next display
update. At that time, the old surface (if any) is released.
If the guest triggers the fw_cfg DMA write callback at least twice between
two adjacent display updates, then the second callback (and further such
callbacks) will leak the previously prepared (but not yet swapped in)
display surface.
The issue can be shown by:
(1) starting QEMU with "-trace displaysurface_free", and
(2) running the following program in the guest UEFI shell:
> #include <Library/ShellCEntryLib.h> // ShellAppMain()
> #include <Library/UefiBootServicesTableLib.h> // gBS
> #include <Protocol/GraphicsOutput.h> // EFI_GRAPHICS_OUTPUT_PROTOCOL
>
> INTN
> EFIAPI
> ShellAppMain (
> IN UINTN Argc,
> IN CHAR16 **Argv
> )
> {
> EFI_STATUS Status;
> VOID *Interface;
> EFI_GRAPHICS_OUTPUT_PROTOCOL *Gop;
> UINT32 Mode;
>
> Status = gBS->LocateProtocol (
> &gEfiGraphicsOutputProtocolGuid,
> NULL,
> &Interface
> );
> if (EFI_ERROR (Status)) {
> return 1;
> }
>
> Gop = Interface;
>
> Mode = 1;
> for ( ; ;) {
> Status = Gop->SetMode (Gop, Mode);
> if (EFI_ERROR (Status)) {
> break;
> }
>
> Mode = 1 - Mode;
> }
>
> return 1;
> }
The symptom is then that:
- only one trace message appears periodically,
- the time between adjacent messages keeps increasing -- implying that
some list structure (containing the leaked resources) keeps growing,
- the "surface" pointer is ever different.
> 18566@1695127471.449586:displaysurface_free surface=0x7f2fcc09a7c0
> 18566@1695127471.529559:displaysurface_free surface=0x7f2fcc9dac10
> 18566@1695127471.659812:displaysurface_free surface=0x7f2fcc441dd0
> 18566@1695127471.839669:displaysurface_free surface=0x7f2fcc0363d0
> 18566@1695127472.069674:displaysurface_free surface=0x7f2fcc413a80
> 18566@1695127472.349580:displaysurface_free surface=0x7f2fcc09cd00
> 18566@1695127472.679783:displaysurface_free surface=0x7f2fcc1395f0
> 18566@1695127473.059848:displaysurface_free surface=0x7f2fcc1cae50
> 18566@1695127473.489724:displaysurface_free surface=0x7f2fcc42fc50
> 18566@1695127473.969791:displaysurface_free surface=0x7f2fcc45dcc0
> 18566@1695127474.499708:displaysurface_free surface=0x7f2fcc70b9d0
> 18566@1695127475.079769:displaysurface_free surface=0x7f2fcc82acc0
> 18566@1695127475.709941:displaysurface_free surface=0x7f2fcc369c00
> 18566@1695127476.389619:displaysurface_free surface=0x7f2fcc32b910
> 18566@1695127477.119772:displaysurface_free surface=0x7f2fcc0d5a20
> 18566@1695127477.899517:displaysurface_free surface=0x7f2fcc086c40
> 18566@1695127478.729962:displaysurface_free surface=0x7f2fccc72020
> 18566@1695127479.609839:displaysurface_free surface=0x7f2fcc185160
> 18566@1695127480.539688:displaysurface_free surface=0x7f2fcc23a7e0
> 18566@1695127481.519759:displaysurface_free surface=0x7f2fcc3ec870
> 18566@1695127482.549930:displaysurface_free surface=0x7f2fcc634960
> 18566@1695127483.629661:displaysurface_free surface=0x7f2fcc26b140
> 18566@1695127484.759987:displaysurface_free surface=0x7f2fcc321700
> 18566@1695127485.940289:displaysurface_free surface=0x7f2fccaad100
We figured this wasn't a CVE-worthy problem, as only small amounts of
memory were leaked (the framebuffer itself is mapped from guest RAM, QEMU
only allocates administrative structures), plus libvirt restricts QEMU
memory footprint anyway, thus the guest can only DoS itself.
Plug the leak, by releasing the last prepared (not yet swapped in) display
surface, if any, in the fw_cfg DMA write callback.
Regarding the "reproducer", with the fix in place, the log is flooded with
trace messages (one per fw_cfg write), *and* the trace message alternates
between just two "surface" pointer values (i.e., nothing is leaked, the
allocator flip-flops between two objects in effect).
This issue appears to date back to the introducion of ramfb (995b30179b,
"hw/display: add ramfb, a simple boot framebuffer living in guest ram",
2018-06-18).
Cc: Gerd Hoffmann <kraxel@redhat.com> (maintainer:ramfb)
Cc: qemu-stable@nongnu.org
Fixes: 995b30179b
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20230919131955.27223-1-lersek@redhat.com>
(cherry picked from commit e0288a7784)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In all likelihood, the compiler with lto doesn't see the function being
used, from assembly macro __try1. Help it by marking the function has
being used.
Resolves:
https://gitlab.com/qemu-project/qemu/-/issues/1904
Fixes: commit d89f30b4df ("win32: wrap socket close() with an exception handler")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 75b773d84c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(mjt: trivial context fixup in include/qemu/compiler.h)
CVTPS2PD only loads a half-register for memory, unlike the other
operations under 0x0F 0x5A. "Unpack" the group into separate
emission functions instead of using gen_unary_fp_sse.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit abd41884c5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
CVTPS2PD only loads a half-register for memory, like CVTPH2PS. It can
reuse the "ph" packed half-precision size to load a half-register,
but rename it to "xh" because it is now a variation of "x" (it is not
used only for half-precision values).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a48b26978a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Update the berkeley-testfloat-3 wrap to include a patch provided by
Olaf Hering. This fixes a problem with "control reaches end of non-void
function [-Werror=return-type]" compiler warning/errors that are now
enabled by default in certain versions of GCC.
Reported-by: Olaf Hering <olaf@aepfle.de>
Message-Id: <20230816091522.1292029-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit c01196bddd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Otherwise when a FORMAT UNIT command is issued, the SCSI layer can become
confused because it can find itself in the situation where it thinks there
is still data to be transferred which can cause the next emulated SCSI
command to fail.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: 6ab71761 ("scsi-disk: add FORMAT UNIT command")
Tested-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20230913204410.65650-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit be2b619a17)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In the case where a SCSI layer transfer is incorrectly terminated, it is
possible for a TI command to cause a SCSI buffer overflow due to the
expected transfer data length being less than the available data in the
FIFO. When this occurs the unsigned async_len variable underflows and
becomes a large offset which writes past the end of the allocated SCSI
buffer.
Restrict the non-DMA transfer length to be the smallest of the expected
transfer length and the available FIFO data to ensure that it is no longer
possible for the SCSI buffer overflow to occur.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1810
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20230913204410.65650-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 77668e4b9b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The call to esp_dma_enable() was being made with the SYSBUS_ESP type instead of
the ESP type. This meant that when GPIO 1 was being used to trigger a DMA
request from an external DMA controller, the setting of ESPState's dma_enabled
field would clobber unknown memory whilst the dma_cb callback pointer would
typically return NULL so the DMA request would never start.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20230913204410.65650-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b86dc5cb0b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Our linker script for optionroms specifies only the placement of the
.text section, leaving the linker free to place the remaining sections
at arbitrary places in the file.
Since at least binutils 2.39, the .note.gnu.build-id section is now
being placed at the start of the file, which causes label addresses to
be shifted. For linuxboot_dma.bin that means that the PnP header
(among others) will not be found when determining the type of ROM at
optionrom_setup():
(0x1c is the label _pnph, where the magic "PnP" is)
$ xxd /usr/share/qemu/linuxboot_dma.bin | grep "PnP"
00000010: 0000 0000 0000 0000 0000 1c00 2450 6e50 ............$PnP
$ xxd pc-bios/optionrom/linuxboot_dma.bin | grep "PnP"
00000010: 0000 0000 0000 0000 0000 4c00 2450 6e50 ............$PnP
^bad
Using a freshly built linuxboot_dma.bin ROM results in a broken boot:
SeaBIOS (version rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org)
Booting from Hard Disk...
Boot failed: could not read the boot disk
Booting from Floppy...
Boot failed: could not read the boot disk
No bootable device.
We're not using the build-id section, so pass the --build-id=none
option to the linker to remove it entirely.
Note: In theory, this same issue could happen with any other
section. The ideal solution would be to have all unused sections
discarded in the linker script. However that would be a larger change,
specially for the pvh rom which uses the .bss and COMMON sections so
I'm addressing only the immediate issue here.
Reported-by: Vasiliy Ulyanov <vulyanov@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20230926192502.15986-1-farosas@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 35ed01ba54)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(mjt: remove unrelated stable@vger)
we would crash if width was 0 for these insns, as tcg_gen_deposit() is
undefined for that case. For TriCore, width = 0 is a mov from the src reg
to the dst reg, so we special case this here.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-ID: <20230828112651.522058-9-kbastian@mail.uni-paderborn.de>
(cherry picked from commit 23fa6f56b3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Require i/o as the last insn of a TranslationBlock always,
not only with icount. This is required for i/o that alters
the address space, such as a pci config space write.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1866
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 18a536f1f8)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Without this we can get see loops through cpu_io_recompile,
in which the cpu makes no progress.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 200c1f904f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Initialize can_do_io to true if this the TB has CF_LAST_IO
and will consist of a single instruction. This avoids a
set to 0 followed immediately by a set to 1.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit a2f99d484c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Simplify translator_io_start by recording the current
known value of can_do_io within DisasContextBase.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 0ca41ccf1c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The condition checked is loop invariant; check it only once.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 5d97e94638)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
With CF_NOIRQ and without !CF_USE_ICOUNT, the load isn't used.
Avoid emitting it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit f47a90dacc)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
After system startup, run the update to memory_dispatch
and the tlb_flush on the cpu. This eliminates a race,
wherein a running cpu sees the memory_dispatch change
but has not yet seen the tlb_flush.
Since the update now happens on the cpu, we need not use
qatomic_rcu_read to protect the read of memory_dispatch.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1826
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1834
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1846
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 0d58c66068)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Now that the return path thread is allowed to finish during a paused
migration, we can move the cleanup of the QEMUFiles to the main
migration thread.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-9-farosas@suse.de>
(cherry picked from commit 36e9aab3c5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Replace the return path retry logic with finishing and restarting the
thread. This fixes a race when resuming the migration that leads to a
segfault.
Currently when doing postcopy we consider that an IO error on the
return path file could be due to a network intermittency. We then keep
the thread alive but have it do cleanup of the 'from_dst_file' and
wait on the 'postcopy_pause_rp' semaphore. When the user issues a
migrate resume, a new return path is opened and the thread is allowed
to continue.
There's a race condition in the above mechanism. It is possible for
the new return path file to be setup *before* the cleanup code in the
return path thread has had a chance to run, leading to the *new* file
being closed and the pointer set to NULL. When the thread is released
after the resume, it tries to dereference 'from_dst_file' and crashes:
Thread 7 "return path" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffd1dbf700 (LWP 9611)]
0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154
154 return f->last_error;
(gdb) bt
#0 0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154
#1 0x00005555560e4983 in qemu_file_get_error (f=0x0) at ../migration/qemu-file.c:206
#2 0x0000555555b9a1df in source_return_path_thread (opaque=0x555556e06000) at ../migration/migration.c:1876
#3 0x000055555602e14f in qemu_thread_start (args=0x55555782e780) at ../util/qemu-thread-posix.c:541
#4 0x00007ffff38d76ea in start_thread (arg=0x7fffd1dbf700) at pthread_create.c:477
#5 0x00007ffff35efa6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Here's the race (important bit is open_return_path happening before
migration_release_dst_files):
migration | qmp | return path
--------------------------+-----------------------------+---------------------------------
qmp_migrate_pause()
shutdown(ms->to_dst_file)
f->last_error = -EIO
migrate_detect_error()
postcopy_pause()
set_state(PAUSED)
wait(postcopy_pause_sem)
qmp_migrate(resume)
migrate_fd_connect()
resume = state == PAUSED
open_return_path <-- TOO SOON!
set_state(RECOVER)
post(postcopy_pause_sem)
(incoming closes to_src_file)
res = qemu_file_get_error(rp)
migration_release_dst_files()
ms->rp_state.from_dst_file = NULL
post(postcopy_pause_rp_sem)
postcopy_pause_return_path_thread()
wait(postcopy_pause_rp_sem)
rp = ms->rp_state.from_dst_file
goto retry
qemu_file_get_error(rp)
SIGSEGV
-------------------------------------------------------------------------------------------
We can keep the retry logic without having the thread alive and
waiting. The only piece of data used by it is the 'from_dst_file' and
it is only allowed to proceed after a migrate resume is issued and the
semaphore released at migrate_fd_connect().
Move the retry logic to outside the thread by waiting for the thread
to finish before pausing the migration.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-8-farosas@suse.de>
(cherry picked from commit ef796ee93b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
We'll start calling the await_return_path_close_on_source() function
from other parts of the code, so move all of the related checks and
tracepoints into it.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-7-farosas@suse.de>
(cherry picked from commit d50f5dc075)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This file is owned by the return path thread which is already doing
cleanup.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-6-farosas@suse.de>
(cherry picked from commit b3b101157d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
It's not safe to call qemu_file_shutdown() on the to_dst_file without
first checking for the file's presence under the lock. The cleanup of
this file happens at postcopy_pause() and migrate_fd_cleanup() which
are not necessarily running in the same thread as migrate_fd_cancel().
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-5-farosas@suse.de>
(cherry picked from commit 7478fb0df9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
We cannot call qemu_file_shutdown() on the return path file without
taking the file lock. The return path thread could be running it's
cleanup code and have just cleared the from_dst_file pointer.
Checking ms->to_dst_file for errors could also race with
migrate_fd_cleanup() which clears the to_dst_file pointer.
Protect both accesses by taking the file lock.
This was caught by inspection, it should be rare, but the next patches
will start calling this code from other places, so let's do the
correct thing.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-4-farosas@suse.de>
(cherry picked from commit 639decf529)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
We don't need to set the rp_state.error right after a shutdown because
qemu_file_shutdown() always sets the QEMUFile error, so the return
path thread would have seen it and set the rp error itself.
Setting the error outside of the thread is also racy because the
thread could clear it after we set it.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-3-farosas@suse.de>
(cherry picked from commit 28a8347281)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
We hit intermit CI issue on failing at migration-test over the unit test
preempt/plain:
qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc current = 4f hit_edge = 1
**
ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion failed: (bad == 0)
(test program exited with status code -6)
Fabiano debugged into it and found that the preempt thread can quit even
without receiving all the pages, which can cause guest not receiving all
the pages and corrupt the guest memory.
To make sure preempt thread finished receiving all the pages, we can rely
on the page_requested_count being zero because preempt channel will only
receive requested page faults. Note, not all the faulted pages are required
to be sent via the preempt channel/thread; imagine the case when a
requested page is just queued into the background main channel for
migration, the src qemu will just still send it via the background channel.
Here instead of spinning over reading the count, we add a condvar so the
main thread can wait on it if that unusual case happened, without burning
the cpu for no good reason, even if the duration is short; so even if we
spin in this rare case is probably fine. It's just better to not do so.
The condvar is only used when that special case is triggered. Some memory
ordering trick is needed to guarantee it from happening (against the
preempt thread status field), so the main thread will always get a kick
when that triggers correctly.
Closes: https://gitlab.com/qemu-project/qemu/-/issues/1886
Debugged-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-2-farosas@suse.de>
(cherry picked from commit cf02f29e1e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
VNC_FEATURE_XVP was not shifted left before adding it to vs->features,
so it was never enabled; but it was also checked the wrong way with
a logical AND instead of vnc_has_feature. Fix both places.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 477b301000)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The debug message was cut and pasted from the invalid audio format
case, but the audio message is at bytes 2-3.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0cb9c5880e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
We are doing things like
nb_sectors /= (s->qdev.blocksize / BDRV_SECTOR_SIZE);
in the code here (e.g. in scsi_disk_emulate_mode_sense()), so if
the blocksize is smaller than BDRV_SECTOR_SIZE (=512), this crashes
with a division by 0 exception. Thus disallow block sizes of 256
bytes to avoid this situation.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1813
CVE: 2023-42467
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20230925091854.49198-1-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7cfcc79b0a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
mttcg asserts that an execution ending with EXCP_HALTED must have
cpu->halted. However between the event or instruction that sets
cpu->halted and requests exit and the assertion here, an
asynchronous event could clear cpu->halted.
This leads to crashes running AIX on ppc/pseries because it uses
H_CEDE/H_PROD hcalls, where H_CEDE sets self->halted = 1 and
H_PROD sets other cpu->halted = 0 and kicks it.
H_PROD could be turned into an interrupt to wake, but several other
places in ppc, sparc, and semihosting follow what looks like a similar
pattern setting halted = 0 directly. So remove this assertion.
Reported-by: Ivan Warren <ivan@vmfacility.fr>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20230829010658.8252-1-npiggin@gmail.com>
[rth: Keep the case label and adjust the comment.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 0e5903436d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit 0db0fbb5cf ("Add conditional dependency for libkeyutils")
tried to provide a possibility for the user to disable keyutils
if not required by makeing it depend on the keyring feature. This
looked reasonable at a first glance (the unit test in tests/unit/
needs both), but the condition in meson.build fails if the feature
is meant to be detected automatically, and there is also another
spot in backends/meson.build where keyutils is used independently
from keyring. So let's remove the dependency on keyring again and
introduce a proper meson build option instead.
Cc: qemu-stable@nongnu.org
Fixes: 0db0fbb5cf ("Add conditional dependency for libkeyutils")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1842
Message-ID: <20230824094208.255279-1-thuth@redhat.com>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit c64023b0ba)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The LDRT/STRT "unprivileged load/store" instructions behave like
normal ones if executed at EL0. We handle this correctly for
the load/store semantics, but get the MTE checking wrong.
We always look at s->mte_active[is_unpriv] to see whether we should
be doing MTE checks, but in hflags.c when we set the TB flags that
will be used to fill the mte_active[] array we only set the
MTE0_ACTIVE bit if UNPRIV is true (i.e. we are not at EL0).
This means that a LDRT at EL0 will see s->mte_active[1] as 0,
and will not do MTE checks even when MTE is enabled.
To avoid the translate-time code having to do an explicit check on
s->unpriv to see if it is OK to index into the mte_active[] array,
duplicate MTE_ACTIVE into MTE0_ACTIVE when UNPRIV is false.
(This isn't a very serious bug because generally nobody executes
LDRT/STRT at EL0, because they have no use there.)
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230912140434.1333369-2-peter.maydell@linaro.org
(cherry picked from commit 903dbefc2b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Just like d7ef5e16a1 sets SCR_EL3.HXEn for FEAT_HCX, this commit
handles SCR_EL3.FGTEn for FEAT_FGT:
When we direct boot a kernel on a CPU which emulates EL3, we need to
set up the EL3 system registers as the Linux kernel documentation
specifies:
https://www.kernel.org/doc/Documentation/arm64/booting.rst
> For CPUs with the Fine Grained Traps (FEAT_FGT) extension present:
> - If EL3 is present and the kernel is entered at EL2:
> - SCR_EL3.FGTEn (bit 27) must be initialised to 0b1.
Cc: qemu-stable@nongnu.org
Signed-off-by: Fabian Vogt <fvogt@suse.de>
Message-id: 4831384.GXAFRqVoOG@linux-e202.suse.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 32b214384e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
tlb_addr is changed from target_ulong to uint64_t to match the type of
a CPUTLBEntry value, and the addressed is changed to vaddr.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230807155706.9580-8-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit c78edb5639)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is a regression test for
https://bugzilla.redhat.com/show_bug.cgi?id=2234374.
All this test needs to do is trigger an I/O error inside of file-posix
(specifically raw_co_prw()). One reliable way to do this without
requiring special privileges is to use a FUSE export, which allows us to
inject any error that we want, e.g. via blkdebug.
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230824155345.109765-6-hreitz@redhat.com>
[hreitz: Fixed test to be skipped when there is no FUSE support, to
suppress fusermount's allow_other warning, and to be skipped
with $IMGOPTSSYNTAX enabled]
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
(cherry picked from commit 380448464d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
We duplicate the same condition three times here, pull it out to the top
level.
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230824155345.109765-5-hreitz@redhat.com>
Reviewed-by: Sam Li <faithilikerun@gmail.com>
(cherry picked from commit d31b50a15d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
We must check that zone information is present before running
update_zones_wp().
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2234374
Fixes: Coverity CID 1512459
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230824155345.109765-4-hreitz@redhat.com>
Reviewed-by: Sam Li <faithilikerun@gmail.com>
(cherry picked from commit deab5c9a4e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Instead of checking bs->wps or bs->bl.zone_size for whether zone
information is present, check bs->bl.zoned. That is the flag that
raw_refresh_zoned_limits() reliably sets to indicate zone support. If
it is set to something other than BLK_Z_NONE, other values and objects
like bs->wps and bs->bl.zone_size must be non-null/zero and valid; if it
is not, we cannot rely on their validity.
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230824155345.109765-3-hreitz@redhat.com>
Reviewed-by: Sam Li <faithilikerun@gmail.com>
(cherry picked from commit 4b5d80f3d0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
bs->bl.zoned is what indicates whether the zone information is present
and valid; it is the only thing that raw_refresh_zoned_limits() sets if
CONFIG_BLKZONED is not defined, and it is also the only thing that it
sets if CONFIG_BLKZONED is defined, but there are no zones.
Make sure that it is always set to BLK_Z_NONE if there is an error
anywhere in raw_refresh_zoned_limits() so that we do not accidentally
announce zones while our information is incomplete or invalid.
This also fixes a memory leak in the last error path in
raw_refresh_zoned_limits().
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230824155345.109765-2-hreitz@redhat.com>
Reviewed-by: Sam Li <faithilikerun@gmail.com>
(cherry picked from commit 56d1a022a7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
According to cxl_interleave_ways_enc(), fw->num_targets is allowed to be up
to 16. This also corresponds to CXL r3.0 spec. So, the fw->target_hbs[]
array is iterated from 0 to 15. But it is statically declared of length 8.
Thus, out of bound array access may occur.
Fixes: c28db9e000 ("hw/pci-bridge: Make PCIe and CXL PXB Devices inherit from TYPE_PXB_DEV")
Signed-off-by: Dmitry Frolov <frolov@swemel.ru>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Link: https://lore.kernel.org/r/20230913101055.754709-1-frolov@swemel.ru
Cc: qemu-stable@nongnu.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit de5bbfc602)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Allocate targets and targets[n] resources when all sanity checks are
passed to avoid memory leaks.
Cc: qemu-stable@nongnu.org
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 7b165fa164)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The code in setup_rt_frame reads two words at haddr, but locks only one.
This patch fixes it to lock both.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Helge Deller <deller@gmx.de>
Cc: qemu-stable@nongnu.org
Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit 5b1270ef14)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
qemu-hppa may crash when delivering a signal. It can be demonstrated with
this program. Compile the program with "hppa-linux-gnu-gcc -O2 signal.c"
and run it with "qemu-hppa -one-insn-per-tb a.out". It reports that the
address of the flag is 0xb4 and it crashes when attempting to touch it.
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <signal.h>
sig_atomic_t flag;
void sig(int n)
{
printf("&flag: %p\n", &flag);
flag = 1;
}
int main(void)
{
struct sigaction sa;
struct itimerval it;
sa.sa_handler = sig;
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_RESTART;
if (sigaction(SIGALRM, &sa, NULL)) perror("sigaction"), exit(1);
it.it_interval.tv_sec = 0;
it.it_interval.tv_usec = 100;
it.it_value.tv_sec = it.it_interval.tv_sec;
it.it_value.tv_usec = it.it_interval.tv_usec;
if (setitimer(ITIMER_REAL, &it, NULL)) perror("setitimer"), exit(1);
while (1) {
}
}
The reason for the crash is that the signal handling routine doesn't clear
the 'N' flag in the PSW. If the signal interrupts a thread when the 'N'
flag is set, the flag remains set at the beginning of the signal handler
and the first instruction of the signal handler is skipped.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Helge Deller <deller@gmx.de>
Cc: qemu-stable@nongnu.org
Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit 2529497cb6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reading the time more than once to perform an operation always increases
complexity and fragility due to introduced deltas. Simplify the
decrementer write by reading the clock once for the operation.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit ea62f8a517)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Lower interrupts, delete timers, and set time facility registers
back to initial state on machine reset.
This is not so important for record-replay since timebase and
decrementer are migrated, but it gives a cleaner reset state.
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[ clg: checkpatch.pl fixes ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit 30d0647bcf)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When writing a value to the decrementer that raises an exception, the
irq is raised, but the value is not stored so the store doesn't appear
to have changed the register when it is read again.
Always store the write value to the register.
Fixes: e81a982aa5 ("PPC: Clean up DECR implementation")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit febb71d543)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When storing a large decrementer value with the most significant
implemented bit set, it is to be treated as a negative and sign
extended.
This isn't hit for book3s DEC because of another bug, fixing it
in the next patch exposes this one and can cause additional
problems, so fix this first. It can be hit with HDECR and other
edge triggered types.
Fixes: a8dafa5251 ("target/ppc: Implement large decrementer support for TCG")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[ clg: removed extra cpu and pcc variables shadowing local variables ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit c8fbc6b9f2)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The decrementer register contains a relative time in timebase units.
When writing to DECR this is converted and stored as an absolute value
in nanosecond units, reading DECR converts back to relative timebase.
The tb<->ns conversion of the relative part can cause rounding such that
a value writen to the decrementer can read back a different, with time
held constant. This is a particular problem for a deterministic icount
and record-replay trace.
Fix this by storing the absolute value in timebase units rather than
nanoseconds. The math before:
store: decr_next = now_ns + decr * ns_per_sec / tb_per_sec
load: decr = (decr_next - now_ns) * tb_per_sec / ns_per_sec
load(store): decr = decr * ns_per_sec / tb_per_sec * tb_per_sec /
ns_per_sec
After:
store: decr_next = now_ns * tb_per_sec / ns_per_sec + decr
load: decr = decr_next - now_ns * tb_per_sec / ns_per_sec
load(store): decr = decr
Fixes: 9fddaa0c0c ("PowerPC merge: real time TB and decrementer - faster and simpler exception handling (Jocelyn Mayer)")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit 8e0a5ac878)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The rule of timers is typically that they should never expire before the
timeout, but some time afterward. Rounding timer intervals up when doing
conversion is the right thing to do.
Under most circumstances it is impossible observe the decrementer
interrupt before the dec register has triggered. However with icount
timing, problems can arise. For example setting DEC to 0 can schedule
the timer for now, causing it to fire before any more instructions
have been executed and DEC is still 0.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit eab0888418)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This will be used for converting time intervals in different base units
to host units, for the purpose of scheduling timers to emulate target
timers. Timers typically must not fire before their requested expiry
time but may fire some time afterward, so rounding up is the right way
to implement these.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[ clg: renamed __muldiv64() to muldiv64_rounding() ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit 47de6c4c28)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
These calculations are repeated several times, and they will become
a little more complicated with subsequent changes.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit 7798f5c576)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Replace select() with poll() to fix a crash when QEMU has a large number
of FDs. Also use RETRY_ON_EINTR to avoid unnecessary errors due to EINTR.
Cc: qemu-stable@nongnu.org
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2020133
Fixes: 56a3c24ffc ("tpm: Probe for connected TPM 1.2 or TPM 2")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
(cherry picked from commit 8e32ddff69)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
qemu 8.1.0 breaks on illumos platforms due to _XOPEN_SOURCE and others no longer being set correctly, leading to breakage such as:
https://us-central.manta.mnx.io/pkgsrc/public/reports/trunk/tools/20230908.1404/qemu-8.1.0/build.log
This is a result of meson conversion which incorrectly matches against 'solaris' instead of 'sunos' for uname.
First time submitting a patch here, hope I did it correctly. Thanks.
Signed-off-by: Jonathan Perkin <jonathan@perkin.org.uk>
Message-ID: <ZPtdxtum9UVPy58J@perkin.org.uk>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit fb0a8b0e23)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: omit net/meson.build change before v8.1.0-279-g73258b3864, adjust context befor v8.1.0-288-g2fc36530de)
A subsystem reset contains a reset of AP resources which has been
missing. Adding the AP bridge to the list of device types that need
reset fixes this issue.
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Fixes: a51b3153 ("s390x/ap: base Adjunct Processor (AP) object model")
Message-ID: <20230823142219.1046522-2-seiden@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 297ec01f0b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555888630 in dpy_ui_info_supported (con=0x0) at ../ui/console.c:812
812 return con->hw_ops->ui_info != NULL;
(gdb) bt
#0 0x0000555555888630 in dpy_ui_info_supported (con=0x0) at ../ui/console.c:812
#1 0x00005555558a44b1 in protocol_client_msg (vs=0x5555578c76c0, data=0x5555581e93f0 <incomplete sequence \373>, len=24) at ../ui/vnc.c:2585
#2 0x00005555558a19ac in vnc_client_read (vs=0x5555578c76c0) at ../ui/vnc.c:1607
#3 0x00005555558a1ac2 in vnc_client_io (ioc=0x5555581eb0e0, condition=G_IO_IN, opaque=0x5555578c76c0) at ../ui/vnc.c:1635
Fixes:
https://issues.redhat.com/browse/RHEL-2600
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Albert Esteve <aesteve@redhat.com>
(cherry picked from commit 48a35e12fa)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
riscv_trigger_init() had been called on reset events that can happen
several times for a CPU and it allocated timers for itrigger. If old
timers were present, they were simply overwritten by the new timers,
resulting in a memory leak.
Divide riscv_trigger_init() into two functions, namely
riscv_trigger_realize() and riscv_trigger_reset() and call them in
appropriate timing. The timer allocation will happen only once for a
CPU in riscv_trigger_realize().
Fixes: 5a4ae64cac ("target/riscv: Add itrigger support when icount is enabled")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230818034059.9146-1-akihiko.odaki@daynix.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit a7c272df82)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When the rule-lock bypass (RLB) bit is set in the mseccfg CSR, the PMP
configuration lock bits must not apply. While this behavior is
implemented for the pmpcfgX CSRs, this bit is not respected for
changes to the pmpaddrX CSRs. This patch ensures that pmpaddrX CSR
writes work even on locked regions when the global rule-lock bypass is
enabled.
Signed-off-by: Leon Schuermann <leons@opentitan.org>
Reviewed-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230829215046.1430463-1-leon@is.currently.online>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 4e3adce124)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In the same emulated RISC-V host, the 'host' KVM CPU takes 4 times
longer to boot than the 'rv64' KVM CPU.
The reason is an unintended behavior of riscv_cpu_satp_mode_finalize()
when satp_mode.supported = 0, i.e. when cpu_init() does not set
satp_mode_max_supported(). satp_mode_max_from_map(map) does:
31 - __builtin_clz(map)
This means that, if satp_mode.supported = 0, satp_mode_supported_max
wil be '31 - 32'. But this is C, so satp_mode_supported_max will gladly
set it to UINT_MAX (4294967295). After that, if the user didn't set a
satp_mode, set_satp_mode_default_map(cpu) will make
cfg.satp_mode.map = cfg.satp_mode.supported
So satp_mode.map = 0. And then satp_mode_map_max will be set to
satp_mode_max_from_map(cpu->cfg.satp_mode.map), i.e. also UINT_MAX. The
guard "satp_mode_map_max > satp_mode_supported_max" doesn't protect us
here since both are UINT_MAX.
And finally we have 2 loops:
for (int i = satp_mode_map_max - 1; i >= 0; --i) {
Which are, in fact, 2 loops from UINT_MAX -1 to -1. This is where the
extra delay when booting the 'host' CPU is coming from.
Commit 43d1de32f8 already set a precedence for satp_mode.supported = 0
in a different manner. We're doing the same here. If supported == 0,
interpret as 'the CPU wants the OS to handle satp mode alone' and skip
satp_mode_finalize().
We'll also put a guard in satp_mode_max_from_map() to assert out if map
is 0 since the function is not ready to deal with it.
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Fixes: 6f23aaeb9b ("riscv: Allow user to set the satp mode")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230817152903.694926-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 3a2fc23563)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
On a dtb dumped from the virt machine, dt-validate complains:
soc: pmu: {'riscv,event-to-mhpmcounters': [[1, 1, 524281], [2, 2, 524284], [65561, 65561, 524280], [65563, 65563, 524280], [65569, 65569, 524280]], 'compatible': ['riscv,pmu']} should not be valid under {'type': 'object'}
from schema $id: http://devicetree.org/schemas/simple-bus.yaml#
That's pretty cryptic, but running the dtb back through dtc produces
something a lot more reasonable:
Warning (simple_bus_reg): /soc/pmu: missing or empty reg/ranges property
Moving the riscv,pmu node out of the soc bus solves the problem.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20230727-groom-decline-2c57ce42841c@spud>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 9ff3140631)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
We should not use types dependend on host arch for target_ucontext.
This bug is found when run rv32 applications.
Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230811055438.1945-1-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit ae7d4d625c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The variables whose values are given by cpu_riscv_read_rtc() should be named
"rtc". The variables whose value are given by cpu_riscv_read_rtc_raw()
should be named "rtc_r".
Signed-off-by: Jason Chien <jason.chien@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230728082502.26439-2-jason.chien@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 9382a9eafc)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When writing the upper mtime, we should keep the original lower mtime
whose value is given by cpu_riscv_read_rtc() instead of
cpu_riscv_read_rtc_raw(). The same logic applies to writes to lower mtime.
Signed-off-by: Jason Chien <jason.chien@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230728082502.26439-1-jason.chien@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit e0922b73ba)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit a47842d ("riscv: Add support for the Zfa extension") implemented the zfa extension.
However, it has some typos for fleq.d and fltq.d. Both of them misused the fltq.s
helper function.
Fixes: a47842d ("riscv: Add support for the Zfa extension")
Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Message-ID: <20230728003906.768-1-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit eda633a534)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit bef6f008b98(accel/tcg: Return bool from page_check_range) converts
integer return value to bool type. However, it wrongly converted the use
of the API in riscv fault-only-first, where page_check_range < = 0, should
be converted to !page_check_range.
Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20230729031618.821-1-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 4cc9f284d5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
zmmul was promoted from experimental to ratified in commit 6d00ffad4e.
Add a riscv,isa string for it.
Fixes: 6d00ffad4e ("target/riscv: move zmmul out of the experimental properties")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230720132424.371132-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 50f9464962)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Values that have been read via cpu_physical_memory_read() from the
guest's memory have to be swapped in case the host endianess differs
from the guest.
Fixes: a6e13e31d5 ("riscv_htif: Support console output via proxy syscall")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng@tinylab.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-Id: <20230721094720.902454-3-thuth@redhat.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 058096f1c5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The character that should be printed is stored in the 64 bit "payload"
variable. The code currently tries to print it by taking the address
of the variable and passing this pointer to qemu_chr_fe_write(). However,
this only works on little endian hosts where the least significant bits
are stored on the lowest address. To do this in a portable way, we have
to store the value in an uint8_t variable instead.
Fixes: 5033606780 ("RISC-V HTIF Console")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng@tinylab.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230721094720.902454-2-thuth@redhat.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit c255946e3d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Due to recent KVM changes, QEMU is setting a ptimer offset resulting
in unintended trap and emulate access and a consequent performance
hit. Filter out the PTIMER_CNT register to restore trapless ptimer
access.
Quoting Andrew Jones:
Simply reading the CNT register and writing back the same value is
enough to set an offset, since the timer will have certainly moved
past whatever value was read by the time it's written. QEMU
frequently saves and restores all registers in the get-reg-list array,
unless they've been explicitly filtered out (with Linux commit
680232a94c12, KVM_REG_ARM_PTIMER_CNT is now in the array). So, to
restore trapless ptimer accesses, we need a QEMU patch to filter out
the register.
See
https://lore.kernel.org/kvmarm/gsntttsonus5.fsf@coltonlewis-kvm.c.googlers.com/T/#m0770023762a821db2a3f0dd0a7dc6aa54e0d0da9
for additional context.
Cc: qemu-stable@nongnu.org
Signed-off-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Colton Lewis <coltonlewis@google.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Colton Lewis <coltonlewis@google.com>
Message-id: 20230831190052.129045-1-coltonlewis@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 682814e2a3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
virtio_load() as a whole should run in coroutine context because it
reads from the migration stream and we don't want this to block.
However, it calls virtio_set_features_nocheck() and devices don't
expect their .set_features callback to run in a coroutine and therefore
call functions that may not be called in coroutine context. To fix this,
drop out of coroutine context for calling virtio_set_features_nocheck().
Without this fix, the following crash was reported:
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007efc738c05d3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2 0x00007efc73873d26 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007efc738477f3 in __GI_abort () at abort.c:79
#4 0x00007efc7384771b in __assert_fail_base (fmt=0x7efc739dbcb8 "", assertion=assertion@entry=0x560aebfbf5cf "!qemu_in_coroutine()",
file=file@entry=0x560aebfcd2d4 "../block/graph-lock.c", line=line@entry=275, function=function@entry=0x560aebfcd34d "void bdrv_graph_rdlock_main_loop(void)") at assert.c:92
#5 0x00007efc7386ccc6 in __assert_fail (assertion=0x560aebfbf5cf "!qemu_in_coroutine()", file=0x560aebfcd2d4 "../block/graph-lock.c", line=275,
function=0x560aebfcd34d "void bdrv_graph_rdlock_main_loop(void)") at assert.c:101
#6 0x0000560aebcd8dd6 in bdrv_register_buf ()
#7 0x0000560aeb97ed97 in ram_block_added.llvm ()
#8 0x0000560aebb8303f in ram_block_add.llvm ()
#9 0x0000560aebb834fa in qemu_ram_alloc_internal.llvm ()
#10 0x0000560aebb2ac98 in vfio_region_mmap ()
#11 0x0000560aebb3ea0f in vfio_bars_register ()
#12 0x0000560aebb3c628 in vfio_realize ()
#13 0x0000560aeb90f0c2 in pci_qdev_realize ()
#14 0x0000560aebc40305 in device_set_realized ()
#15 0x0000560aebc48e07 in property_set_bool.llvm ()
#16 0x0000560aebc46582 in object_property_set ()
#17 0x0000560aebc4cd58 in object_property_set_qobject ()
#18 0x0000560aebc46ba7 in object_property_set_bool ()
#19 0x0000560aeb98b3ca in qdev_device_add_from_qdict ()
#20 0x0000560aebb1fbaf in virtio_net_set_features ()
#21 0x0000560aebb46b51 in virtio_set_features_nocheck ()
#22 0x0000560aebb47107 in virtio_load ()
#23 0x0000560aeb9ae7ce in vmstate_load_state ()
#24 0x0000560aeb9d2ee9 in qemu_loadvm_state_main ()
#25 0x0000560aeb9d45e1 in qemu_loadvm_state ()
#26 0x0000560aeb9bc32c in process_incoming_migration_co.llvm ()
#27 0x0000560aebeace56 in coroutine_trampoline.llvm ()
Cc: qemu-stable@nongnu.org
Buglink: https://issues.redhat.com/browse/RHEL-832
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230905145002.46391-3-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 92e2e6a867)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The assert() that checks for valid MTU sizes can be triggered by
the guest (e.g. with the reproducer code from the bug ticket
https://gitlab.com/qemu-project/qemu/-/issues/517 ). Let's avoid
this problem by simply logging the error and refusing to activate
the device instead.
Fixes: d05dcd94ae ("net: vmxnet3: validate configuration values during activate")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
[Mjt: change format specifier from %d to %u for uint32_t argument]
(cherry picked from commit 90a0778421)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
docs/multi-thread-compression.txt uses parameter names with
underscores instead of dashes. Wrong since day one.
docs/rdma.txt, tests/qemu-iotests/181, and tests/qtest/test-hmp.c are
wrong the same way since commit cbde7be900 (v6.0.0). Hard to see,
as test-hmp doesn't check whether the commands work, and iotest 181
appears to be unaffected.
Fixes: 263170e679 (docs: Add a doc about multiple thread compression)
Fixes: cbde7be900 (migrate: remove QMP/HMP commands for speed, downtime and cache size)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit b21a6e31a1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The current description says that these options will create a device
on the IDE bus, which is only true on x86. So rephrase these sentences
a little bit to speak of "default bus" instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit bcd8e24308)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
According to the ast2600 datasheet and the linux aspeed i2c driver,
the TXBUF transmission start position should be TXBUF[0] instead
of TXBUF[1],so the arg pool_start is useless,and the address is not
included in TXBUF.So even if Tx Count equals zero,there is at least
1 byte data needs to be transmitted,and M_TX_CMD should not be cleared
at this condition.The driver url is:
https://github.com/AspeedTech-BMC/linux/blob/aspeed-master-v5.15/drivers/i2c/busses/i2c-ast2600.c
Signed-off-by: Hang Yu <francis_yuu@stu.pku.edu.cn>
Fixes: 6054fc73e8 ("aspeed/i2c: Add support for pool buffer transfers")
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit 961faf3ddb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fixed inconsistency between the regisiter bit field definition header file
and the ast2600 datasheet. The reg name is I2CD1C:Pool Buffer Control
Register in old register mode and I2CC0C: Master/Slave Pool Buffer Control
Register in new register mode. They share bit field
[12:8]:Transmit Data Byte Count and bit field
[29:24]:Actual Received Pool Buffer Size according to the datasheet.
According to the ast2600 datasheet,the actual Tx count is
Transmit Data Byte Count plus 1, and the max Rx size is
Receive Pool Buffer Size plus 1, both in Pool Buffer Control Register.
The version before forgot to plus 1, and mistake Rx count for Rx size.
Signed-off-by: Hang Yu <francis_yuu@stu.pku.edu.cn>
Fixes: 3be3d6ccf2 ("aspeed: i2c: Migrate to registerfields API")
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit 97b8aa5ae9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When encountering an NCQ error, you should not write the NCQ tag to the
SError register. This is completely wrong.
The SError register has a clear definition, where each bit represents a
different error, see PxSERR definition in AHCI 1.3.1.
If we write a random value (like the NCQ tag) in SError, e.g. Linux will
read SError, and will trigger arbitrary error handling depending on the
NCQ tag that happened to be executing.
In case of success, ncq_cb() will call ncq_finish().
In case of error, ncq_cb() will call ncq_err() (which will clear
ncq_tfs->used), and then call ncq_finish(), thus using ncq_tfs->used is
sufficient to tell if finished should get set or not.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230609140844.202795-9-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 9f89423537)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When there is an error, we need to raise a TFES error irq, see AHCI 1.3.1,
5.3.13.1 SDB:Entry.
If ERR_STAT is set, we jump to state ERR:FatalTaskfile, which will raise
a TFES IRQ unconditionally, regardless if the I bit is set in the FIS or
not.
Thus, we should never raise a normal IRQ after having sent an error IRQ.
It is valid to signal successfully completed commands as finished in the
same SDB FIS that generates the error IRQ. The important thing is that
commands that did not complete successfully (e.g. commands that were
aborted, do not get the finished bit set).
Before this commit, there was never a TFES IRQ raised on NCQ error.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230609140844.202795-8-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 7e85cb0db4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
For NCQ, PxCI is cleared on command queued successfully.
For non-NCQ, PxCI is cleared on command completed successfully.
Successfully means ERR_STAT, BUSY and DRQ are all cleared.
A command that has ERR_STAT set, does not get to clear PxCI.
See AHCI 1.3.1, section 5.3.8, states RegFIS:Entry and RegFIS:ClearCI,
and 5.3.16.5 ERR:FatalTaskfile.
In the case of non-NCQ commands, not clearing PxCI is needed in order
for host software to be able to see which command slot that failed.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Message-id: 20230609140844.202795-7-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 1a16ce64fd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
According to AHCI 1.3.1 definition of PxSACT:
This field is cleared when PxCMD.ST is written from a '1' to a '0' by
software. This field is not cleared by a COMRESET or a software reset.
According to AHCI 1.3.1 definition of PxCI:
This field is also cleared when PxCMD.ST is written from a '1' to a '0'
by software.
Clearing PxCMD.ST is part of the error recovery procedure, see
AHCI 1.3.1, section "6.2 Error Recovery".
If we don't clear PxCI on error recovery, the previous command will
incorrectly still be marked as pending after error recovery.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230609140844.202795-6-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit d73b84d0b6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The AHCI spec states that:
For NCQ, PxCI is cleared on command queued successfully.
For non-NCQ, PxCI is cleared on command completed successfully.
(A non-NCQ command that completes with error does not clear PxCI.)
The current QEMU implementation either clears PxCI in check_cmd(),
or in ahci_cmd_done().
check_cmd() will clear PxCI for a command if handle_cmd() returns 0.
handle_cmd() will return -1 if BUSY or DRQ is set.
The QEMU implementation for NCQ commands will currently not set BUSY
or DRQ, so they will always have PxCI cleared by handle_cmd().
ahci_cmd_done() will never even get called for NCQ commands.
Non-NCQ commands are executed by ide_bus_exec_cmd().
Non-NCQ commands in QEMU are implemented either in a sync or in an async
way.
For non-NCQ commands implemented in a sync way, the command handler will
return true, and when ide_bus_exec_cmd() sees that a command handler
returns true, it will call ide_cmd_done() (which will call
ahci_cmd_done()). For a command implemented in a sync way,
ahci_cmd_done() will do nothing (since busy_slot is not set). Instead,
after ide_bus_exec_cmd() has finished, check_cmd() will clear PxCI for
these commands.
For non-NCQ commands implemented in an async way (using either aiocb or
pio_aiocb), the command handler will return false, ide_bus_exec_cmd()
will not call ide_cmd_done(), instead it is expected that the async
callback function will call ide_cmd_done() once the async command is
done. handle_cmd() will set busy_slot, if and only if BUSY or DRQ is
set, and this is checked _after_ ide_bus_exec_cmd() has returned.
handle_cmd() will return -1, so check_cmd() will not clear PxCI.
When the async callback calls ide_cmd_done() (which will call
ahci_cmd_done()), it will see that busy_slot is set, and
ahci_cmd_done() will clear PxCI.
This seems racy, since busy_slot is set _after_ ide_bus_exec_cmd() has
returned. The callback might come before busy_slot gets set. And it is
quite confusing that ahci_cmd_done() will be called for all non-NCQ
commands when the command is done, but will only clear PxCI in certain
cases, even though it will always write a D2H FIS and raise an IRQ.
Even worse, in the case where ahci_cmd_done() does not clear PxCI, it
still raises an IRQ. Host software might thus read an old PxCI value,
since PxCI is cleared (by check_cmd()) after the IRQ has been raised.
Try to simplify this by always setting busy_slot for non-NCQ commands,
such that ahci_cmd_done() will always be responsible for clearing PxCI
for non-NCQ commands.
For NCQ commands, clear PxCI when we receive the D2H FIS, but before
raising the IRQ, see AHCI 1.3.1, section 5.3.8, states RegFIS:Entry and
RegFIS:ClearCI.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Message-id: 20230609140844.202795-5-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit e2a5d9b3d9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The way that BUSY + PxCI is cleared for NCQ (FPDMA QUEUED) commands is
described in SATA 3.5a Gold:
11.15 FPDMA QUEUED command protocol
DFPDMAQ2: ClearInterfaceBsy
"Transmit Register Device to Host FIS with the BSY bit cleared to zero
and the DRQ bit cleared to zero and Interrupt bit cleared to zero to
mark interface ready for the next command."
PxCI is currently cleared by handle_cmd(), but we don't write the D2H
FIS to the FIS Receive Area that actually caused PxCI to be cleared.
Similar to how ahci_pio_transfer() calls ahci_write_fis_pio() with an
additional parameter to write a PIO Setup FIS without raising an IRQ,
add a parameter to ahci_write_fis_d2h() so that ahci_write_fis_d2h()
also can write the FIS to the FIS Receive Area without raising an IRQ.
Change process_ncq_command() to call ahci_write_fis_d2h() without
raising an IRQ (similar to ahci_pio_transfer()), such that the FIS
Receive Area is in sync with the PxTFD shadow register.
E.g. Linux reads status and error fields from the FIS Receive Area
directly, so it is wise to keep the FIS Receive Area and the PxTFD
shadow register in sync.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Message-id: 20230609140844.202795-4-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 2967dc8209)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Currently, the first time sending an unsupported command
(e.g. READ LOG DMA EXT) will not have ERR_STAT set in the completion.
Sending the unsupported command again, will correctly have ERR_STAT set.
When ide_cmd_permitted() returns false, it calls ide_abort_command().
ide_abort_command() first calls ide_transfer_stop(), which will call
ide_transfer_halt() and ide_cmd_done(), after that ide_abort_command()
sets ERR_STAT in status.
ide_cmd_done() for AHCI will call ahci_write_fis_d2h() which writes the
current status in the FIS, and raises an IRQ. (The status here will not
have ERR_STAT set!).
Thus, we cannot call ide_transfer_stop() before setting ERR_STAT, as
ide_transfer_stop() will result in the FIS being written and an IRQ
being raised.
The reason why it works the second time, is that ERR_STAT will still
be set from the previous command, so when writing the FIS, the
completion will correctly have ERR_STAT set.
Set ERR_STAT before writing the FIS (calling cmd_done), so that we will
raise an error IRQ correctly when receiving an unsupported command.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230609140844.202795-3-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit c3461c6264)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
LQ, STQ have the same register-pair ordering as LQARX/STQARX., which is
the even (lower) register contains the most significant bits. This is
not implemented correctly for big-endian.
do_ldst_quad() has variables low_addr_gpr and high_addr_gpr which is
confusing because they are low and high addresses, whereas LQARX/STQARX.
and most such things use the low and high values for lo/hi variables.
The conversion to native 128-bit memory access functions missed this
strangeness.
Fix this by changing the if condition, and change the variable names to
hi/lo to match convention.
Cc: qemu-stable@nongnu.org
Reported-by: Ivan Warren <ivan@vmfacility.fr>
Fixes: 57b38ffd0c ("target/ppc: Use tcg_gen_qemu_{ld,st}_i128 for LQARX, LQ, STQ")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1836
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit 718209358f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1779
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit af03aeb631)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
ppce500_reset_device_tree is registered for system reset, but after
c4b075318e this function rerandomizes rng-seed via
qemu_guest_getrandom_nofail. And when loading a snapshot, it tries to read
EVENT_RANDOM that doesn't exist, so we have an error:
qemu-system-ppc: Missing random event in the replay log
To fix this, use qemu_register_reset_nosnapshotload instead of
qemu_register_reset.
Reported-by: Vitaly Cheptsov <cheptsov@ispras.ru>
Fixes: c4b075318e ("hw/ppc: pass random seed to fdt ")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1634
Signed-off-by: Maksim Kostin <maksim.kostin@ispras.ru>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit 6ec65b69ba)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Failing to reset the of_instance_last makes ihandle allocation continue
to increase, which causes record-replay replay fail to match the
recorded trace.
Not resetting claimed_base makes VOF eventually run out of memory after
some resets.
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Fixes: fc8c745d50 ("spapr: Implement Open Firmware client interface")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit 7b8589d7ce)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
PIE executables are usually linked at offset 0 and are
relocated somewhere during load. The hiaddr needs to
be adjusted to keep the brk next to the executable.
Cc: qemu-stable@nongnu.org
Fixes: 1f356e8c01 ("linux-user: Adjust initial brk when interpreter is close to executable")
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit aec338d63b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is a mandatory feature for Armv8.1 architectures but we don't
state the feature clearly in our emulation list. Also include
FEAT_CRC32 comment in aarch64_max_tcg_initfn for ease of grepping.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20230824075406.1515566-1-alex.bennee@linaro.org
Cc: qemu-stable@nongnu.org
Message-Id: <20230222110104.3996971-1-alex.bennee@linaro.org>
[PMM: pluralize 'instructions' in docs]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 9e771a2fc6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
We can fail the blk_insert_bs() at init_blk_migration(), leaving the
BlkMigDevState without a dirty_bitmap and BlockDriverState. Account
for the possibly missing elements when doing cleanup.
Fix the following crashes:
Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at ../block/dirty-bitmap.c:359
359 BlockDriverState *bs = bitmap->bs;
#0 0x0000555555ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at ../block/dirty-bitmap.c:359
#1 0x0000555555bba331 in unset_dirty_tracking () at ../migration/block.c:371
#2 0x0000555555bbad98 in block_migration_cleanup_bmds () at ../migration/block.c:681
Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555e971ff in bdrv_op_unblock (bs=0x0, op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073
7073 QLIST_FOREACH_SAFE(blocker, &bs->op_blockers[op], list, next) {
#0 0x0000555555e971ff in bdrv_op_unblock (bs=0x0, op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073
#1 0x0000555555e9734a in bdrv_op_unblock_all (bs=0x0, reason=0x0) at ../block.c:7095
#2 0x0000555555bbae13 in block_migration_cleanup_bmds () at ../migration/block.c:690
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Message-id: 20230731203338.27581-1-farosas@suse.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f187609f27)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Suggested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 86e4f93d82)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In early 2021 (see commit 2ad784339e "docs: update README to use
GitLab repo URLs") almost all of the code base was converted to
point to GitLab instead of git.qemu.org. During 2023, git.qemu.org
switched from a git mirror to a http redirect to GitLab (see [1]).
Update the LICENSE URL to match its previous content, displaying
the file raw content similarly to gitweb 'blob_plain' format ([2]).
[1] https://lore.kernel.org/qemu-devel/CABgObfZu3mFc8tM20K-yXdt7F-7eV-uKZN4sKDarSeu7DYoRbA@mail.gmail.com/
[2] https://git-scm.com/docs/gitweb#Documentation/gitweb.txt-blobplain
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230822125716.55295-1-philmd@linaro.org>
(cherry picked from commit 09a3fffae0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Before this change, the default KVM type, which is used for non-virt
machine models, was 0.
The kernel documentation says:
> On arm64, the physical address size for a VM (IPA Size limit) is
> limited to 40bits by default. The limit can be configured if the host
> supports the extension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use
> KVM_VM_TYPE_ARM_IPA_SIZE(IPA_Bits) to set the size in the machine type
> identifier, where IPA_Bits is the maximum width of any physical
> address used by the VM. The IPA_Bits is encoded in bits[7-0] of the
> machine type identifier.
>
> e.g, to configure a guest to use 48bit physical address size::
>
> vm_fd = ioctl(dev_fd, KVM_CREATE_VM, KVM_VM_TYPE_ARM_IPA_SIZE(48));
>
> The requested size (IPA_Bits) must be:
>
> == =========================================================
> 0 Implies default size, 40bits (for backward compatibility)
> N Implies N bits, where N is a positive integer such that,
> 32 <= N <= Host_IPA_Limit
> == =========================================================
> Host_IPA_Limit is the maximum possible value for IPA_Bits on the host
> and is dependent on the CPU capability and the kernel configuration.
> The limit can be retrieved using KVM_CAP_ARM_VM_IPA_SIZE of the
> KVM_CHECK_EXTENSION ioctl() at run-time.
>
> Creation of the VM will fail if the requested IPA size (whether it is
> implicit or explicit) is unsupported on the host.
https://docs.kernel.org/virt/kvm/api.html#kvm-create-vm
So if Host_IPA_Limit < 40, specifying 0 as the type will fail. This
actually confused libvirt, which uses "none" machine model to probe the
KVM availability, on M2 MacBook Air.
Fix this by using Host_IPA_Limit as the default type when
KVM_CAP_ARM_VM_IPA_SIZE is available.
Cc: qemu-stable@nongnu.org
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20230727073134.134102-3-akihiko.odaki@daynix.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 1ab445af8c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
kvm_arch_get_default_type() returns the default KVM type. This hook is
particularly useful to derive a KVM type that is valid for "none"
machine model, which is used by libvirt to probe the availability of
KVM.
For MIPS, the existing mips_kvm_type() is reused. This function ensures
the availability of VZ which is mandatory to use KVM on the current
QEMU.
Cc: qemu-stable@nongnu.org
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20230727073134.134102-2-akihiko.odaki@daynix.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: added doc comment for new function]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 5e0d65909c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Using "-device virtio-gpu,blob=true" currently does not work on big
endian hosts (like s390x). The guest kernel prints an error message
like:
[drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x10c)
and the display stays black. When running QEMU with "-d guest_errors",
it shows an error message like this:
virtio_gpu_create_mapping_iov: nr_entries is too big (83886080 > 16384)
which indicates that this value has not been properly byte-swapped.
And indeed, the virtio_gpu_create_blob_bswap() function (that should
swap the fields in the related structure) fails to swap some of the
entries. After correctly swapping all missing values here, too, the
virtio-gpu device is now also working with blob=true on s390x hosts.
Fixes: e0933d91b1 ("virtio-gpu: Add virtio_gpu_resource_create_blob")
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2230469
Message-Id: <20230815122007.928049-1-thuth@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit d194362910)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
VFMIN and VFMAX should raise a specification exceptions when bits 1-3
of M5 are set.
Cc: qemu-stable@nongnu.org
Fixes: da4807527f ("s390x/tcg: Implement VECTOR FP (MAXIMUM|MINIMUM)")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230804234621.252522-1-iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 6a2ea61518)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The length is always truncated to 16 bytes. Do not probe more than
that.
Cc: qemu-stable@nongnu.org
Fixes: 0e0a5b49ad ("s390x/tcg: Implement VECTOR STORE WITH LENGTH")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230804235624.263260-1-iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 6db3518ba4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Unlike most other instructions that contain an immediate element index,
VREP's one is 16-bit, and not 4-bit. The code uses only 8 bits, so
using, e.g., 0x101 does not lead to a specification exception.
Fix by checking all 16 bits.
Cc: qemu-stable@nongnu.org
Fixes: 28d08731b1 ("s390x/tcg: Implement VECTOR REPLICATE")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230807163459.849766-1-iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 23e87d419f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Currently the emulation of VSTRS recognizes partial matches in presence
of \0 in the haystack, which, according to PoP, is not correct:
If the ZS flag is one and a zero byte was detected
in the second operand, then there can not be a
partial match ...
Add a check for this. While at it, fold a number of explicitly handled
special cases into the generic logic.
Cc: qemu-stable@nongnu.org
Reported-by: Claudio Fontana <cfontana@suse.de>
Closes: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg00633.html
Fixes: 1d706f3141 ("target/s390x: vxeh2: vector string search")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230804233748.218935-3-iii@linux.ibm.com>
Tested-by: Claudio Fontana <cfontana@suse.de>
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 791b2b6a93)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-23 16:55:40 +03:00
320 changed files with 3437 additions and 1654 deletions
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.