Bitmaps data is not critical, and we should not fail the migration (or
use postcopy recovering) because of dirty-bitmaps migration failure.
Instead we should just lose unfinished bitmaps.
Still we have to report io stream violation errors, as they affect the
whole migration stream.
While touching this, tighten code that was previously blindly calling
malloc on a size read from the migration stream, as a corrupted stream
(perhaps from a malicious user) should not be able to convince us to
allocate an inordinate amount of memory.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200727194236.19551-16-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: typo fixes, enhance commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
Keep bitmap state for disabled bitmaps too. Keep the state until the
end of the process. It's needed for the following commit to implement
bitmap postcopy canceling.
To clean-up the new list the following logic is used:
We need two events to consider bitmap migration finished:
1. chunk with DIRTY_BITMAP_MIG_FLAG_COMPLETE flag should be received
2. dirty_bitmap_mig_before_vm_start should be called
These two events may come in any order, so we understand which one is
last, and on the last of them we remove bitmap migration state from the
list.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Message-Id: <20200727194236.19551-15-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
bdrv_enable_dirty_bitmap_locked() call does nothing, as if we are in
postcopy, bitmap successor must be enabled, and reclaim operation will
enable the bitmap.
So, actually we need just call _reclaim_ in both if branches, and
making differences only to add an assertion seems not really good. The
logic becomes simple: on load complete we do reclaim and that's all.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Message-Id: <20200727194236.19551-14-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
finish_lock is bad name, as lock used not only on process end.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Message-Id: <20200727194236.19551-13-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Move all state variables into one global struct. Reduce global
variable usage, utilizing opaque pointer where possible.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Message-Id: <20200727194236.19551-12-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
No reasons to keep two public init functions.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200727194236.19551-11-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Rename dirty_bitmap_mig_cleanup to dirty_bitmap_do_save_cleanup, to
stress that it is on save part.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200727194236.19551-10-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Rename types to be symmetrical for load/save part and shorter.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200727194236.19551-9-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Using the _locked version of bdrv_enable_dirty_bitmap to bypass locking
is wrong as we do not already own the mutex. Moreover, the adjacent
call to bdrv_dirty_bitmap_enable_successor grabs the mutex.
Fixes: 58f72b965e9e1q
Cc: qemu-stable@nongnu.org # v3.0
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200727194236.19551-8-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The test wants to force a bitmap postcopy. Still, the resulting
postcopy period is very small. Let's increase it by adding more
bitmaps to migrate. Also, test disabled bitmaps migration.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200727194236.19551-7-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
iotest 199 works too long because of many discard operations. At the
same time, postcopy period is very short, in spite of all these
efforts.
So, let's use less discards (and with more interesting patterns) to
reduce test timing. In the next commit we'll increase postcopy period.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200727194236.19551-6-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Discard dirties dirty-bitmap as well as write, but works faster. Let's
use it instead.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200727194236.19551-5-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The test aims to test _postcopy_ migration, and wants to do some write
operations during postcopy time.
Test considers migrate status=complete event on source as start of
postcopy. This is completely wrong, completion is completion of the
whole migration process. Let's instead consider destination start as
start of postcopy, and use RESUME event for it.
Next, as migration finish, let's use migration status=complete event on
target, as such method is closer to what libvirt or another user will
do, than tracking number of dirty-bitmaps.
Finally, add a possibility to dump events for debug. And if
set debug to True, we see, that actual postcopy period is very small
relatively to the whole test duration time (~0.2 seconds to >40 seconds
for me). This means, that test is very inefficient in what it supposed
to do. Let's improve it in following commits.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200727194236.19551-4-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
We don't need any specific format constraints here. Still keep qcow2
for two reasons:
1. No extra calls of format-unrelated test
2. Add some check around persistent bitmap in future (require qcow2)
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200727194236.19551-3-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Make the capitalization of the hexadecimal numbers consistent for the
QCOW2 header extension constants in docs/interop/qcow2.txt.
Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <1594973699-781898-2-git-send-email-andrey.shinkevich@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The glibc getcwd function returns different errors than the getcwd
syscall, which triggers an assertion failure in the glibc getcwd function
when running under the emulation.
When the syscall returns ENAMETOOLONG, the glibc wrapper uses a fallback
implementation that potentially handles an unlimited path length, and
returns with ERANGE if the provided buffer is too small. The qemu
emulation cannot distinguish the two cases, and thus always returns ERANGE.
This is unexpected by the glibc wrapper.
Signed-off-by: Andreas Schwab <schwab@suse.de>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <mvmmu3qplvi.fsf@suse.de>
[lv: updated description]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Implementation of 'rt_sigtimedwait()' in 'syscall.c' uses the
function 'target_to_host_timespec()' to transfer the value of
'struct timespec' from target to host. However, the implementation
doesn't check whether this conversion succeeds and thus can cause
an unaproppriate error instead of the 'EFAULT (Bad address)' which
is supposed to be set if the conversion from target to host fails.
This was confirmed with the LTP test for rt_sigtimedwait:
"/testcases/kernel/syscalls/rt_sigtimedwait/rt_sigtimedwait01.c"
which causes an unapropriate error in test case "test_bad_adress3"
which is run with a bad adress for the 'struct timespec' argument:
FAIL: test_bad_address3 (349): Unexpected failure: EAGAIN/EWOULDBLOCK (11)
The test fails with an unexptected errno 'EAGAIN/EWOULDBLOCK' instead
of the expected EFAULT.
After the changes from this patch, the test case is executed successfully
along with the other LTP test cases for 'rt_sigtimedwait()':
PASS: test_bad_address3 (349): Test passed
Signed-off-by: Filip Bozuta <Filip.Bozuta@syrmia.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200724181651.167819-1-Filip.Bozuta@syrmia.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
When the chroot does not have /proc mounted, we can read neither
/proc/sys/vm/mmap_min_addr nor /proc/sys/maps.
The enforcement of mmap_min_addr in the host kernel is done by
the security module, and so does not apply to processes owned
by root. Which leads pgd_find_hole_fallback to succeed in probing
a reservation at address 0. Which confuses pgb_reserved_va to
believe that guest_base has not actually been initialized.
We don't actually want NULL addresses to become accessible, so
make sure that mmap_min_addr is initialized with a non-zero value.
Buglink: https://bugs.launchpad.net/qemu/+bug/1888728
Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200724212314.545877-1-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Minor bugfixes all over the places, including one CVE.
Additionally, a fix for an ancient bug in migration -
one has to wonder how come no one noticed.
The fix is also non-trivial since we dare not break all
existing machine types with pci - we have a work around
in the works, for now we just skip the work-around for
old machine types.
Great job by Hogan Wang noticing, debugging and fixing it,
and thanks to Dr. David Alan Gilbert for reviewing the patches.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl8e9CIPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpsAIH/2EEq9rLpjqMJdzRvjq3/UAHsvm42zeTnJl7
81cM887Mrg2Nd7MXFoxurLK5UEehTzlD2DRTvaDFfJaJlrtkPM2QEU2X/6c3syAS
GbmOQaljQtR4zEFE81t84mZQS025Gp0s+uble7KvtXakgp1A/vdu93OEvJkhtRY8
JBdRMlTt2T0eizvHn1obBKjaQN7tAUKl5KagHWxP1ApGU0YibUbrBadpJ18ZcKMl
vwB3dwmoi4f7AjuC0GnxYKp7kC/MMhUPFoDxQKI7d+wMGFnbsAF4sBIN9EZKeOkv
xT2InNSAzk/PTSuQpnDnZQjmrf4dPuL/GNJ8vQk27eaFfVchJyc=
=Bu6o
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio,pci: bugfixes
Minor bugfixes all over the places, including one CVE.
Additionally, a fix for an ancient bug in migration -
one has to wonder how come no one noticed.
The fix is also non-trivial since we dare not break all
existing machine types with pci - we have a work around
in the works, for now we just skip the work-around for
old machine types.
Great job by Hogan Wang noticing, debugging and fixing it,
and thanks to Dr. David Alan Gilbert for reviewing the patches.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon 27 Jul 2020 16:34:58 BST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
virtio-pci: fix virtio_pci_queue_enabled()
MAINTAINERS: Cover the firmware JSON schema
vhost-vdpa :Fix Coverity CID 1430270 / CID 1420267
libvhost-user: Report descriptor index on panic
Fix vhost-user buffer over-read on ram hot-unplug
hw/pci-host: save/restore pci host config register
virtio-mem-pci: force virtio version 1
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In legacy mode, virtio_pci_queue_enabled() falls back to
virtio_queue_enabled() to know if the queue is enabled.
But virtio_queue_enabled() calls again virtio_pci_queue_enabled()
if k->queue_enabled is set. This ends in a crash after a stack
overflow.
The problem can be reproduced with
"-device virtio-net-pci,disable-legacy=off,disable-modern=true
-net tap,vhost=on"
And a look to the backtrace is very explicit:
...
#4 0x000000010029a438 in virtio_queue_enabled ()
#5 0x0000000100497a9c in virtio_pci_queue_enabled ()
...
#130902 0x000000010029a460 in virtio_queue_enabled ()
#130903 0x0000000100497a9c in virtio_pci_queue_enabled ()
#130904 0x000000010029a460 in virtio_queue_enabled ()
#130905 0x0000000100454a20 in vhost_net_start ()
...
This patch fixes the problem by introducing a new function
for the legacy case and calls it from virtio_pci_queue_enabled().
It also calls it from virtio_queue_enabled() to avoid code duplication.
Fixes: f19bcdfedd ("virtio-pci: implement queue_enabled method")
Cc: Jason Wang <jasowang@redhat.com>
Cc: Cindy Lu <lulu@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20200727153319.43716-1-lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When GCR_EL1.RRND==1, the choosing of the random value is IMPDEF,
and the kernel is not expected to have set RGSR_EL1. Force a
non-zero value into SEED, so that we do not continually return
the same tag.
Reported-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200724163853.504655-4-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When booting an EL3 cpu with -kernel, we set up EL3 and then
drop down to EL2. We need to enable access to v8.5-MemTag
tag allocation at EL3 before doing so.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200724163853.504655-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When booting an EL3 cpu with -kernel, we set up EL3 and then
drop down to EL2. We need to enable access to v8.3-PAuth
keys and instructions at EL3 before doing so.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200724163853.504655-2-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 6a0b7505f1 which added documentation of the virt board
crossed in the post with commit 6f4e1405b9 which added a new
'mte' machine option. Update the docs to include the new option.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
When we changed the interface of get_phys_addr_lpae to require
the cacheattr parameter, this spot was missed. The compiler is
unable to detect the use of NULL vs the nonnull attribute here.
Fixes: 7e98e21c09
Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Jan Kiszka <jan.kiskza@siemens.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The SDRAM Memory Controller has a 32-bit address bus, thus
supports up to 4 GiB of DRAM. There is a signed to unsigned
conversion error with the AST2600 maximum memory size:
(uint64_t)(2048 << 20) = (uint64_t)(-2147483648)
= 0xffffffff40000000
= 16 EiB - 2 GiB
Fix by using the IEC suffixes which are usually safer, and add
an assertion check to verify the memory is valid. This would have
caught this bug:
$ qemu-system-arm -M ast2600-evb
qemu-system-arm: hw/misc/aspeed_sdmc.c:258: aspeed_sdmc_realize: Assertion `asc->max_ram_size < 4 * GiB' failed.
Aborted (core dumped)
Fixes: 1550d72679 ("aspeed/sdmc: Add AST2600 support")
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
data_length is a constant value, so we use assert instead of
condition check.
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Message-id: 20200622113146.33421-1-gengdongjiu@huawei.com
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Writing zeroes to a qcow2 v2 images without a backing file results in an
unallocated cluster as of 61b3043965. 197 has a test for COR-ing a
cluster on an image without a backing file, which means that the data
will be zero, so now on a v2 image that cluster will just stay
unallocated, and so the test fails. Just force compat=1.1 for that
particular case to enforce the cluster to get allocated.
Fixes: 61b3043965
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200727135237.1096841-1-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Add an entry to cover firmware.json (see commit 3a0adfc9bf:
schema that describes the different uses and properties of
virtual machine firmware).
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Kashyap Chamarthy <kchamart@redhat.com>
Cc: Daniel P. Berrange <berrange@redhat.com>
Suggested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200703183450.32398-1-philmd@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In the function vhost_vdpa_dma_map/unmap, The struct msg was not initialized all its fields.
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20200710064642.24505-1-lulu@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We want to report the index of the descriptor,
not its pointer.
Fixes: 7b2e5c65f4 ("contrib: add libvhost-user")
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200723171935.18535-1-philmd@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS vhost-user protocol
feature introduced a shadow-table, used by the backend to dynamically
determine how a vdev's memory regions have changed since the last
vhost_user_set_mem_table() call. On hot-remove, a memmove() operation
is used to overwrite the removed shadow region descriptor(s). The size
parameter of this memmove was off by 1 such that if a VM with a backend
supporting the VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS filled it's
shadow-table (by performing the maximum number of supported hot-add
operatons) and attempted to remove the last region, Qemu would read an
out of bounds value and potentially crash.
This change fixes the memmove() bounds such that this erroneous read can
never happen.
Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com>
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <1594799958-31356-1-git-send-email-raphael.norwitz@nutanix.com>
Fixes: f1aeb14b08 ("Transmit vhost-user memory regions individually")
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The pci host config register is used to save PCI address for
read/write config data. If guest writes a value to config register,
and then QEMU pauses the vcpu to migrate, after the migration, the guest
will continue to write pci config data, and the write data will be ignored
because of new qemu process losing the config register state.
To trigger the bug:
1. guest is booting in seabios.
2. guest enables the SMRAM in seabios:piix4_apmc_smm_setup, and then
expects to disable the SMRAM by pci_config_writeb.
3. after guest writes the pci host config register, QEMU pauses vcpu
to finish migration.
4. guest write of config data(0x0A) fails to disable the SMRAM because
the config register state is lost.
5. guest continues to boot and crashes in ipxe option ROM due to SMRAM
in enabled state.
Example Reproducer:
step 1. Make modifications to seabios and qemu for increase reproduction
efficiency, write 0xf0 to 0x402 port notify qemu to stop vcpu after
0x0cf8 port wrote i440 configure register. qemu stop vcpu when catch
0x402 port wrote 0xf0.
seabios:/src/hw/pci.c
@@ -52,6 +52,11 @@ void pci_config_writeb(u16 bdf, u32 addr, u8 val)
writeb(mmconfig_addr(bdf, addr), val);
} else {
outl(ioconfig_cmd(bdf, addr), PORT_PCI_CMD);
+ if (bdf == 0 && addr == 0x72 && val == 0xa) {
+ dprintf(1, "stop vcpu\n");
+ outb(0xf0, 0x402); // notify qemu to stop vcpu
+ dprintf(1, "resume vcpu\n");
+ }
outb(val, PORT_PCI_DATA + (addr & 3));
}
}
qemu:hw/char/debugcon.c
@@ -60,6 +61,9 @@ static void debugcon_ioport_write(void *opaque, hwaddr addr, uint64_t val,
printf(" [debugcon: write addr=0x%04" HWADDR_PRIx " val=0x%02" PRIx64 "]\n", addr, val);
#endif
+ if (ch == 0xf0) {
+ vm_stop(RUN_STATE_PAUSED);
+ }
/* XXX this blocks entire thread. Rewrite to use
* qemu_chr_fe_write and background I/O callbacks */
qemu_chr_fe_write_all(&s->chr, &ch, 1);
step 2. start vm1 by the following command line, and then vm stopped.
$ qemu-system-x86_64 -machine pc-i440fx-5.0,accel=kvm\
-netdev tap,ifname=tap-test,id=hostnet0,vhost=on,downscript=no,script=no\
-device virtio-net-pci,netdev=hostnet0,id=net0,bus=pci.0,addr=0x13,bootindex=3\
-device cirrus-vga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2\
-chardev file,id=seabios,path=/var/log/test.seabios,append=on\
-device isa-debugcon,iobase=0x402,chardev=seabios\
-monitor stdio
step 3. start vm2 to accept vm1 state.
$ qemu-system-x86_64 -machine pc-i440fx-5.0,accel=kvm\
-netdev tap,ifname=tap-test1,id=hostnet0,vhost=on,downscript=no,script=no\
-device virtio-net-pci,netdev=hostnet0,id=net0,bus=pci.0,addr=0x13,bootindex=3\
-device cirrus-vga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2\
-chardev file,id=seabios,path=/var/log/test.seabios,append=on\
-device isa-debugcon,iobase=0x402,chardev=seabios\
-monitor stdio \
-incoming tcp:127.0.0.1:8000
step 4. execute the following qmp command in vm1 to migrate.
(qemu) migrate tcp:127.0.0.1:8000
step 5. execute the following qmp command in vm2 to resume vcpu.
(qemu) cont
Before this patch, we get KVM "emulation failure" error on vm2.
This patch fixes it.
Cc: qemu-stable@nongnu.org
Signed-off-by: Hogan Wang <hogan.wang@huawei.com>
Message-Id: <20200727084621.3279-1-hogan.wang@huawei.com>
Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Trying to run simple virtio-mem-pci examples currently fails with
qemu-system-x86_64: -device virtio-mem-pci,id=vm0,memdev=mem0,node=0,
requested-size=300M: device is modern-only, use disable-legacy=on
due to the added safety checks in 9b3a35ec82 ("virtio: verify that legacy
support is not accidentally on").
As noted by Conny, we have to force virtio version 1. While at it, use
qdev_realize() to set the parent bus and realize - like most other
virtio-*-pci implementations.
Fixes: 0b9a2443a4 ("virtio-pci: Proxy for virtio-mem")
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200727115905.129397-1-david@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If you are building only with either the new rx-softmmu or avr-softmmu
target, "make check-block" fails a couple of tests since there is no
default machine defined in these new targets. We have to select a machine
in the "check" script for these, just like we already do for the arm- and
tricore-softmmu targets.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20200722161908.25383-1-thuth@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
We should check whether the user-specified node-name actually refers to
a node. The simplest way to do that is to use bdrv_lookup_bs() instead
of bdrv_find_node() (the former wraps the latter, and produces an error
message if necessary).
Reported-by: Coverity (CID 1430268)
Fixes: ced914d0ab
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200710095037.10885-1-mreitz@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
We are adding the shutdown timeout to solve an issue
we now see where the aarch64 VMs timeout on shutdown
under TCG.
There is a new 3 second timeout in machine.py,
which we override in basevm.py when shutting down.
Signed-off-by: Robert Foley <robert.foley@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20200717203041.9867-4-robert.foley@linaro.org>
Message-Id: <20200724064509.331-17-alex.bennee@linaro.org>
The primary purpose of this change is to clean up
machine.py's console_socket property to return a single type,
a ConsoleSocket.
ConsoleSocket now derives from a socket, which means that
in the default case (of not draining), machine.py
will see the same behavior as it did prior to ConsoleSocket.
Signed-off-by: Robert Foley <robert.foley@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20200717203041.9867-3-robert.foley@linaro.org>
Message-Id: <20200724064509.331-16-alex.bennee@linaro.org>
The changes to console_socket.py and machine.py are to
cleanup for pylint and flake8.
Signed-off-by: Robert Foley <robert.foley@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200717203041.9867-2-robert.foley@linaro.org>
Message-Id: <20200724064509.331-15-alex.bennee@linaro.org>
Our safe_clock_nanosleep() returns -1 and updates errno.
We don't need to update the CRF bit in syscall.c because it will
be updated in ppc/cpu_loop.c as the return value is negative.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200722174612.2917566-3-laurent@vivier.eu>
Message-Id: <20200724064509.331-14-alex.bennee@linaro.org>
If the call is interrupted by a signal handler, it fails with error EINTR
and if "remain" is not NULL and "flags" is not TIMER_ABSTIME, it returns
the remaining unslept time in "remain".
Update linux-user to not overwrite the "remain" structure if there is no
error.
Found with "make check-tcg", linux-test fails on nanosleep test:
TEST linux-test on x86_64
.../tests/tcg/multiarch/linux-test.c:242: nanosleep
Reported-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200722174612.2917566-2-laurent@vivier.eu>
Message-Id: <20200724064509.331-13-alex.bennee@linaro.org>
For installing stuff from sid or ports you may need to manually
specify the location of the keyring. You can even import keys into
your personal keyring and point it there, e.g.:
gpg --keyserver keyring.debian.org --recv-keys 84C573CD4E1AFD6C
make docker-binfmt-image-debian-sid-hppa DEB_TYPE=sid DEB_ARCH=hppa \
DEB_URL=http://ftp.ports.debian.org/debian-ports/ \
EXECUTABLE=./hppa-linux-user/qemu-hppa V=1 \
DEB_KEYRING=${HOME}/.gnupg/pubring.kbx
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200724064509.331-12-alex.bennee@linaro.org>
When we updated the arguments for docker.py we missed a bit.
Fixes: dfae628459 ("docker.py/build: support -t and -f arguments")
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200724064509.331-11-alex.bennee@linaro.org>
Does this seem convoluted to you? It feels a little complicated to me.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20200724064509.331-10-alex.bennee@linaro.org>
Plain MAP_FIXED has the undesirable behaviour of splatting exiting
maps so we don't actually achieve what we want when looking for gaps.
We should be using MAP_FIXED_NOREPLACE. As this isn't always available
we need to potentially check the returned address to see if the kernel
gave us what we asked for.
Fixes: ad592e37df ("linux-user: provide fallback pgd_find_hole for bare chroots")
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200724064509.331-9-alex.bennee@linaro.org>
Quoting ISO C99 6.7.8p4, "All the expressions in an initializer for an
object that has static storage duration shall be constant expressions or
string literals".
The compound literal produced by the make_floatx80() macro is not such a
constant expression, per 6.6p7-9. (An implementation may accept it,
according to 6.6p10, but is not required to.)
Therefore using "floatx80_zero" and make_floatx80() for initializing
"f2xm1_table" and "fpatan_table" is not portable. And gcc-4.8 in RHEL-7.6
actually chokes on them:
> target/i386/fpu_helper.c:871:5: error: initializer element is not constant
> { make_floatx80(0xbfff, 0x8000000000000000ULL),
> ^
We've had the make_floatx80_init() macro for this purpose since commit
3bf7e40ab9 ("softfloat: fix for C99", 2012-03-17), so let's use that
macro again.
Fixes: eca30647fc ("target/i386: reimplement f2xm1 using floatx80 operations")
Fixes: ff57bb7b63 ("target/i386: reimplement fpatan using floatx80 operations")
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Joseph Myers <joseph@codesourcery.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Link: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg06566.html
Link: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg04714.html
Message-Id: <20200716144251.23004-1-lersek@redhat.com>
Message-Id: <20200724064509.331-8-alex.bennee@linaro.org>