v->vq forgot to cleanup in virtio_9p_device_unrealize, the memory leak
stack is as follow:
Direct leak of 14336 byte(s) in 2 object(s) allocated from:
#0 0x7f819ae43970 (/lib64/libasan.so.5+0xef970) ??:?
#1 0x7f819872f49d (/lib64/libglib-2.0.so.0+0x5249d) ??:?
#2 0x55a3a58da624 (./x86_64-softmmu/qemu-system-x86_64+0x2c14624) /mnt/sdb/qemu/hw/virtio/virtio.c:2327
#3 0x55a3a571bac7 (./x86_64-softmmu/qemu-system-x86_64+0x2a55ac7) /mnt/sdb/qemu/hw/9pfs/virtio-9p-device.c:209
#4 0x55a3a58e7bc6 (./x86_64-softmmu/qemu-system-x86_64+0x2c21bc6) /mnt/sdb/qemu/hw/virtio/virtio.c:3504
#5 0x55a3a5ebfb37 (./x86_64-softmmu/qemu-system-x86_64+0x31f9b37) /mnt/sdb/qemu/hw/core/qdev.c:876
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Message-Id: <20200117060927.51996-2-pannengyuan@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Acked-by: Greg Kurz <groug@kaod.org>
Firmware can enumerate present at boot APs by broadcasting wakeup IPI,
so that woken up secondary CPUs could register them-selves.
However in CPU hotplug case, it would need to know architecture
specific CPU IDs for possible and hotplugged CPUs so it could
prepare environment for and wake hotplugged AP.
Reuse and extend existing CPU hotplug interface to return architecture
specific ID for currently selected CPU in 2 registers:
- lower 32 bits in ACPI_CPU_CMD_DATA_OFFSET_RW
- upper 32 bits in ACPI_CPU_CMD_DATA2_OFFSET_R
On x86, firmware will use CPHP_GET_CPU_ID_CMD for fetching the APIC ID
when handling hotplug SMI.
Later, CPHP_GET_CPU_ID_CMD will be used on ARM to retrieve MPIDR,
which serves the similar to APIC ID purpose.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1575896942-331151-10-git-send-email-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Document work-flows for
* enabling/detecting modern CPU hotplug interface
* finding a CPU with pending 'insert/remove' event
* enumerating present and possible CPUs
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1575896942-331151-9-git-send-email-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
No functional change in practice, patch only aims to properly
document (in spec and code) intended usage of the reserved space.
The new field is to be used for 2 purposes:
- detection of modern CPU hotplug interface using
CPHP_GET_NEXT_CPU_WITH_EVENT_CMD command.
procedure will be described in follow up patch:
"acpi: cpuhp: spec: add typical usecases"
- for returning upper 32 bits of architecture specific CPU ID,
for new CPHP_GET_CPU_ID_CMD command added by follow up patch:
"acpi: cpuhp: add CPHP_GET_CPU_ID_CMD command"
Change is backward compatible with 4.2 and older machines, as field was
unconditionally reserved and always returned 0x0 if modern CPU hotplug
interface was enabled.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1575896942-331151-8-git-send-email-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Write section of 'Command data' register should describe what happens
when it's written into. Correct description in case the last stored
'Command field' value is equal to 0, to reflect that currently it's not
supported.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <1575896942-331151-7-git-send-email-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Correct returned value description in case 'Command field' == 0x0,
it's not PXM but CPU selector value with pending event
In addition describe 0 blanket value in case of not supported
'Command field' value.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <1575896942-331151-6-git-send-email-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Move reserved registers to the top of the section, so reader would be
aware of effects when reading registers description.
* State registers endianness explicitly at the beginning of the section
* Describe registers behavior in case of 'CPU selector' register contains
value that doesn't point to a possible CPU.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <1575896942-331151-5-git-send-email-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
test lockable SMRAM at default SMBASE feature, introduced by
patch "q35: implement 128K SMRAM at default SMBASE address"
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1575899217-333105-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It's not what real HW does, implementing which would be overkill [**]
and would require complex cross stack changes (QEMU+firmware) to make
it work.
So considering that SMRAM is owned by MCH, for simplicity (ab)use
reserved Q35 register, which allows QEMU and firmware easily init
and make RAM at SMBASE available only from SMM context.
Patch uses commit (2f295167e0 q35/mch: implement extended TSEG sizes)
for inspiration and uses reserved register in config space at 0x9c
offset [*] to extend q35 pci-host with ability to use 128K at
0x30000 as SMRAM and hide it (like TSEG) from non-SMM context.
Usage:
1: write 0xff in the register
2: if the feature is supported, follow up read from the register
should return 0x01. At this point RAM at 0x30000 is still
available for SMI handler configuration from non-SMM context
3: writing 0x02 in the register, locks SMBASE area, making its contents
available only from SMM context. In non-SMM context, reads return
0xff and writes are ignored. Further writes into the register are
ignored until the system reset.
*) https://www.mail-archive.com/qemu-devel@nongnu.org/msg455991.html
**) https://www.mail-archive.com/qemu-devel@nongnu.org/msg646965.html
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1575896942-331151-3-git-send-email-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Another set of build-sys patches, to help building the firmware
binaries we use for testing. We almost have reproducible builds.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAl4nFUEACgkQ4+MsLN6t
wN7yjg/+L7NKlcbvYscyJc5eS6Comm1PNv4dhuTF52uFEffIQS9t/RJlo96pzOqD
pFpoKDROHn8QIXz3K53JPD6be58xsykneXfgQt/DNJbn7rIr8KJWfEV4ABdnzVWu
5eKfJIvIJrmu/ezqnCAEnH6Giajz9+2kGq8HP/YMVNW71Edci0P5txkuii22rrlI
Rmsx7jEmAC0uIHsieeC9EmkLxbOpnphlC1FfXuYPgGiW9Bp1puf/s9Sb2aBef3aF
nsbLFDgJiCAsmu7hkcuv9PZvqyZj/wgphVy6yAiF39K7WA7cW2rkPJaqF7yOEODB
inpA9ZT/kNkj6r/mCs8cZ9CevJ+rYcoqIkLs6LXm2PU0m115zfBOM6d/q0R9ZtTa
sr4j+yVByvsXj91Cee6V/CWviC9i3pb1wyetCCs6KwD6fNxju7UdOOXShL+qbQjV
t2NFu+avufJDE3govIomrE5j8OvLfpLO1d42jIEWE/DZ9w3ze00ZzDT7J/P2ZRrZ
Ds3yyniAPefGsVTxUjP9n0H+MS3f3VEODVZ5spr1Ii6QmdHRQplg3VuE2Ic1qe5C
pslaakyVUAqLY2livYVcYKEP8EEda73FKBVYVLwCiTRddVsSuCr8RbIumGd7+oRv
vcZ6pAkjJPJC84ysCaNYbBTkZDupD2wBYQIyYQNiHPnctDjFL94=
=RfVX
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/philmd-gitlab/tags/edk2-next-20200121' into staging
EDK2 firmware patches
Another set of build-sys patches, to help building the firmware
binaries we use for testing. We almost have reproducible builds.
# gpg: Signature made Tue 21 Jan 2020 15:14:09 GMT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* remotes/philmd-gitlab/tags/edk2-next-20200121:
gitlab-ci.yml: Add jobs to build EDK2 firmware binaries
roms/edk2-funcs: Force softfloat ARM toolchain prefix on Debian
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add two GitLab job to build the EDK2 firmware binaries.
The first job build a Docker image with the packages requisite
to build EDK2, and store this image in the GitLab registry.
The second job pull the image from the registry and build the
EDK2 firmware binaries.
The docker image is only rebuilt if the GitLab YAML or the
Dockerfile is updated.
The second job is only built when the roms/edk2/ submodule is
updated, when a git-ref starts with 'edk2' or when the last
commit contains 'EDK2'. The files generated are archived in
the artifacts.zip file.
With edk2-stable201905, it took 2 minutes 52 seconds to build
the docker image, and 36 minutes 28 seconds to generate the
artifacts.zip with the firmware binaries (filesize: 10MiB).
See: https://gitlab.com/philmd/qemu/pipelines/107553178
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
The Debian (based) distributions currently provides 2 ARM
toolchains, documented as [1]:
* The ARM EABI (armel) port targets a range of older 32-bit ARM
devices, particularly those used in NAS hardware and a variety
of *plug computers.
* The newer ARM hard-float (armhf) port supports newer, more
powerful 32-bit devices using version 7 of the ARM architecture
specification.
For various reasons documented in [2], the EDK2 project suggests
to use the softfloat toolchain (named 'armel' by Debian).
Force the softfloat cross toolchain prefix on Debian distributions.
[1] https://www.debian.org/ports/arm/#status
[2] https://github.com/tianocore/edk2/commit/41203b9a
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
A regression that was introduced, with the refactor to TranslatorOps,
drops two lines that update the PC when single-stepping is being performed.
Fixes: 11ab74b01e ("target/m68k: Convert to TranslatorOps")
Reported-by: Lucien Murray-Pitts <lucienmp_antispam@yahoo.com>
Suggested-by: Lucien Murray-Pitts <lucienmp_antispam@yahoo.com>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200116165454.2076265-1-laurent@vivier.eu>
The MANUAL_BUILDDIR directory is automatically created by sphinx-build
for the other targets. The index.html target does not use sphinx-build
so we must manually create the directory to avoid the following error:
GEN docs/built/index.html
/bin/sh: docs/built/index.html: No such file or directory
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200120163400.603449-1-stefanha@redhat.com
Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
'out' label in v9fs_xattr_write() and 'out_nofid' label in
v9fs_complete_rename() can be replaced by appropriate return
calls.
CC: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Acked-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
'err_out' can be removed and be replaced by 'return -errno'
in its only instance in the function.
CC: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Acked-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
init_in_iov_from_pdu might not be able to allocate the full buffer size
requested, which comes from the client and could be larger than the
transport has available at the time of the request. Specifically, this
can happen with read operations, with the client requesting a read up to
the max allowed, which might be more than the transport has available at
the time.
Today the implementation of init_in_iov_from_pdu throws an error, both
Xen and Virtio.
Instead, change the V9fsTransport interface so that the size becomes a
pointer and can be limited by the implementation of
init_in_iov_from_pdu.
Change both the Xen and Virtio implementations to set the size to the
size of the buffer they managed to allocate, instead of throwing an
error. However, if the allocated buffer size is less than P9_IOHDRSZ
(the size of the header) still throw an error as the case is unhandable.
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
CC: groug@kaod.org
CC: anthony.perard@citrix.com
CC: roman@zededa.com
CC: qemu_oss@crudebyte.com
[groug: fix 32-bit build]
Signed-off-by: Greg Kurz <groug@kaod.org>
local_unlinkat_common() is supposed to always return -1 on error.
This is being done by jumps to the 'err_out' label, which is
a 'return ret' call, and 'ret' is initialized with -1.
Unfortunately there is a condition in which the function will
return 0 on error: in a case where flags == AT_REMOVEDIR, 'ret'
will be 0 when reaching
map_dirfd = openat_dir(...)
And, if map_dirfd == -1 and errno != ENOENT, the existing 'err_out'
jump will execute 'return ret', when ret is still set to zero
at that point.
This patch fixes it by changing all 'err_out' labels by
'return -1' calls, ensuring that the function will always
return -1 on error conditions. 'ret' can be left unintialized
since it's now being used just to store the result of 'unlinkat'
calls.
CC: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
[groug: changed prefix in title to be "9p: local:"]
Signed-off-by: Greg Kurz <groug@kaod.org>
There is a possible memory leak while local_link return -1 without free
odirpath and oname.
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Jaijun Chen <chenjiajun8@huawei.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Recent commit 3e7fb5811b "qapi: Fix code generation for empty modules"
modules" switched QAPISchema.visit() from
for entity in self._entity_list:
effectively to
for mod in self._module_dict.values():
for entity in mod._entity_list:
Visits in the same order as long as .values() is in insertion order.
That's the case only for Python 3.6 and later. Before, it's in some
arbitrary order, which results in broken generated code.
Fix by making self._module_dict an OrderedDict rather than a dict.
Fixes: 3e7fb5811b
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: BALATON Zoltan <balaton@eik.bme.hu>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200116202558.31473-1-armbru@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAl4lgSEACgkQ9IfvGFhy
1yO8eg/+Ip2mP5s4BfYTPIvWrUK2zNxr1R469EWPW5slKMpXimuTBi4HEMNuMqRm
kVD/BPq2wwAPKkNt0AdSJ5URl9Nk2Ip/Mpf1Ov9Su9biqjG6tY5T21LxQg2KvvDv
8fl3nK2sqVqRHuQy/rRkfMdHyBzNaXSG8Y0LADlLJ8ceXDocr/UU7DWxBkidc/SF
xUczpuds3SQeHgGDR61aScKF+7qQcXkjpjleQHIKV2aIymj0EqS8qYa8rp/g2XAu
6Jq0spXIt2mI5H1mxOeDN4xIWwvJWlqxbEdnspAmXBfJpt05878C30xCzTW33NiN
7OjCVyYfyi6MnnSL82D3uAcWA7HWmCQGg+laxYHwP+Ltq2FF5aOkk0A5skDKPZBY
CLUNUrHHw+RAKTSwIJbBE7noD/dpsKy83jLB5JlzkO8Nz2+kv2zAPxes3Bzk6Zmr
nyl2bcnu95pZcpR4gL5a0EpyMyHw/TLrHC4/Y93VTQKqTcr2+lapbwOGvoqzrimp
9EDf6/GXoLcc0govu66P3qlsotdyyxvp90j5v/C4Ew2oWjYhfWHIhWXS48+/ltcS
+B2wPFWykY2Wt+aiAlUV+Sucl7DLC83TLnjBts1JpqpEJKaPKgWPIVyVXAwY3U2E
Jxacd1viBKnxRZ4XAcqpavc+NhiFVFNafS9yIVvbb3iA8hO9HFQ=
=DHw5
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/juanquintela/tags/migration-pull-pull-request' into staging
Migration pull request
# gpg: Signature made Mon 20 Jan 2020 10:29:53 GMT
# gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full]
# gpg: aka "Juan Quintela <quintela@trasno.org>" [full]
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* remotes/juanquintela/tags/migration-pull-pull-request: (29 commits)
multifd: Be consistent about using uint64_t
migration: Support QLIST migration
apic: Use 32bit APIC ID for migration instance ID
migration: Change SaveStateEntry.instance_id into uint32_t
migration: Define VMSTATE_INSTANCE_ID_ANY
Bug #1829242 correction.
migration/multifd: fix destroyed mutex access in terminating multifd threads
migration/multifd: fix nullptr access in terminating multifd threads
migration/multifd: not use multifd during postcopy
migration/multifd: clean pages after filling packet
migration/postcopy: enable compress during postcopy
migration/postcopy: enable random order target page arrival
migration/postcopy: set all_zero to true on the first target page
migration/postcopy: count target page number to decide the place_needed
migration/postcopy: wait for decompress thread in precopy
migration/postcopy: reduce memset when it is zero page and matches_target_page_size
migration/ram: Yield periodically to the main loop
migration: savevm_state_handler_insert: constant-time element insertion
migration: add savevm_state_handler_remove()
misc: use QEMU_IS_ALIGNED
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We transmit ram_addr_t always as uint64_t. Be consistent in its
use (on 64bit system, it is always uint64_t problem is 32bits).
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Support QLIST migration using the same principle as QTAILQ:
94869d5c52 ("migration: migrate QTAILQ").
The VMSTATE_QLIST_V macro has the same proto as VMSTATE_QTAILQ_V.
The change mainly resides in QLIST RAW macros: QLIST_RAW_INSERT_HEAD
and QLIST_RAW_REVERSE.
Tests also are provided.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Migration is silently broken now with x2apic config like this:
-smp 200,maxcpus=288,sockets=2,cores=72,threads=2 \
-device intel-iommu,intremap=on,eim=on
After migration, the guest kernel could hang at anything, due to
x2apic bit not migrated correctly in IA32_APIC_BASE on some vcpus, so
any operations related to x2apic could be broken then (e.g., RDMSR on
x2apic MSRs could fail because KVM would think that the vcpu hasn't
enabled x2apic at all).
The issue is that the x2apic bit was never applied correctly for vcpus
whose ID > 255 when migrate completes, and that's because when we
migrate APIC we use the APICCommonState.id as instance ID of the
migration stream, while that's too short for x2apic.
Let's use the newly introduced initial_apic_id for that.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
It was always used as 32bit, so define it as used to be clear.
Instead of using -1 as the auto-gen magic value, we switch to
UINT32_MAX. We also make sure that we don't auto-gen this value to
avoid overflowed instance IDs without being noticed.
Suggested-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Define the new macro VMSTATE_INSTANCE_ID_ANY for callers who wants to
auto-generate the vmstate instance ID. Previously it was hard coded
as -1 instead of this macro. It helps to change this default value in
the follow up patches. No functional change.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Added type conversions to ram_addr_t before all left shifts of page
indexes to TARGET_PAGE_BITS, to correct overflows when the page
address was 4Gb and more.
Signed-off-by: Alexey Romko <nevilad@yahoo.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
One multifd will lock all the other multifds' IOChannel mutex to inform them
to quit by setting p->quit or shutting down p->c. In this senario, if some
multifds had already been terminated and multifd_load_cleanup/multifd_save_cleanup
had destroyed their mutex, it could cause destroyed mutex access when trying
lock their mutex.
Here is the coredump stack:
#0 0x00007f81a2794437 in raise () from /usr/lib64/libc.so.6
#1 0x00007f81a2795b28 in abort () from /usr/lib64/libc.so.6
#2 0x00007f81a278d1b6 in __assert_fail_base () from /usr/lib64/libc.so.6
#3 0x00007f81a278d262 in __assert_fail () from /usr/lib64/libc.so.6
#4 0x000055eb1bfadbd3 in qemu_mutex_lock_impl (mutex=0x55eb1e2d1988, file=<optimized out>, line=<optimized out>) at util/qemu-thread-posix.c:64
#5 0x000055eb1bb4564a in multifd_send_terminate_threads (err=<optimized out>) at migration/ram.c:1015
#6 0x000055eb1bb4bb7f in multifd_send_thread (opaque=0x55eb1e2d19f8) at migration/ram.c:1171
#7 0x000055eb1bfad628 in qemu_thread_start (args=0x55eb1e170450) at util/qemu-thread-posix.c:502
#8 0x00007f81a2b36df5 in start_thread () from /usr/lib64/libpthread.so.0
#9 0x00007f81a286048d in clone () from /usr/lib64/libc.so.6
To fix it up, let's destroy the mutex after all the other multifd threads had
been terminated.
Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
Signed-off-by: Ying Fang <fangying1@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
One multifd channel will shutdown all the other multifd's IOChannel when it
fails to receive an IOChannel. In this senario, if some multifds had not
received its IOChannel yet, it would try to shutdown its IOChannel which could
cause nullptr access at qio_channel_shutdown.
Here is the coredump stack:
#0 object_get_class (obj=obj@entry=0x0) at qom/object.c:908
#1 0x00005563fdbb8f4a in qio_channel_shutdown (ioc=0x0, how=QIO_CHANNEL_SHUTDOWN_BOTH, errp=0x0) at io/channel.c:355
#2 0x00005563fd7b4c5f in multifd_recv_terminate_threads (err=<optimized out>) at migration/ram.c:1280
#3 0x00005563fd7bc019 in multifd_recv_new_channel (ioc=ioc@entry=0x556400255610, errp=errp@entry=0x7ffec07dce00) at migration/ram.c:1478
#4 0x00005563fda82177 in migration_ioc_process_incoming (ioc=ioc@entry=0x556400255610, errp=errp@entry=0x7ffec07dce30) at migration/migration.c:605
#5 0x00005563fda8567d in migration_channel_process_incoming (ioc=0x556400255610) at migration/channel.c:44
#6 0x00005563fda83ee0 in socket_accept_incoming_migration (listener=0x5563fff6b920, cioc=0x556400255610, opaque=<optimized out>) at migration/socket.c:166
#7 0x00005563fdbc25cd in qio_net_listener_channel_func (ioc=<optimized out>, condition=<optimized out>, opaque=<optimized out>) at io/net-listener.c:54
#8 0x00007f895b6fe9a9 in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0
#9 0x00005563fdc18136 in glib_pollfds_poll () at util/main-loop.c:218
#10 0x00005563fdc181b5 in os_host_main_loop_wait (timeout=1000000000) at util/main-loop.c:241
#11 0x00005563fdc183a2 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:517
#12 0x00005563fd8edb37 in main_loop () at vl.c:1791
#13 0x00005563fd74fd45 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4473
To fix it up, let's check p->c before calling qio_channel_shutdown.
Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
Signed-off-by: Ying Fang <fangying1@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
We don't support multifd during postcopy, but user still could enable
both multifd and postcopy. This leads to migration failure.
Skip multifd during postcopy.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This is a preparation for the next patch:
not use multifd during postcopy.
Without enabling postcopy, everything looks good. While after enabling
postcopy, migration may fail even not use multifd during postcopy. The
reason is the pages is not properly cleared and *old* target page will
continue to be transferred.
After clean pages, migration succeeds.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
postcopy requires to place a whole host page, while migration thread
migrate memory in target page size. This makes postcopy need to collect
all target pages in one host page before placing via userfaultfd.
To enable compress during postcopy, there are two problems to solve:
1. Random order for target page arrival
2. Target pages in one host page arrives without interrupt by target
page from other host page
The first one is handled by previous cleanup patch.
This patch handles the second one by:
1. Flush compress thread for each host page
2. Wait for decompress thread for before placing host page
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
After using number of target page received to track one host page, we
could have the capability to handle random order target page arrival in
one host page.
This is a preparation for enabling compress during postcopy.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
For the first target page, all_zero is set to true for this round check.
After target_pages introduced, we could leverage this variable instead
of checking the address offset.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
In postcopy, it requires to place whole host page instead of target
page.
Currently, it relies on the page offset to decide whether this is the
last target page. We also can count the target page number during the
iteration. When the number of target page equals
(host page size / target page size), this means it is the last target
page in the host page.
This is a preparation for non-ordered target page transmission.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Compress is not supported with postcopy, it is safe to wait for
decompress thread just in precopy.
This is a preparation for later patch.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
In this case, page_buffer content would not be used.
Skip this to save some time.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Usually, incoming migration coroutine yields to the main loop
while its IO-channel is waiting for data to receive. But there is a case
when RAM migration and data receive have the same speed: VM with huge
zeroed RAM. In this case, IO-channel won't read and thus the main loop
is stuck and for instance, it doesn't respond to QMP commands.
For this case, yield periodically, but not too often, so as not to
affect the speed of migration.
Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
savevm_state's SaveStateEntry TAILQ is a priority queue. Priority
sorting is maintained by searching from head to tail for a suitable
insertion spot. Insertion is thus an O(n) operation.
If we instead keep track of the head of each priority's subqueue
within that larger queue we can reduce this operation to O(1) time.
savevm_state_handler_remove() becomes slightly more complex to
accomodate these gains: we need to replace the head of a priority's
subqueue when removing it.
With O(1) insertion, booting VMs with many SaveStateEntry objects is
more plausible. For example, a ppc64 VM with maxmem=8T has 40000 such
objects to insert.
Signed-off-by: Scott Cheloha <cheloha@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Create a function to abstract common logic needed when removing a
SaveStateEntry element from the savevm_state.handlers queue.
For now we just remove the element. Soon it will involve additional
cleanup.
Signed-off-by: Scott Cheloha <cheloha@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The current check sets an error but doesn't fail the command.
This may cause a problem if new connection attempt by the same URI
affects the first connection.
Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Clang does not like qmp_migrate_set_downtime()'s code to clamp double
@value to 0..INT64_MAX:
qemu/migration/migration.c:2038:24: error: implicit conversion from 'long' to 'double' changes value from 9223372036854775807 to 9223372036854775808 [-Werror,-Wimplicit-int-float-conversion]
The warning will be enabled by default in clang 10. It is not
available for clang <= 9.
The clamp is actually useless; @value is checked to be within
0..MAX_MIGRATE_DOWNTIME_SECONDS immediately before. Delete it.
While there, make the conversion from double to int64_t explicit.
Signed-off-by: Fangrui Song <i@maskray.me>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[Patch split, commit message improved]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
When using hugepages, rate limiting is necessary within each huge
page, since a 1G huge page can take a significant time to send, so
you end up with bursty behaviour.
Fixes: 4c011c37ec ("postcopy: Send whole huge pages")
Reported-by: Lin Ma <LMa@suse.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
ram_save_queue_pages() has an 'err' label that can be replaced by
'return -1' instead.
Same thing with ram_discard_range(), and in this case we can also
get rid of the 'ret' variable and return either '-1' on error
or the result of ram_block_discard_range().
CC: Juan Quintela <quintela@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Commit 1bd71dce4b tries to prevent a finishmigrate -> prelaunch
transition by exiting at the beginning of the main_loop_should_exit()
function if the state is already finishmigrate.
As the finishmigrate state is set in the migration thread it can
happen concurrently to the function. The migration thread and the
function are normally protected by the iothread mutex and thus the
state should no evolve between the start of the function and its end.
Unfortunately during the function life the lock is released by
pause_all_vcpus() just before the point we need to be sure we are
not in finishmigrate state and if the migration thread is waiting
for the lock it will take the opportunity to change the state
to finishmigrate.
The only way to be sure we are not in the finishmigrate state when
we need is to check the state after the pause_all_vcpus() function.
Fixes: 1bd71dce4b ("runstate: ignore exit request in finish migrate state")
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>