The BQL is acquired via qemu_mutex_lock_iothread(), which makes
the profiler assign the associated wait time (i.e. most of
BQL wait time) entirely to that function. This loses the original
call site information, which does not help diagnose BQL contention.
Fix it by tracking the callers explicitly.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
I first implemented this by deleting all entries in the global
hash table. But doing that safely slows down profiling, since
we'd need to introduce rcu_read_lock/unlock in the fast path.
What's implemented here avoids messing with the thread-local
data in the global hash table. It achieves this by taking a snapshot
of the current state, so that subsequent reports present the delta
wrt to the snapshot.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The goal of this module is to profile synchronization primitives (i.e.
mutexes, recursive mutexes and condition variables) so that scalability
issues can be quickly diagnosed.
Sync primitives are profiled by QSP based on the vaddr of the object accessed
as well as the call site (file:line_nr). That means the same object called
from two different call sites will be tracked in separate entries, which
might be reported together or separately (see subsequent commit on
call site coalescing).
Some perf numbers:
Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
Command: taskset -c 0 tests/atomic_add-bench -d 5 -m
- Before: 54.80 Mops/s
- After: 54.75 Mops/s
That is, a negligible slowdown due to the now indirect call to
qemu_mutex_lock. Note that using a branch instead of an indirect
call introduces a more severe slowdown (53.65 Mops/s, i.e. 2% slowdown).
Enabling the profiler (with -p, added in this series) is more interesting:
- No profiling: 54.75 Mops/s
- W/ profiling: 12.53 Mops/s
That is, a 4.36X slowdown.
We can break down this slowdown by removing the get_clock calls or
the entry lookup:
- No profiling: 54.75 Mops/s
- W/o get_clock: 25.37 Mops/s
- W/o entry lookup: 19.30 Mops/s
- W/ profiling: 12.53 Mops/s
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The counter is for qemu_lockcnt_inc/dec sections (read side),
qemu_lockcnt_lock/unlock is for the write side.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180803063917.30292-1-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
An aio_notify() pairs with an aio_notify_accept(). The former should
happen in the main thread or a vCPU thread, and the latter should be
done in the IOThread.
There is one rare case that the main thread or vCPU thread may "steal"
the aio_notify() event just raised by itself, in bdrv_set_aio_context()
[1]. The sequence is like this:
main thread IO Thread
===============================================================
bdrv_drained_begin()
aio_disable_external(ctx)
aio_poll(ctx, true)
ctx->notify_me += 2
...
bdrv_drained_end()
...
aio_notify()
...
bdrv_set_aio_context()
aio_poll(ctx, false)
[1] aio_notify_accept(ctx)
ppoll() /* Hang! */
[1] is problematic. It will clear the ctx->notifier event so that
the blocked ppoll() will not return.
(For the curious, this bug was noticed when booting a number of VMs
simultaneously in RHV. One or two of the VMs will hit this race
condition, making the VIRTIO device unresponsive to I/O commands. When
it hangs, Seabios is busy waiting for a read request to complete (read
MBR), right after initializing the virtio-blk-pci device, using 100%
guest CPU. See also https://bugzilla.redhat.com/show_bug.cgi?id=1562750
for the original bug analysis.)
aio_notify() only injects an event when ctx->notify_me is set,
correspondingly aio_notify_accept() is only useful when ctx->notify_me
_was_ set. Move the call to it into the "blocking" branch. This will
effectively skip [1] and fix the hang.
Furthermore, blocking aio_poll is only allowed on home thread
(in_aio_context_home_thread), because otherwise two blocking
aio_poll()'s can steal each other's ctx->notifier event and cause
hanging just like described above.
Cc: qemu-stable@nongnu.org
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180809132259.18402-3-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
The same logic exists in fd polling. This change is especially important
to avoid busy loop once we limit aio_notify_accept() to blocking
aio_poll().
Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180809132259.18402-2-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Ciro Santilli reported that commit a5ed352596
breaks the execution replay. It happens due to the probing the clock
for the new instances of iothread.
However, this probing was made in replay mode for the timer lists that
are empty.
This patch removes clock probing in replay mode.
It is an artifact of the old version with another thread model.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <20180725121526.12867.17866.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
No callers of get_opt_value() pass in a NULL for the "value" parameter,
so the check is redundant.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180514171913.17664-4-berrange@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The logic for parsing the multiboot initrd modules was messed up in
commit 950c4e6c94
Author: Daniel P. Berrangé <berrange@redhat.com>
Date: Mon Apr 16 12:17:43 2018 +0100
opts: don't silently truncate long option values
Causing the length to be undercounter, and the number of modules over
counted. It also passes NULL to get_opt_value() which was not robust
at accepting a NULL value.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180514171913.17664-2-berrange@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
soft-freeze, but I'd like these preparatory patches to be merged anyway.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEtIKLr5QxQM7yo0kQcdTV5YIvc9YFAls2DEgACgkQcdTV5YIv
c9ZShw/6AuDMeWpzFHAdf4lBuUdDFyAC7QFln8xdb+MDus5whAvtt7vDeY0OKbgo
w5VDTkstO7h4jQJDHkjzHK91ZdUbgu0Tj7C09x4oQpUueNWsTZGcPBvNGadjjOBt
70LdwyV2nSER3+QNjTNznrh0faxay4xuSTIY/mW6iudeWGobwXmseEeOE8gGM+w0
s1GwxMVKIfllKUmW2vx0mGfn02pKtTnan+Si+sp/AnY9xSquFfHWpZhXZlkZrfYd
mgtJOTY9IpSekr9jBBKgUlZ/QVYiliDzuh3ePDYKtsuHZZ7z2ype3DkXqYOnblOs
C+2gWUE/TC5BStjRX3RmPv21dpfkEdlxOZpgXbpP1VgKqbtnbnvcgTL89IPv9afl
Aj+q5uYR494kOL5rSDynVRdWhnUmMnkqHCZpKG+IRMHv6GlrXpxOQWenwCS/vYWK
swKqRwGj0CFugdt7qVZ+4XjXbbWEI21dHHG7nAXinfakKVOfJYIeGIQC7WfpIrxy
ApV0mHSceK0AMBJvlf1Zf0Qm0lJ7Ay7MRT/5XWDFV9Bogf+wxtGvf9Ukc2qQhwd8
mR9iN7rlWz3VSu5vS3bEdsiBXKibxIRfv7HhF5fa+mwkZA9gMbj33vVds1zA4ta1
Qw4doRq4xWui3uNO9jvtcXtW5Bq7N4p6wVFK76dLVHk5axLCIec=
=vVJr
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
The Darwin host support still needs some more work. It won't make it for
soft-freeze, but I'd like these preparatory patches to be merged anyway.
# gpg: Signature made Fri 29 Jun 2018 11:39:04 BST
# gpg: using RSA key 71D4D5E5822F73D6
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg: aka "Gregory Kurz <gregory.kurz@free.fr>"
# gpg: aka "[jpeg image of size 3330]"
# Primary key fingerprint: B482 8BAF 9431 40CE F2A3 4910 71D4 D5E5 822F 73D6
* remotes/gkurz/tags/for-upstream:
9p: darwin: Explicitly cast comparisons of mode_t with -1
cutils: Provide strchrnul
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This updates the minimum required glib version to 2.40
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJbNhcKAAoJEL6G67QVEE/f1gMP+wV4J+V9WJOxcXbNjP9bb1K4
gsvZrv2pKCVbcJW2o3Mc3POXExX1E2GsrAl639zpDHbNVlVFvQC51GaRO1Fc6lwh
7NRFHen0Ee6I7TvWlVXMy1YPXPZ/8EJ/1KuZerYUUyKO/R8ojEt/TbELwfl3P1LF
VZGWgs+GpwUYwiG4ZmYPQ0VsT/munZPlaK1mRQSDbGqX0KG1Vy8Q+mgbGAm+S6xh
39HtM7ecdfXbVEAnoMsp9+kRi3zXpD5zSAyVBZN2RDktMt/EHxF0pPJCqX7oq0Rn
DehXDkzaqF2ghzrSAIT3rbKBhzYIY/ny/vxnyQZ/Px0GrBdKwuUJ+pYcF2akS2hV
s98VWRw/tqt8wQKYLF/wmeL7eE/tKAHSUqFc3Ta0Rvgfq0z9Mp3b73vuswBpfjWL
+GiASRvg96n/1yqmywtCd7KtF5riOo1iyvfMFCdQ+nBHQok/6K/oWfZPyoZsCAIa
2GJBfmHzkkc98QeNO0Dp/+ckSyKIBuyTV1bDyq/8Yz7IwEd1PfRWooEgVFSHO7MU
6ddCECvKVvP0S03r5dlw4Imio39gPpeZBMmSE4ZOh/hRa89T4UqIUvSLwzat7OhA
DH1NhJsj7G1jaIxENfzBhpggg++rHCVcfc2cv28Tl7i7twyRsb7CveYA6TwD+UIg
7JT+KswUo7W9MrkKBHC+
=8tKK
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/berrange/tags/min-glib-pull-request' into staging
glib: update the min required version
This updates the minimum required glib version to 2.40
# gpg: Signature made Fri 29 Jun 2018 12:24:58 BST
# gpg: using RSA key BE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* remotes/berrange/tags/min-glib-pull-request:
glib: enforce the minimum required version and warn about old APIs
glib: bump min required glib library version to 2.40
util: remove redundant include of glib.h and add osdep.h
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Per supported platforms doc[1], the various min glib on relevant distros is:
RHEL-7: 2.50.3
Debian (Stretch): 2.50.3
Debian (Jessie): 2.42.1
OpenBSD (Ports): 2.54.3
FreeBSD (Ports): 2.50.3
OpenSUSE Leap 15: 2.54.3
SLE12-SP2: 2.48.2
Ubuntu (Xenial): 2.48.0
macOS (Homebrew): 2.56.0
This suggests that a minimum glib of 2.42 is a reasonable target.
The GLibC compile farm, however, uses Ubuntu 14.04 (Trusty) which only
has glib 2.40.0, and this is needed for testing during merge. Thus an
exception is made to the documented platform support policy to allow for
all three current LTS releases to be supported.
Docker jobs that not longer satisfy this new min version are removed.
[1] https://qemu.weilnetz.de/doc/qemu-doc.html#Supported-build-platforms
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Code must only ever include glib.h indirectly via the glib-compat.h
header file, because we will need some macros set before glib.h is
pulled in. Adding extra includes of glib.h will (soon) cause compile
failures such as:
In file included from /home/berrange/src/virt/qemu/include/qemu/osdep.h:107,
from /home/berrange/src/virt/qemu/include/qemu/iova-tree.h:26,
from util/iova-tree.c:13:
/home/berrange/src/virt/qemu/include/glib-compat.h:22: error: "GLIB_VERSION_MIN_REQUIRED" redefined [-Werror]
#define GLIB_VERSION_MIN_REQUIRED GLIB_VERSION_2_40
In file included from /usr/include/glib-2.0/glib/gtypes.h:34,
from /usr/include/glib-2.0/glib/galloca.h:32,
from /usr/include/glib-2.0/glib.h:30,
from util/iova-tree.c:12:
/usr/include/glib-2.0/glib/gversionmacros.h:237: note: this is the location of the previous definition
# define GLIB_VERSION_MIN_REQUIRED (GLIB_VERSION_CUR_STABLE)
Furthermore, the osdep.h include should always be done directly from the
.c file rather than indirectly via any .h file.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
strchrnul is a GNU extension and thus unavailable on a number of targets.
In the review for a commit removing strchrnul from 9p, I was asked to
create a qemu_strchrnul helper to factor out this functionality.
Do so, and use it in a number of other places in the code base that inlined
the replacement pattern in a place where strchrnul could be used.
Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
We have had some tracing tools for mutex but it's not easy to use them
for e.g. dead locks. Let's provide "--enable-debug-mutex" parameter
when configure to allow QemuMutex to store the last owner that took
specific lock. It will be easy to use this tool to debug deadlocks
since we can directly know who took the lock then as long as we can have
a debugger attached to the process.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180425025459.5258-4-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce some hooks for the shared part of qemu thread between POSIX
and Windows implementations. Note that in qemu_mutex_unlock_impl() we
moved the call before unlock operation which should make more sense.
And we don't need qemu_mutex_post_unlock() hook.
Put all these shared hooks into the header files. It should be internal
to qemu-thread but not for qemu-thread users, hence put into util/
directory.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180425025459.5258-3-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
laio_init() can fail for a couple of reasons, which will lead to a NULL
pointer dereference in laio_attach_aio_context().
To solve this, add a aio_setup_linux_aio() function which is called
early in raw_open_common. If this fails, propagate the error up. The
signature of aio_get_linux_aio() was not modified, because it seems
preferable to return the actual errno from the possible failing
initialization calls.
Additionally, when the AioContext changes, we need to associate a
LinuxAioState with the new AioContext. Use the bdrv_attach_aio_context
callback and call the new aio_setup_linux_aio(), which will allocate a
new AioContext if needed, and return errors on failures. If it fails for
any reason, fallback to threaded AIO with an error message, as the
device is already in-use by the guest.
Add an assert that aio_get_linux_aio() cannot return NULL.
Signed-off-by: Nishanth Aravamudan <naravamudan@digitalocean.com>
Message-id: 20180622193700.6523-1-naravamudan@digitalocean.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This new parameter allows the caller to just query the next dirty
position without moving the iterator.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20180613181823.13618-8-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Introduce a new global big lock for mon_fdsets. Take it where needed.
The monitor_fdset_get_fd() handling is a bit tricky: now we need to call
qemu_mutex_unlock() which might pollute errno, so we need to make sure
the correct errno be passed up to the callers. To make things simpler,
we let monitor_fdset_get_fd() return the -errno directly when error
happens, then in qemu_open() we move it back into errno.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180608035511.7439-8-peterx@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The meaning of "existing" is now changed to "matches in hash and
ht->cmp result". This is saner than just checking the pointer value.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
qht_lookup now uses the default cmp function. qht_lookup_custom is defined
to retain the old behaviour, that is a cmp function is explicitly provided.
qht_insert will gain use of the default cmp in the next patch.
Note that we move qht_lookup_custom's @func to be the last argument,
which makes the new qht_lookup as simple as possible.
Instead of this (i.e. keeping @func 2nd):
0000000000010750 <qht_lookup>:
10750: 89 d1 mov %edx,%ecx
10752: 48 89 f2 mov %rsi,%rdx
10755: 48 8b 77 08 mov 0x8(%rdi),%rsi
10759: e9 22 ff ff ff jmpq 10680 <qht_lookup_custom>
1075e: 66 90 xchg %ax,%ax
We get:
0000000000010740 <qht_lookup>:
10740: 48 8b 4f 08 mov 0x8(%rdi),%rcx
10744: e9 37 ff ff ff jmpq 10680 <qht_lookup_custom>
10749: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There are numerous QDict functions that have been introduced for and are
used only by the block layer. Move their declarations into an own
header file to reflect that.
While qdict_extract_subqdict() is in fact used outside of the block
layer (in util/qemu-config.c), it is still a function related very
closely to how the block layer works with nested QDicts, namely by
sometimes flattening them. Therefore, its declaration is put into this
header as well and util/qemu-config.c includes it with a comment stating
exactly which function it needs.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20180509165530.29561-7-mreitz@redhat.com>
[Copyright note tweaked, superfluous includes dropped]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We banned use of certain g_assert_FOO() functions outside tests, and
made checkpatch.pl flag them (commit 6e9389563e). We neglected to
purge existing uses. Do that now.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180608170231.27912-1-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: John Snow <jsnow@redhat.com>
It really is up to the caller to decide what this list of options means.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20180509210023.20283-4-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
* Copy offloading for qemu-img convert (iSCSI, raw, and qcow2)
If the underlying storage supports copy offloading, qemu-img convert will
use it instead of performing reads and writes. This avoids data transfers
and thus frees up storage bandwidth for other purposes. SCSI EXTENDED COPY
and Linux copy_file_range(2) are used to implement this optimization.
* Drop spurious "WARNING: I\/O thread spun for 1000 iterations" warning
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbFSBoAAoJEJykq7OBq3PISpEIAIcMao4/rzinAWXzS+ncK9LO
6FtRVpgutpHaWX2ayySaz5n2CdR3cNMrpCI7sjY2Kw0lrdkqxPgl5n0SWD+VCl4W
7+JLz/uF0iUV8X+99e7WGAjZbm9LSlxgn5AQKfrrwyPf0ZfzoYQ5nBMcQ6xjEeQP
48j2WqJqN9/u8RBD07o11yn0+CE5g56/f12xVjR5ASVodzsAmcZ2OQRMQbM01isU
1mBekJQkDxJkt5l13Rql8+t+vWz8/9BEW2c/eIDKvoayMqYJpdfKv4DqLloIuHnc
3RkquA0zUuKtl7xEnEkH/We7fi4QPGW/vyBN7ychS/zKzZFQrXmwqrAuFSw3dKU=
=vZp+
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Pull request
* Copy offloading for qemu-img convert (iSCSI, raw, and qcow2)
If the underlying storage supports copy offloading, qemu-img convert will
use it instead of performing reads and writes. This avoids data transfers
and thus frees up storage bandwidth for other purposes. SCSI EXTENDED COPY
and Linux copy_file_range(2) are used to implement this optimization.
* Drop spurious "WARNING: I\/O thread spun for 1000 iterations" warning
# gpg: Signature made Mon 04 Jun 2018 12:20:08 BST
# gpg: using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request:
main-loop: drop spin_counter
qemu-img: Convert with copy offloading
block-backend: Add blk_co_copy_range
iscsi: Implement copy offloading
iscsi: Create and use iscsi_co_wait_for_task
iscsi: Query and save device designator when opening
file-posix: Implement bdrv_co_copy_range
qcow2: Implement copy offloading
raw: Implement copy offloading
raw: Check byte range uniformly
block: Introduce API for copy offloading
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit d759c951f3 ("replay: push
replay_mutex_lock up the call tree") removed the !timeout lock
optimization in the main loop.
The idea of the optimization was to avoid ping-pongs between threads by
keeping the Big QEMU Lock held across non-blocking (!timeout) main loop
iterations.
A warning is printed when the main loop spins without releasing BQL for
long periods of time. These warnings were supposed to aid debugging but
in practice they just alarm users. They are considered noise because
the cause of spinning is not shown and is hard to find.
Now that the lock optimization has been removed, there is no danger of
hogging the BQL. Drop the spin counter and the infamous warning.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Coverity complains about qemu_memfd_create() (CID 1385858) because
we calculate a bit position htsize which could be up to 63, but
then use it in "1 << htsize" which is a 32-bit integer calculation
and could push the 1 off the top of the value.
Silence the complaint bu using "1ULL"; this isn't a bug in
practice since a hugetlbsize of 4GB is not very plausible.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180515172729.24564-1-peter.maydell@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce a simplest iova tree implementation based on GTree.
CC: QEMU Stable <qemu-stable@nongnu.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently the minimal supported version of glib is 2.22.
Since testing is done with a glib that claims to be 2.22, but in fact
has APIs from newer version of glib, this bug was not caught during
submit of the patch referenced below.
Replace g_realloc_n, which is available only since 2.24, with g_renew.
Fixes commit 418026ca43 ("util: Introduce vfio helpers")
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
CC: qemu-stable@nongnu.org
When we call addIOThread, the epollfd created in aio_context_setup,
but not close it in the process of delIOThread, so the epollfd will leak.
Reorder the code in aio_epoll_disable and reuse it.
Signed-off-by: Jie Wang <wangjie88@huawei.com>
Message-Id: <1526517763-11108-1-git-send-email-wangjie88@huawei.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
[Mention change to aio_epoll_disable in commit message. - Fam]
Signed-off-by: Fam Zheng <famz@redhat.com>
Usually the logging of the CPU state produced by -d cpu is sufficient
to diagnose problems, but sometimes you want to see the state of
the floating point registers as well. We don't want to enable that
by default as it adds a lot of extra data to the log; instead,
allow it to be optionally enabled via -d fpu.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180510130024.31678-1-peter.maydell@linaro.org
We will conditionally have a wrapper layer depending on whether the host
has the PTHREAD_SETNAME capability. It complicates stuff. Let's keep
the wrapper there; we opt out the pthread_setname_np() call only.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180412053444.17801-1-peterx@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that we can safely call QOBJECT() on QObject * as well as its
subtypes, we can have macros qobject_ref() / qobject_unref() that work
everywhere instead of having to use QINCREF() / QDECREF() for QObject
and qobject_incref() / qobject_decref() for its subtypes.
The replacement is mechanical, except I broke a long line, and added a
cast in monitor_qmp_cleanup_req_queue_locked(). Unlike
qobject_decref(), qobject_unref() doesn't accept void *.
Note that the new macros evaluate their argument exactly once, thus no
need to shout them.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180419150145.24795-4-marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Rebased, semantic conflict resolved, commit message improved]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
qemu_mempath_getpagesize() gets the effective (host side) page size for
a block of memory backed by an mmap()ed file on the host. It requires
the mem_path parameter to be non-NULL.
This ends up meaning all the callers need a different case for handling
anonymous memory (for memory-backend-ram or default memory with -mem-path
is not specified).
We can make all those callers a little simpler by having
qemu_mempath_getpagesize() accept NULL, and treat that as the anonymous
memory case.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Our rule right now is to use <> for external headers only.
util/sys_membarrier.c violates that. Fix it up.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Message-Id: <20180329151018.15319-1-brogers@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu_aio_coroutine_enter() is (indirectly) called recursively when
processing co_queue_wakeup. This can lead to stack exhaustion.
This patch rewrites co_queue_wakeup in an iterative fashion (instead of
recursive) with bounded memory usage to prevent stack exhaustion.
qemu_co_queue_run_restart() is inlined into qemu_aio_coroutine_enter()
and the qemu_coroutine_enter() call is turned into a loop to avoid
recursion.
There is one change that is worth mentioning: Previously, when
coroutine A queued coroutine B, qemu_co_queue_run_restart() entered
coroutine B from coroutine A. If A was terminating then it would still
stay alive until B yielded. After this patch B is entered by A's parent
so that a A can be deleted immediately if it is terminating.
It is safe to make this change since B could never interact with A if it
was terminating anyway.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180322152834.12656-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
OOB can enable iothread for parsing even on Windows. We need some tunes
to enable that on Windows otherwise it'll break Windows users. This
patch fixes the breakage on Windows with qemu-system-ppc.exe.
Reported-by: Howard Spoelstra <hsp.cat7@gmail.com>
Tested-by: Howard Spoelstra <hsp.cat7@gmail.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180322085630.23654-1-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch was generated using the following Coccinelle script:
@@
expression Obj;
@@
(
- qobject_to_qnum(Obj)
+ qobject_to(QNum, Obj)
|
- qobject_to_qstring(Obj)
+ qobject_to(QString, Obj)
|
- qobject_to_qdict(Obj)
+ qobject_to(QDict, Obj)
|
- qobject_to_qlist(Obj)
+ qobject_to(QList, Obj)
|
- qobject_to_qbool(Obj)
+ qobject_to(QBool, Obj)
)
and a bit of manual fix-up for overly long lines and three places in
tests/check-qjson.c that Coccinelle did not find.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20180224154033.29559-4-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: swap order from qobject_to(o, X), rebase to master, also a fix
to latent false-positive compiler complaint about hw/i386/acpi-build.c]
Signed-off-by: Eric Blake <eblake@redhat.com>
* SCSI fix to pass maximum transfer size (Daniel Barboza)
* chardev fixes and improved iothread support (Daniel Berrangé, Peter)
* checkpatch tweak (Eric)
* make help tweak (Marc-André)
* make more PCI NICs available with -net or -nic (myself)
* change default q35 NIC to e1000e (myself)
* SCSI support for NDOB bit (myself)
* membarrier system call support (myself)
* SuperIO refactoring (Philippe)
* miscellaneous cleanups and fixes (Thomas)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJapqaMAAoJEL/70l94x66DQoUH/Rvg+a8giz/SrEA4P8D3Cb2z
4GNbNUUoy4oU0ltD5IAMskMwpOsvl1batE0D+pKIlfO9NV4+Cj2kpgo0p9TxoYqM
VCby3wRtx27zb5nVytC6M++iIKXmeEMqXmFw61I6umddNPSl4IR3hiHEE0DM+7dV
UPIOvJeEiazyQaw3Iw+ZctNn8dDBKc/+6oxP9xRcYTaZ6hB4G9RZkqGNNSLcJkk7
R0UotdjzIZhyWMOkjIwlpTF4sWv8gsYUV4bPYKMYho5B0Obda2dBM3I1kpA8yDa/
xZ5lheOaAVBZvM5aMIcaQPa65MO9hLyXFmhMOgyfpJhLBBz6Qpa4OLLI6DeTN+0=
=UAgA
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Record-replay lockstep execution, log dumper and fixes (Alex, Pavel)
* SCSI fix to pass maximum transfer size (Daniel Barboza)
* chardev fixes and improved iothread support (Daniel Berrangé, Peter)
* checkpatch tweak (Eric)
* make help tweak (Marc-André)
* make more PCI NICs available with -net or -nic (myself)
* change default q35 NIC to e1000e (myself)
* SCSI support for NDOB bit (myself)
* membarrier system call support (myself)
* SuperIO refactoring (Philippe)
* miscellaneous cleanups and fixes (Thomas)
# gpg: Signature made Mon 12 Mar 2018 16:10:52 GMT
# gpg: using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (69 commits)
tcg: fix cpu_io_recompile
replay: update documentation
replay: save vmstate of the asynchronous events
replay: don't process async events when warping the clock
scripts/replay-dump.py: replay log dumper
replay: avoid recursive call of checkpoints
replay: check return values of fwrite
replay: push replay_mutex_lock up the call tree
replay: don't destroy mutex at exit
replay: make locking visible outside replay code
replay/replay-internal.c: track holding of replay_lock
replay/replay.c: bump REPLAY_VERSION again
replay: save prior value of the host clock
replay: added replay log format description
replay: fix save/load vm for non-empty queue
replay: fixed replay_enable_events
replay: fix processing async events
cpu-exec: fix exception_index handling
hw/i386/pc: Factor out the superio code
hw/alpha/dp264: Use the TYPE_SMC37C669_SUPERIO
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# default-configs/i386-softmmu.mak
# default-configs/x86_64-softmmu.mak
The SocketAddress 'fd' kind accepts the name of a file descriptor passed
to the monitor with the 'getfd' command. This makes it impossible to use
the 'fd' kind in cases where a monitor is not available. This can apply in
handling command line argv at startup, or simply if internal code wants to
use SocketAddress and pass a numeric FD it has acquired from elsewhere.
Fortunately the 'getfd' command mandated that the FD names must not start
with a leading digit. We can thus safely extend semantics of the
SocketAddress 'fd' kind, to allow a purely numeric name to reference an
file descriptor that QEMU already has open. There will be restrictions on
when each kind can be used.
In codepaths where we are handling a monitor command (ie cur_mon != NULL),
we will only support use of named file descriptors as before. Use of FD
numbers is still not permitted for monitor commands.
In codepaths where we are not handling a monitor command (ie cur_mon ==
NULL), we will not support named file descriptors. Instead we can reference
FD numers explicitly. This allows the app spawning QEMU to intentionally
"leak" a pre-opened socket to QEMU and reference that in a SocketAddress
definition, or for code inside QEMU to pass pre-opened FDs around.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The SocketAddress struct has an "fd" type, which references the name of a
file descriptor passed over the monitor using the "getfd" command. We
currently blindly assume the FD is a socket, which can lead to hard to
diagnose errors later. This adds an explicit check that the FD is actually
a socket to improve the error diagnosis.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The fd_is_socket() helper method is useful in a few places, so put it in
the common sockets code. Make the code more compact while moving it.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There are qemu_strtoNN functions for various sized integers. This adds two
more for plain int & unsigned int types, with suitable range checking.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch adds saving/restoring of the host clock field 'last'.
It is used in host clock calculation and therefore clock may
become incorrect when using restored vmstate.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180227095226.1060.50975.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Actually enable the global memory barriers if supported by the OS.
Because only recent versions of Linux include the support, they
are disabled by default. Note that it also has to be disabled
for QEMU to run under Wine.
Before this patch, rcutorture reports 85 ns/read for my machine,
after the patch it reports 12.5 ns/read. On the other hand updates
go from 50 *micro*seconds to 20 *milli*seconds.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This new header file provides heavy-weight "global" memory barriers that
enforce memory ordering on each running thread belonging to the current
process. For now, use a dummy implementation that issues memory barriers
on both sides (matching what QEMU has been doing so far).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Prepare for introducing smp_mb_placeholder() and smp_mb_global().
The new smp_mb() in synchronize_rcu() is not strictly necessary, since
the first atomic_mb_set for rcu_gp_ctr provides the required ordering.
However, synchronize_rcu is not performance critical, and it *will* be
necessary to introduce a smp_mb_global before calling wait_for_readers().
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Nested BDRV_POLL_WHILE() calls can occur. Currently
assert(!wait_->wakeup) fails in AIO_WAIT_WHILE() when this happens.
This patch converts the bool wait_->need_kick flag to an unsigned
wait_->num_waiters counter.
Nesting works correctly because outer AIO_WAIT_WHILE() callers evaluate
the condition again after the inner caller completes (invoking the inner
caller counts as aio_poll() progress).
Reported-by: "fuweiwei (C)" <fuweiwei2@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20180307124619.6218-1-stefanha@redhat.com
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJaoqOgAAoJEH8JsnLIjy/W2MMP/1gj7CJgtSG9wIzyBHjSWQMy
ofXEgRJO9t/smfUMlH2NdrW8P2LvYcmqOsEBkLJzCtl48fPexwtI/cunzVjutXcf
VlpqKz/8uN4C9D6m8FN/5kKf65l+tnVqnCoJgwafY5uT7jmoC8LF1xO2jo8a+lJd
0Dv6RxJUQq/tDR6OvO6aW4EzbOUcD4wkLvi/uz8+ZjV1BLSLlpdudejr6W9TnJY/
EGFedbxqjPV7fIvMbodbFp0Ie8Aw0WEL8ttERboeR4jbA/o+PZVGpPtHsr/4V6QO
Pgh6vH2rGavxFzwuCWEGhlLKGx66CGqqdTknm6lNJchepCvcfoYxjOPZv9FCaMUs
enC/x43xSkCmkwBwKKxpXqu1vS5nGdMebAwRjstSIplypjv2YOwS1AiU5snaDwuk
t9Gjkw0Wka5nySuYi43H2RPXmlWbh4T8DfQ6pOyJGvXGjm8t+f5BTaMtSWn6Iq2W
F6r1UezQJBDnUbpFgsRg4AP+htPGDHgsOg7KzCCd/lBHwbjX7dkQlAYbBZZ2OBF+
wQN5olDR6jsKIy2IlARNgNweZHW5UQa1cc+7HlVNNE5tqtkjo7aWPk/LhEzBCIHg
sWG3VH2y3lQlaMzYh1v+jnGrFoq1ZJU4sbjaxvQX8czjmaQvPtbzKuZAovQ4pGwa
g0SrWP6p9yLo0LXLuXBP
=WDF4
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Fri 09 Mar 2018 15:09:20 GMT
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (56 commits)
qemu-iotests: fix 203 migration completion race
iotests: Tweak 030 in order to trigger a race condition with parallel jobs
iotests: Skip test for ENOMEM error
iotests: Mark all tests executable
iotests: Test creating overlay when guest running
qemu-iotests: Test ssh image creation over QMP
qemu-iotests: Test qcow2 over file image creation with QMP
block: Fail bdrv_truncate() with negative size
file-posix: Fix no-op bdrv_truncate() with falloc preallocation
ssh: Support .bdrv_co_create
ssh: Pass BlockdevOptionsSsh to connect_to_ssh()
ssh: QAPIfy host-key-check option
ssh: Use QAPI BlockdevOptionsSsh object
sheepdog: Support .bdrv_co_create
sheepdog: QAPIfy "redundancy" create option
nfs: Support .bdrv_co_create
nfs: Use QAPI options in nfs_client_open()
rbd: Use qemu_rbd_connect() in qemu_rbd_do_create()
rbd: Assign s->snap/image_name in qemu_rbd_open()
rbd: Support .bdrv_co_create
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch disables the pragma diagnostic -Wunused-but-set-variable for
clang in util/coroutine-ucontext.c.
This in turn allows us to remove it from the configure check, so the
CONFIG_PRAGMA_DIAGNOSTIC_AVAILABLE will succeed for clang.
With that in place clang builds (linux) will use -Werror by default,
which breaks the build due to warning about unaligned struct members.
Just turning off this warning isn't a good idea as it indicates
portability problems. So make it a warning again, using
-Wno-error=address-of-packed-member. That way it doesn't break the
build but still shows up in the logs.
Now clang builds qemu without errors. Well, almost. There are some
left in the rdma code. Leaving that to the rdma people. All others can
use --disable-rdma to workarounds this.
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20180309135945.20436-1-kraxel@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This allows, given a QemuOpts for a QemuOptsList that was merged from
multiple QemuOptsList, to only consider those options that exist in one
specific list. Block drivers need this to separate format-layer create
options from protocol-level options.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Sometimes it's necessary for the main loop thread to run a BH in an
IOThread and wait for its completion. This primitive is useful during
startup/shutdown to synchronize and avoid race conditions.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180307144205.20619-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJanYJPAAoJEH8JsnLIjy/WxjUQAJA+DTOmGXvaNpMs65BrU79K
/r/iGVrzHv/RMLmrWMnqj96W9SnpMuiAP9hVLNsekqClY9q4ME4DpGcXhWfhSvF5
FC51ehvFJdfo8cPorsevcqNj60iWebjcx3lFfUq2606UOyYih3oijYxr6gSwWbRc
GAgdGMqsvGYpzgqAQVEWHUhaX0La49/OzY42aR+E+LCBNfTYvlydvyoc+tUTdIpW
1eM/ASGndGsN0Cf2vxlbKgJ0/P6v+cRZuuIDhKZqre+YG+yM+pq7yZb+o7nf/P36
TPR93BsT7FSVAizRK7VFRuPIynHpiaxYygrJERCXF0sxsV4OlKjpmt/uUPamWFh+
46Jx2NK1AuAx87BdErgmA119ObO3oAPxK0+2p981obb6SphTbbPxDj6SOlYCt4mJ
mhff4JtIiwCmDSckAwd2mkBI1Tvl9qqcELrpyd2t2eU4ec2vf7fPd85EsK/Mq6Kr
dbfqFvjNaaMxChoqFgkHAveYJ7zYqRFI2IY5o9c1QyZehCGPWjScxHXZZYdpDl59
YF9DkYQDOyvEX2jmMECaO1r/0nnO+BqQHu5ItJuTte9rjP9Q0do3iBISiIefewtf
yji6/QNn2hFrnr1HPAwLFFC3kPgc8Mq8mIUb53j8vG/01KhVRCcnJm2K6D4IUwLZ
S6ZnQJB97eE4y7YR5dNt
=2axz
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Mon 05 Mar 2018 17:45:51 GMT
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (38 commits)
block: Fix NULL dereference on empty drive error
qcow2: Replace align_offset() with ROUND_UP()
block/ssh: Add basic .bdrv_truncate()
block/ssh: Make ssh_grow_file() blocking
block/ssh: Pull ssh_grow_file() from ssh_create()
qemu-img: Make resize error message more general
qcow2: make qcow2_co_create2() a coroutine_fn
block: rename .bdrv_create() to .bdrv_co_create_opts()
Revert "IDE: Do not flush empty CDROM drives"
block: test blk_aio_flush() with blk->root == NULL
block: add BlockBackend->in_flight counter
block: extract AIO_WAIT_WHILE() from BlockDriverState
aio: rename aio_context_in_iothread() to in_aio_context_home_thread()
docs: document how to use the l2-cache-entry-size parameter
specs/qcow2: Fix documentation of the compressed cluster descriptor
iotest 033: add misaligned write-zeroes test via truncate
block: fix write with zero flag set and iovector provided
block: Drop unused .bdrv_co_get_block_status()
vvfat: Switch to .bdrv_co_block_status()
vpc: Switch to .bdrv_co_block_status()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# include/block/block.h
Mostly patches that are only indirectly related to the block layer, but I've
reviewed them and there is no maintainer.
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJanRBmAAoJEJykq7OBq3PIzQUIAI7q2BFcNEKjNYQusYDA6gJv
7uYiEx1NkcXUu0DVWlCSENGfvQRbYa46oti3MIgu/imMaYwtahefvqA4R2JEy/GS
xRx+LnzVeQ2d3CYLHd1D3ce2BWAIa9bzRDd25Ux2HfDC46a7v5yDPdeRxQF9xgPQ
CwtPKtr5X/xZXUJR3PoC6QqrZUthN5DXOk4cCLJoks6iLpQlNO3jnoq+RXMl+8II
yj0L2sdvi2ebLsuwndXblEg0qR27X94nz0L1j0GDbHir6I13A6Y66a9qv8mTXz/L
poRWaNTHza6KrapseZ85ed6I/IfNODKR2PCfL/CClsHhi5HWRK54nKS3xaWiBUk=
=8zc8
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Pull request
Mostly patches that are only indirectly related to the block layer, but I've
reviewed them and there is no maintainer.
# gpg: Signature made Mon 05 Mar 2018 09:39:50 GMT
# gpg: using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request:
README: Document 'git-publish' workflow
Add a git-publish configuration file
tests/libqos: Check for valid dev pointer when looking for PCI devices
util/uri.c: wrap single statement blocks with braces {}
util/uri.c: remove brackets that wrap `return` statement's content.
util/uri.c: Coding style check, Only whitespace involved
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For this patch, using curly braces to wrap `if` `while` `else` statements,
which only hold single statement. For example:
'''
if (cond)
statement;
'''
to
'''
if (cond) {
statement;
}
'''
And using tricks that compare the disassemblies before and after
code changes, to make sure code logic isn't changed:
'''
git checkout master
make util/uri.o
strip util/uri.o
objdump -Drx util/uri.o > /tmp/uri-master.txt
git checkout cleanupbranch
make util/uri.o
strip util/uri.o
objdump -Drx util/uri.o > /tmp/uri-cleanup.txt
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
only remove brackets that wrap `return` statements' content.
use `perl -pi -e "s/return \((.*?)\);/return \1;/g" util/uri.c`
to remove pattern like this: "return (1);"
Signed-off-by: Su Hang <suhang16@mails.ucas.ac.cn>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1519533358-13759-3-git-send-email-suhang16@mails.ucas.ac.cn
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Using `clang-format -i util/uri.c` first, then change back few code
manually, to make sure only whitespace involved.
Signed-off-by: Su Hang <suhang16@mails.ucas.ac.cn>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1519533358-13759-2-git-send-email-suhang16@mails.ucas.ac.cn
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The previous commit improved compile time by including less of the
generated QAPI headers. This is impossible for stuff defined directly
in qapi-schema.json, because that ends up in headers that that pull in
everything.
Move everything but include directives from qapi-schema.json to new
sub-module qapi/misc.json, then include just the "misc" shard where
possible.
It's possible everywhere, except:
* monitor.c needs qmp-command.h to get qmp_init_marshal()
* monitor.c, ui/vnc.c and the generated qapi-event-FOO.c need
qapi-event.h to get enum QAPIEvent
Perhaps we'll get rid of those some other day.
Adding a type to qapi/migration.json now recompiles some 120 instead
of 2300 out of 5100 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-25-armbru@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
In my "build everything" tree, a change to the types in
qapi-schema.json triggers a recompile of about 4800 out of 5100
objects.
The previous commit split up qmp-commands.h, qmp-event.h, qmp-visit.h,
qapi-types.h. Each of these headers still includes all its shards.
Reduce compile time by including just the shards we actually need.
To illustrate the benefits: adding a type to qapi/migration.json now
recompiles some 2300 instead of 4800 objects. The next commit will
improve it further.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-24-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
BlockDriverState has the BDRV_POLL_WHILE() macro to wait on event loop
activity while a condition evaluates to true. This is used to implement
synchronous operations where it acts as a condvar between the IOThread
running the operation and the main loop waiting for the operation. It
can also be called from the thread that owns the AioContext and in that
case it's just a nested event loop.
BlockBackend needs this behavior but doesn't always have a
BlockDriverState it can use. This patch extracts BDRV_POLL_WHILE() into
the AioWait abstraction, which can be used with AioContext and isn't
tied to BlockDriverState anymore.
This feature could be built directly into AioContext but then all users
would kick the event loop even if they signal different conditions.
Imagine an AioContext with many BlockDriverStates, each time a request
completes any waiter would wake up and re-check their condition. It's
nicer to keep a separate AioWait object for each condition instead.
Please see "block/aio-wait.h" for details on the API.
The name AIO_WAIT_WHILE() avoids the confusion between AIO_POLL_WHILE()
and AioContext polling.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
1) string not null terminated in sysfs_find_group_file
2) NULL pointer dereference and dead local variable in nvme_init.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180213015240.9352-1-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Currently only file backed memory backend can
be created with a "share" flag in order to allow
sharing guest RAM with other processes in the host.
Add the "share" flag also to RAM Memory Backend
in order to allow remapping parts of the guest RAM
to different host virtual addresses. This is needed
by the RDMA devices in order to remap non-contiguous
QEMU virtual addresses to a contiguous virtual address range.
Moved the "share" flag to the Host Memory base class,
modified phys_mem_alloc to include the new parameter
and a new interface memory_region_init_ram_shared_nomigrate.
There are no functional changes if the new flag is not used.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Applied using the Coccinelle semantic patch scripts/coccinelle/use_osdep.cocci
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Check for the presence of posix_memalign() in the configure script,
not using "defined(_POSIX_C_SOURCE) && !defined(__sun__)". This
lets qemu use posix_memalign() on NetBSD versions that have it,
instead of falling back to valloc() which is wasteful when the
required alignment is smaller than a page.
Signed-off-by: Andreas Gustafsson <gson@gson.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Kamil Rytarowski <n54@gmx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
This cleanup makes the number of objects depending on qapi/qmp/qdict.h
drop from 4550 (out of 4743) to 368 in my "build everything" tree.
For qapi/qmp/qobject.h, the number drops from 4552 to 390.
While there, separate #include from file comment with a blank line.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-13-armbru@redhat.com>
This cleanup makes the number of objects depending on qapi/qmp/qlist.h
drop from 4551 (out of 4743) to 16 in my "build everything" tree.
While there, separate #include from file comment with a blank line.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-12-armbru@redhat.com>
The macro expansions of qdict_put_TYPE() and qlist_append_TYPE() need
qbool.h, qnull.h, qnum.h and qstring.h to compile. We include qnull.h
and qnum.h in the headers, but not qbool.h and qstring.h. Works,
because we include those wherever the macros get used.
Open-coding these helpers is of dubious value. Turn them into
functions and drop the includes from the headers.
This cleanup makes the number of objects depending on qapi/qmp/qnum.h
from 4551 (out of 4743) to 46 in my "build everything" tree. For
qapi/qmp/qnull.h, the number drops from 4552 to 21.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-10-armbru@redhat.com>
qapi/qmp/types.h is a convenience header to include a number of
qapi/qmp/ headers. Since we rarely need all of the headers
qapi/qmp/types.h includes, we bypass it most of the time. Most of the
places that use it don't need all the headers, either.
Include the necessary headers directly, and drop qapi/qmp/types.h.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-9-armbru@redhat.com>
This cleanup makes the number of objects depending on qapi/error.h
drop from 1910 (out of 4743) to 1612 in my "build everything" tree.
While there, separate #include from file comment with a blank line,
and drop a useless comment on why qemu/osdep.h is included first.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-5-armbru@redhat.com>
[Semantic conflict with commit 34e304e975 resolved, OSX breakage fixed]
This is a library to manage the host vfio interface, which could be used
to implement userspace device driver code in QEMU such as NVMe or net
controllers.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180116060901.17413-3-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
qemu_co_queue_next does not need to release and re-acquire the mutex,
because the queued coroutine does not run immediately. However, this
does not hold for qemu_co_enter_next. Now that qemu_co_queue_wait
can synchronize (via QemuLockable) with code that is not running in
coroutine context, it's important that code using qemu_co_enter_next
can easily use a standardized locking idiom.
First of all, qemu_co_enter_next must use aio_co_wake to restart the
coroutine. Second, the function gains a second argument, a QemuLockable*,
and the comments of qemu_co_queue_next and qemu_co_queue_restart_all
are adjusted to clarify the difference.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180203153935.8056-5-pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
There are cases in which a queued coroutine must be restarted from
non-coroutine context (with qemu_co_enter_next). In this cases,
qemu_co_enter_next also needs to be thread-safe, but it cannot use
a CoMutex and so cannot qemu_co_queue_wait. Use QemuLockable so
that the CoQueue can interchangeably use CoMutex or QemuMutex.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180203153935.8056-4-pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
The .count of HBitmap is forgot to set in function
hbitmap_deserialize_finish, let's set it to the right value.
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Cc: Fam Zheng <famz@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: John Snow <jsnow@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Weiping Zhang <zhangweiping@didichuxing.com>
Signed-off-by: Liang Li <liliangleo@didichuxing.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20180118131308.GA2181@liangdeMacBook-Pro.local
Signed-off-by: John Snow <jsnow@redhat.com>
Linux commit 749df87bd7bee5a79cef073f5d032ddb2b211de8 (v4.14-rc1)
added a new flag MFD_HUGETLB to memfd_create() that specify the file
to be created resides in the hugetlbfs filesystem. This is the
generic hugetlbfs filesystem not associated with any specific mount
point.
hugetlbfs does not support sealing operations in v4.14, therefore
specifying MFD_ALLOW_SEALING with MFD_HUGETLB will result in EINVAL.
However, I added sealing support in "[PATCH v3 0/9] memfd: add sealing
to hugetlb-backed memory" series, queued in -mm tree for v4.16.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180201132757.23063-3-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will allow callers to silence error report when the call is
allowed to failed.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180201132757.23063-2-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It helps ASAN to detect more leaks on coroutine stacks, and to get rid
of some extra warnings.
Before:
tests/test-coroutine -p
/basic/lifecycle
/basic/lifecycle: ==20781==WARNING: ASan doesn't fully support
makecontext/swapcontext functions and may produce false positives in
some cases!
==20781==WARNING: ASan is ignoring requested __asan_handle_no_return:
stack top: 0x7ffcb184d000; bottom 0x7ff6c4cfd000; size: 0x0005ecb50000
(25446121472)
False positive error reports may follow
For details see https://github.com/google/sanitizers/issues/189
OK
After:
tests/test-coroutine -p /basic/lifecycle
/basic/lifecycle: ==21110==WARNING: ASan doesn't fully support
makecontext/swapcontext functions and may produce false positives in
some cases!
OK
A similar work would need to be done for sigaltstack & windows fibers
to have similar coverage. Since ucontext is preferred, I didn't bother
checking the other coroutine implementations for now.
Update travis to fix the build with ASAN annotations.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180116151152.4040-4-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The inet_parse() function looks for 'ipv4' and 'ipv6' flags, but only
treats them as bare bool flags. The normal QemuOpts parsing would allow
on/off values to be set too.
This updates inet_parse() so that its handling of the 'ipv4' and 'ipv6'
flags matches that done by QemuOpts.
This impacts the NBD block driver parsing the legacy filename syntax and
the migration code parsing the socket scheme.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20180125171412.21627-1-berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since commit e5dc1a6c6c, QEMU aborts on exit if completion was used
in the monitor:
*** Error in `obj/ppc64-softmmu/qemu-system-ppc64': double free or
corruption (fasttop): 0x00000100331069d0 ***
/home/greg/Work/qemu/qemu-spapr/util/readline.c:514
/home/greg/Work/qemu/qemu-spapr/monitor.c:586
/home/greg/Work/qemu/qemu-spapr/monitor.c:4125
argv=<optimized out>, envp=<optimized out>) at
/home/greg/Work/qemu/qemu-spapr/vl.c:4795
Completion strings are not persistent accross completions (why would
they?). They are allocated under readline_completion(), which already
takes care of freeing them before returning.
Maybe all completion related bits should be moved out of ReadLineState
to a dedicated structure ?
In the meantime, let's drop the offending lines from readline_free()
to fix the crash.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <151627206353.4505.4602428849861610759.stgit@bahia.lan>
Fixes: e5dc1a6c6c
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJaZyzMAAoJEH8JsnLIjy/WSOYP/jHffS2fOHtdLQU42G76HN0K
jkhSUE8cuKFzgxuJDJhv10AGsGZvDIaRhuumPIFArQkcEwsFDfd0UqzC5GkCnhfn
6frVPsRSUp9BqXqha1+6vOgVRobdBXPpS25ERfanTsbu3aPEDRnGpxmpMvyyimft
BTUnWNCg2lM6bojXrC6oy7MqUdi9p9PviMcQAfnN07SmGa+s6tS2Jc9znvZwgL06
o+oPukWVTAiub5qcH18BLA3T8xcCXWANdY9pUnNj7mXHoxg3kYzzYBArYDh6Kyju
BkSEML1kNcUACFAZ+LSqQpnoc8/5cP+jY5cOBGtUUgjZSns/xnAZxALltds0I4m3
fqQM68oOTX7squAYAaKYVYMirime6aa2OAn2afxPJildPp8uH4lNust95yiUyyJQ
oqA3zfAnP5FfmTnzjLG7smYlRUlcHp8eMPyOKHxp3BuqTMbWY5KQETyDMk3QVnZr
7fSFIdT4sRTdroKXUKHHu3RLFyCo77EBovxY2oUtt6v43qxQhLx0IFwW6jHrcLK9
ifLOr1CqdgwH/OU7h6rzoLcGLX5/eOTxwcCbU0kP2cx4E60VBXmSaDq9TiwQhbeV
4HteS+EP6R0WpCiAvsFl2aUd6iwDRHeYt0aKpYyUuTVrW2mfmLPaQP+tJLjEoaHF
H5HlbNWy2gFAB2uQtmOd
=gjNZ
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Tue 23 Jan 2018 12:38:36 GMT
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (29 commits)
iotests: Disable some tests for compat=0.10
iotests: Split 177 into two parts for compat=0.10
iotests: Make 059 pass on machines with little RAM
iotests: Filter compat-dependent info in 198
iotests: Make 191 work with qcow2 options
iotests: Make 184 image-less
iotests: Make 089 compatible with compat=0.10
iotests: Fix 067 for compat=0.10
iotests: Fix 059's reference output
iotests: Fix 051 for compat=0.10
iotests: Fix 020 for vmdk
iotests: Skip 103 for refcount_bits=1
iotests: Forbid 020 for non-file protocols
iotests: Drop format-specific in _filter_img_info
iotests: Fix _img_info for backslashes
block/vmdk: Add blkdebug events
block/qcow: Add blkdebug events
qcow2: No persistent dirty bitmaps for compat=0.10
block/vmdk: Fix , instead of ; at end of line
qemu-iotests: Fix locking issue in 102
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This reverts commit f87d72f5c5 as that is
part of a patchset reported to break cleanup and migration.
Cc: Gal Hammer <ghammer@redhat.com>
Cc: Sitong Liu <siliu@redhat.com>
Cc: Xiaoling Gao <xiagao@redhat.com>
Suggested-by: Greg Kurz <groug@kaod.org>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Reported-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
We could hit lock failure if there is a signal that makes fcntl return
-1 and errno set to EINTR. In this case we should retry.
Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add a function to only create a memfd, without mmap. The function is
used in the following memory backend.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20171023141815.17709-2-marcandre.lureau@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>