Their last user went away in commit f51074cdc6, "pci-hotplug-old: Has
been dead for five major releases, bury", v2.3.0. Remove them, as new
code should use QemuOpts or maybe keyval_parse() instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20171006131645.17729-1-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
opt was declared as a separate local inside the last loop,
shadowing the local at the top of the function.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20171005190725.18712-1-dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The AioContext pointer argument to co_aio_sleep_ns() is only used for
the sleep timer. It does not affect where the caller coroutine is
resumed.
Due to changes to coroutine and AIO APIs it is now possible to drop the
AioContext pointer argument. This is safe to do since no caller has
specific requirements for which AioContext the timer must run in.
This patch drops the AioContext pointer argument and renames the
function to simplify the API.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20171109102652.6360-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The function searches for next zero bit.
Also add interface for BdrvDirtyBitmap and unit test.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20171012135313.227864-2-vsementsov@virtuozzo.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
exec: housekeeping (funny since 02d0e09503)
applied using ./scripts/clean-includes
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
SPARC Linux has an oddity that it insists that mmap()
of MAP_FIXED memory must be at an alignment defined by
SHMLBA, which is more aligned than the page size
(typically, SHMLBA alignment is to 16K, and pages are 8K).
This is a relic of ancient hardware that had cache
aliasing constraints, but even on modern hardware the
kernel still insists on the alignment.
To ensure that we get mmap() alignment sufficient to
make the kernel happy, change QEMU_VMALLOC_ALIGN,
qemu_fd_getpagesize() and qemu_mempath_getpagesize()
to use the maximum of getpagesize() and SHMLBA.
In particular, this allows 'make check' to pass on Sparc:
we were previously failing the ivshmem tests.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1512752248-17857-1-git-send-email-peter.maydell@linaro.org
If socket_listen_cleanup is passed an invalid FD, then querying the socket
local address will fail. We must thus be prepared for the returned addr to
be NULL
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The previous patch fixed a race condition, in which there were
coroutines being executing doubly, or after coroutine deletion.
We can detect common scenarios when this happens, and print an error
message and abort before we corrupt memory / data, or segfault.
This patch will abort if an attempt to enter a coroutine is made while
it is currently pending execution, either in a specific AioContext bh,
or pending execution via a timer. It will also abort if a coroutine
is scheduled, before a prior scheduled run has occurred.
We cannot rely on the existing co->caller check for recursive re-entry
to catch this, as the coroutine may run and exit with
COROUTINE_TERMINATE before the scheduled coroutine executes.
(This is the scenario that was occurring and fixed in the previous
patch).
This patch also re-orders the Coroutine struct elements in an attempt to
optimize caching.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
stat64_min_slow() and stat64_max_slow() compare the wrong way. This
makes iotest 136 fail with clang and -m32.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20171114232223.25207-1-mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit b7a745d added a qemu_bh_cancel call to the completion function
as an optimization to prevent it from unnecessarily rescheduling itself.
This completion function is scheduled from worker_thread, after setting
the state of a ThreadPoolElement to THREAD_DONE.
This was considered to be safe, as the completion function restarts the
loop just after the call to qemu_bh_cancel. But, as this loop lacks a HW
memory barrier, the read of req->state may actually happen _before_ the
call, seeing it still as THREAD_QUEUED, and ending the completion
function without having processed a pending TPE linked at pool->head:
worker thread | I/O thread
------------------------------------------------------------------------
| speculatively read req->state
req->state = THREAD_DONE; |
qemu_bh_schedule(p->completion_bh) |
bh->scheduled = 1; |
| qemu_bh_cancel(p->completion_bh)
| bh->scheduled = 0;
| if (req->state == THREAD_DONE)
| // sees THREAD_QUEUED
The source of the misunderstanding was that qemu_bh_cancel is now being
used by the _consumer_ rather than the producer, and therefore now needs
to have acquire semantics just like e.g. aio_bh_poll.
In some situations, if there are no other independent requests in the
same aio context that could eventually trigger the scheduling of the
completion function, the omitted TPE and all operations pending on it
will get stuck forever.
[Added Sergio's updated wording about the HW memory barrier.
--Stefan]
Signed-off-by: Sergio Lopez <slp@redhat.com>
Message-id: 20171108063447.2842-1-slp@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If we iterate over the full port range without successfully binding+listening
on the socket, we'll try the next address, whereupon we overwrite the slisten
file descriptor variable without closing it.
Rather than having two places where we open + close socket FDs on different
iterations of nested for loops, re-arrange the code to always open+close
within the same loop iteration.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This hunk should not have been merged but I forgot to remove it. Let's
remove it before it slips into a QEMU release.
¯\_(ツ)_/¯
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20171103154041.12617-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
NetBSD 8.0(beta) ships with KERN_PROC_PATHNAME in sysctl(2).
Older NetBSD versions can use argv[0] parsing fallback.
This code section is partly shared with FreeBSD.
Signed-off-by: Kamil Rytarowski <n54@gmx.com>
Message-id: 20171028194833.23858-1-n54@gmx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
gcc warning:
/qemu/util/oslib-posix.c:304:11: error:
variable ‘addr’ might be clobbered by ‘longjmp’ or ‘vfork’
[-Werror=clobbered]
Fix also some related data types:
numpages, hpagesize are used as pointer offset.
Always use size_t for them and also for the derived
numpages_per_thread and size_per_thread.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-id: 20171016202912.1117-1-sw@weilnetz.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If an offset of ports is specified to the inet_listen_saddr function(),
and two or more processes tries to bind from these ports at the same time,
occasionally more than one process may be able to bind to the same
port. The condition is detected by listen() but too late to avoid a failure.
This function is called by socket_listen() and used
by all socket listening code in QEMU, so all cases where any form of dynamic
port selection is used should be subject to this issue.
Add code to close and re-establish the socket when this
condition is observed, hiding the race condition from the user.
Also clean up some issues with error handling to allow more
accurate reporting of the cause of an error.
This has been developed and tested by means of the
test-listen unit test in the previous commit.
Enable the test for make check now that it passes.
Reviewed-by: Bhavesh Davda <bhavesh.davda@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Girish Moodalbail <girish.moodalbail@oracle.com>
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Another refactoring step to prepare for fixing the problem
exposed with the test-listen test in the previous commit
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A refactoring step to prepare for the problem
exposed by the test-listen test in the previous commit.
Simplify and reorganize the IPv6 specific extra
measures and move it out of the for loop to increase
code readability. No semantic changes.
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
These only depend on the host and therefore belong in the common
osdep, not in a target-dependent object.
While at it, query the host during an init constructor, which guarantees
the page size will be well-defined throughout the execution of the program.
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Change qemu_config_parse() to return the number of config groups
in success and -EINVAL on error. This will allow callers of
qemu_config_parse() to check if something was really loaded from
the config file.
All existing callers of qemu_config_parse() and
qemu_read_config_file() only check if the return value was
negative, so the change shouldn't affect them.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20171004025043.3788-2-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The only client of hbitmap_serialization_granularity() is dirty-bitmap's
bdrv_dirty_bitmap_serialization_align(). Keeping the two names consistent
is worthwhile, and the shorter name is more representative of what the
function returns (the required alignment to be used for start/count of
other serialization functions, where violating the alignment causes
assertion failures).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
After iothread is enabled internally inside QEMU with GMainContext, we
may encounter this warning when destroying the iothread:
(qemu-system-x86_64:19925): GLib-CRITICAL **: g_source_remove_poll:
assertion '!SOURCE_DESTROYED (source)' failed
The problem is that g_source_remove_poll() does not allow to remove one
source from array if the source is detached from its owner
context. (peterx: which IMHO does not make much sense)
Fix it on QEMU side by avoid calling g_source_remove_poll() if we know
the object is during destruction, and we won't leak anything after all
since the array will be gone soon cleanly even with that fd.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-id: 20170928025958.1420-6-peterx@redhat.com
[peterx: write the commit message]
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZylugAAoJEH8JsnLIjy/Wp1QP/jtwre8WwtlX3SZ82cyiwonc
fODcokF6iEsX7vs9vMr8lkbwF4mjxaU2Xf/knC98J/6rCxM8DRP/MW4BZttTxDI9
++U926cIHHzjtSDalcjfTKnJD7PikHbLXz3SJ+u9hPnIeGs56hybCU6iHrrIcOTs
YSMtD3eUQ8B8JaoegKgNyqkhvfhEdHlWFCq9/T2YEWwYMzGt1jvSTRqe3vr8vIu3
v7QKvh35gm965yaUTGpv7Ej7TOBTWOaYBo9D1TBKNEZOFTBjqyldciyBoDhCF2P9
+4EsNTkZ7u20Rko41dzsZPzhjG10gm/lNZNj3Cul9ta4kQ0hGeOjEujd9L9ANTGl
gwnPPHKwgax5O+ctCPmrU7yHG+XIQD3gckC69qQeRXnPYQ4Jeo/LqhwjU+FcZfHs
97Lz6CHQHgtBP9JJwBMtUp77HJ58KPnnWIxGkb9u2vm4CpvRFkMrx5ekmj//9klX
5niRiqkNdrkEUnu/FIXOXxSmwnlhedAGQNq7ALSoW95El1QCy8Mm0eKEvmHyvZzd
z2gSufLX6ynOaG4x5oY5eezKm6F4Hxwt+w8Svj9PHXIrmrEIop11LG5MVsDGDjyh
XKiLddEIVKTYGwffX0CGTLYA34m2uHPkVVrMOIEvni3G6byXkb10+4pBpuTu/O/h
wQPFraquH1I2B5YETsMa
=7Bt2
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Tue 26 Sep 2017 14:52:32 BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (24 commits)
block/qcow2-bitmap: fix use of uninitialized pointer
qemu-iotests: add shrinking image test
qcow2: add shrink image support
qcow2: add qcow2_cache_discard
qemu-img: add --shrink flag for resize
iotests: fix 181: enable postcopy-ram capability on target
qemu-iotests: Test change-backing-file command
block: Fix permissions after bdrv_reopen()
block: reopen: Queue children after their parents
block: Base permissions on rw state after reopen
block: Add reopen queue to bdrv_check_perm()
block: Add reopen_queue to bdrv_child_perm()
qemu-io: Drop write permissions before read-only reopen
block: Clean up some bad code in the vvfat driver
block/throttle-groups.c: allocate RestartData on the heap
throttle: Assert that bkt->max is valid in throttle_compute_wait()
iotests: Print full path of bad output if mismatch
iotests: use virtio aliases for 067
iotests: use -ccw on s390x for 051
iotests: use -ccw on s390x for 040, 139, and 182
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If bkt->max == 0 and bkt->burst_length > 1 then we could have a
division by 0 in throttle_do_compute_wait(). That configuration is
however not permitted and is already detected by throttle_is_valid(),
but let's assert it in throttle_compute_wait() to make it explicit.
Found by Coverity (CID: 1381016).
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In qemu-thread-posix.c we have two implementations of the
various qemu_sem_* functions, one of which uses native POSIX
sem_* and the other of which emulates them with pthread conditions.
This is necessary because not all our host OSes support
sem_timedwait().
Instead of a hard-coded list of OSes which don't implement
sem_timedwait(), which gets out of date, make configure
test for the presence of the function and set a new
CONFIG_HAVE_SEM_TIMEDWAIT appropriately.
In particular, newer NetBSDs have sem_timedwait(), so this
commit will switch them over to using it. OSX still does
not have an implementation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Kamil Rytarowski <n54@gmx.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Provide helpers to convert bitmaps to little endian format. It can be
used when we want to send one bitmap via network to some other hosts.
One thing to mention is that, these helpers only solve the problem of
endianess, but it does not solve the problem of different word size on
machines (the bitmaps managing same count of bits may contains different
size when malloced). So we need to take care of the size alignment issue
on the callers for now.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Count how many bits set in the bitmap.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
We have BIT_WORD(). It's the same.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Convert any remaining uses of fprintf(stderr, "warning:"...
to use warn_report() instead. This helps standardise on a single
method of printing warnings to the user.
All of the warnings were changed using this command:
find ./* -type f -exec sed -i 's|fprintf(.*".*warning[,:] |warn_report("|Ig' {} +
The #include lines and chagnes to the test Makefile were manually
updated to allow the code to compile.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-Id: <2c94ac3bb116cc6b8ebbcd66a254920a69665515.1503077821.git.alistair.francis@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Convert all the multi-line uses of fprintf(stderr, "warning:"..."\n"...
to use warn_report() instead. This helps standardise on a single
method of printing warnings to the user.
All of the warnings were changed using these commands:
find ./* -type f -exec sed -i \
'N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
Indentation fixed up manually afterwards.
Some of the lines were manually edited to reduce the line length to below
80 charecters. Some of the lines with newlines in the middle of the
string were also manually edit to avoid checkpatch errrors.
The #include lines were manually updated to allow the code to compile.
Several of the warning messages can be improved after this patch, to
keep this patch mechanical this has been moved into a later patch.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Jason Wang <jasowang@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <5def63849ca8f551630c6f2b45bcb1c482f765a6.1505158760.git.alistair.francis@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
util/scsi.c includes some SCSI code that is shared by block/iscsi.c and
hw/scsi, but the introduction of the persistent reservation helper
will add many more instances of this. There is also include/block/scsi.h,
which actually is not part of the core block layer.
The persistent reservation manager will also need a home. A scsi/
directory provides one for both the aforementioned shared code and
the PR manager code.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This recognizes the "fixed" and "descriptor" format sense data, extracts
the sense key/asc/ascq fields then converts them to an errno.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170821141008.19383-4-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tweak the errno mapping to return more accurate/appropriate values.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170821141008.19383-3-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
So that it can be reused outside of iscsi.c.
Also update MAINTAINERS to include the new files in SCSI section.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170821141008.19383-2-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Nobody has mentioned AIX host support on the mailing list for years,
and we have no test systems for it so it is most likely broken.
We've advertised in configure for two releases now that we plan
to drop support for this host OS, and have had no complaints.
Drop the AIX host support code.
We can also drop the now-unused AIX version of sys_cache_info().
Note that the _CALL_AIX define used in the PPC tcg backend is
also used for Linux PPC64, and so that code should not be removed.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1504545540-8002-1-git-send-email-peter.maydell@linaro.org
ThrottleGroup is converted to an object. This will allow the future
throttle block filter drive easy creation and configuration of throttle
groups in QMP and cli.
A new QAPI struct, ThrottleLimits, is introduced to provide a shared
struct for all throttle configuration needs in QMP.
ThrottleGroups can be created via CLI as
-object throttle-group,id=foo,x-iops-total=100,x-..
where x-* are individual limit properties. Since we can't add non-scalar
properties in -object this interface must be used instead. However,
setting these properties must be disabled after initialization because
certain combinations of limits are forbidden and thus configuration
changes should be done in one transaction. The individual properties
will go away when support for non-scalar values in CLI is implemented
and thus are marked as experimental.
ThrottleGroup also has a `limits` property that uses the ThrottleLimits
struct. It can be used to create ThrottleGroups or set the
configuration in existing groups as follows:
{ "execute": "object-add",
"arguments": {
"qom-type": "throttle-group",
"id": "foo",
"props" : {
"limits": {
"iops-total": 100
}
}
}
}
{ "execute" : "qom-set",
"arguments" : {
"path" : "foo",
"property" : "limits",
"value" : {
"iops-total" : 99
}
}
}
This also means a group's configuration can be fetched with qom-get.
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The non-blocking connect mechanism is obsolete, and it doesn't
work well in inet connection, because it will call getaddrinfo
first and getaddrinfo will blocks on DNS lookups. Since commit
e65c67e4 & d984464e, the non-blocking connect of migration goes
through QIOChannel in a different manner(using a thread), and
nobody use this old non-blocking connect anymore.
Any newly written code which needs a non-blocking connect should
use the QIOChannel code, so we can drop NonBlockingConnectHandler
as a concept entirely.
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The next commit will put it to use. May look pointless now, but we're
going to change the FOO_lookup's type, and then it'll help.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1503564371-26090-13-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
If QEMU is running on a system that's out of memory and mmap()
fails, QEMU aborts with no error message at all, making it hard
to debug the reason for the failure.
Add perror() calls that will print error information before
aborting.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170829212053.6003-1-ehabkost@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
LeakyBucket.burst_length is defined as an unsigned integer but the
code never checks for overflows and it only makes sure that the value
is not 0.
In practice this means that the user can set something like
throttling.iops-total-max-length=4294967300 despite being larger than
UINT_MAX and the final value after casting to unsigned int will be 4.
This patch changes the data type to uint64_t. This does not increase
the storage size of LeakyBucket, and allows us to assign the value
directly from qemu_opt_get_number() or BlockIOThrottle and then do the
checks directly in throttle_is_valid().
The value of burst_length does not have a specific upper limit,
but since the bucket size is defined by max * burst_length we have
to prevent overflows. Instead of going for UINT64_MAX or something
similar this patch reuses THROTTLE_VALUE_MAX, which allows I/O bursts
of 1 GiB/s for 10 days in a row.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1b2e3049803f71cafb2e1fa1be4fb47147a0d398.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Both the throttling limits set with the throttling.iops-* and
throttling.bps-* options and their QMP equivalents defined in the
BlockIOThrottle struct are integer values.
Those limits are also reported in the BlockDeviceInfo struct and they
are integers there as well.
Therefore there's no reason to store them internally as double and do
the conversion everytime we're setting or querying them, so this patch
uses uint64_t for those types. Let's also use an unsigned type because
we don't allow negative values anyway.
LeakyBucket.level and LeakyBucket.burst_level do however remain double
because their value changes depending on the fraction of time elapsed
since the previous I/O operation.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: f29b840422767b5be2c41c2dfdbbbf6c5f8fedf8.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The throttling code can change internally the value of bkt->max if it
hasn't been set by the user. The problem with this is that if we want
to retrieve the original value we have to undo this change first. This
is ugly and unnecessary: this patch removes the throttle_fix_bucket()
and throttle_unfix_bucket() functions completely and moves the logic
to throttle_compute_wait().
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Message-id: 5b0b9e1ac6eb208d709eddc7b09e7669a523bff3.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Use a pointer to the bucket instead of repeating cfg->buckets[i] all
the time. This makes the code more concise and will help us expand the
checks later and save a few line breaks.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 763ffc40a26b17d54cf93f5a999e4656049fcf0c.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The way the throttling algorithm works is that requests start being
throttled once the bucket level exceeds the burst limit. When we get
there the bucket leaks at the level set by the user (bkt->avg), and
that leak rate is what prevents guest I/O from exceeding the desired
limit.
If we don't allow bursts (i.e. bkt->max == 0) then we can start
throttling requests immediately. The problem with keeping the
threshold at 0 is that it only allows one request at a time, and as
soon as there's a bit of I/O from the guest every other request will
be throttled and performance will suffer considerably. That can even
make the guest unable to reach the throttle limit if that limit is
high enough, and that happens regardless of the block scheduler used
by the guest.
Increasing that threshold gives flexibility to the guest, allowing it
to perform short bursts of I/O before being throttled. Increasing the
threshold too much does not make a difference in the long run (because
it's the leak rate what defines the actual throughput) but it does
allow the guest to perform longer initial bursts and exceed the
throttle limit for a short while.
A burst value of bkt->avg / 10 allows the guest to perform 100ms'
worth of I/O at the target rate without being throttled.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 31aae6645f0d1fbf3860fb2b528b757236f0c0a7.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Build time check of OFD lock is not sufficient and can cause image open
errors when the runtime environment doesn't support it.
Add a helper function to probe it at runtime, additionally. Also provide
a qemu_has_ofd_lock() for callers to check the status.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This reverts commit a59629fcc6.
This is not needed anymore because the IOThread mutex is not
"magic" anymore (need not kick the CPU thread)and also because
fork callbacks are only enabled at the very beginning of
QEMU's execution.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Because of -daemonize, system mode QEMU sometimes needs to fork() and
keep RCU enabled in the child. However, there is a possible deadlock
with synchronize_rcu:
- the CPU thread is inside a RCU critical section and wants to take
the BQL in order to do MMIO
- the monitor thread, which is owning the BQL, calls rcu_init_lock
which tries to take the rcu_sync_lock
- the call_rcu thread has taken rcu_sync_lock in synchronize_rcu, but
synchronize_rcu needs the CPU thread to end the critical section
before returning.
This cannot happen for user-mode emulation, because it does not have
a BQL.
To fix it, assume that system mode QEMU only forks in preparation for
exec (except when daemonizing) and disable pthread_atfork as soon as
the double fork has happened.
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With the move of some docs/ to docs/devel/ on ac06724a71,
no references were updated.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit 97f4030 changed warning messages from
timestamp-if-enabled progname ":" location "warning: " message
to
"warning: " timestamp-if-enabled progname ":" location message
This regressed qemu-iotests 051. Put "warning: " right back where it
was, along with "info: ".
Reported-by: Kevin Wolf <kwolf@redhat.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1500449614-16811-1-git-send-email-armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Clang 3.9 passes the CONFIG_AVX2_OPT configure test. However, the
supplied <cpuid.h> does not contain the bit_AVX2 define that we use
when detecting whether the routine can be enabled.
Introduce a qemu-specific header that uses the compiler's definition
of __cpuid et al, but supplies any missing bit_* definitions needed.
This avoids introducing any extra ifdefs to util/bufferiszero.c, and
allows quite a few to be removed from tcg/i386/tcg-target.inc.c.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20170719044018.18063-1-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
On NetBSD the compiler warns:
util/oslib-posix.c: In function 'sigaction_invoke':
util/oslib-posix.c:589:5: warning: missing braces around initializer [-Wmissing-braces]
siginfo_t si = { 0 };
^
util/oslib-posix.c:589:5: warning: (near initialization for 'si.si_pad') [-Wmissing-braces]
because on this platform siginfo_t is defined as
typedef union siginfo {
char si_pad[128]; /* Total size; for future expansion */
struct _ksiginfo _info;
} siginfo_t;
Avoid this warning by initializing the struct with {} instead;
this is a GCC extension but we use it all over the codebase already.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1500568341-8389-1-git-send-email-peter.maydell@linaro.org
This include was forgotten when splitting cacheinfo.c out of
tcg/ppc/tcg-target.inc.c (see commit b255b2c8).
For a Centos7 host, the include path
<signal.h>
<bits/sigcontext.h>
<asm/sigcontext.h>
<asm/elf.h>
<asm/auxvec.h>
implicitly pulls in the desired AT_* defines.
Not so for Debian Jessie.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20170711015524.22936-1-f4bug@amsat.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
throttle_config() cancels the timers of the calling BlockBackend. This
doesn't make sense because other BlockBackends in the group remain
untouched. There's no need to cancel the timers in the one specific
BlockBackend so let's not do that. Throttled requests will run as
scheduled and future requests will follow the new configuration. This
also allows a throttle group's configuration to be changed even when it
has no members.
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Clock type in throttling is currently inferred by the ThrottleTimer's
clock type even though it is a per-ThrottleGroup property; it doesn't
make sense to have different clock types in the same group. Moving this
to a field in ThrottleGroup can simplify some of the throttle functions.
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
These functions are more efficient in the presence of contention.
qemu_co_rwlock_downgrade also guarantees not to block, which may
be useful in some algorithms too.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170629132749.997-3-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Currently if you disable listening on IPv4 addresses, via the
CLI flag ipv4=off, we still mistakenly accept IPv4 clients via
the IPv6 listener socket due to IPV6_V6ONLY flag being unset.
We must ensure IPV6_V6ONLY is always set if ipv4=off
This fixes the following scenarios
-incoming tcp::9000,ipv6=on
-incoming tcp:[::]:9000,ipv6=on
-chardev socket,id=cdev0,host=,port=9000,server,nowait,ipv4=off
-chardev socket,id=cdev0,host=,port=9000,server,nowait,ipv6=on
-chardev socket,id=cdev0,host=::,port=9000,server,nowait,ipv4=off
-chardev socket,id=cdev0,host=::,port=9000,server,nowait,ipv6=on
which all mistakenly accepted IPv4 clients
Acked-by: Gerd Hoffmann <kraxel@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When inet_parse() parses the hostname, it is forcing the
has_ipv6 && ipv6 flags if the address contains a ":". This
means that if the user had set the ipv4=on flag, to try to
restrict the listener to just ipv4, an error would not have
been raised. eg
-incoming tcp:[::]:9000,ipv4
should have raised an error because listening for IPv4
on "::" is a non-sensical combination. With this removed,
we now call getaddrinfo() on "::" passing PF_INET and
so getaddrinfo reports an error about the hostname being
incompatible with the requested protocol:
qemu-system-x86_64: -incoming tcp:[::]:9000,ipv4: address resolution
failed for :::9000: Address family for hostname not supported
Likewise it is explicitly setting the has_ipv4 & ipv4
flags when the address contains only digits + '.'. This
has no ill-effect, but also has no benefit, so is removed.
Acked-by: Gerd Hoffmann <kraxel@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When binding to an IPv6 socket we currently force the
IPV6_V6ONLY flag to off. This means that the IPv6 socket
will accept both IPv4 & IPv6 sockets when QEMU is launched
with something like
-vnc :::1
While this is good for that case, it is bad for other
cases. For example if an empty hostname is given,
getaddrinfo resolves it to 2 addresses 0.0.0.0 and ::,
in that order. We will thus bind to 0.0.0.0 first, and
then fail to bind to :: on the same port. The same
problem can happen if any other hostname lookup causes
the IPv4 address to be reported before the IPv6 address.
When we get an IPv6 bind failure, we should re-try the
same port, but with IPV6_V6ONLY turned on again, to
avoid clash with any IPv4 listener.
This ensures that
-vnc :1
will bind successfully to both 0.0.0.0 and ::, and also
avoid
-vnc :1,to=2
from mistakenly using a 2nd port for the :: listener.
This is a regression due to commit 396f935 "ui: add ability to
specify multiple VNC listen addresses".
Acked-by: Gerd Hoffmann <kraxel@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Implement warn_report_err() and warn_reportf_err() functions which
are the same as the error_report_err() and error_reportf_err()
functions except report a warning instead of an error.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <276ff93eadc0b01b8243cc61ffc331f77922c0d0.1499866456.git.alistair.francis@xilinx.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Add warn_report(), warn_vreport() for reporting warnings, and
info_report(), info_vreport() for informational messages.
These are implemented them with a helper function factored out of
error_vreport(), suitably generalized. This patch makes no changes
to the output of the original error_report() function.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <c89e9980019f296ec9aa38d7689ac4d5c369296d.1499866456.git.alistair.francis@xilinx.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Rename the error_print_loc() function in preparation for using it to
print warnings as well.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <661b215695db878a0aef8401b506fb3da50e981a.1499866456.git.alistair.francis@xilinx.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Add bdrv_dirty_bitmap_deserialize_ones() function, which is needed for
qcow2 bitmap loading, to handle unallocated bitmap parts, marked as
all-ones.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170628120530.31251-7-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Make dirty iter resistant to resetting bits in corresponding HBitmap.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170628120530.31251-4-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Now that qcow & qcow2 are wired up to get encryption keys
via the QCryptoSecret object, nothing is relying on the
interactive prompting for passwords. All the code related
to password prompting can thus be ripped out.
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-17-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Clang generates the following warning on aarch64 host:
CC util/cacheinfo.o
/home/pranith/qemu/util/cacheinfo.c:121:48: warning: value size does not match register size specified by the constraint and modifier [-Wasm-operand-widths]
asm volatile("mrs\t%0, ctr_el0" : "=r"(ctr));
^
/home/pranith/qemu/util/cacheinfo.c:121:28: note: use constraint modifier "w"
asm volatile("mrs\t%0, ctr_el0" : "=r"(ctr));
^~
%w0
Constraint modifier 'w' is not (yet?) accepted by gcc. Fix this by increasing the ctr size.
Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170630153946.11997-1-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Not all platforms check whether a lock is initialized before used. In
particular Linux seems to be more permissive than OSX.
Check initialization state explicitly in our code to catch such bugs
earlier.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170704122325.25634-1-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The original ready < nhandles - 1 can be re-written as ready + 1 <
nhandles. The check was actually incorrect because
WAIT_OBJECT_0 was not subtracted from ready; it worked because
WAIT_OBJECT_0 is zero. After subtracting WAIT_OBJECT_0,
the result is the same condition that we are checking on the first
itteration of the for loop. This means we can remove the if statement
and let the for loop check the code.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-Id: <a14083d681951f3999a0e9314605cb706381ae8d.1498756113.git.alistair.francis@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The 'sun_path' field in the sockaddr_un struct is not required
to be NUL termianted, so when reporting an error, we must use
the separate 'path' variable which is guaranteed terminated.
Fixes a bug spotted by coverity that was introduced in
commit ad9579aaa1
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Thu May 25 16:53:00 2017 +0100
sockets: improve error reporting if UNIX socket path is too long
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20170626103756.22974-1-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZSRWrAAoJEDhwtADrkYZTVOQP/RK8br2A1Cn7LeVG6jnKz5hJ
OqyII77x8I2RachWvnwxQeHPMEVDuz3y2WjL+80s6peXlpR6y/13w7oX0f6aiJuo
+T9khTqMv2I7HsM5UCsXAJpFPHT7r90b4x8nstY80YLGe7lA7L6yk6PGyCxHThwA
mOiTKDw6/Xb/yZGrS2Favrun7juNpAs0Ec1IAkaA8xsEgVkd6tDv281rmHqvibl/
//90VfJp3nHFZ12FCQ1HzA42Eigtmo/fIk9LnAzBoYG0zw0cnzjuv0BNzs/JwuUZ
/VskeD1cViQ4yzFnPpjOavjYjTN854/JTJzm7gZ7dTQ6/l3ykoY6NDE8p1BLuHlC
p2RKkg20EeZlpOEtMQ4g6iyG6EUxaKcEiXmQ31LqN/LJwxTYbo5B5nCHMjrt4gxe
MqFBJQSNsJ7QjZ7Qa7pADMCi/G0m7/0dN8vBqSr4vcbLVvdbw/yb/9s33wXGrUj1
PyXM2ymi+vvSqcXtNXKshsJLxJSJxO1tm2tRIANDTabQ00yxs8dOYnQnbQFR94fp
6nrE2PnjZqgqk69aNDJEbngj6Tgx44nyTr1+Q17juZf9nTCE5QmBE1J0IRoykCJn
E8+T63ZxtIxVV2yLi5xBjmZaZtPyJRGGeUXunA10SuWrHzupEcBuhFhFYd2MFM5L
fsojALN2K3Gdx2+CmAo2
=O9Vv
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-06-09-v2' into staging
QAPI patches for 2017-06-09
# gpg: Signature made Tue 20 Jun 2017 13:31:39 BST
# gpg: using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-qapi-2017-06-09-v2: (41 commits)
tests/qdict: check more get_try_int() cases
console: use get_uint() for "head" property
i386/cpu: use get_uint() for "min-level"/"min-xlevel" properties
numa: use get_uint() for "size" property
pnv-core: use get_uint() for "core-pir" property
pvpanic: use get_uint() for "ioport" property
auxbus: use get_uint() for "addr" property
arm: use get_uint() for "mp-affinity" property
xen: use get_uint() for "max-ram-below-4g" property
pc: use get_uint() for "hpet-intcap" property
pc: use get_uint() for "apic-id" property
pc: use get_uint() for "iobase" property
acpi: use get_uint() for "pci-hole*" properties
acpi: use get_uint() for various acpi properties
acpi: use get_uint() for "acpi-pcihp-io*" properties
platform-bus: use get_uint() for "addr" property
bcm2835_fb: use {get, set}_uint() for "vcram-size" and "vcram-base"
aspeed: use {set, get}_uint() for "ram-size" property
pcihp: use get_uint() for "bsel" property
pc-dimm: make "size" property uint64
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We would like to use a same QObject type to represent numbers, whether
they are int, uint, or floats. Getters will allow some compatibility
between the various types if the number fits other representations.
Add a few more tests while at it.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170607163635.17635-7-marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[parse_stats_intervals() simplified a bit, comment in
test_visitor_in_int_overflow() tidied up, suppress bogus warnings]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Add helpers to gather cache info from the host at init-time.
For now, only export the host's I/D cache line sizes, which we
will use to improve cache locality to avoid false sharing.
Suggested-by: Richard Henderson <rth@twiddle.net>
Suggested-by: Geert Martin Ijewski <gm.ijewski@web.de>
Tested-by: Geert Martin Ijewski <gm.ijewski@web.de>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1496794624-4083-1-git-send-email-cota@braap.org>
[rth: Move all implementations from tcg/ppc/]
Signed-off-by: Richard Henderson <rth@twiddle.net>
This module provides fast paths for 64-bit atomic operations on machines
that only have 32-bit atomic access.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170605123908.18777-11-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
The 'struct sockaddr_un' only allows 108 bytes for the socket
path.
If the user supplies a path, QEMU uses snprintf() to silently
truncate it when too long. This is undesirable because the user
will then be unable to connect to the path they asked for.
If the user doesn't supply a path, QEMU builds one based on
TMPDIR, but if that leads to an overlong path, it mistakenly
uses error_setg_errno() with a stale errno value, because
snprintf() does not set errno on truncation.
In solving this the code needed some refactoring to ensure we
don't pass 'un.sun_path' directly to any APIs which expect
NUL-terminated strings, because the path is not required to
be terminated.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20170525155300.22743-1-berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Submission of requests on linux aio is a bit tricky and can lead to
requests completions on submission path:
44713c9e85 ("linux-aio: Handle io_submit() failure gracefully")
0ed93d84ed ("linux-aio: process completions from ioq_submit()")
That means that any coroutine which has been yielded in order to wait
for completion can be resumed from submission path and be eventually
terminated (freed).
The following use-after-free crash was observed when IO throttling
was enabled:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f5813dff700 (LWP 56417)]
virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=<optimized out>) at virtio.c:252
(gdb) bt
#0 virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=<optimized out>) at virtio.c:252
^^^^^^^^^^^^^^
remember the address
#1 virtqueue_fill (vq=0x5598b20d21b0, elem=0x7f5804009a30, len=1, idx=0) at virtio.c:282
#2 virtqueue_push (vq=0x5598b20d21b0, elem=elem@entry=0x7f5804009a30, len=<optimized out>) at virtio.c:308
#3 virtio_blk_req_complete (req=req@entry=0x7f5804009a30, status=status@entry=0 '\000') at virtio-blk.c:61
#4 virtio_blk_rw_complete (opaque=<optimized out>, ret=0) at virtio-blk.c:126
#5 blk_aio_complete (acb=0x7f58040068d0) at block-backend.c:923
#6 coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:78
(gdb) p * elem
$8 = {index = 77, out_num = 2, in_num = 1,
in_addr = 0x7f5804009ad8, out_addr = 0x7f5804009ae0,
in_sg = 0x0, out_sg = 0x7f5804009a50}
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
'in_sg' and 'out_sg' are invalid.
e.g. it is impossible that 'in_sg' is zero,
instead its value must be equal to:
(gdb) p/x 0x7f5804009ad8 + sizeof(elem->in_addr[0]) + 2 * sizeof(elem->out_addr[0])
$26 = 0x7f5804009af0
Seems 'elem' was corrupted. Meanwhile another thread raised an abort:
Thread 12 (Thread 0x7f57f2ffd700 (LWP 56426)):
#0 raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 qemu_coroutine_enter (co=0x7f5804009af0) at qemu-coroutine.c:113
#3 qemu_co_queue_run_restart (co=0x7f5804009a30) at qemu-coroutine-lock.c:60
#4 qemu_coroutine_enter (co=0x7f5804009a30) at qemu-coroutine.c:119
^^^^^^^^^^^^^^^^^^
WTF?? this is equal to elem from crashed thread
#5 qemu_co_queue_run_restart (co=0x7f57e7f16ae0) at qemu-coroutine-lock.c:60
#6 qemu_coroutine_enter (co=0x7f57e7f16ae0) at qemu-coroutine.c:119
#7 qemu_co_queue_run_restart (co=0x7f5807e112a0) at qemu-coroutine-lock.c:60
#8 qemu_coroutine_enter (co=0x7f5807e112a0) at qemu-coroutine.c:119
#9 qemu_co_queue_run_restart (co=0x7f5807f17820) at qemu-coroutine-lock.c:60
#10 qemu_coroutine_enter (co=0x7f5807f17820) at qemu-coroutine.c:119
#11 qemu_co_queue_run_restart (co=0x7f57e7f18e10) at qemu-coroutine-lock.c:60
#12 qemu_coroutine_enter (co=0x7f57e7f18e10) at qemu-coroutine.c:119
#13 qemu_co_enter_next (queue=queue@entry=0x5598b1e742d0) at qemu-coroutine-lock.c:106
#14 timer_cb (blk=0x5598b1e74280, is_write=<optimized out>) at throttle-groups.c:419
Crash can be explained by access of 'co' object from the loop inside
qemu_co_queue_run_restart():
while ((next = QSIMPLEQ_FIRST(&co->co_queue_wakeup))) {
QSIMPLEQ_REMOVE_HEAD(&co->co_queue_wakeup, co_queue_next);
^^^^^^^^^^^^^^^^^^^^
on each iteration 'co' is accessed,
but 'co' can be already freed
qemu_coroutine_enter(next);
}
When 'next' coroutine is resumed (entered) it can in its turn resume
'co', and eventually free it. That's why we see 'co' (which was freed)
has the same address as 'elem' from the first backtrace.
The fix is obvious: use temporary queue and do not touch coroutine after
first qemu_coroutine_enter() is invoked.
The issue is quite rare and happens every ~12 hours on very high IO
and CPU load (building linux kernel with -j512 inside guest) when IO
throttling is enabled. With the fix applied guest is running ~35 hours
and is still alive so far.
Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170601160847.23720-1-roman.penyaev@profitbricks.com
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Fam Zheng <famz@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Assert that the return value is not an error. This issue was found by
Coverity.
CID: 1374831
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
CC: groug@kaod.org
CC: pbonzini@redhat.com
CC: Eric Blake <eblake@redhat.com>
Message-Id: <1494356693-13190-2-git-send-email-sstabellini@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Those are apparently unnecessary includes.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
mapped-file security mode (especially for the virtfs root).
-----BEGIN PGP SIGNATURE-----
iEYEABECAAYFAlktdVYACgkQAvw66wEB28LaTgCfS/bunOy4Wp+I+DO/Gx/5bfNp
7/IAn3qfJYBqRnpfz8KNKRZJ8QG/Bu5X
=KTTf
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
Various bugfixes and code cleanups. Most notably, it fixes metadata handling in
mapped-file security mode (especially for the virtfs root).
# gpg: Signature made Tue 30 May 2017 14:36:22 BST
# gpg: using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg: aka "Greg Kurz <groug@free.fr>"
# gpg: aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg: aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg: aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
* remotes/gkurz/tags/for-upstream:
9pfs: local: metadata file for the VirtFS root
9pfs: local: simplify file opening
9pfs: local: resolve special directories in paths
9pfs: check return value of v9fs_co_name_to_path()
util: drop old utimensat() compat code
9pfs: assume utimensat() and futimens() are present
fsdev: fix virtfs-proxy-helper cwd
9pfs: local: fix unlink of alien files in mapped-file mode
9pfs: drop pdu_push_and_notify()
fsdev: don't allow unknown format in marshal/unmarshal
virtio-9p/xen-9p: move 9p specific bits to core 9p code
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Alternates are sum types like unions, but use the JSON type on the
wire / QType in QObject instead of an explicit tag. That's why we
require alternate members to have distinct QTypes.
The recently introduced keyval_parse() (commit d454dbe) can only
produce string scalars. The qobject_input_visitor_new_keyval() input
visitor mostly hides the difference, so code using a QObject input
visitor doesn't have to care whether its input was parsed from JSON or
KEY=VALUE,... The difference leaks for alternates, as noted in commit
0ee9ae7: a non-string, non-enum scalar alternate value can't currently
be expressed.
In part, this is just our insufficiently sophisticated implementation.
Consider alternate type 'GuestFileWhence'. It has an integer member
and a 'QGASeek' member. The latter is an enumeration with values
'set', 'cur', 'end'. The meaning of b=set, b=cur, b=end, b=0, b=1 and
so forth is perfectly obvious. However, our current implementation
falls apart at run time for b=0, b=1, and so forth. Fixable, but not
today; add a test case and a TODO comment.
Now consider an alternate type with a string and an integer member.
What's the meaning of a=42? Is it the string "42" or the integer 42?
Whichever meaning you pick makes the other inexpressible. This isn't
just an implementation problem, it's fundamental. Our current
implementation will pick string.
So far, we haven't needed such alternates. To make sure we stop and
think before we add one that cannot sanely work with keyval_parse(),
let's require alternate members to have sufficiently distinct
representation in KEY=VALUE,... syntax:
* A string member clashes with any other scalar member
* An enumeration member clashes with bool members when it has value
'on' or 'off'.
* An enumeration member clashes with numeric members when it has a
value that starts with '-', '+', or a decimal digit. This is a
rather lazy approximation of the actual number syntax accepted by
the visitor.
Note that enumeration values starting with '-' and '+' are rejected
elsewhere already, but better safe than sorry.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1495471335-23707-5-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Now that 9pfs and virtfs-proxy-helper have been converted to utimensat(),
we don't need to keep qemu_utimens() anymore.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
socket_address_flatten() leaks a SocketAddress when its argument is
null. Happens when opening a ChardevBackend of type 'udp' that is
configured without a local address. Screwed up in commit bd269ebc due
to last minute semantic conflict resolution. Spotted by Coverity.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1494866344-11013-1-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Moving the algorithm from print_type_size() into size_to_str() so that
other component can also leverage it. With that, refactor
print_type_size().
The assert() in that logic is removed though, since even UINT64_MAX
would not overflow.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494562661-9063-3-git-send-email-peterx@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>