This makes sense when the next commit starts using the extra buffer space
to perform many I/O operations asynchronously.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The desired granularity may be very different depending on the kind of
operation (e.g. continuous replication vs. collapse-to-raw) and whether
the VM is expected to perform lots of I/O while mirroring is in progress.
Allow the user to customize it, while providing a sane default so that
in general there will be no extra allocated space in the target compared
to the source.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is needed in the following patch.
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This actually uses the dirty bitmap in the block layer, and converts
mirroring to use an HBitmapIter.
Reviewed-by: Laszlo Ersek <lersek@redhat.com> (except block/mirror.c parts)
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
HBitmaps provides an array of bits. The bits are stored as usual in an
array of unsigned longs, but HBitmap is also optimized to provide fast
iteration over set bits; going from one bit to the next is O(logB n)
worst case, with B = sizeof(long) * CHAR_BIT: the result is low enough
that the number of levels is in fact fixed.
In order to do this, it stacks multiple bitmaps with progressively coarser
granularity; in all levels except the last, bit N is set iff the N-th
unsigned long is nonzero in the immediately next level. When iteration
completes on the last level it can examine the 2nd-last level to quickly
skip entire words, and even do so recursively to skip blocks of 64 words or
powers thereof (32 on 32-bit machines).
Given an index in the bitmap, it can be split in group of bits like
this (for the 64-bit case):
bits 0-57 => word in the last bitmap | bits 58-63 => bit in the word
bits 0-51 => word in the 2nd-last bitmap | bits 52-57 => bit in the word
bits 0-45 => word in the 3rd-last bitmap | bits 46-51 => bit in the word
So it is easy to move up simply by shifting the index right by
log2(BITS_PER_LONG) bits. To move down, you shift the index left
similarly, and add the word index within the group. Iteration uses
ffs (find first set bit) to find the next word to examine; this
operation can be done in constant time in most current architectures.
Setting or clearing a range of m bits on all levels, the work to perform
is O(m + m/W + m/W^2 + ...), which is O(m) like on a regular bitmap.
When iterating on a bitmap, each bit (on any level) is only visited
once. Hence, The total cost of visiting a bitmap with m bits in it is
the number of bits that are set in all bitmaps. Unless the bitmap is
extremely sparse, this is also O(m + m/W + m/W^2 + ...), so the amortized
cost of advancing from one bit to the next is usually constant.
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We can provide fast versions based on the other functions defined
by host-utils.h. Some care is required on glibc, which provides
ffsl already.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
# By Juan Quintela (7) and Paolo Bonzini (6)
# Via Juan Quintela
* quintela/thread.next:
migration: remove argument to qemu_savevm_state_cancel
migration: Only go to the iterate stage if there is anything to send
migration: unfold rest of migrate_fd_put_ready() into thread
migration: move exit condition to migration thread
migration: Add buffered_flush error handling
migration: move beginning stage to the migration thread
qemu-file: Only set last_error if it is not already set
migration: fix off-by-one in buffered_rate_limit
migration: remove double call to migrate_fd_close
migration: make function static
use XFER_LIMIT_RATIO consistently
Protect migration_bitmap_sync() with the ramlist lock
Unlock ramlist lock also in error case
# By Kevin Wolf (4) and others
# Via Stefan Hajnoczi
* stefanha/block:
dataplane: support viostor virtio-pci status bit setting
dataplane: avoid reentrancy during virtio_blk_data_plane_stop()
win32-aio: use iov utility functions instead of open-coding them
win32-aio: Fix memory leak
win32-aio: Fix vectored reads
aio: Fix return value of aio_poll()
ide: Remove wrong assertion
block: fix null-pointer bug on error case in block commit
s390x-linux-user now also uses GETPC. Instead of adding it to the list of
targets which use GETPC, the macro is now defined unconditionally.
This avoids future build regressions like this one:
CC s390x-linux-user/target-s390x/int_helper.o
cc1: warnings being treated as errors
qemu/target-s390x/int_helper.c: In function ‘helper_divs32’:
qemu/target-s390x/int_helper.c:47: error: implicit declaration of function ‘GETPC’
qemu/target-s390x/int_helper.c:47: error: nested extern declaration of ‘GETPC’
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Commit c64ca8140e (cpu: Move
queued_work_{first,last} to CPUState) moved the qemu_work_item fields
away. Clean up the now unused prototype.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Code mixes uint32_t, int and size_t. Very unlikely to go wrong in
practice, but clean it up anyway.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
# By Wenchao Xia
# Via Luiz Capitulino
* luiz/queue/qmp:
HMP: add sub command table to info
HMP: move define of mon_cmds
HMP: add infrastructure for sub command
HMP: delete info handler
HMP: add QDict to info callback handler
Add a documentation section "Methods" and discuss among others how to
handle overriding virtual methods.
Clarify DeviceClass::realize documentation and refer to the above.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Reviewed-by: Eric Blake <eblake@redhat.com>
This patch change all info call back function to take
additional QDict * parameter, which allow those command
take parameter. Now it is set to NULL at default case.
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
aio_poll() must return true if any work is still pending, even if it
didn't make progress, so that bdrv_drain_all() doesn't stop waiting too
early. The possibility of stopping early occasionally lead to a failed
assertion in bdrv_drain_all(), when some in-flight request was missed
and the function didn't really drain all requests.
In order to make that change, the return value as specified in the
function comment must change for blocking = false; fortunately, the
return value of blocking = false callers is only used in test cases, so
this change shouldn't cause any trouble.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
OpenBSD system compiler (gcc 4.2.1) has problems with concatenation
of macro arguments in macro functions:
CC aes.o
In file included from /src/qemu/include/qemu-common.h:126,
from /src/qemu/aes.c:30:
/src/qemu/include/qemu/bswap.h: In function 'leul_to_cpu':
/src/qemu/include/qemu/bswap.h:461: warning: implicit declaration of function 'bswapHOST_LONG_BITS'
/src/qemu/include/qemu/bswap.h:461: warning: nested extern declaration of 'bswapHOST_LONG_BITS'
Function leul_to_cpu() is only used in kvm-all.c, so the warnings
are not fatal on OpenBSD without -Werror.
Fix by applying glue(). Also add do {} while(0) wrapping and fix
semicolon use while at it.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
qemu_chr_new_from_opts handles QemuOpts release now, so callers don't
have to worry. It will either be saved in CharDriverState, then
released in qemu_chr_delete, or in the error case released instantly.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
A usage with a hardcoded partial path such as
object_resolve_path_component(obj, "foo")
is totally valid but currently leads to a compilation error. Fix this.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Any KVM-specific code that use these constants must check if
kvm_enabled() is true before using them.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Move the declaration to qemu/cpu.h and add documentation.
The implementation still depends on CPUArchState for CPU iteration.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Note that target-alpha accesses this field from TCG, now using a
negative offset. Therefore the field is placed last in CPUState.
Pass PowerPCCPU to [kvm]ppc_fixup_cpu() to facilitate this change.
Move common parts of mips cpu_state_reset() to mips_cpu_reset().
Acked-by: Richard Henderson <rth@twiddle.net> (for alpha)
[AF: Rebased onto ppc CPU subclasses and openpic changes]
Signed-off-by: Andreas Färber <afaerber@suse.de>
To facilitate the field movements, pass MIPSCPU to malta_mips_config();
avoid that for mips_cpu_map_tc() since callers only access MIPS Thread
Contexts, inside TCG helpers.
Signed-off-by: Andreas Färber <afaerber@suse.de>
The qiov_is_aligned() function checks whether a QEMUIOVector meets a
BlockDriverState's alignment requirements. This is needed by
virtio-blk-data-plane so:
1. Move the function from block/raw-posix.c to block/block.c.
2. Make it public in block/block.h.
3. Rename to bdrv_qiov_is_aligned().
4. Change return type from int to bool.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We've now optimized the ld/st versions; reuse that for the "legacy"
versions. Always use inlines so that we get the type checking that
we expect.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Use the new host endian unaligned access functions instead of
open coding byte-by-byte references. Remove assembly special
cases for i386 and ppc -- we've now exposed the operation to
the compiler sufficiently for these to be optimized automatically.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Move the bswap_N -> bswapN wrappers inside CONFIG_BYTESWAP_H.
Change the ultimate fallback defintions from macros to inline functions.
The proper types recieved by the function arguments means we can remove
unnecessary casts, making the code more readable.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Fixes the libfdt enabled build for hosts that have <machine/bswap.h>.
The code at the beginning of qemu/bswap.h is attempting to standardize
on bswapN. In the case of CONFIG_MACHINE_BSWAP_H, this is all we get.
In the case of CONFIG_BYTESWAP_H, we get bswap_N from the system header
and then wrap these with inline functions to get bswapN.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Since 39bffca203 (qdev: register all
types natively through QEMU Object Model), TypeInfo as used in
the common, non-iterative pattern is no longer amended with information
and should therefore be const.
Fix the documented QOM examples:
sed -i 's/static TypeInfo/static const TypeInfo/g' include/qom/object.h
Since frequently the wrong examples are being copied by contributors of
new devices, fix all types in the tree:
sed -i 's/^static TypeInfo/static const TypeInfo/g' */*.c
sed -i 's/^static TypeInfo/static const TypeInfo/g' */*/*.c
This also avoids to piggy-back these changes onto real functional
changes or other refactorings.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Turn the *-user macro into a no-op inline function to avoid
unused-variable warnings and band-aiding #ifdef'ery.
This allows to drop an #ifdef for alpha and avoids more for unicore32
and other upcoming trivial realizefn implementations.
Suggested-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
This finally makes the CPU class a subclass of the Device class,
allowing us to start using DeviceState properties on CPU subclasses.
It has no_user=1, as creating CPUs using -device doesn't work yet.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The qemu_iovec_concat() function copies a subset of a QEMUIOVector. The
new qemu_iovec_concat_iov() function does the same for a iov/cnt pair.
It is easy to define qemu_iovec_concat() in terms of
qemu_iovec_concat_iov(). The existing code is mostly unchanged, except
for the assertion src->size >= soffset, which cannot be efficiently
checked upfront on a iov/cnt pair. Instead we assert upon hitting the
end of src with an unsatisfied soffset.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The iov_discard_front/back() functions remove data from the front or
back of the vector. This is useful when peeling off header/footer
structs.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The raw_get_aio_fd() function allows virtio-blk-data-plane to get the
file descriptor of a raw image file with Linux AIO enabled. This
interface is really a layering violation that can be resolved once the
block layer is able to run outside the global mutex - at that point
virtio-blk-data-plane will switch from custom Linux AIO code to using
the block layer.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Disable the semaphores fallback code for OpenBSD as modern OpenBSD
releases now have sem_timedwait().
Signed-off-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Avoid splitting the state of outgoing migration, more or less arbitrarily,
between two data structures. QEMUFileBuffered anyway is used only during
migration.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This only moves the code (also from buffered_file.h to migration.h).
Fix whitespace until checkpatch is happy.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Code just now does (simplified for clarity)
if (qemu_savevm_state_iterate(s->file) == 1) {
vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
qemu_savevm_state_complete(s->file);
}
Problem here is that qemu_savevm_state_iterate() returns 1 when it
knows that remaining memory to sent takes less than max downtime.
But this means that we could end spending 2x max_downtime, one
downtime in qemu_savevm_iterate, and the other in
qemu_savevm_state_complete.
Changed code to:
pending_size = qemu_savevm_state_pending(s->file, max_size);
DPRINTF("pending size %lu max %lu\n", pending_size, max_size);
if (pending_size >= max_size) {
ret = qemu_savevm_state_iterate(s->file);
} else {
vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
qemu_savevm_state_complete(s->file);
}
So what we do is: at current network speed, we calculate the maximum
number of bytes we can sent: max_size.
Then we ask every save_live section how much they have pending. If
they are less than max_size, we move to complete phase, otherwise we
do an iterate one.
This makes things much simpler, because now individual sections don't
have to caluclate the bandwidth (it was implossible to do right from
there).
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Now that we have a thread, and blocking writes, we don't need it.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Move all the writes to the migration_thread, and make writings
blocking. Notice that are still using the iothread for everything
that we do.
Signed-off-by: Juan Quintela <quintela@redhat.com>
We want the file assignment to happen before the thread is created to
avoid locking, so we just do it before creating the thread.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Add the new mutex that protects shared state between ram_save_live
and the iothread. If the iothread mutex has to be taken together
with the ramlist mutex, the iothread shall always be _outside_.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Umesh Deshpande <udeshpan@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
This will be used to detect if last_block might have become invalid
across different calls to ram_save_live.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Umesh Deshpande <udeshpan@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Most of the time, only 2 items will be active (from/to for a string operation,
or code/data). But TCG guests likely won't have gigabytes of memory, so
this actually goes down to 1 item.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
* bonzini/header-dirs: (45 commits)
janitor: move remaining public headers to include/
hw: move executable format header files to hw/
fpu: move public header file to include/fpu
softmmu: move remaining include files to include/ subdirectories
softmmu: move include files to include/sysemu/
misc: move include files to include/qemu/
qom: move include files to include/qom/
migration: move include files to include/migration/
monitor: move include files to include/monitor/
exec: move include files to include/exec/
block: move include files to include/block/
qapi: move include files to include/qobject/
janitor: add guards to headers
qapi: make struct Visitor opaque
qapi: remove qapi/qapi-types-core.h
qapi: move inclusions of qemu-common.h from headers to .c files
ui: move files to ui/ and include/ui/
qemu-ga: move qemu-ga files to qga/
net: reorganize headers
net: move net.c to net/
...
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move public headers to include/net, and leave private headers in net/.
Put the virtio headers in include/net/tap.h, removing the multiple copies
that existed. Leave include/net/tap.h as the interface for NICs, and
net/tap_int.h as the interface for OS-specific parts of the tap backend.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The formula to compute slice_quota was wrong since commit 6ef228fc.
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Fix typos, whitespace and update comments to match current
implementation.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
It is not used anymore, and there is no need to make it public.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Store in the object the freeing function that will be used at deletion
time. This makes it possible to use object_delete on statically-allocated
(embedded) objects. Dually, it makes it possible to use object_unparent
and object_unref without leaking memory, when the lifetime of object
might extend until after the call to object_delete.
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add an ObjectClass method that is done at object_unparent time. It
should remove any backlinks to the object in the composition tree,
so that object_delete will be able to drop the last reference and
free the object.
Use it for qdev buses.
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The filename can be overridden but it expects a non-blocking source of entropy.
A typical invocation would be:
qemu -object rng-random,id=rng0 -device virtio-rng-pci,rng=rng0
This can also be used with /dev/urandom by using the command line:
qemu -object rng-random,filename=/dev/urandom,id=rng0 \
-device virtio-rng-pci,rng=rng0
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
---
v1 -> v2
- merged header split patch into this one
v2 -> v3
- bug fix in rng-random (Paolo)
For target-mips also change the return type to bool.
Make include paths for cpu-qom.h consistent for alpha and unicore32.
Signed-off-by: Andreas Färber <afaerber@suse.de>
[AF: Updated new target-openrisc function accordingly]
Acked-by: Richard Henderson <rth@twiddle.net> (for alpha)