Commit Graph

509 Commits

Author SHA1 Message Date
Fam Zheng
efd7556708 blockdev-backup: Don't move target AioContext if it's attached
If the BDS is attached, it will want to stay on the AioContext where its
BlockBackend is. Don't call bdrv_set_aio_context in this case.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1463969978-24970-3-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-07 14:40:50 +01:00
Fam Zheng
0d97891312 blockdev-backup: Use bdrv_lookup_bs on target
This allows backing up to a BDS that has not been attached to any BB.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1463969978-24970-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-07 14:40:50 +01:00
Kevin Wolf
5c438bc68c backup: Use BlockBackend for I/O
This changes the backup block job to use the job's BlockBackend for
performing its I/O. job->bs isn't used by the backup code any more
afterwards.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
e253f4b897 mirror: Use BlockBackend for I/O
This changes the mirror block job to use the job's BlockBackend for
performing its I/O. job->bs isn't used by the mirroring code any more
afterwards.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
b880481579 mirror: Allow target that already has a BlockBackend
We had to forbid mirroring to a target BDS that already had a BB
attached because the node swapping at job completion would add a second
BB and we didn't support multiple BBs on a single BDS at the time. Now
we do, so we can lift the restriction.

As we allow additional BlockBackends for the target, we must expect
other users to be sending requests. There may no requests be in flight
during the graph modification, so we have to drain those users now.

The core part of this patch is a revert of commit 40365552.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Max Reitz
109525ad6a block: Drop errp parameter from blk_new()
blk_new() cannot fail so its Error ** parameter has become superfluous.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
5b3639371c block: Make bdrv_open() return a BDS
There are no callers to bdrv_open() or bdrv_open_inherit() left that
pass a pointer to a non-NULL BDS pointer as the first argument of these
functions, so we can finally drop that parameter and just make them
return the new BDS.

Generally, the following pattern is applied:

    bs = NULL;
    ret = bdrv_open(&bs, ..., &local_err);
    if (ret < 0) {
        error_propagate(errp, local_err);
        ...
    }

by

    bs = bdrv_open(..., errp);
    if (!bs) {
        ret = -EINVAL;
        ...
    }

Of course, there are only a few instances where the pattern is really
pure.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Kevin Wolf
88be7b4be4 block: Fix bdrv_next() memory leak
The bdrv_next() users all leaked the BdrvNextIterator after completing
the iteration. Simply changing bdrv_next() to free the iterator before
returning NULL at the end of list doesn't work because some callers exit
the loop before looking at all BDSes.

This patch moves the BdrvNextIterator from the heap to the stack of
the caller and switches to a bdrv_first()/bdrv_next() interface for
initialising the iterator.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-25 19:04:10 +02:00
John Snow
3a3086b72a block: clarify error message for qmp-eject
If you use HMP's eject but the CDROM tray is locked, you may get a
confusing error message informing you that the "tray isn't open."

As this is the point of eject, we can do a little better and help
clarify that the tray was locked and that it (might) open up later,
so try again.

It's not ideal, but it makes the semantics of the (legacy) eject
command more understandable to end users when they try to use it.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
1f0c461b82 block: Remove BlockDriverState.blk
This patch removes the remaining users of bs->blk, which will allow us
to have multiple BBs on top of a single BDS. In the meantime, all checks
that are currently in place to prevent the user from creating such
setups can be switched to bdrv_has_blk() instead of accessing BDS.blk.

Future patches can allow them and e.g. enable users to mirror to a block
device that already has a BlockBackend on it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
7c8eece45b block: Avoid bs->blk in bdrv_next()
We need to introduce a separate BdrvNextIterator struct that can keep
more state than just the current BDS in order to avoid using the bs->blk
pointer.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
b26ded9a7d Revert "block: Forbid I/O throttling on nodes with multiple parents for 2.6"
This reverts commit 76b223200e.

Now that I/O throttling is fully done on the BlockBackend level, there
is no reason any more to block I/O throttling for nodes with multiple
parents as the parents don't influence each other any more.

Conflicts:
	block.c

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
7ca7f0f6db block: Decouple throttling from BlockDriverState
This moves the throttling related part of the BDS life cycle management
to BlockBackend. The throttling group reference is now kept even when no
medium is inserted.

With this commit, throttling isn't disabled and then re-enabled any more
during graph reconfiguration. This fixes the temporary breakage of I/O
throttling when used with live snapshots or block jobs that manipulate
the graph.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:30 +02:00
Kevin Wolf
97148076e8 block: Move I/O throttling configuration functions to BlockBackend
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:30 +02:00
Kevin Wolf
27ccdd5259 block: Move throttling fields from BDS to BB
This patch changes where the throttling state is stored (used to be the
BlockDriverState, now it is the BlockBackend), but it doesn't actually
make it a BB level feature yet. For example, throttling is still
disabled when the BDS is detached from the BB.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:30 +02:00
Kevin Wolf
a5614993d7 block: Make sure throttled BDSes always have a BB
It was already true in principle that a throttled BDS always has a BB
attached, except that the order of operations while attaching or
detaching a BDS to/from a BB wasn't careful enough.

This commit breaks graph manipulations while I/O throttling is enabled.
It would have been possible to keep things working with some temporary
hacks, but quite cumbersome, so it's not worth the hassle. We'll fix
things again in a minute.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:29 +02:00
Wen Congyang
7f82159769 qmp: add monitor command to add/remove a child
The new QMP command name is x-blockdev-change. It's just for adding/removing
quorum's child now, and doesn't support all kinds of children, all kinds of
operations, nor all block drivers. So it is experimental now.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1462865799-19402-4-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12 15:33:23 +02:00
Wei Jiangang
547cb1574e block: Fix typo in comment
s/imlement/implement/

Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:08 +02:00
Kevin Wolf
76b223200e block: Forbid I/O throttling on nodes with multiple parents for 2.6
As the patches to move I/O throttling to BlockBackend didn't make it in
time for the 2.6 release, but the release adds new ways of configuring
VMs whose behaviour would change once the move is done, we need to
outlaw such configurations temporarily.

The problem exists whenever a BDS has more users than just its BB, for
example it is used as a backing file for another node. (This wasn't
possible in 2.5 yet as we introduced node references to specify a
backing file only recently.) In these cases, the throttling would
apply to these other users now, but after moving throttling to the
BlockBackend the other users wouldn't be throttled any more.

This patch prevents making new references to a throttled node as well as
using monitor commands to throttle a node with multiple parents.

Compared to 2.5 this changes behaviour in some corner cases where
references were allowed before, like bs->file or Quorum children. It
seems reasonable to assume that users didn't use I/O throttling on such
low level nodes. With the upcoming move of throttling into BlockBackend,
such configurations won't be possible anyway.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-04-05 09:22:28 +02:00
Paolo Bonzini
5cf87fd68e block: forbid x-blockdev-del from acting on DriveInfo
Failing on -drive/drive_add created BlockBackends was a
requirement for x-blockdev-del, but it sneaked through
the patch review.  Let's fix it now.

Example:

$ x86_64-softmmu/qemu-system-x86_64 -drive if=none,file=null-co://,id=null -qmp stdio
>> {'execute':'qmp_capabilities'}
<< {"return": {}}
>> {'execute':'x-blockdev-del','arguments':{'id':'null'}}
<< {"error": {"class": "GenericError", "desc": "Deleting block backend added with drive-add is not supported"}}

And without a DriveInfo:

>> { "execute": "blockdev-add", "arguments": { "options": { "driver":"null-co", "id":"null2"}}}
<< {"return": {}}
>> {'execute':'x-blockdev-del','arguments':{'id':'null2'}}
<< {"return": {}}

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-04-05 09:22:28 +02:00
Kevin Wolf
61de4c6808 block: Remove BDRV_O_CACHE_WB
The previous patches have successively made blk->enable_write_cache the
true source for the information whether a writethrough mode must be
implemented. The corresponding BDRV_O_CACHE_WB is only useless baggage
we're carrying around, so now's the time to remove it.

At the same time, we remove the 'cache.writeback' option parsing on the
BDS level as the only effect was setting the BDRV_O_CACHE_WB flag.

This change requires test cases that explicitly enabled the option to
drop it. Other than that and the change of the error message when
writethrough is enabled on the BDS level (from "Can't set writethrough
mode" to "doesn't support the option"), there should be no change in
behaviour.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30 12:16:03 +02:00
Kevin Wolf
04feb4a507 block: Use bdrv_parse_cache_mode() in drive_init()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30 12:16:02 +02:00
Kevin Wolf
72e775c7d9 block: Always set writeback mode in blk_new_open()
All callers of blk_new_open() either don't rely on the WCE bit set after
blk_new_open() because they explicitly set it anyway, or they pass
BDRV_O_CACHE_WB unconditionally.

This patch changes blk_new_open() so that it always enables writeback
mode and asserts that BDRV_O_CACHE_WB is clear. For those callers that
used to pass BDRV_O_CACHE_WB unconditionally, the flag is removed now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30 12:16:01 +02:00
Kevin Wolf
e4b24b497e block: blockdev_init(): Call blk_set_enable_write_cache() explicitly
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30 12:16:01 +02:00
Kevin Wolf
73ac451f34 block: Reject writethrough mode except at the root
Writethrough mode is going to become a BlockBackend feature rather than
a BDS one, so forbid it in places where we won't be able to support it
when the code finally matches the envisioned design.

We only allowed setting the cache mode of non-root nodes after the 2.5
release, so we're still free to make this change.

The target of block jobs is now always opened in a writeback mode
because it doesn't have a BlockBackend attached. This makes more sense
anyway because block jobs know when to flush. If the graph is modified
on job completion, the original cache mode moves to the new root, so
for the guest device writethough always stays enabled if it was
configured this way.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-03-30 11:59:32 +02:00
Kevin Wolf
4c8449832c block: Remove copy-on-read from bdrv_move_feature_fields()
Ever since we first introduced bdrv_append() in commit 8802d1fd ('qapi:
Introduce blockdev-group-snapshot-sync command'), the copy-on-read flag
was moved to the new top layer when taking a snapshot. The only problem
is that it doesn't make a whole lot of sense.

The use case for manually enabled CoR is to avoid reading data twice
from a slow remote image, so we want to save it to a local overlay, say
an ISO image accessed via HTTP to a local qcow2 overlay. When taking a
snapshot, we end up with a backing chain like this:

    http <- local.qcow2 <- snap_overlay.qcow2

There is no point in doing CoR from local.qcow2 into snap_overlay.qcow2,
we just want to keep copying data from the remote source into
local.qcow2.

The other use case of CoR is in the context of streaming, which isn't
very interesting for bdrv_move_feature_fields() because op blockers
prevent this combination.

This patch makes the copy-on-read flag stay on the image for which it
was originally set and prevents it from being propagated to the new
overlay. It is no longer intended to move CoR to the BlockBackend level.
In order for this to make sense, we also need to keep the respective
image read-write.

As a side effect of these changes, creating a live snapshot image (as
opposed to using an existing externally created one) on top of a COR
block device works now. It used to fail because it tried to open its
backing file both read-only and with COR.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-03-30 11:59:32 +02:00
Kevin Wolf
63eaaae08c block: Remove bdrv_make_anon()
The call in hmp_drive_del() is dead code because blk_remove_bs() is
called a few lines above. The only other remaining user is
bdrv_delete(), which only abuses bdrv_make_anon() to remove it from the
named nodes list. This path inlines the list entry removal into
bdrv_delete() and removes bdrv_make_anon().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30 11:59:32 +02:00
Veronia Bahaa
f348b6d1a5 util: move declarations out of qemu-common.h
Move declarations out of qemu-common.h for functions declared in
utils/ files: e.g. include/qemu/path.h for utils/path.c.
Move inline functions out of qemu-common.h and into new files (e.g.
include/qemu/bcd.h)

Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 22:20:17 +01:00
Eric Blake
32bafa8fdd qapi: Don't special-case simple union wrappers
Simple unions were carrying a special case that hid their 'data'
QMP member from the resulting C struct, via the hack method
QAPISchemaObjectTypeVariant.simple_union_type().  But by using
the work we started by unboxing flat union and alternate
branches, coupled with the ability to visit the members of an
implicit type, we can now expose the simple union's implicit
type in qapi-types.h:

| struct q_obj_ImageInfoSpecificQCow2_wrapper {
|     ImageInfoSpecificQCow2 *data;
| };
|
| struct q_obj_ImageInfoSpecificVmdk_wrapper {
|     ImageInfoSpecificVmdk *data;
| };
...
| struct ImageInfoSpecific {
|     ImageInfoSpecificKind type;
|     union { /* union tag is @type */
|         void *data;
|-        ImageInfoSpecificQCow2 *qcow2;
|-        ImageInfoSpecificVmdk *vmdk;
|+        q_obj_ImageInfoSpecificQCow2_wrapper qcow2;
|+        q_obj_ImageInfoSpecificVmdk_wrapper vmdk;
|     } u;
| };

Doing this removes asymmetry between QAPI's QMP side and its
C side (both sides now expose 'data'), and means that the
treatment of a simple union as sugar for a flat union is now
equivalent in both languages (previously the two approaches used
a different layer of dereferencing, where the simple union could
be converted to a flat union with equivalent C layout but
different {} on the wire, or to an equivalent QMP wire form
but with different C representation).  Using the implicit type
also lets us get rid of the simple_union_type() hack.

Of course, now all clients of simple unions have to adjust from
using su->u.member to using su->u.member.data; while this touches
a number of files in the tree, some earlier cleanup patches
helped minimize the change to the initialization of a temporary
variable rather than every single member access.  The generated
qapi-visit.c code is also affected by the layout change:

|@@ -7393,10 +7393,10 @@ void visit_type_ImageInfoSpecific_member
|     }
|     switch (obj->type) {
|     case IMAGE_INFO_SPECIFIC_KIND_QCOW2:
|-        visit_type_ImageInfoSpecificQCow2(v, "data", &obj->u.qcow2, &err);
|+        visit_type_q_obj_ImageInfoSpecificQCow2_wrapper_members(v, &obj->u.qcow2, &err);
|         break;
|     case IMAGE_INFO_SPECIFIC_KIND_VMDK:
|-        visit_type_ImageInfoSpecificVmdk(v, "data", &obj->u.vmdk, &err);
|+        visit_type_q_obj_ImageInfoSpecificVmdk_wrapper_members(v, &obj->u.vmdk, &err);
|         break;
|     default:
|         abort();

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-13-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-18 10:29:26 +01:00
Max Reitz
9aaf28c61d block: Remove bdrv_states list
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:57 +01:00
Max Reitz
262b4e8f74 block: Add bdrv_next_monitor_owned()
Add a function for iterating over all monitor-owned BlockDriverStates so
the generic block layer can do so.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz
7c735873d9 blockdev: Remove blk_hide_on_behalf_of_hmp_drive_del()
We can basically inline it in hmp_drive_del(); monitor_remove_blk() is
called already, so we just need to call bdrv_make_anon(), too.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz
efaa7c4eeb blockdev: Split monitor reference from BB creation
Before this patch, blk_new() automatically assigned a name to the new
BlockBackend and considered it referenced by the monitor. This patch
removes the implicit monitor_add_blk() call from blk_new() (and
consequently the monitor_remove_blk() call from blk_delete(), too) and
thus blk_new() (and related functions) no longer take a BB name
argument.

In fact, there is only a single point where blk_new()/blk_new_open() is
called and the new BB is monitor-owned, and that is in blockdev_init().
Besides thus relieving us from having to invent names for all of the BBs
we use in qemu-img, this fixes a bug where qemu cannot create a new
image if there already is a monitor-owned BB named "image".

If a BB and its BDS tree are created in a single operation, as of this
patch the BDS tree will be created before the BB is given a name
(whereas it was the other way around before). This results in minor
change to the output of iotest 087, whose reference output is amended
accordingly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz
da31d594cf block: Use blk_{commit,flush}_all() consistently
Replace bdrv_commmit_all() and bdrv_flush_all() by their BlockBackend
equivalents.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Kevin Wolf
f8746fb804 block: Fix memory leak in hmp_drive_add_node()
hmp_drive_add_node() leaked qdict in the error path when no node-name is
specified.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-03-17 15:47:56 +01:00
Kevin Wolf
23f7fcb295 block: Fix qemu_root_bds_opts.head initialisation
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-17 15:47:56 +01:00
Kevin Wolf
2073d410ce hmp: Extend drive_del to delete nodes without BB
Now that we can use drive_add to create new nodes without a BB, we also
want to be able to delete such nodes again.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
abb21ac3e6 hmp: 'drive_add -n' for creating a node without BB
This patch adds an option to the drive_add HMP command to create only a
BlockDriverState without a BlockBackend on top.

The motivation for this is that libvirt needs to specify options to a
migration target (specifically, detect-zeroes). drive-mirror doesn't
allow specifying options, and the proper way to do this is to create the
target BDS separately with blockdev-add (where you can specify options)
and then use blockdev-mirror to that BDS.

However, libvirt can't use blockdev-add as long as it is still
experimental, and we're expecting that it will still take some time, so
we need to resort to drive_add.

The problem with drive_add is that so far it always created a BB, and
BDSes with a BB can't be used as a mirroring target as long as we don't
support multiple BBs per BDS - and while we're working towards that
goal, it's another thing that will still take some time.

So to achieve the goal, the simplest solution to provide the
functionality now without adding one-off options to the mirror QMP
commands is to extend drive_add to create nodes without BBs.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
a81d616437 block: Fix cache mode defaults in bds_tree_init()
Without setting explicit defaults in the options, blockdev-add without
an ID ended up defaulting to writethrough. It should be writeback as
documented.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
73176bee99 block: Fix snapshot=on cache modes
Since commit 91a097e, we end up with a somewhat weird cache mode
configuration with snapshot=on: The commit broke the cache mode
inheritance for the snapshot overlay so that it is opened as
writethrough instead of unsafe now. The following bdrv_append() call to
put it on top of the tree swaps the WCE flag with the snapshot's backing
file (i.e. the originally given file), so what we eventually get is
cache=writeback on the temporary overlay and
cache=writethrough,cache.no-flush=on on the real image file.

This patch changes things so that the temporary overlay gets
cache=unsafe again like it used to, and the real images get whatever the
user specified. This means that cache.direct is now respected even with
snapshot=on, and in the case of committing changes, the final flush is
no longer ignored except explicitly requested by the user.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
f86b8b584b blockdev: Snapshotting must not open second instance of old top
Calling bdrv_img_create() with a size of -1 means that it determines the
size automatically by opening the backing file. However, in the case of
live snapshots, the backing file is already opened and we must avoid
opening the same image twice at the same time. Apart from that, just
getting the size from the already existing BDS is a lot less overhead
than opening a new instance.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2016-03-14 16:46:43 +01:00
Eric Blake
10f759079e qapi: Avoid use of 'data' member of QAPI unions
QAPI code generators currently create a 'void *data' member as
part of the anonymous union embedded in the C struct corresponding
to a QAPI union.  However, directly assigning to this member of
the union feels a bit fishy, when we can assign to another member
of the struct instead.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1457021813-10704-9-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-05 10:41:58 +01:00
Alyssa Milburn
156abc2f90 blockdev: unset inappropriate flags when changing medium
Most importantly, this removes BDRV_O_TEMPORARY, to avoid unlink()ing an
image which replaces a snapshotted one.

Signed-off-by: Alyssa Milburn <fuzzie@fuzzie.org>
Message-id: 20160206133618.GA16635@li141-249.members.linode.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-22 16:54:14 +01:00
Alberto Garcia
dce13204a0 qapi: Add burst length parameters to block_set_io_throttle
This patch adds the new bps_*_max_length and iops_*_max_length
parameters to the block_set_io_throttle command.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:06 +01:00
Alberto Garcia
8a0fc18d88 throttle: Add command-line settings to define the burst periods
This patch adds all the throttling.*-max-length command-line
parameters to define the length of the burst periods.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:05 +01:00
Alberto Garcia
1588ab5d0b throttle: Use throttle_config_init() to initialize ThrottleConfig
We can currently initialize ThrottleConfig by zeroing all its fields,
but this will change with the new fields to define the length of the
burst periods.

This patch introduces a new throttle_config_init() function and uses it
to replace all memset() calls that initialize ThrottleConfig directly.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:05 +01:00
Alberto Garcia
d5851089a8 throttle: Merge all functions that check the configuration into one
There's no need to keep throttle_conflicting(), throttle_is_valid()
and throttle_max_is_missing_limit() as separate functions, so this
patch merges all three into one.

As a consequence, check_throttle_config() becomes redundant and can be
replaced with throttle_is_valid().

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:05 +01:00
Alberto Garcia
03ba36c83d throttle: Make throttle_is_valid() set errp
The caller does not need to set it, and this will allow us to refactor
this function later.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:04 +01:00
Alberto Garcia
45b2d418e0 throttle: Make throttle_max_is_missing_limit() set errp
The caller does not need to set it, and this will allow us to refactor
this function later.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:04 +01:00
Alberto Garcia
6921b18095 throttle: Make throttle_conflicting() set errp
The caller does not need to set it, and this will allow us to refactor
this function later.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:04 +01:00
Kevin Wolf
12d5ee3a7e block: Fix -incoming with snapshot=on
The BDRV_O_INACTIVE flag should only be set for images explicitly opened
by the user. snapshot=on needs to create a new qcow2 image and write
some metadata to it. This is not a problem because it can't come from
the source, so there's no reason to mark it as BDRV_O_INACTIVE, even
though it is opened while waiting for the migration to complete.

This fixes an assertion failure when -incoming and snapshot=on are
combined.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 09:49:46 +01:00
Eric Blake
51e72bc1dd qapi: Swap visit_* arguments for consistent 'name' placement
JSON uses "name":value, but many of our visitor interfaces were
called with visit_type_FOO(v, &value, name, errp).  This can be
a bit confusing to have to mentally swap the parameter order to
match JSON order.  It's particularly bad for visit_start_struct(),
where the 'name' parameter is smack in the middle of the
otherwise-related group of 'obj, kind, size' parameters! It's
time to do a global swap of the parameter ordering, so that the
'name' parameter is always immediately after the Visitor argument.

Additional reason in favor of the swap: the existing include/qjson.h
prefers listing 'name' first in json_prop_*(), and I have plans to
unify that file with the qapi visitors; listing 'name' first in
qapi will minimize churn to the (admittedly few) qjson.h clients.

Later patches will then fix docs, object.h, visitor-impl.h, and
those clients to match.

Done by first patching scripts/qapi*.py by hand to make generated
files do what I want, then by running the following Coccinelle
script to affect the rest of the code base:
 $ spatch --sp-file script `git grep -l '\bvisit_' -- '**/*.[ch]'`
I then had to apply some touchups (Coccinelle insisted on TAB
indentation in visitor.h, and botched the signature of
visit_type_enum() by rewriting 'const char *const strings[]' to
the syntactically invalid 'const char*const[] strings').  The
movement of parameters is sufficient to provoke compiler errors
if any callers were missed.

    // Part 1: Swap declaration order
    @@
    type TV, TErr, TObj, T1, T2;
    identifier OBJ, ARG1, ARG2;
    @@
     void visit_start_struct
    -(TV v, TObj OBJ, T1 ARG1, const char *name, T2 ARG2, TErr errp)
    +(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp)
     { ... }

    @@
    type bool, TV, T1;
    identifier ARG1;
    @@
     bool visit_optional
    -(TV v, T1 ARG1, const char *name)
    +(TV v, const char *name, T1 ARG1)
     { ... }

    @@
    type TV, TErr, TObj, T1;
    identifier OBJ, ARG1;
    @@
     void visit_get_next_type
    -(TV v, TObj OBJ, T1 ARG1, const char *name, TErr errp)
    +(TV v, const char *name, TObj OBJ, T1 ARG1, TErr errp)
     { ... }

    @@
    type TV, TErr, TObj, T1, T2;
    identifier OBJ, ARG1, ARG2;
    @@
     void visit_type_enum
    -(TV v, TObj OBJ, T1 ARG1, T2 ARG2, const char *name, TErr errp)
    +(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp)
     { ... }

    @@
    type TV, TErr, TObj;
    identifier OBJ;
    identifier VISIT_TYPE =~ "^visit_type_";
    @@
     void VISIT_TYPE
    -(TV v, TObj OBJ, const char *name, TErr errp)
    +(TV v, const char *name, TObj OBJ, TErr errp)
     { ... }

    // Part 2: swap caller order
    @@
    expression V, NAME, OBJ, ARG1, ARG2, ERR;
    identifier VISIT_TYPE =~ "^visit_type_";
    @@
    (
    -visit_start_struct(V, OBJ, ARG1, NAME, ARG2, ERR)
    +visit_start_struct(V, NAME, OBJ, ARG1, ARG2, ERR)
    |
    -visit_optional(V, ARG1, NAME)
    +visit_optional(V, NAME, ARG1)
    |
    -visit_get_next_type(V, OBJ, ARG1, NAME, ERR)
    +visit_get_next_type(V, NAME, OBJ, ARG1, ERR)
    |
    -visit_type_enum(V, OBJ, ARG1, ARG2, NAME, ERR)
    +visit_type_enum(V, NAME, OBJ, ARG1, ARG2, ERR)
    |
    -VISIT_TYPE(V, OBJ, NAME, ERR)
    +VISIT_TYPE(V, NAME, OBJ, ERR)
    )

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-19-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:56 +01:00
Peter Maydell
d38ea87ac5 all: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-16-git-send-email-peter.maydell@linaro.org
2016-02-04 17:41:30 +00:00
Jeff Cody
f8aa905a4f block: set device_list.tqe_prev to NULL on BDS removal
This fixes a regression introduced with commit 3f09bfbc7.  Multiple
bugs arise in conjunction with live snapshots and mirroring operations
(which include active layer commit).

After a live snapshot occurs, the active layer and the base layer both
have a non-NULL tqe_prev field in the device_list, although the base
node's tqe_prev field points to a NULL entry.  This non-NULL tqe_prev
field occurs after the bdrv_append() in the external snapshot calls
change_parent_backing_link().

In change_parent_backing_link(), when the previous active layer is
removed from device_list, the device_list.tqe_prev pointer is not
set to NULL.

The operating scheme in the block layer is to indicate that a BDS belongs
in the bdrv_states device_list iff the device_list.tqe_prev pointer
is non-NULL.

This patch does two things:

1.) Introduces a new block layer helper bdrv_device_remove() to remove a
    BDS from the device_list, and
2.) uses that new API, which also fixes the regression once used in
    change_parent_backing_link().

Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 0cd51e11c0666c04ddb7c05293fe94afeb551e89.1454376655.git.jcody@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 18:04:47 +01:00
Max Reitz
9c4218e957 blockdev: Keep track of monitor-owned BDS
As a side effect, we can now make x-blockdev-del's check whether a BDS
is actually owned by the monitor explicit.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Max Reitz
938abd4325 blockdev: Use blk_remove_bs() in do_drive_del()
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Max Reitz
12c7ec87a7 blockdev: Fix 'change' for slot devices
'change' and related operations did not work when used on guest devices
featuring removable media but no actual tray, because
blk_dev_is_tray_open() always returned false for them and the
blockdev-{insert,remove}-medium commands required it to return true.

Fix this by making blockdev-{insert,remove}-medium work on tray-less
devices. Also, blockdev-{open,close}-tray are now explicitly no-ops when
invoked on such devices, and blk_dev_change_media_cb() is instead
called by blockdev-{insert,remove}-medium (for tray-less devices only).

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1454096953-31773-3-git-send-email-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-02 17:47:00 +01:00
Fam Zheng
972606c4db blockdev: Error out on negative throttling option values
extract_common_blockdev_options() uses qemu_opt_get_number() to parse
the bps/iops numbers to uint64_t, then converts to double and stores in
ThrottleConfig.  The actual parsing is done by strtoull() in
parse_option_number().  Negative numbers are wrapped to large positive
ones, and stored.

We used to reject negative numbers since 7d81c1413c, but this regressed
when the option parsing code was changed later. Now fix this again.

This time, define an arbitrary large upper limit (1e15),  and check the
values so both negative and impractically big numbers are caught and
reported.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-01-20 13:37:37 +01:00
Markus Armbruster
c29b77f955 error: Use error_reportf_err() where it makes obvious sense
Done with this Coccinelle semantic patch

    @@
    expression FMT, E, S;
    expression list ARGS;
    @@
    -    error_report(FMT, ARGS, error_get_pretty(E));
    +    error_reportf_err(E, FMT/*@@@*/, ARGS);
    (
    -    error_free(E);
    |
	 exit(S);
    |
	 abort();
    )

followed by a replace of '%s"/*@@@*/' by '"' and some line rewrapping,
because I can't figure out how to make Coccinelle transform strings.

We now use the error whole instead of just its message obtained with
error_get_pretty().  This avoids suppressing its hint (see commit
50b7b00), but I can't see how the errors touched in this commit could
come with hints.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1450452927-8346-12-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-01-13 15:16:17 +01:00
Fam Zheng
df92562e68 qmp: Add blockdev-mirror command
This will start a mirror job from a named device to another named
device, its relation with drive-mirror is similar with blockdev-backup
to drive-backup.

In blockdev-mirror, the target node should be prepared by blockdev-add,
which will be responsible for assigning a name to the new node, so
we don't have 'node-name' parameter.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1450932306-13717-5-git-send-email-famz@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-01-07 21:30:18 +01:00
Fam Zheng
e40e5027f6 block: Add check on mirror target
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1450932306-13717-4-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-01-07 21:30:18 +01:00
Fam Zheng
4193cdd771 block: Extract blockdev part of qmp_drive_mirror
This is the part that will be reused by blockdev-mirror.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1450932306-13717-3-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-01-07 21:30:18 +01:00
Fam Zheng
05e4d14bf3 block: Rename BLOCK_OP_TYPE_MIRROR to BLOCK_OP_TYPE_MIRROR_SOURCE
It's necessary to distinguish source and target before we can add
blockdev-mirror, because we would want a concrete type of operation to
check on target bs before starting.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1450932306-13717-2-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-01-07 21:30:17 +01:00
Kevin Wolf
91a097e747 block: Move cache options into options QDict
This adds the cache mode options to the QDict, so that they can be
specified for child nodes (e.g. backing.cache.direct=off).

The cache modes are not removed from the flags at this point; instead,
options and flags are kept in sync. If the user specifies both flags and
options, the options take precedence.

Child node inherit cache modes as options now, they don't use flags any
more.

Note that this forbids specifying the cache mode for empty drives. It
didn't make sense anyway to specify it there, because it didn't have any
effect. blockdev_init() considers the cache options now bdrv_open()
options and therefore doesn't create an empty drive any more but calls
into bdrv_open(). This in turn will fail with no driver and filename
specified.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-12-18 14:34:43 +01:00
Kevin Wolf
39c4ae941e blockdev: Set 'format' indicates non-empty drive
Creating an empty drive while specifying 'format' doesn't make sense.
The specified format driver would simply be ignored.

Make a set 'format' option an indication that a non-empty drive should
be created. This makes 'format' consistent with 'driver' and allows
using it with a block driver that doesn't need any other options (like
null-co/null-aio).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-12-18 14:34:43 +01:00
Eric Blake
7fb1cf1606 qapi: Don't let implicit enum MAX member collide
Now that we guarantee the user doesn't have any enum values
beginning with a single underscore, we can use that for our
own purposes.  Renaming ENUM_MAX to ENUM__MAX makes it obvious
that the sentinel is generated.

This patch was mostly generated by applying a temporary patch:

|diff --git a/scripts/qapi.py b/scripts/qapi.py
|index e6d014b..b862ec9 100644
|--- a/scripts/qapi.py
|+++ b/scripts/qapi.py
|@@ -1570,6 +1570,7 @@ const char *const %(c_name)s_lookup[] = {
|     max_index = c_enum_const(name, 'MAX', prefix)
|     ret += mcgen('''
|     [%(max_index)s] = NULL,
|+// %(max_index)s
| };
| ''',
|                max_index=max_index)

then running:

$ cat qapi-{types,event}.c tests/test-qapi-types.c |
    sed -n 's,^// \(.*\)MAX,s|\1MAX|\1_MAX|g,p' > list
$ git grep -l _MAX | xargs sed -i -f list

The only things not generated are the changes in scripts/qapi.py.

Rejecting enum members named 'MAX' is now useless, and will be dropped
in the next patch.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1447836791-369-23-git-send-email-eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
[Rebased to current master, commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2015-12-17 08:21:28 +01:00
Max Reitz
6e0abc251d blockdev: Mark {insert, remove}-medium experimental
While in the long term we want throttling to be its own block filter
BDS, in the short term we want it to be part of the BB instead of a BDS;
even in the long term we may want legacy throttling to be automatically
tied to the BB.

blockdev-insert-medium and blockdev-remove-medium do not retain
throttling information in the BB (deliberately so). Therefore, using
them means tying this information to a BDS, which would break the model
described above. (The same applies to other flags such as
detect_zeroes.) We probably want to move this information to the BB or
its own filter BDS before blockdev-{insert,remove}-medium can be
considered completely stable.

Therefore, mark these functions experimental for the time being.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1449847385-13986-2-git-send-email-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
[PMM: fixed format nit (underlining) in qmp-commands.hx]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-12-11 15:39:29 +00:00
Alberto Garcia
4ad6f3db7d block: Call external_snapshot_clean after blockdev-snapshot
Otherwise the AioContext will never be released.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1447419624-21918-1-git-send-email-berto@igalia.com
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2015-11-18 16:26:15 +01:00
Max Reitz
0702d3d88c blockdev: Add missing bdrv_unref() in drive-backup
All error paths after a successful bdrv_open() of target_bs should
contain a bdrv_unref(target_bs). This one did not yet, so add it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-18 16:05:56 +01:00
Alberto Garcia
40119effc5 block: make 'stats-interval' an array of ints instead of a string
This is the natural JSON representation and prevents us from having to
decode the list manually.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 0e3da8fa206f4ab534ae3ce6086e75fe84f1557e.1447665472.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-11-17 18:35:57 +08:00
Alberto Garcia
2be5506fc8 block: New option to define the intervals for collecting I/O statistics
The BlockAcctStats structure contains a list of BlockAcctTimedStats.
Each one of these collects statistics about the minimum, maximum and
average latencies of all I/O operations in a certain interval of time.

This patch adds a new "stats-intervals" option that allows defining
these intervals.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 41cbcd334a61c6157f0f495cdfd21eff6c156f2a.1446044837.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-12 16:22:46 +01:00
Alberto Garcia
362e9299b3 block: Allow configuring whether to account failed and invalid ops
This patch adds two options, "stats-account-invalid" and
"stats-account-failed", that can be used to decide whether invalid and
failed I/O operations must be used when collecting statistics for
latency and last access time.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: ebc7e5966511a342cad428a392c5f5ad56b15213.1446044837.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-12 16:22:45 +01:00
John Snow
94d16a640a block: add transactional properties
Add both transactional properties to the QMP transactional interface,
and add the BlockJobTxn that we create as a result of the err-cancel
property to the BlkActionState structure.

[split up from a patch originally by Stefan and Fam. --js]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1446765200-3054-13-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-12 16:22:44 +01:00
John Snow
78f51fde88 block: Add BlockJobTxn support to backup_run
Allow a BlockJobTxn to be passed into backup_run, which
will allow the job to join a transactional group if present.

Propagate this new parameter outward into new QMP helper
functions in blockdev.c to allow transaction commands to
pass forward their BlockJobTxn object in a forthcoming patch.

[split up from a patch originally by Stefan and Fam. --js]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1446765200-3054-12-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-12 16:22:44 +01:00
Fam Zheng
18930ba3d1 blockjob: Introduce reference count and fix reference to job->bs
Add reference count to block job, meanwhile move the ownership of the
reference to job->bs from the caller (which is released in two
completion callbacks) to the block job itself. It is necessary for
block_job_complete_sync to work, because block job shouldn't live longer
than its bs, as asserted in bdrv_delete.

Now block_job_complete_sync can be simplified.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1446765200-3054-6-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-12 16:22:43 +01:00
John Snow
50f43f0ff9 block: rename BlkTransactionState and BdrvActionOps
These structures are misnomers, somewhat.

(1) BlockTransactionState is not state for a transaction,
    but is rather state for a single transaction action.
    Rename it "BlkActionState" to be more accurate.

(2) The BdrvActionOps describes operations for the BlkActionState,
    above. This name might imply a 'BdrvAction' or a 'BdrvActionState',
    which there isn't.
    Rename this to 'BlkActionOps' to match 'BlkActionState'.

Lastly, update the surrounding in-line documentation and comments
to reflect the current nature of how Transactions operate.

This patch changes only comments and names, and should not affect
behavior in any way.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1446765200-3054-4-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-12 16:22:43 +01:00
Fam Zheng
df9a681dc9 qed: Implement .bdrv_drain
The "need_check_timer" is used to clear the "NEED_CHECK" flag in the
image header after a grace period once metadata update has finished. In
compliance to the bdrv_drain semantics we should make sure it remains
deleted once .bdrv_drain is called.

We cannot reuse qed_need_check_timer_cb because here it doesn't satisfy
the assertion.  Do the "plug" and "flush" calls manually.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1447064214-29930-10-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-12 16:22:43 +01:00
Alberto Garcia
81b936ae70 block: Add 'x-blockdev-del' QMP command
This command is still experimental, hence the name.

This is the companion to 'blockdev-add'. It allows deleting a
BlockBackend with its associated BlockDriverState tree, or a
BlockDriverState that is not attached to any backend.

In either case, the command fails if the reference count is greater
than 1 or the BlockDriverState has any parents.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 6cfc148c77aca1da942b094d811bfa3fcf7ac7bb.1446475331.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2015-11-11 16:55:28 +01:00
Alberto Garcia
08b24cfe37 block: Disallow snapshots if the overlay doesn't support backing files
This addresses scenarios like this one:

  { 'execute': 'blockdev-add', 'arguments':
    { 'options': { 'driver': 'qcow2',
                   'node-name': 'new0',
                   'file': { 'driver': 'file',
                             'filename': 'new.qcow2',
                             'node-name': 'file0' } } } }

  { 'execute': 'blockdev-snapshot', 'arguments':
    { 'node': 'virtio0',
      'overlay': 'file0' } }

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-11 16:25:48 +01:00
Alberto Garcia
a0d64a61db throttle: Use bs->throttle_state instead of bs->io_limits_enabled
There are two ways to check for I/O limits in a BlockDriverState:

- bs->throttle_state: if this pointer is not NULL, it means that this
  BDS is member of a throttling group, its ThrottleTimers structure
  has been initialized and its I/O limits are ready to be applied.

- bs->io_limits_enabled: if true it means that the throttle_state
  pointer is valid _and_ the limits are currently enabled.

The latter is used in several places to check whether a BDS has I/O
limits configured, but what it really checks is whether requests
are being throttled or not. For example, io_limits_enabled can be
temporarily set to false in cases like bdrv_read_unthrottled() without
otherwise touching the throtting configuration of that BDS.

This patch replaces bs->io_limits_enabled with bs->throttle_state in
all cases where what we really want to check is the existence of I/O
limits, not whether they are currently enabled or not.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-11 16:25:47 +01:00
Alberto Garcia
43de7e2de0 block: add a 'blockdev-snapshot' QMP command
One of the limitations of the 'blockdev-snapshot-sync' command is that
it does not allow passing BlockdevOptions to the newly created
snapshots, so they are always opened using the default values.

Extending the command to allow passing options is not a practical
solution because there is overlap between those options and some of
the existing parameters of the command.

This patch introduces a new 'blockdev-snapshot' command with a simpler
interface: it just takes two references to existing block devices that
will be used as the source and target for the snapshot.

Since the main difference between the two commands is that one of them
creates and opens the target image, while the other uses an already
opened one, the bulk of the implementation is shared.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-11 16:25:47 +01:00
Alberto Garcia
a911e6ae7c block: rename BlockdevSnapshot to BlockdevSnapshotSync
We will introduce the 'blockdev-snapshot' command that will require
its own struct for the parameters, so we need to rename this one in
order to avoid name clashes.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-11 16:25:47 +01:00
Alberto Garcia
a39a24fbb0 block: check for existing device IDs in external_snapshot_prepare()
The 'snapshot-node-name' parameter of blockdev-snapshot-sync allows
setting the node name of the image that is going to be created.

Before creating the image, external_snapshot_prepare() checks that the
name is not already being used. The check is however incomplete since
it only considers existing node names, but node names must not clash
with device IDs either because they share the same namespace.

If the user attempts to create a snapshot using the name of an
existing device for the 'snapshot-node-name' parameter the operation
will eventually fail, but only after the new image has been created.

This patch replaces bdrv_find_node() with bdrv_lookup_bs() to extend
the check to existing device IDs, and thus detect possible name
clashes before the new image is created.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-11 16:25:47 +01:00
Max Reitz
39ff43d9e1 blockdev: read-only-mode for blockdev-change-medium
Add an option to qmp_blockdev_change_medium() which allows changing the
read-only status of the block device whose medium is changed.

Some drives do not have a inherently fixed read-only status; for
instance, floppy disks can be set read-only or writable independently of
the drive. Some users may find it useful to be able to therefore change
the read-only status of a block device when changing the medium.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-11 16:23:34 +01:00
Max Reitz
24fb413300 qmp: Introduce blockdev-change-medium
Introduce a new QMP command 'blockdev-change-medium' which is intended
to replace the 'change' command for block devices. The existing function
qmp_change_blockdev() is accordingly renamed to
qmp_blockdev_change_medium().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-11 16:22:47 +01:00
Max Reitz
de2c6c0536 blockdev: Implement change with basic operations
Implement 'change' on block devices by calling blockdev-open-tray,
blockdev-remove-medium, blockdev-insert-medium (a variation of that
which does not need a node-name) and blockdev-close-tray.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-11 16:22:47 +01:00
Max Reitz
38f54bd1ee blockdev: Implement eject with basic operations
Implement 'eject' by calling blockdev-open-tray and
blockdev-remove-medium.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-11 16:22:47 +01:00
Max Reitz
d129988289 blockdev: Add blockdev-insert-medium
And a helper function for that, which directly takes a pointer to the
BDS to be inserted instead of its node-name (which will be used for
implementing 'change' using blockdev-insert-medium).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-11 16:22:47 +01:00
Max Reitz
2814f67271 blockdev: Add blockdev-remove-medium
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-11 16:22:47 +01:00
Max Reitz
abaaf59d24 blockdev: Add blockdev-close-tray
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-11 16:22:47 +01:00
Max Reitz
7d8a9f71b9 blockdev: Add blockdev-open-tray
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-11 16:22:47 +01:00
Stefan Hajnoczi
84aa0140dd blockdev: acquire AioContext in hmp_commit()
This one slipped through.  Although we acquire AioContext when
committing all devices we don't for just a single device.

AioContext must be acquired before calling bdrv_*() functions to
synchronize access with other threads that may be using the AioContext.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-11-09 10:07:10 +00:00
Eric Blake
6a8f9661dc block: Convert to new qapi union layout
We have two issues with our qapi union layout:
1) Even though the QMP wire format spells the tag 'type', the
C code spells it 'kind', requiring some hacks in the generator.
2) The C struct uses an anonymous union, which places all tag
values in the same namespace as all non-variant members. This
leads to spurious collisions if a tag value matches a non-variant
member's name.

Make the conversion to the new layout for block-related code.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1445898903-12082-16-git-send-email-eblake@redhat.com>
[Commit message tweaked slightly]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2015-11-02 08:30:27 +01:00
Fam Zheng
507306cc8e block: Add "drained begin/end" for internal snapshot
This ensures the atomicity of the transaction by avoiding processing of
external requests such as those from ioeventfd.

state->bs is assigned right after bdrv_drained_begin. Because it was
used as the flag for deletion or not in abort, now we need a separate
flag - InternalSnapshotState.created.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:24 +02:00
Fam Zheng
ff52bf36a3 block: Add "drained begin/end" for transactional blockdev-backup
Similar to the previous patch, make sure that external events are not
dispatched during transaction operations.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:24 +02:00
Fam Zheng
1fdd4b7be3 block: Add "drained begin/end" for transactional backup
This ensures the atomicity of the transaction by avoiding processing of
external requests such as those from ioeventfd.

Move the assignment to state->bs up right after bdrv_drained_begin, so
that we can use it in the clean callback. The abort callback will still
check bs->job and state->job, so it's OK.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:24 +02:00
Fam Zheng
da763e8301 block: Add "drained begin/end" for transactional external snapshot
This ensures the atomicity of the transaction by avoiding processing of
external requests such as those from ioeventfd.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:24 +02:00
Max Reitz
bd745e238b blockdev: Allow more options for BB-less BDS tree
Most of the options which blockdev_init() parses for both the
BlockBackend and the root BDS are valid for just the root BDS as well
(e.g. read-only). This patch allows specifying these options even if not
creating a BlockBackend.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00
Max Reitz
fbf8175eac blockdev: Pull out blockdev option extraction
Extract some of the blockdev option extraction code from blockdev_init()
into its own function. This simplifies blockdev_init() and will allow
reusing the code in a different function added in a follow-up patch.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00
Max Reitz
5ec18f8c83 blockdev: Do not create BDS for empty drive
Do not use "rudimentary" BDSs for empty drives any longer (for
freshly created drives).

After a follow-up patch, empty drives will generally use a NULL BDS, not
only the freshly created drives.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00