-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJanYJPAAoJEH8JsnLIjy/WxjUQAJA+DTOmGXvaNpMs65BrU79K
/r/iGVrzHv/RMLmrWMnqj96W9SnpMuiAP9hVLNsekqClY9q4ME4DpGcXhWfhSvF5
FC51ehvFJdfo8cPorsevcqNj60iWebjcx3lFfUq2606UOyYih3oijYxr6gSwWbRc
GAgdGMqsvGYpzgqAQVEWHUhaX0La49/OzY42aR+E+LCBNfTYvlydvyoc+tUTdIpW
1eM/ASGndGsN0Cf2vxlbKgJ0/P6v+cRZuuIDhKZqre+YG+yM+pq7yZb+o7nf/P36
TPR93BsT7FSVAizRK7VFRuPIynHpiaxYygrJERCXF0sxsV4OlKjpmt/uUPamWFh+
46Jx2NK1AuAx87BdErgmA119ObO3oAPxK0+2p981obb6SphTbbPxDj6SOlYCt4mJ
mhff4JtIiwCmDSckAwd2mkBI1Tvl9qqcELrpyd2t2eU4ec2vf7fPd85EsK/Mq6Kr
dbfqFvjNaaMxChoqFgkHAveYJ7zYqRFI2IY5o9c1QyZehCGPWjScxHXZZYdpDl59
YF9DkYQDOyvEX2jmMECaO1r/0nnO+BqQHu5ItJuTte9rjP9Q0do3iBISiIefewtf
yji6/QNn2hFrnr1HPAwLFFC3kPgc8Mq8mIUb53j8vG/01KhVRCcnJm2K6D4IUwLZ
S6ZnQJB97eE4y7YR5dNt
=2axz
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Mon 05 Mar 2018 17:45:51 GMT
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (38 commits)
block: Fix NULL dereference on empty drive error
qcow2: Replace align_offset() with ROUND_UP()
block/ssh: Add basic .bdrv_truncate()
block/ssh: Make ssh_grow_file() blocking
block/ssh: Pull ssh_grow_file() from ssh_create()
qemu-img: Make resize error message more general
qcow2: make qcow2_co_create2() a coroutine_fn
block: rename .bdrv_create() to .bdrv_co_create_opts()
Revert "IDE: Do not flush empty CDROM drives"
block: test blk_aio_flush() with blk->root == NULL
block: add BlockBackend->in_flight counter
block: extract AIO_WAIT_WHILE() from BlockDriverState
aio: rename aio_context_in_iothread() to in_aio_context_home_thread()
docs: document how to use the l2-cache-entry-size parameter
specs/qcow2: Fix documentation of the compressed cluster descriptor
iotest 033: add misaligned write-zeroes test via truncate
block: fix write with zero flag set and iovector provided
block: Drop unused .bdrv_co_get_block_status()
vvfat: Switch to .bdrv_co_block_status()
vpc: Switch to .bdrv_co_block_status()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# include/block/block.h
In my "build everything" tree, a change to the types in
qapi-schema.json triggers a recompile of about 4800 out of 5100
objects.
The previous commit split up qmp-commands.h, qmp-event.h, qmp-visit.h,
qapi-types.h. Each of these headers still includes all its shards.
Reduce compile time by including just the shards we actually need.
To illustrate the benefits: adding a type to qapi/migration.json now
recompiles some 2300 instead of 4800 objects. The next commit will
improve it further.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-24-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
BlockDriver->bdrv_create() has been called from coroutine context since
commit 5b7e1542cf ("block: make
bdrv_create adopt coroutine").
Make this explicit by renaming to .bdrv_co_create_opts() and add the
coroutine_fn annotation. This makes it obvious to block driver authors
that they may yield, use CoMutex, or other coroutine_fn APIs.
bdrv_co_create is reserved for the QAPI-based version that Kevin is
working on.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170705102231.20711-2-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
BlockBackend currently relies on BlockDriverState->in_flight to track
requests for blk_drain(). There is a corner case where
BlockDriverState->in_flight cannot be used though: blk->root can be NULL
when there is no medium. This results in a segfault when the NULL
pointer is dereferenced.
Introduce a BlockBackend->in_flight counter for aio requests so it works
even when blk->root == NULL.
Based on a patch by Kevin Wolf <kwolf@redhat.com>.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
BlockDriverState has the BDRV_POLL_WHILE() macro to wait on event loop
activity while a condition evaluates to true. This is used to implement
synchronous operations where it acts as a condvar between the IOThread
running the operation and the main loop waiting for the operation. It
can also be called from the thread that owns the AioContext and in that
case it's just a nested event loop.
BlockBackend needs this behavior but doesn't always have a
BlockDriverState it can use. This patch extracts BDRV_POLL_WHILE() into
the AioWait abstraction, which can be used with AioContext and isn't
tied to BlockDriverState anymore.
This feature could be built directly into AioContext but then all users
would kick the event loop even if they signal different conditions.
Imagine an AioContext with many BlockDriverStates, each time a request
completes any waiter would wake up and re-check their condition. It's
nicer to keep a separate AioWait object for each condition instead.
Please see "block/aio-wait.h" for details on the API.
The name AIO_WAIT_WHILE() avoids the confusion between AIO_POLL_WHILE()
and AioContext polling.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We don't need the can_write_zeroes_with_unmap field in
BlockDriverInfo, because it is redundant information with
supported_zero_flags & BDRV_REQ_MAY_UNMAP. Note that
BlockDriverInfo and supported_zero_flags are both per-device
settings, rather than global state about the driver as a
whole, which means one or both of these bits of information
can already be conditional. Let's audit how they were set:
crypto: always setting can_write_ to false is pointless (the
struct starts life zero-initialized), no use of supported_
nbd: just recently fixed to set can_write_ if supported_
includes MAY_UNMAP (thus this commit effectively reverts
bca80059e and solves the problem mentioned there in a more
global way)
file-posix, iscsi, qcow2: can_write_ is conditional, while
supported_ was unconditional; but passing MAY_UNMAP would
fail with ENOTSUP if the condition wasn't met
qed: can_write_ is unconditional, but pwrite_zeroes lacks
support for MAY_UNMAP and supported_ is not set. Perhaps
support can be added later (since it would be similar to
qcow2), but for now claiming false is no real loss
all other drivers: can_write_ is not set, and supported_ is
either unset or a passthrough
Simplify the code by moving the conditional into
supported_zero_flags for all drivers, then dropping the
now-unused BDI field. For callers that relied on
bdrv_can_write_zeroes_with_unmap(), we return the same
per-device settings for drivers that had conditions (no
observable change in behavior there); and can now return
true (instead of false) for drivers that support passthrough
(for example, the commit driver) which gives those drivers
the same fix as nbd just got in bca80059e. For callers that
relied on supported_zero_flags, we now have a few more places
that can avoid a wasted call to pwrite_zeroes() that will
just fail with ENOTSUP.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180126193439.20219-1-eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
qemu-common.h includes qemu/option.h, but most places that include the
former don't actually need the latter. Drop the include, and add it
to the places that actually need it.
While there, drop superfluous includes of both headers, and
separate #include from file comment with a blank line.
This cleanup makes the number of objects depending on qemu/option.h
drop from 4545 (out of 4743) to 284 in my "build everything" tree.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-20-armbru@redhat.com>
[Semantic conflict with commit bdd6a90a9e in block/nvme.c resolved]
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-15-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-14-armbru@redhat.com>
This cleanup makes the number of objects depending on qapi/qmp/qdict.h
drop from 4550 (out of 4743) to 368 in my "build everything" tree.
For qapi/qmp/qobject.h, the number drops from 4552 to 390.
While there, separate #include from file comment with a blank line.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-13-armbru@redhat.com>
This cleanup makes the number of objects depending on qapi/error.h
drop from 1910 (out of 4743) to 1612 in my "build everything" tree.
While there, separate #include from file comment with a blank line,
and drop a useless comment on why qemu/osdep.h is included first.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-5-armbru@redhat.com>
[Semantic conflict with commit 34e304e975 resolved, OSX breakage fixed]
The bdrv_reopen*() implementation doesn't like it if the graph is
changed between queuing nodes for reopen and actually reopening them
(one of the reasons is that queuing can be recursive).
So instead of draining the device only in bdrv_reopen_multiple(),
require that callers already drained all affected nodes, and assert this
in bdrv_reopen_queue().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
We need to remember how many of the drain sections in which a node is
were recursive (i.e. subtree drain rather than node drain), so that they
can be correctly applied when children are added or removed during the
drained section.
With this change, it is safe to modify the graph even inside a
bdrv_subtree_drained_begin/end() section.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is in preparation for subtree drains, i.e. drained sections that
affect not only a single node, but recursively all child nodes, too.
Calling the parent callbacks for drain is pointless when we just came
from that parent node recursively and leads to multiple increases of
bs->quiesce_counter in a single drain call. Don't do it.
In order for this to work correctly, the parent callback must be called
for every bdrv_drain_begin/end() call, not only for the outermost one:
If we have a node N with two parents A and B, recursive draining of A
should cause the quiesce_counter of B to increase because its child N is
drained independently of B. If now B is recursively drained, too, A must
increase its quiesce_counter because N is drained independently of A
only now, even if N is going from quiesce_counter 1 to 2.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Management tools create overlays of running guests with qemu-img:
$ qemu-img create -b /image/in/use.qcow2 -f qcow2 /overlay/image.qcow2
but this doesn't work anymore due to image locking:
qemu-img: /overlay/image.qcow2: Failed to get shared "write" lock
Is another process using the image?
Could not open backing image to determine size.
Use the force share option to allow this use case again.
Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit 1f4ad7d fixed 'qemu-img info' for raw images that are currently
in use as a mirror target. It is not enough for image formats, though,
as these still unconditionally request BLK_PERM_CONSISTENT_READ.
As this permission is geared towards whether the guest-visible data is
consistent, and has no impact on whether the metadata is sane, and
'qemu-img info' does not read guest-visible data (except for the raw
format), it makes sense to not require BLK_PERM_CONSISTENT_READ if there
is not going to be any guest I/O performed, regardless of image format.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
BDRV_POLL_WHILE() does not support recursive AioContext locking. It
only releases the AioContext lock once regardless of how many times the
caller has acquired it. This results in a hang since the IOThread does
not make progress while the AioContext is still locked.
The following steps trigger the hang:
$ qemu-system-x86_64 -M accel=kvm -m 1G -cpu host \
-object iothread,id=iothread0 \
-device virtio-scsi-pci,iothread=iothread0 \
-drive if=none,id=drive0,file=test.img,format=raw \
-device scsi-hd,drive=drive0 \
-drive if=none,id=drive1,file=test.img,format=raw \
-device scsi-hd,drive=drive1
$ qemu-system-x86_64 ...same options... \
-incoming tcp::1234
(qemu) migrate tcp:127.0.0.1:1234
...hang...
Tested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20171207201320.19284-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
bdrv_close() skips much of its logic when bs->drv is NULL. This is
fine when we're closing a BlockDriverState that has just been created
(because e.g the initialization process failed), but it's not enough
in other cases.
For example, when a valid qcow2 image is found to be corrupted then
QEMU marks it as such in the file header and then sets bs->drv to
NULL in order to make the BlockDriverState unusable. When that BDS is
later closed then many of its data structures are not freed (leaking
their memory) and none of its children are detached. This results in
bdrv_close_all() failing to close all BDSs and making this assertion
fail when QEMU is being shut down:
bdrv_close_all: Assertion `QTAILQ_EMPTY(&all_bdrv_states)' failed.
This patch makes bdrv_close() do the full uninitialization process
in all cases. This fixes the problem with corrupted images and still
works fine with freshly created BDSs.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 20171106145345.12038-1-berto@igalia.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
For format probing, we don't really care whether all of the image
content is consistent. The only thing we're looking at is the image
header, and specifically the magic numbers that are expected to never
change, no matter how inconsistent the guest visible disk content is.
Therefore, don't request BLK_PERM_CONSISTENT_READ. This allows to use
format probing, e.g. in the context of 'qemu-img info', even while the
guest visible data in the image is inconsistent during a running block
job.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
On one hand, it is a good idea for bdrv_next() to return a strong
reference because ideally nearly every pointer should be refcounted.
This fixes intermittent failure of iotest 194.
On the other, it is absolutely necessary for bdrv_next() itself to keep
a strong reference to both the BB (in its first phase) and the BDS (at
least in the second phase) because when called the next time, it will
dereference those objects to get a link to the next one. Therefore, it
needs these objects to stay around until then. Just storing the pointer
to the next in the iterator is not really viable because that pointer
might become invalid as well.
Both arguments taken together means we should probably just invoke
bdrv_ref() and blk_ref() in bdrv_next(). This means we have to assert
that bdrv_next() is always called from the main loop, but that was
probably necessary already before this patch and judging from the
callers, it also looks to actually be the case.
Keeping these strong references means however that callers need to give
them up if they decide to abort the iteration early. They can do so
through the new bdrv_next_cleanup() function.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20171110172545.32609-1-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
We currently do not guard everywhere against a NULL bs->drv where we
should be doing so. Most of the places fixed here just do not care
about that case at all.
Some care implicitly, e.g. through a prior function call to
bdrv_getlength() which would always fail for an ejected BDS. Add an
assert there to make it more obvious.
Other places seem to care, but do so insufficiently: Freeing clusters in
a qcow2 image is an error-free operation, but it may leave the image in
an unusable state anyway. Giving qcow2_free_clusters() an error code is
not really viable, it is much easier to note that bs->drv may be NULL
even after a successful driver call. This concerns bdrv_co_flush(), and
the way the check is added to bdrv_co_pdiscard() (in every iteration
instead of only once).
Finally, some places employ at least an assert(bs->drv); somewhere, that
may be reasonable (such as in the reopen code), but in
bdrv_has_zero_init(), it is definitely not. Returning 0 there in case
of an ejected BDS saves us much headache instead.
Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com>
Buglink: https://bugs.launchpad.net/qemu/+bug/1728660
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20171110203111.7666-4-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Currently, bdrv_reopen_prepare() assumes that all BDS options are
strings. However, this is not the case if the BDS has been created
through the json: pseudo-protocol or blockdev-add.
Note that the user-invokable reopen command is an HMP command, so you
can only specify strings there. Therefore, specifying a non-string
option with the "same" value as it was when originally created will now
return an error because the values are supposedly similar (and there is
no way for the user to circumvent this but to just not specify the
option again -- however, this is still strictly better than just
crashing).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20171114180128.17076-5-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Inactive images generally request less permissions for their image files
than they would if they were active (in particular, write permissions).
Activating the image involves extending the permissions, therefore.
drv->bdrv_invalidate_cache() can already require write access to the
image file, so we have to update the permissions earlier than that.
The current code does it only later, so we have to move up this part.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
bdrv_set_read_only() is used by some block drivers to override the
read-only option given by the user. This is not how read-only images
generally work in QEMU: Instead of second guessing what the user really
meant (which currently includes making an image read-only even if the
user didn't only use the default, but explicitly said read-only=off), we
should error out if we can't provide what the user requested.
This adds deprecation warnings to all callers of bdrv_set_read_only() so
that the behaviour can be corrected after the usual deprecation period.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
error_setg_errno() takes a positive errno code. Spotted by Coverity
(CID 1381628).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
When referring to a backing file of an image via node name
bdrv_open_backing_file would add the 'driver' option to the option list
filling it with the backing format driver. This breaks construction of
the backing chain via -blockdev, as bdrv_open_inherit reports an error
if both 'reference' and 'options' are provided.
$ qemu-img create -f raw /tmp/backing.raw 64M
$ qemu-img create -f qcow2 -F raw -b /tmp/backing.raw /tmp/test.qcow2
$ qemu-system-x86_64 \
-blockdev driver=file,filename=/tmp/backing.raw,node-name=backing \
-blockdev driver=qcow2,file.driver=file,file.filename=/tmp/test.qcow2,node-name=root,backing=backing
qemu-system-x86_64: -blockdev driver=qcow2,file.driver=file,file.filename=/tmp/test.qcow2,node-name=root,backing=backing: Could not open backing file: Cannot reference an existing block device with additional options or a new filename
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We don't need to make any assumptions about the graph layout above the
top node of the commit operation any more. Remove the use of
bdrv_find_overlay() and related variables from the commit job code.
bdrv_drop_intermediate() doesn't use the 'active' parameter any more, so
we can just drop it.
The overlay node was previously added to the block job to get a
BLK_PERM_GRAPH_MOD. We really need to respect those permissions in
bdrv_drop_intermediate() now, but as long as we haven't figured out yet
how BLK_PERM_GRAPH_MOD is actually supposed to work, just leave a TODO
comment there.
With this change, it is now possible to perform another block job on an
overlay node without conflicts. qemu-iotests 030 is changed accordingly.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This changes the commit block job to support operation in a graph where
there is more than a single active layer that references the top node.
This involves inserting the commit filter node not only on the path
between the given active node and the top node, but between the top node
and all of its parents.
On completion, bdrv_drop_intermediate() must consider all parents for
updating the backing file link. These parents may be backing files
themselves and as such read-only; reopen them temporarily if necessary.
Previously this was achieved by the bdrv_reopen() calls in the commit
block job that made overlay_bs read-write for the whole duration of the
block job, even though write access is only needed on completion.
Now that we consider all parents, overlay_bs is meaningless. It is left
in place in this commit, but we'll remove it soon.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is no good reason for bdrv_drop_intermediate() to know the active
layer above the subchain it is operating on - even more so, because
the assumption that there is a single active layer above it is not
generally true.
In order to prepare removal of the active parameter, use a BdrvChildRole
callback to update the backing file string in the overlay image instead
of directly calling bdrv_change_backing_file().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
We've previously fixed several places where we failed to account
for possible errors from bdrv_nb_sectors(). Fix another one by
making bdrv_dirty_bitmap_truncate() take the new size from the
caller instead of querying itself; then adjust the sole caller
bdrv_truncate() to pass the size just determined by a successful
resize, or to reuse the size given to the original truncate
operation when refresh_total_sectors() was not able to confirm the
actual size (the two sizes can potentially differ according to
rounding constraints), thus avoiding sizing the bitmaps to -1.
This also fixes a bug where not all failure paths in
bdrv_truncate() would set errp.
Note that bdrv_truncate() is still a bit awkward. We may want
to revisit it later and clean up things to better guarantee that
a resize attempt either fails cleanly up front, or cannot fail
after guest-visible changes have been made (if temporary changes
are made, then they need to be cleanly rolled back). But that
is a task for another day; for now, the goal is the bare minimum
fix to ensure that just bdrv_dirty_bitmap_truncate() cannot fail.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All callers of bdrv_img_create() pass in a size, or -1 to read the
size from the backing file. We then set that size as the QemuOpt
default, which means we will reuse that default rather than the
final parameter to qemu_opt_get_size() several lines later. But
it is rather confusing to read subsequent checks of 'size == -1'
when it looks (without seeing the full context) like size defaults
to 0; it also doesn't help that a size of 0 is valid (for some
formats).
Rework the logic to make things more legible.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If we switch between read-only and read-write, the permissions that
image format drivers need on bs->file change, too. Make sure to update
the permissions during bdrv_reopen().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
We will calculate the required new permissions in the prepare stage of a
reopen. Required permissions of children can be influenced by the
changes made to their parents, but parents are independent from their
children. This means that permissions need to be calculated top-down. In
order to achieve this, queue parents before their children rather than
queuing the children first.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
When new permissions are calculated during bdrv_reopen(), they need to
be based on the state of the graph as it will be after the reopen has
completed, not on the current state of the involved nodes.
This patch makes bdrv_is_writable() optionally accept a BlockReopenQueue
from which the new flags are taken. This is then used for determining
the new bs->file permissions of format drivers as soon as we add the
code to actually pass a non-NULL reopen queue to the .bdrv_child_perm
callbacks.
While moving bdrv_is_writable(), make it static. It isn't used outside
block.c.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
In the context of bdrv_reopen(), we'll have to look at the state of the
graph as it will be after the reopen. This interface addition is in
preparation for the change.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
In the context of bdrv_reopen(), we'll have to look at the state of the
graph as it will be after the reopen. This interface addition is in
preparation for the change.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This function is not used anywhere, so remove it.
Markus Armbruster adds:
The i82078 floppy device model used to call bdrv_media_changed() to
implement its media change bit when backed by a host floppy. This
went away in 21fcf36 "fdc: simplify media change handling".
Probably broke host floppy media change. Host floppy pass-through
was dropped in commit f709623. bdrv_media_changed() has never been
used for anything else. Remove it.
(Source is Message-ID: <87y3ruaypm.fsf@dusky.pond.sub.org>)
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The following functions fail if bs->drv is a filter and does not
implement them:
bdrv_probe_blocksizes
bdrv_probe_geometry
bdrv_truncate
bdrv_has_zero_init
bdrv_get_info
Instead, the call should be passed to bs->file if it exists, to allow
filter drivers to support those methods without implementing them. This
commit makes `drv->is_filter = true` imply that these callbacks will be
forwarded to bs->file by default, so disabling support for these
functions must be done explicitly.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently, a FOO_lookup is an array of strings terminated by a NULL
sentinel.
A future patch will generate enums with "holes". NULL-termination
will cease to work then.
To prepare for that, store the length in the FOO_lookup by wrapping it
in a struct and adding a member for the length.
The sentinel will be dropped next.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170822132255.23945-13-marcandre.lureau@redhat.com>
[Basically redone]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1503564371-26090-16-git-send-email-armbru@redhat.com>
[Rebased]
The next commit will put it to use. May look pointless now, but we're
going to change the FOO_lookup's type, and then it'll help.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1503564371-26090-13-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The lookup tables have a sentinel, no need to make callers pass their
size.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1503564371-26090-3-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Rebased, commit message corrected]
In the ->inactivate() callbacks, permissions are updated, which
typically involves a recursive check of the whole graph. Setting
BDRV_O_INACTIVE right before doing that creates a state that
bdrv_is_writable() returns false, which causes permission update
failure.
Reorder them so the flag is updated after calling the function. Note
that this doesn't break the assert in bdrv_child_cb_inactivate() because
for any specific BDS, we still update its flags first before calling
->inactivate() on it one level deeper in the recursion.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170823134242.12080-5-famz@redhat.com>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reopening an image should be consistent with opening it, so we should
set BDRV_O_ALLOW_RDWR for any image that is reopened read-write like in
bdrv_open_inherit().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
BDRV_O_ALLOW_RDWR is a flag that tells whether qemu can internally
reopen a node read-write temporarily because the user requested
read-write for the top-level image, but qemu decided that read-only is
enough for this node (a backing file).
bdrv_reopen() is different, it is also used for cases where the user
changed their mind and wants to update the options. There is no reason
to forbid making a node read-write in that case.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Commit 8ee03995 refactored the code incorrectly and broke the release of
permissions on the old BDS. Instead of changing the permissions to the
new required values after removing the old BDS from the list of
children, it only re-obtains the permissions it already had.
Change the order of operations so that the old BDS is removed again
before calculating the new required permissions.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
bdrv_open_driver() is called in two places, bdrv_new_open_driver() and
bdrv_open_common(). In the latter, failure cleanup in is in its caller,
bdrv_open_inherit(), which unrefs the bs->file of the failed driver open
if it exists.
Let's move the bs->file cleanup to bdrv_open_driver() to take care of
all callers and do not set bs->drv to NULL unless the driver's open
function failed. When bs is destroyed by removing its last reference, it
calls bdrv_close() which checks bs->drv to perform the needed cleanups
and also call the driver's close function. Since it cleans up options
and opaque we must take care not leave dangling pointers.
The error paths in bdrv_open_driver() are now two:
If open fails, drv->bdrv_close() should not be called. Unref the child
if it exists, free what we allocated and set bs->drv to NULL. Return the
error and let callers free their stuff.
If open succeeds but we fail after, return the error and let callers
unref and delete their bs, while cleaning up their allocations.
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In some error paths it is possible to QDECREF a freed dangling
explicit_options, resulting in a heap overflow crash. For example
bdrv_open_inherit()'s fail unrefs it, then calls bdrv_unref which calls
bdrv_close which also unrefs it.
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commits 0db832f and 6cdbceb introduced the automatic insertion of filter
nodes above the top layer of mirror and commit block jobs. The
assumption made there was that since libvirt doesn't do node-level
management of the block layer yet, it shouldn't be affected by added
nodes.
This is true as far as commands issued by libvirt are concerned. It only
uses BlockBackend names to address nodes, so any operations it performs
still operate on the root of the tree as intended.
However, the assumption breaks down when you consider query commands,
which return data for the wrong node now. These commands also return
information on some child nodes (bs->file and/or bs->backing), which
libvirt does make use of, and which refer to the wrong nodes, too.
One of the consequences is that oVirt gets wrong information about the
image size and stops the VM in response as long as a mirror or commit
job is running:
https://bugzilla.redhat.com/show_bug.cgi?id=1470634
This patch fixes the problem by hiding the implicit nodes created
automatically by the mirror and commit block jobs in the output of
query-block and BlockBackend-based query-blockstats as long as the user
doesn't indicate that they are aware of those nodes by providing a node
name for them in the QMP command to start the block job.
The node-based commands query-named-block-nodes and query-blockstats
with query-nodes=true still show all nodes, including implicit ones.
This ensures that users that are capable of node-level management can
still access the full information; users that only know BlockBackends
won't use these commands.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>