The new functions handles the data that is taken from the
BlockDriverState.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
The new functions handles the data that is taken from the BlockBackend.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Now that QEMU wraps the Win32 sockets methods to automatically
set errno upon failure, there is no reason for callers to use
the socket_error() method. They can rely on accessing errno
even on Win32. Remove all use of socket_error() from general
code, leaving it as a static method in oslib-win32.c only.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
An upcoming patch will alter how simple unions, like SocketAddress,
are laid out, which will impact all lines of the form 'addr->u.XXX'
(expanding it to the longer 'addr->u.XXX.data'). For better
legibility in that patch, and less need for line wrapping, it's better
to use a temporary variable to reduce the effect of a layout change to
just the variable initializations, rather than every reference within
a SocketAddress. Also, take advantage of some C99 initialization where
it makes sense (simplifying g_new0() to g_new()).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1457021813-10704-7-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
During incremental backups, if the target has a cluster size that is
larger than the backup cluster size and we are backing up to a target
that cannot (for whichever reason) pull clusters up from a backing image,
we may inadvertantly create unusable incremental backup images.
For example:
If the bitmap tracks changes at a 64KB granularity and we transmit 64KB
of data at a time but the target uses a 128KB cluster size, it is
possible that only half of a target cluster will be recognized as dirty
by the backup block job. When the cluster is allocated on the target
image but only half populated with data, we lose the ability to
distinguish between zero padding and uninitialized data.
This does not happen if the target image has a backing file that points
to the last known good backup.
Even if we have a backing file, though, it's likely going to be faster
to just buffer the redundant data ourselves from the live image than
fetching it from the backing file, so let's just always round up to the
target granularity.
The same logic applies to backup modes top, none, and full. Copying
fractional clusters without the guarantee of COW is dangerous, but even
if we can rely on COW, it's likely better to just re-copy the data.
Reported-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1456433911-24718-3-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
64K might not always be appropriate, make this a runtime value.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1456433911-24718-2-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
The three lines are duplicated a number of times now, refactor a
function.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1454637630-10585-3-git-send-email-famz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
The "pnum < nb_sectors" condition in deciding whether to actually copy
data is unnecessarily strict, and the qiov initialization is
unnecessarily for bdrv_aio_write_zeroes and bdrv_aio_discard.
Rewrite mirror_iteration to fix both flaws.
The output of iotests 109 is updated because we now report the offset
and len slightly differently in mirroring progress.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1454637630-10585-2-git-send-email-famz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
For values which are powers of two (and we do assume all of these to
be), sizeof(x) * 8 - 1 - clz(x) == ctz(x). Therefore, use ctz().
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 1450451066-13335-3-git-send-email-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
We have DIV_ROUND_UP(), so we can use it to produce more easily readable
code. It may be slower than the bit shifting currently performed
(because it actually performs a division), but since
vhdx_calc_bat_entries() is never used in a hot path, this is completely
fine.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 1450451066-13335-2-git-send-email-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
The iSCSI driver currently accepts the CHAP password in plain text
as a block driver property. This change adds a new "password-secret"
property that accepts the ID of a QCryptoSecret instance.
$QEMU \
-object secret,id=sec0,filename=/home/berrange/example.pw \
-drive driver=iscsi,url=iscsi://example.com/target-foo/lun1,\
user=dan,password-secret=sec0
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1453385961-10718-4-git-send-email-berrange@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
If connecting to a web server which has authentication
turned on, QEMU gets a 401 as curl has not been configured
with any authentication credentials.
This adds 4 new parameters to the curl block driver
options 'username', 'password-secret', 'proxy-username'
and 'proxy-password-secret'. Passwords are provided using
the recently added 'secret' object type
$QEMU \
-object secret,id=sec0,filename=/home/berrange/example.pw \
-object secret,id=sec1,filename=/home/berrange/proxy.pw \
-drive driver=http,url=http://example.com/some.img,\
username=dan,password-secret=sec0,\
proxy-username=dan,proxy-password-secret=sec1
Of course it is possible to use the same secret for both the
proxy & server passwords if desired, or omit the proxy auth
details, or the server auth details as required.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1453385961-10718-3-git-send-email-berrange@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Currently RBD passwords must be provided on the command line
via
$QEMU -drive file=rbd:pool/image:id=myname:\
key=QVFDVm41aE82SHpGQWhBQXEwTkN2OGp0SmNJY0UrSE9CbE1RMUE=:\
auth_supported=cephx
This is insecure because the key is visible in the OS process
listing.
This adds support for an 'password-secret' parameter in the RBD
parameters that can be used with the QCryptoSecret object to
provide the password via a file:
echo "QVFDVm41aE82SHpGQWhBQXEwTkN2OGp0SmNJY0UrSE9CbE1RMUE=" > poolkey.b64
$QEMU -object secret,id=secret0,file=poolkey.b64,format=base64 \
-drive driver=rbd,filename=rbd:pool/image:id=myname:\
auth_supported=cephx,password-secret=secret0
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1453385961-10718-2-git-send-email-berrange@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
This patch implements a blockdriver function bdrv_snapshot_delete() in
the sheepdog driver. With the new function, snapshots of sheepdog can
be deleted from libvirt.
Cc: Jeff Cody <jcody@redhat.com>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Message-id: 1450873346-22334-1-git-send-email-mitake.hitoshi@lab.ntt.co.jp
Signed-off-by: Jeff Cody <jcody@redhat.com>
recent libnfs versions support logging debug messages. Add
support for it in qemu through an URL parameter.
Example:
qemu -cdrom nfs://127.0.0.1/iso/my.iso?debug=2
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1447052973-14513-1-git-send-email-pl@kamp.de
Signed-off-by: Jeff Cody <jcody@redhat.com>
This patch adds the new bps_*_max_length and iops_*_max_length
parameters to the BlockDeviceInfo struct.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This modifies the NBD driver so that it is possible to request
use of TLS. This is done by providing the 'tls-creds' parameter
with the ID of a previously created QCryptoTLSCreds object.
For example
$QEMU -object tls-creds-x509,id=tls0,endpoint=client,\
dir=/home/berrange/security/qemutls \
-drive driver=nbd,host=localhost,port=9000,tls-creds=tls0
The client will drop the connection if the NBD server does not
provide TLS.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-15-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This extends the NBD protocol handling code so that it is capable
of negotiating TLS support during the connection setup. This involves
requesting the STARTTLS protocol option before any other NBD options.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-14-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that all callers are converted to use I/O channels for
initial connection setup, it is possible to switch the core
NBD protocol handling core over to use QIOChannel APIs for
actual sockets I/O.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-7-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This converts the NBD block driver client to use the QIOChannelSocket
class for initial connection setup. The NbdClientSession struct has
two pointers, one to the master QIOChannelSocket providing the raw
data channel, and one to a QIOChannel which is the current channel
used for I/O. Initially the two point to the same object, but when
TLS support is added, they will point to different objects.
The qemu-img & qemu-io tools now need to use MODULE_INIT_QOM to
ensure the QIOChannel object classes are registered. The qemu-nbd
tool already did this.
In this initial conversion though, all I/O is still actually done
using the raw POSIX sockets APIs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-4-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is also needed in bdrv_drain_all, not just in bdrv_drain.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1450867706-19860-3-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
JSON uses "name":value, but many of our visitor interfaces were
called with visit_type_FOO(v, &value, name, errp). This can be
a bit confusing to have to mentally swap the parameter order to
match JSON order. It's particularly bad for visit_start_struct(),
where the 'name' parameter is smack in the middle of the
otherwise-related group of 'obj, kind, size' parameters! It's
time to do a global swap of the parameter ordering, so that the
'name' parameter is always immediately after the Visitor argument.
Additional reason in favor of the swap: the existing include/qjson.h
prefers listing 'name' first in json_prop_*(), and I have plans to
unify that file with the qapi visitors; listing 'name' first in
qapi will minimize churn to the (admittedly few) qjson.h clients.
Later patches will then fix docs, object.h, visitor-impl.h, and
those clients to match.
Done by first patching scripts/qapi*.py by hand to make generated
files do what I want, then by running the following Coccinelle
script to affect the rest of the code base:
$ spatch --sp-file script `git grep -l '\bvisit_' -- '**/*.[ch]'`
I then had to apply some touchups (Coccinelle insisted on TAB
indentation in visitor.h, and botched the signature of
visit_type_enum() by rewriting 'const char *const strings[]' to
the syntactically invalid 'const char*const[] strings'). The
movement of parameters is sufficient to provoke compiler errors
if any callers were missed.
// Part 1: Swap declaration order
@@
type TV, TErr, TObj, T1, T2;
identifier OBJ, ARG1, ARG2;
@@
void visit_start_struct
-(TV v, TObj OBJ, T1 ARG1, const char *name, T2 ARG2, TErr errp)
+(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp)
{ ... }
@@
type bool, TV, T1;
identifier ARG1;
@@
bool visit_optional
-(TV v, T1 ARG1, const char *name)
+(TV v, const char *name, T1 ARG1)
{ ... }
@@
type TV, TErr, TObj, T1;
identifier OBJ, ARG1;
@@
void visit_get_next_type
-(TV v, TObj OBJ, T1 ARG1, const char *name, TErr errp)
+(TV v, const char *name, TObj OBJ, T1 ARG1, TErr errp)
{ ... }
@@
type TV, TErr, TObj, T1, T2;
identifier OBJ, ARG1, ARG2;
@@
void visit_type_enum
-(TV v, TObj OBJ, T1 ARG1, T2 ARG2, const char *name, TErr errp)
+(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp)
{ ... }
@@
type TV, TErr, TObj;
identifier OBJ;
identifier VISIT_TYPE =~ "^visit_type_";
@@
void VISIT_TYPE
-(TV v, TObj OBJ, const char *name, TErr errp)
+(TV v, const char *name, TObj OBJ, TErr errp)
{ ... }
// Part 2: swap caller order
@@
expression V, NAME, OBJ, ARG1, ARG2, ERR;
identifier VISIT_TYPE =~ "^visit_type_";
@@
(
-visit_start_struct(V, OBJ, ARG1, NAME, ARG2, ERR)
+visit_start_struct(V, NAME, OBJ, ARG1, ARG2, ERR)
|
-visit_optional(V, ARG1, NAME)
+visit_optional(V, NAME, ARG1)
|
-visit_get_next_type(V, OBJ, ARG1, NAME, ERR)
+visit_get_next_type(V, NAME, OBJ, ARG1, ERR)
|
-visit_type_enum(V, OBJ, ARG1, ARG2, NAME, ERR)
+visit_type_enum(V, NAME, OBJ, ARG1, ARG2, ERR)
|
-VISIT_TYPE(V, OBJ, NAME, ERR)
+VISIT_TYPE(V, NAME, OBJ, ERR)
)
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-19-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Now that all drivers return the right "file" pointer, we can use it.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1453780743-16806-14-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
"offset" is the offset of cluster and sector_num doesn't necessarily
refer to the start of it, it should add index_in_cluster.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-12-git-send-email-famz@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
The added parameter can be used to return the BDS pointer which the
valid offset is referring to. Its value should be ignored unless
BDRV_BLOCK_OFFSET_VALID in ret is set.
Until block drivers fill in the right value, let's clear it explicitly
right before calling .bdrv_get_block_status.
The "bs->file" condition in bdrv_co_get_block_status is kept now to keep iotest
case 102 passing, and will be fixed once all drivers return the right file
pointer.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-2-git-send-email-famz@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
When bdrv_close_all() is called, instead of force-closing all root
BlockDriverStates, it is better to just drop the reference from all
BlockBackends and let them be closed automatically. This prevents BDS
from getting closed that are still referenced by other BDS, which may
result in loss of cached data.
This patch adds a function for doing that, but does not yet incorporate
it in bdrv_close_all().
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It is unused now, so we can remove it.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_close() no longer signifies ejection of a medium, this is now done
by removing the BDS from the BB. Therefore, we want to have a notifier
for that in the BB instead of a close notifier in the BDS. The former is
added now, the latter is removed later.
Symmetrically, another notifier list is added that is invoked whenever a
BDS is inserted. We will need that for virtio-blk and virtio-scsi, which
can then remove their op blockers on BDS ejection and set them up on
insertion.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit d62d9dc4b8 lifted streamOptimized images's version to 3, but we
now refuse to open version 3 images read-write. We need to make
streamOptimized an exception to allow converting to it. This fixes the
accidentally broken iotests case 059 for the same reason.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Pull out the check whether a block device has a tray from
blk_dev_is_tray_open() into its own function so both attributes (whether
there is a tray vs. whether that tray is open) can be queried
independently.
Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1454096953-31773-2-git-send-email-mreitz@redhat.com
VMware products accept only version 3 for streamOptimized, let's bump
the version.
Reported-by: Radoslav Gerganov <rgerganov@vmware.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If qcow2_invalidate_cache() fails, we are in a state where qcow2_close()
has already been completed, but the image hasn't been reopened yet.
Calling into any qcow2 function for an image in this state will cause
crashes.
The real solution would be to get rid of the close/open pair and instead
do an atomic reset of the involved data structures, but this isn't
trivial, so let's just make the image inaccessible for now.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
What qcow2_invalidate_cache() should do is close the image with
BDRV_O_INACTIVE set and reopen it with the flag cleared. In fact, it
used to do exactly the opposite: qcow2_close() relied on bs->open_flags,
which is already updated to have cleared BDRV_O_INACTIVE at this point,
whereas qcow2_open() was called with s->flags, which has the flag still
set. Fix this.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The callback has to ensure that closing or flushing the image afterwards
wouldn't cause a write access to the image files. This means that just
the caches have to be written out, which is part of the existing
.bdrv_close implementation.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Instead of covering only the state of images on the migration
destination before the migration is completed, the flag will also cover
the state of images on the migration source after completion. This
common state implies that the image is technically still open, but no
writes will happen and any cached contents will be reloaded from disk if
and when the image leaves this state.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
As long as BDRV_O_INCOMING is set, the image file is only opened so we
have a file descriptor for it. We're definitely not supposed to modify
the image, it's still owned by the migration source.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
When creating a qcow2 image, we didn't necessarily call
qcow2_update_header(), but could end up with the basic header that
qcow2_create2() created manually. One thing that this basic header
lacks is the feature table. Let's make sure that it's always present.
This requires a few updates to test cases as well.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Version 2 images don't have feature bits, so writing a feature table to
those images is kind of pointless.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
large volume DASD that have > 64k cylinders do claim to have
0xFFFE cylinders as special value in the old 16 bit field. We
want to pass this "token" along to the guest, instead of
calculating the real number. Otherwise qemu might fail with
"cyls must be between 1 and 65535"
Cc: qemu-stable@nongnu.org
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The amount of memory allocated in nbd_co_receive_request is driven by the
NBD client (possibly a virtual machine). Parallel I/O can cause the
server to allocate a large amount of memory; check for failures and
return ENOMEM in that case.
Cc: qemu-block@nongnu.org
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When play with Dell MD3000 target, for sure it
is a TYPE_DISK, but readcapacity16 would fail.
Then we find that readcapacity10 succeeded. It
looks like the target just support readcapacity10
even through it is a TYPE_DISK or have some
TYPE_ROM characteristics.
This patch can give a chance to send
readcapacity16 when readcapacity10 failed.
This patch is not harmful to original pathes
Signed-off-by: Zhu Lingshan <lszhu@suse.com>
Message-Id: <1451359934-9236-1-git-send-email-lszhu@suse.com>
[Don't fall through on UNIT ATTENTION. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The arguments of error_setg_errno() should yield a short error string
without newlines.
Here, we try to append additional help to the error message by
embedding newlines in the error string. That's nice, but it's doesn't
play nicely with the errno part. tests/qemu-iotests/070.out shows the
resulting mess:
can't open device TEST_DIR/iotest-dirtylog-10G-4M.vhdx: VHDX image file 'TEST_DIR/iotest-dirtylog-10G-4M.vhdx' opened read-only, but contains a log that needs to be replayed. To replay the log, execute:
qemu-img check -r all 'TEST_DIR/iotest-dirtylog-10G-4M.vhdx': Operation not permitted
Switch to error_setg() and error_append_hint(). Result:
can't open device TEST_DIR/iotest-dirtylog-10G-4M.vhdx: VHDX image file 'TEST_DIR/iotest-dirtylog-10G-4M.vhdx' opened read-only, but contains a log that needs to be replayed
To replay the log, run:
qemu-img check -r all 'TEST_DIR/iotest-dirtylog-10G-4M.vhdx'
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1450452927-8346-21-git-send-email-armbru@redhat.com>
vmdk_parse_extents() reports parse errors like this:
error_setg(errp, "Invalid extent lines:\n%s", p);
where p points to the beginning of the malformed line in the image
descriptor. This results in a multi-line error message
Invalid extent lines:
<first line that doesn't parse>
<remaining text that may or may not parse, if any>
Error messages should not have newlines embedded. Since the remaining
text is not helpful, we can simply report:
Invalid extent line: <first line that doesn't parse>
Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1450452927-8346-19-git-send-email-armbru@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Factor out loop stepping to turn a while-loop with goto into a
for-loop with continue.
Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1450452927-8346-18-git-send-email-armbru@redhat.com>
Commit 6daf194d, be62a2eb and 312fd5f got rid of a bunch, but they
keep coming back. Tracked down with the Coccinelle semantic patch
from commit 312fd5f.
Cc: Fam Zheng <famz@redhat.com>
Cc: Peter Crosthwaite <crosthwaitepeter@gmail.com>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Dominik Dingel <dingel@linux.vnet.ibm.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Cc: Stefan Berger <stefanb@linux.vnet.ibm.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Changchun Ouyang <changchun.ouyang@intel.com>
Cc: zhanghailiang <zhang.zhanghailiang@huawei.com>
Cc: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Markus Armbruster <armbru@pond.sub.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Acked-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1450452927-8346-17-git-send-email-armbru@redhat.com>
Done with this Coccinelle semantic patch
@@
expression FMT, E1, E2;
expression list ARGS;
@@
- error_setg(E1, FMT, ARGS, error_get_pretty(E2));
+ error_propagate(E1, E2);/*###*/
+ error_prepend(E1, FMT/*@@@*/, ARGS);
followed by manual cleanup, first because I can't figure out how to
make Coccinelle transform strings, and second to get rid of now
superfluous error_propagate().
We now use or propagate the original error whole instead of just its
message obtained with error_get_pretty(). This avoids suppressing its
hint (see commit 50b7b00), but I can't see how the errors touched in
this commit could come with hints. It also improves the message
printed with &error_abort when we screw up (see commit 1e9b65b).
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Done with this Coccinelle semantic patch
@@
expression FMT, E, S;
expression list ARGS;
@@
- error_report(FMT, ARGS, error_get_pretty(E));
+ error_reportf_err(E, FMT/*@@@*/, ARGS);
(
- error_free(E);
|
exit(S);
|
abort();
)
followed by a replace of '%s"/*@@@*/' by '"' and some line rewrapping,
because I can't figure out how to make Coccinelle transform strings.
We now use the error whole instead of just its message obtained with
error_get_pretty(). This avoids suppressing its hint (see commit
50b7b00), but I can't see how the errors touched in this commit could
come with hints.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1450452927-8346-12-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Same Coccinelle semantic patch as in commit 565f65d.
We now use the original error whole instead of just its message
obtained with error_get_pretty(). This avoids suppressing its hint
(see commit 50b7b00), but I don't think the errors touched in this
commit can come with hints.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1450452927-8346-3-git-send-email-armbru@redhat.com>
Since a5002d5 (block/qapi: allow best-effort query) we don't return at
this error, however err must be cleared before passing to
bdrv_query_snapshot_info_list below, as required by error API.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1450779107-26765-1-git-send-email-famz@redhat.com
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Request merging must not result in a huge request that exceeds the
maximum number of iovec elements. Use BlockLimits.max_iov instead of
hardcoding IOV_MAX.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The maximum number of struct iovec elements depends on the
BlockDriverState. The raw-posix and iSCSI protocols have a maximum of
IOV_MAX but others could have different values.
Cc: Peter Lieven <pl@kamp.de>
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
For more complex BDS trees that can be created under normal circumstances,
we lose the ability to issue query commands because of our inability to
re-construct the absolute filename.
Instead, omit this field when it is a problem and present as much information
as we can.
This will change the expected output in iotest 110, where we will now see a
json filename and the lack of an absolute filename instead of an error.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1450122916-4706-6-git-send-email-jsnow@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Disambiguate "Backing filename and full backing filename are equivalent"
from "full backing filename could not be determined."
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1450122916-4706-4-git-send-email-jsnow@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Always report full_backing_filename, even if it's the same as
backing_filename. In the next patch, full_backing_filename may be
omitted if it cannot be generated instead of allowing e.g. drive_query
to abort if it runs into this scenario.
The presence or absence of the "full" field becomes useful information.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1450122916-4706-3-git-send-email-jsnow@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
If it happens to match the backing path, that was the actual path.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1450122916-4706-2-git-send-email-jsnow@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
When called from a coroutine, bdrv_ioctl must be asynchronous just like
e.g. bdrv_flush. The code was incorrectly making it synchronous, fix
it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Do not use bdrv_drain, since by itself it does not guarantee
anything.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Traditionally, aio=native was treated as an advice that could simply be
ignored if an error occurs while initialising Linux AIO or the feature
wasn't compiled in. This behaviour was deprecated in commit 96518254
(qemu 2.3; error during init) and commit 1501ecc1 (qemu 2.5; not
compiled in).
This patch changes raw-posix to error out in these cases instead of
printing a deprecation warning.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
s->qcow_version is always set to 2 or 3. Let's assert if this is wrong.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If a reference count is not representable with the current refcount
order, the image check should point to qemu-img amend for increasing the
refcount order. However, qemu-img amend needs write access to the image
which cannot be provided if the image is marked corrupt; and the image
check will not mark the image consistent unless everything actually is
consistent.
Therefore, if an image is marked corrupt and the image check encounters
a reference count overflow, it cannot be fixed by using qemu-img amend
to increase the refcount order. Instead, one has to use qemu-img convert
to create a completely new copy of the image in this case.
Alternatively, we may want to give the user a way of manually removing
the corrupt flag, maybe through qemu-img amend, but this is not part of
this patch.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Make use of qcow2_change_refcount_order() to support changing the
refcount order with qemu-img amend.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add a function qcow2_change_refcount_order() which allows changing the
refcount order of a qcow2 image.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If there is more than one time-consuming operation to be performed for
qcow2_amend_options(), we need an intermediate CB which coordinates the
progress of the individual operations and passes the result to the
original status callback.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If the image version should be upgraded, that is the first we should do;
if it should be downgraded, that is the last we should do. So split the
version change block into an upgrade part at the start and a downgrade
part at the end.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add an opaque value which is to be passed to the bdrv_amend_options()
status callback.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Just reopening the children (as block.c does now) is enough.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
In order to decide whether a blkdebug: filename can be produced or a
json: one is necessary, blkdebug checked whether bs->options had more
options than just "config", "x-image" or "image" (the latter including
nested options). That doesn't work well when generic block layer options
are present.
This patch passes an option QDict to the driver that contains only
driver-specific options, i.e. the options for the general block layer as
well as child nodes are already filtered out. Works much better this
way.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
For bs->file, using references to existing BDSes has been possible for a
while already. This patch enables the same for bs->backing.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
bdrv_replace_in_backing_chain() asserts that not both old and new
BlockDdriverState have a BlockBackend attached to them because both
would have to end up pointing to the new BDS and we don't support more
than one BB per BDS yet.
Before we can safely allow references to existing nodes as backing
files, we need to make sure that even if a backing file has a BB on it,
this doesn't crash qemu.
There are probably also some cases with the 'replaces' option set where
drive-mirror could fail this assertion today. They are fixed with this
error check as well.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
qcow2 accepts a few driver-specific options that overlap semantically
(e.g. "overlap-check" is an alias of "overlap-check.template", and any
missing cache size option is derived from the given ones).
When bdrv_reopen() merges the set of updated options with left out
options that should be kept at their old value, we need to consider this
and filter out any duplicates (which would generally cause errors
because new and old value would contradict each other).
This patch adds a .bdrv_join_options callback to BlockDriver and
implements it for qcow2.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
The name QType matches our CODING_STYLE conventions for type names
in CamelCase. It also matches the fact that we are already naming
all the enum members with a prefix of QTYPE, not QTYPE_CODE. And
doing the rename will also make it easier for the next patch to use
QAPI for providing the enum, which also wants CamelCase type names.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1449033659-25497-3-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Now that we guarantee the user doesn't have any enum values
beginning with a single underscore, we can use that for our
own purposes. Renaming ENUM_MAX to ENUM__MAX makes it obvious
that the sentinel is generated.
This patch was mostly generated by applying a temporary patch:
|diff --git a/scripts/qapi.py b/scripts/qapi.py
|index e6d014b..b862ec9 100644
|--- a/scripts/qapi.py
|+++ b/scripts/qapi.py
|@@ -1570,6 +1570,7 @@ const char *const %(c_name)s_lookup[] = {
| max_index = c_enum_const(name, 'MAX', prefix)
| ret += mcgen('''
| [%(max_index)s] = NULL,
|+// %(max_index)s
| };
| ''',
| max_index=max_index)
then running:
$ cat qapi-{types,event}.c tests/test-qapi-types.c |
sed -n 's,^// \(.*\)MAX,s|\1MAX|\1_MAX|g,p' > list
$ git grep -l _MAX | xargs sed -i -f list
The only things not generated are the changes in scripts/qapi.py.
Rejecting enum members named 'MAX' is now useless, and will be dropped
in the next patch.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1447836791-369-23-git-send-email-eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
[Rebased to current master, commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
No need to keep two separate enums, where editing one is likely
to forget the other. Now that we can specify a qapi enum prefix,
we don't even have to change the bulk of the uses.
get_event_by_name() could perhaps be replaced by qapi_enum_parse(),
but I left that for another day.
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1447836791-369-20-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The assertion problem was noticed in 06c3916b35, but it wasn't
completely fixed, because even though the req is not marked as
serialising, it still gets serialised by wait_serialising_requests
against other serialising requests, which could lead to the same
assertion failure.
Fix it by even more explicitly skipping the serialising for this
specific case.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1448962590-2842-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
With dataplane, the ioeventfd events could be dispatched after
mirror_run releases the dirty bitmap, but before mirror_exit actually
does the device switch, because the iothread will still be running, and
it will cause silent data loss.
Fix this by adding a bdrv_drained_begin/end pair around the window, so
that no new external request will be handled.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
make check always outputs warnings, this
is not nice. Disable blkdebug warnings under qtest.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1448883874-17933-1-git-send-email-mst@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
This crash was caught with qemu-iotests test case 138.
Commit b6d36de already fixed a few 32 bit truncation bugs that could
cause qemu-img check to allocate too little memory and consequently
it would segfault. On 32 bit hosts, there is one more place that needs
to be fixed because size_t was involved in the calculation and is a
32 bit type there.
Cc: qemu-stable@nongnu.org
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Spotted by Coverity.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add support for caching options that can be specified from the command
line.
The CD-ROM raw char device bypasses the host page cache and therefore
has alignment requirements. Alignment probing is necessary so only use
the raw char device if BDRV_O_NOCACHE is set.
This patch fixes -cdrom /dev/cdrom on Mac OS X hosts, where bdrv_read()
used to fail due to misaligned requests during image format probing.
Signed-off-by: John Arbuckle <programmingkidx@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch marks part of the BAT dirty properly. There is a possibility that
multy-block allocation could have one block allocated on one BAT page and
next block on the next page. The code without the patch could not save
updated position to the file.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Message-id: 1447779778-26062-1-git-send-email-den@openvz.org
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The patch also ensures proper locking for the operation.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
to create snapshot for all loaded block drivers.
The patch also ensures proper locking.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
to check that snapshot is available for all loaded block drivers.
The check bs != bs1 in hmp_info_snapshots is an optimization. The check
for availability of this snapshot will return always true as the list
of snapshots was collected from that image.
The patch also ensures proper locking.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
to switch to snapshot on all loaded block drivers.
The patch also ensures proper locking.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
to delete snapshots from all loaded block drivers.
The patch also ensures proper locking.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
this will make code better in the next patch
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The patch enforces proper locking for this operation.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This patch switches to QEMU_CLOCK_VIRTUAL for the accounting code in
qtest mode, and makes the latency of the operation constant. This way we
can perform tests on the accounting code with reproducible results.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 35ed0501450fa572684e9b5e92c361ab6cce565b.1446044838.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch adds two new fields to BlockDeviceTimedStats that track the
average number of pending read and write requests for a block device.
The values are calculated for the period of time defined for that
interval.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: fd31fef53e2714f2f30d59ed58ca2f67ec9ab926.1446044837.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch keeps track of the minimum, maximum and average latencies
of I/O operations during a certain interval of time.
The values are exposed in the BlockDeviceTimedStats structure.
An option to define the intervals to collect these statistics will be
added in a separate patch.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: c7382dc89622c64f918d09f32815827772628f8e.1446044837.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch adds two options, "stats-account-invalid" and
"stats-account-failed", that can be used to decide whether invalid and
failed I/O operations must be used when collecting statistics for
latency and last access time.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: ebc7e5966511a342cad428a392c5f5ad56b15213.1446044837.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch adds the block_acct_failed() and block_acct_invalid()
functions to allow keeping track of failed and invalid I/O operations.
The number of failed and invalid operations is exposed in
BlockDeviceStats.
We don't keep track of the time spent on invalid operations because
they are cancelled immediately when they are started.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: a7256ccb883a86356b1c6c46b5a29ed5448546a5.1446044837.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch adds the new field 'idle_time_ns' to the BlockDeviceStats
structure, indicating the time that has passed since the previous I/O
operation.
It also adds the block_acct_idle_time_ns() call, to ensure that all
references to the clock type used for accounting are in the same
place. This will later allow us to use a different clock for iotests.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 7d8cfcf931453e1a2443e6626e8c1edc347c7c8a.1446044837.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Its value is still QEMU_CLOCK_REALTIME, but having it in a variable will
allow us to change its value easily in the future when running in qtest
mode.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 547485eb841cf9e3b2770c96539ae9ae5996e214.1446044837.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Allow a BlockJobTxn to be passed into backup_run, which
will allow the job to join a transactional group if present.
Propagate this new parameter outward into new QMP helper
functions in blockdev.c to allow transaction commands to
pass forward their BlockJobTxn object in a forthcoming patch.
[split up from a patch originally by Stefan and Fam. --js]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1446765200-3054-12-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Switch over to the new .commit/.abort handlers for
cleaning up incremental bitmaps.
[split up from a patch originally by Stefan and Fam. --js]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1446765200-3054-11-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add reference count to block job, meanwhile move the ownership of the
reference to job->bs from the caller (which is released in two
completion callbacks) to the block job itself. It is necessary for
block_job_complete_sync to work, because block job shouldn't live longer
than its bs, as asserted in bdrv_delete.
Now block_job_complete_sync can be simplified.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1446765200-3054-6-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This will be reused by the coming new transactional completion code.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1446765200-3054-5-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The "need_check_timer" is used to clear the "NEED_CHECK" flag in the
image header after a grace period once metadata update has finished. In
compliance to the bdrv_drain semantics we should make sure it remains
deleted once .bdrv_drain is called.
We cannot reuse qed_need_check_timer_cb because here it doesn't satisfy
the assertion. Do the "plug" and "flush" calls manually.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1447064214-29930-10-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Drivers can have internal request sources that generate IO, like the
need_check_timer in QED. Since we want quiesced periods that contain
nested event loops in block layer, we need to have a way to disable such
event sources.
Block drivers must implement the "bdrv_drain" callback if it has any
internal sources that can generate I/O activity, like a timer or a
worker thread (even in a library) that can schedule QEMUBH in an
asynchronous callback.
Update the comments of bdrv_drain and bdrv_drained_begin accordingly.
Like bdrv_requests_pending(), we should consider all the children of bs.
Before, the while loop just works, as bdrv_requests_pending() already
tracks its children; now we mustn't miss the callback, so recurse down
explicitly.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1447064214-29930-9-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now the callback is not used any more, drop the field along with all
implementations in block drivers, which are iscsi and raw.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1447064214-29930-8-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently all drivers that support .bdrv_aio_ioctl also implement
.bdrv_ioctl redundantly. To track ioctl requests in block layer it is
easier if we unify the two paths, because we'll need to run it in a
coroutine, as required by tracked_request_begin. While we're at it, use
.bdrv_aio_ioctl plus aio_poll() to emulate bdrv_ioctl().
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1447064214-29930-7-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
iscsi_ioctl emulates SG_GET_VERSION_NUM and SG_GET_SCSI_ID. Now that
bdrv_ioctl() will be emulated with .bdrv_aio_ioctl, replicate the logic
into iscsi_aio_ioctl to make them consistent.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1447064214-29930-5-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Both bdrv_discard and bdrv_aio_discard will call into bdrv_co_discard,
so add tracked_request_begin/end calls around the loop.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1447064214-29930-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Both bdrv_flush and bdrv_aio_flush eventually call bdrv_co_flush, add
tracked_request_begin and tracked_request_end pair in that function so
that all flush requests are now tracked.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1447064214-29930-3-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We'll track more request types besides read and write, change the
boolean field to an enum.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1447064214-29930-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When searching for contiguous zero clusters, we only need to check the
cluster type. Before this patch, an increasing offset (L2E_OFFSET_MASK)
was expected, so that the function never returned more than a single
zero cluster in practice. This patch fixes it to actually return as many
contiguous zero clusters as it can.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1446657384-5907-1-git-send-email-kwolf@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
This function returns the reference count of a given BlockBackend.
For convenience, it returns 0 if the BlockBackend pointer is NULL.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: dfdd8a17dbe3288842840636d2cfe5bb895abcb0.1446475331.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
There's nothing preventing the target image from being used by other
operations during the 'drive-mirror' job, so we should block them all
until the job is done.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 82b88fd04cde918a08a6f9a4ab85626d7fd7b502.1446475331.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
This is simpler now that the driver has been converted to coroutines.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
There are two ways to check for I/O limits in a BlockDriverState:
- bs->throttle_state: if this pointer is not NULL, it means that this
BDS is member of a throttling group, its ThrottleTimers structure
has been initialized and its I/O limits are ready to be applied.
- bs->io_limits_enabled: if true it means that the throttle_state
pointer is valid _and_ the limits are currently enabled.
The latter is used in several places to check whether a BDS has I/O
limits configured, but what it really checks is whether requests
are being throttled or not. For example, io_limits_enabled can be
temporarily set to false in cases like bdrv_read_unthrottled() without
otherwise touching the throtting configuration of that BDS.
This patch replaces bs->io_limits_enabled with bs->throttle_state in
all cases where what we really want to check is the existence of I/O
limits, not whether they are currently enabled or not.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
throttle_group_unregister_bs() removes a BlockDriverState from its
throttling group and destroys the timers. This means that there must
be no pending throttled requests at that point (because it would be
impossible to complete them), so the caller has to drain them first.
At the moment throttle_group_unregister_bs() is only called from
bdrv_io_limits_disable(), which already takes care of draining the
requests, so there's nothing to worry about, but this patch makes
this invariant explicit in the documentation and adds the relevant
assertions.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If we create a buffer directly on the stack by using 12 bytes, there's
no guarantee the 64bit value we want to swap will be aligned, which
could cause errors with undefined behavior.
Spotted with clang -fsanitize=undefined and observed in iotests 15, 26,
44, 115 and 121.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
'block-commit' needs write access to two different nodes of the chain:
- 'base', because that's where the data is written to.
- the overlay of 'top', because it needs to update the backing file
string to point to 'base' after the operation.
Both images have to be opened in read-write mode, and commit_start()
takes care of reopening them if necessary.
With the current implementation, however, when overlay_bs is reopened
in read-write mode it has the side effect of making 'base' read-only
again, eventually making 'block-commit' fail.
This needs to be fixed in bdrv_reopen(), but until we get to that it
can be worked around simply by swapping the order of base and
overlay_bs in the reopen queue.
In order to reproduce this bug, overlay_bs needs to be initially in
read-only mode. That is: the 'top' parameter of 'block-commit' cannot
be the active layer nor its immediate backing chain.
Cc: qemu-stable@nongnu.org
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
blk_dev_change_media_cb() is called for all potential tray movements;
however, it is possible to request closing the tray but nothing actually
happening (on a floppy disk drive without a medium).
Thus, the actual tray status should be inquired before sending a
tray-moved event (and an event should be sent whenever the status
changed).
Checking @load is now superfluous; it was necessary because it was
possible to change a medium without having explicitly opened the tray
and closed it again (or it might have been possible, at least). This is
no longer possible, though.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In order to open a BDS which inherits a BB's root state,
blk_get_open_flags_from_root_state() is used to inquire the flags to be
passed to bdrv_open(), and blk_apply_root_state() is used to apply the
remaining state after the BDS has been opened.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This function removes the BlockDriverState associated with the given
BlockBackend from that BB and sets the BDS pointer in the BB to NULL.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Previously we return -EIO blindly when anything goes wrong. Add a helper
function to parse sense fields and try to make the return code more
meaningful.
This also fixes the default werror configuration (enospc) when we're
using qcow2 on an iscsi lun. The old -EIO not being treated as out of
space error failed to trigger vm stop.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1446699609-11376-1-git-send-email-famz@redhat.com>
[libiscsi 1.9 compatibility - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We have two issues with our qapi union layout:
1) Even though the QMP wire format spells the tag 'type', the
C code spells it 'kind', requiring some hacks in the generator.
2) The C struct uses an anonymous union, which places all tag
values in the same namespace as all non-variant members. This
leads to spurious collisions if a tag value matches a non-variant
member's name.
Make the conversion to the new layout for block-related code.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1445898903-12082-16-git-send-email-eblake@redhat.com>
[Commit message tweaked slightly]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The function manually recursed into bs->file and bs->backing to check
whether there were any requests pending, but it ignored other children.
There's no need to special case file and backing here, so just replace
these two explicit recursions by a loop recursing for all child nodes.
Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1446029211-27148-1-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The semantics is that after bdrv_drained_begin(bs), bs will not get new external
requests until the matching bdrv_drained_end(bs).
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All callers pass in false, and the real external ones will switch to
true in coming patches.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The group throttling code was always meant to handle its locking
internally. However, bdrv_swap() was touching the ThrottleGroup
structure directly and therefore needed an API for that.
Now that bdrv_swap() no longer exists there's no need for the
throttle_group_lock() API anymore.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
blk_bs() will not necessarily return a non-NULL value any more (unless
blk_is_available() is true or it can be assumed to otherwise, e.g.
because it is called immediately after a successful blk_new_with_bs() or
blk_new_open()).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>