This will be needed to check some restrictions before making bitmap
persistent in qmp-block-dirty-bitmap-add (this functionality will be
added by future patch)
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170628120530.31251-22-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170628120530.31251-19-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
New field BdrvDirtyBitmap.persistent means, that bitmap should be saved
by format driver in .bdrv_close and .bdrv_inactivate. No format driver
supports it for now.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20170628120530.31251-18-vsementsov@virtuozzo.com
[mreitz: Fixed indentation]
Signed-off-by: Max Reitz <mreitz@redhat.com>
Mirror AUTO flag from Qcow2 bitmap in BdrvDirtyBitmap. This will be
needed in future, to save this flag back to Qcow2 for persistent
bitmaps.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20170628120530.31251-16-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Add format driver handler, which should mark loaded read-only
bitmaps as 'IN_USE' in the image and unset read_only field in
corresponding BdrvDirtyBitmap's.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170628120530.31251-14-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
It will be needed in following commits for persistent bitmaps.
If bitmap is loaded from read-only storage (and we can't mark it
"in use" in this storage) corresponding BdrvDirtyBitmap should be
read-only.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20170628120530.31251-11-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Add bdrv_dirty_bitmap_deserialize_ones() function, which is needed for
qcow2 bitmap loading, to handle unallocated bitmap parts, marked as
all-ones.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170628120530.31251-7-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Make getter signature const-correct. This allows other functions with
const dirty bitmap parameter use bdrv_dirty_bitmap_granularity().
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20170628120530.31251-6-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Now that all encryption keys must be provided upfront via
the QCryptoSecret API and associated block driver properties
there is no need for any explicit encryption handling APIs
in the block layer. Encryption can be handled transparently
within the block driver. We only retain an API for querying
whether an image is encrypted or not, since that is a
potentially useful piece of metadata to report to the user.
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-18-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Historically the qcow & qcow2 image formats supported a property
"encryption=on" to enable their built-in AES encryption. We'll
soon be supporting LUKS for qcow2, so need a more general purpose
way to enable encryption, with a choice of formats.
This introduces an "encrypt.format" option, which will later be
joined by a number of other "encrypt.XXX" options. The use of
a "encrypt." prefix instead of "encrypt-" is done to facilitate
mapping to a nested QAPI schema at later date.
e.g. the preferred syntax is now
qemu-img create -f qcow2 -o encrypt.format=aes demo.qcow2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-8-berrange@redhat.com
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. In the common case, allocation is unlikely to ever use
values that are not naturally sector-aligned, but it is possible
that byte-based values will let us be more precise about allocation
at the end of an unaligned file that can do byte-based access.
Changing the signature of the function to use int64_t *pnum ensures
that the compiler enforces that all callers are updated. For now,
the io.c layer still assert()s that all callers are sector-aligned,
but that can be relaxed when a later patch implements byte-based
block status. Therefore, for the most part this patch is just the
addition of scaling at the callers followed by inverse scaling at
bdrv_is_allocated(). But some code, particularly stream_run(),
gets a lot simpler because it no longer has to mess with sectors.
Leave comments where we can further simplify by switching to
byte-based iterations, once later patches eliminate the need for
sector-aligned operations.
For ease of review, bdrv_is_allocated() was tackled separately.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. In the common case, allocation is unlikely to ever use
values that are not naturally sector-aligned, but it is possible
that byte-based values will let us be more precise about allocation
at the end of an unaligned file that can do byte-based access.
Changing the signature of the function to use int64_t *pnum ensures
that the compiler enforces that all callers are updated. For now,
the io.c layer still assert()s that all callers are sector-aligned
on input and that *pnum is sector-aligned on return to the caller,
but that can be relaxed when a later patch implements byte-based
block status. Therefore, this code adds usages like
DIV_ROUND_UP(,BDRV_SECTOR_SIZE) to callers that still want aligned
values, where the call might reasonbly give non-aligned results
in the future; on the other hand, no rounding is needed for callers
that should just continue to work with byte alignment.
For the most part this patch is just the addition of scaling at the
callers followed by inverse scaling at bdrv_is_allocated(). But
some code, particularly bdrv_commit(), gets a lot simpler because it
no longer has to mess with sectors; also, it is now possible to pass
NULL if the caller does not care how much of the image is allocated
beyond the initial offset. Leave comments where we can further
simplify once a later patch eliminates the need for sector-aligned
requests through bdrv_is_allocated().
For ease of review, bdrv_is_allocated_above() will be tackled
separately.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Continue by converting
the public interface to backup jobs (no semantic change), including
a change to CowRequest to track by bytes instead of cluster indices.
Note that this does not change the difference between the public
interface (starting point, and size of the subsequent range) and
the internal interface (starting and end points).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Xie Changlong <xiechanglong@cmss.chinamobile.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now that the last user [mirror_iteration()] has converted to using
bytes, we no longer need a function to round sectors to clusters.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The lone caller that cares about a return of BDRV_BLOCK_RAW
(namely, io.c:bdrv_co_get_block_status) completely replaces the
return value, so there is no point in passing BDRV_BLOCK_DATA.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Just as the block layer already sets BDRV_BLOCK_ALLOCATED as a
shortcut for subsequent operations, there are also some optimizations
that are made easier if we can quickly tell that *pnum will advance
us to the end of a file, via a new BDRV_BLOCK_EOF which gets set
by the block layer.
This just plumbs up the new bit; subsequent patches will make use
of it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170505021500.19315-2-eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
img_commit could fall into an infinite loop calling run_block_job() if
its blockjob fails on any I/O error, fix this already known problem.
Signed-off-by: sochin.jiang <sochin.jiang@huawei.com>
Message-id: 1497509253-28941-1-git-send-email-sochin.jiang@huawei.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Change the 'int count' parameter in *pwrite_zeros, *pdiscard related
functions (and some others) to 'int bytes', as they both refer to bytes.
This helps with code legibility.
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Message-id: 20170609101808.13506-1-el13635@mail.ntua.gr
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Version: GnuPG v2
iQEtBAABCAAXBQJZQyPmEBxmYW16QHJlZGhhdC5jb20ACgkQyjViTGqRccaWrgf/
SCAHpi4gzWbr7AN03jP16Qy/kqNik6F7LTNSqrRbvBPb3TNchDd4z44SAghK5m/r
+IlYQc20sBZ60tRHIHAUSF2WNcea2pj1v3ZVgjrI7hiJ3DXPiqqt/dAR/W/BLIDO
tAHAVF6Pnrjm9DC4d2zATLDHvcHMzWOsnePh7XcOm44REbwUr3GDg6bf2+j+5yfS
9ewmXfh8z4w1IvSn+f5B+IeCvGvJNA1D55dqcGo8Ivlg9PnElziXFaXO2s7UiLIM
mF3eTSIbJQNNN+E+0lpRpnqQiq+Txxggu61Q4f8bOTBhEOPa3etj1ydnXMVbvX25
6SUuBfGh51tyOIZOJz3GtA==
=9b+J
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/famz/tags/docker-and-block-pull-request' into staging
# gpg: Signature made Fri 16 Jun 2017 01:18:46 BST
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
* remotes/famz/tags/docker-and-block-pull-request: (23 commits)
block: make accounting thread-safe
block: split BlockAcctStats creation and setup
block: introduce block_account_one_io
block: protect modification of dirty bitmaps with a mutex
migration/block: reset dirty bitmap before reading
block: introduce dirty_bitmap_mutex
block: protect tracked_requests and flush_queue with reqs_lock
block: access write_gen with atomics
block: use Stat64 for wr_highest_offset
util: add stats64 module
throttle-groups: protect throttled requests with a CoMutex
throttle-groups: do not use qemu_co_enter_next
throttle-groups: only start one coroutine from drained_begin
block: access io_plugged with atomic ops
block: access wakeup with atomic ops
block: access serialising_in_flight with atomic ops
block: access io_limits_disabled with atomic ops
block: access quiesce_counter with atomic ops
block: access copy_on_read with atomic ops
docker: Add flex and bison to centos6 image
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
I'm not trying too hard yet. Later, with multiqueue support,
this may cause mutex contention or cacheline bouncing.
Cc: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-20-pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
block_acct_destroy is called unconditionally in blk_delete, but there is
no BlockAcctStats function that is called unconditionally in blk_new.
Split block_acct_init in two, so that it will be possible to create a
QemuMutex in block_acct_init and destroy it in block_acct_cleanup.
Cc: Alberto Garcia <berto@igalia.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-19-pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
It protects only the list of dirty bitmaps; in the next patch we will
also protect their content.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-15-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Rename
nbd_wr_syncv -> nbd_rwv
read_sync -> nbd_read
read_sync_eof -> nbd_read_eof
write_sync -> nbd_write
drop_sync -> nbd_drop
1. nbd_ prefix
read_sync and write_sync are already shared, so it is good to have a
namespace prefix. drop_sync will be shared, and read_sync_eof is
related to read_sync, so let's rename them all.
2. _sync suffix
_sync is related to the fact that nbd_wr_syncv doesn't return if a
write to socket returns EAGAIN. The first implementation of
nbd_wr_syncv (was wr_sync in 7a5ca8648b) just loops while getting
EAGAIN, the current implementation yields in this case.
Why we want to get rid of it:
- it is normal for r/w functions to be synchronous, so having an
additional suffix for it looks redundant (contrariwise, we have
_aio suffix for async functions)
- _sync suffix in block layer is used when function does flush (so
using it for other thing is confusing a bit)
- keep function names short after adding nbd_ prefix
3. for nbd_wr_syncv let's use more common notation 'rw'
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170602150150.258222-2-vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Back in qemu 2.5, qemu-nbd was immune to port probes (a transient
server would not quit, regardless of how many probe connections
came and went, until a connection actually negotiated). But we
broke that in commit ee7d7aa when removing the return value to
nbd_client_new(), although that patch also introduced a bug causing
an assertion failure on a client that fails negotiation. We then
made it worse during refactoring in commit 1a6245a (a segfault
before we could even assert); the (masked) assertion was cleaned
up in d3780c2 (still in 2.6), and just recently we finally fixed
the segfault ("nbd: Fully intialize client in case of failed
negotiation"). But that still means that ever since we added
TLS support to qemu-nbd, we have been vulnerable to an ill-timed
port-scan being able to cause a denial of service by taking down
qemu-nbd before a real client has a chance to connect.
Since negotiation is now handled asynchronously via coroutines,
we no longer have a synchronous point of return by re-adding a
return value to nbd_client_new(). So this patch instead wires
things up to pass the negotiation status through the close_fn
callback function.
Simple test across two terminals:
$ qemu-nbd -f raw -p 30001 file
$ nmap 127.0.0.1 -p 30001 && \
qemu-io -c 'r 0 512' -f raw nbd://localhost:30001
Note that this patch does not change what constitutes successful
negotiation (thus, a client must enter transmission phase before
that client can be considered as a reason to terminate the server
when the connection ends). Perhaps we may want to tweak things
in a later patch to also treat a client that uses NBD_OPT_ABORT
as being a 'successful' negotiation (the client correctly talked
the NBD protocol, and informed us it was not going to use our
export after all), but that's a discussion for another day.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1451614
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170608222617.20376-1-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move to modern errp scheme from just LOGging errors.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170526110913.89098-1-vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Will be used in following patch to provide actual error message in
some cases.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170516094533.6160-4-vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Start removing migration code from sysemu/sysemu.h.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJZLDGTAAoJEH8JsnLIjy/WjY0P/jtXF+SQKs9H695Uy+U+EUr5
9w+lYTfjHXTUx9F5AKT4OAMww/uWSG5ywG8FZD8AJZyt1B8piDrSA7sW5YYgWCV4
XD1HMzN6pasVGrchzgfBSW/K9BgUlvBEjqFrLCNVii0osa68J7vX3uJsLyRA+yPo
ijFseJaB/b0KnAPG9JqHXc4DVQgU2IhZH3ZsgE6Utb+KUrm8/veiORhWQboBbfCW
eyTF+YWee/rI+up4nKn9+Le57EUXWr3Y50BBxFZw1/4wQbv0WEhOIbl/Z4ggxNRR
VoyXGMqOIZFlGQ7bGxDawuEwGKfcOFFEHj3rhPrs7RPnod/6X8Vxm0rAr2GNZoOW
9tLMQKe4HLo3kVwX65AhzWd/WMukAd0MC/0oaULOjiAEdlWjflyEhx9H9uYefl5x
kn77ZqQqIEL4aCYRBLJn9oJV3CIZbSd6cuKC9UPuXAwD8XDWTXUzT/aDMJUnfij7
vhS1qks1Wyk37wIjTuuERwrr0K6y8e3De30NeU6LqQuzXvJ1MYSp49BA1FCgHfmT
x5RWbTSjLdjZiIo7biE2dSwdOtsxsnuxX19utqfGqOmegNU7LykRtu6mFh9kJhNe
5ladt1Mu//puZLQ/q4RH/J0pwkuS/OVrS3LBobcggPGHhUIiHhdNUI73bb7BK0+9
ME5DGLm4/c/REB3Lidr9
=kkU0
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'kwolf/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Mon 29 May 2017 03:34:59 PM BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* kwolf/tags/for-upstream:
block/file-*: *_parse_filename() and colons
block: Fix backing paths for filenames with colons
block: Tweak error message related to qemu-img amend
qemu-img: Fix leakage of options on error
qemu-img: copy *key-secret opts when opening newly created files
qemu-img: introduce --target-image-opts for 'convert' command
qemu-img: fix --image-opts usage with dd command
qemu-img: add support for --object with 'dd' command
qemu-img: Fix documentation of convert
qcow2: remove extra local_error variable
mirror: Drop permissions on s->target on completion
nvme: Add support for Controller Memory Buffers
iotests: 147: Don't test inet6 if not available
qemu-iotests: Test streaming with missing job ID
stream: fix crash in stream_start() when block_job_create() fails
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The file drivers' *_parse_filename() implementations just strip the
optional protocol prefix off the filename. However, for e.g.
"file:foo:bar", this would lead to "foo:bar" being stored as the BDS's
filename which looks like it should be managed using the "foo" protocol.
This is especially troublesome if you then try to resolve a backing
filename based on "foo:bar".
This issue can only occur if the stripped part is a relative filename
("file:/foo:bar" will be shortened to "/foo:bar" and having a slash
before the first colon means that "/foo" is not recognized as a protocol
part). Therefore, we can easily fix it by prepending "./" to such
filenames.
Before this patch:
$ ./qemu-img create -f qcow2 backing.qcow2 64M
Formatting 'backing.qcow2', fmt=qcow2 size=67108864 encryption=off
cluster_size=65536 lazy_refcounts=off refcount_bits=16
$ ./qemu-img create -f qcow2 -b backing.qcow2 file🔝image.qcow2
Formatting 'file🔝image.qcow2', fmt=qcow2 size=67108864
backing_file=backing.qcow2 encryption=off cluster_size=65536
lazy_refcounts=off refcount_bits=16
$ ./qemu-io file🔝image.qcow2
can't open device file🔝image.qcow2: Could not open backing file:
Unknown protocol 'top'
After this patch:
$ ./qemu-io file🔝image.qcow2
[no error]
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170522195217.12991-3-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
All block jobs are using block_job_defer_to_main_loop as the final
step just before the coroutine terminates. At this point,
block_job_enter should do nothing, but currently it restarts
the freed coroutine.
Now, the job->co states should probably be changed to an enum
(e.g. BEFORE_START, STARTED, YIELDED, COMPLETED) subsuming
block_job_started, job->deferred_to_main_loop and job->busy.
For now, this patch eliminates the problematic reenter by
removing the reset of job->deferred_to_main_loop (which served
no purpose, as far as I could see) and checking the flag in
block_job_enter.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170508141310.8674-12-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Remove use of block_job_pause/resume from outside blockjob.c, thus
making them static. The new functions are used by the block layer,
so place them in blockjob_int.h.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170508141310.8674-5-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Outside blockjob.c, block_job_unref is only used when a block job fails
to start, and block_job_ref is not used at all. The reference counting
thus is pretty well hidden. Introduce a separate function to be used
by block jobs; because block_job_ref and block_job_unref now become
static, move them earlier in blockjob.c.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170508141310.8674-4-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
This is unused since commit 66a0fae ("blockjob: Don't touch BDS iostatus",
2016-05-19).
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170508141310.8674-3-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
The main loop uses aio_disable_external()/aio_enable_external() to
temporarily disable processing of external AioContext clients like
device emulation.
This allows monitor commands to quiesce I/O and prevent the guest from
submitting new requests while a monitor command is in progress.
The aio_enable_external() API is currently broken when an IOThread is in
aio_poll() waiting for fd activity when the main loop re-enables
external clients. Incrementing ctx->external_disable_cnt does not wake
the IOThread from ppoll(2) so fd processing remains suspended and leads
to unresponsive emulated devices.
This patch adds an aio_notify() call to aio_enable_external() so the
IOThread is kicked out of ppoll(2) and will re-arm the file descriptors.
The bug can be reproduced as follows:
$ qemu -M accel=kvm -m 1024 \
-object iothread,id=iothread0 \
-device virtio-scsi-pci,iothread=iothread0,id=virtio-scsi-pci0 \
-drive if=none,id=drive0,aio=native,cache=none,format=raw,file=test.img \
-device scsi-hd,id=scsi-hd0,drive=drive0 \
-qmp tcp::5555,server,nowait
$ scripts/qmp/qmp-shell localhost:5555
(qemu) blockdev-snapshot-sync device=drive0 snapshot-file=sn1.qcow2
mode=absolute-paths format=qcow2
After blockdev-snapshot-sync completes the SCSI disk will be
unresponsive. This leads to request timeouts inside the guest.
Reported-by: Qianqian Zhu <qizhu@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170508180705.20609-1-stefanha@redhat.com
Suggested-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We had some conflicting documentation: a nice 8-way table that
described all possible combinations of DATA, ZERO, and
OFFSET_VALID, contrasted with text that implied that OFFSET_VALID
always meant raw data could be read directly. Furthermore, the
text refers a lot to bs->file, even though the interface was
updated back in 67a0fd2a to let the driver pass back a specific
BDS (not necessarily bs->file). As the 8-way table is the
intended semantics, simplify the rest of the text to get rid of
the confusion.
ALLOCATED is always set by the block layer for convenience (drivers
do not have to worry about it). RAW is used only internally, but
by more than the raw driver. Document these additional items on
the driver callback.
Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170507000552.20847-4-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Format drivers for inactive nodes don't need write/resize permissions on
their bs->file and can share write/resize with another VM (in fact, this
is the whole point of keeping images inactive). Represent this fact in
the op blocker system, so that image locking does the right thing
without special-casing inactive images.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
With image locking, permissions affect other qemu processes as well. We
want to be sure that the destination can run, so let's drop permissions
on the source when migration completes.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Instead of manually calling blk_resume_after_migration() in migration
code after doing bdrv_invalidate_cache_all(), integrate the BlockBackend
activation with cache invalidation into a single function. This is
achieved with a new callback in BdrvChildRole that is called by
bdrv_invalidate_cache_all().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>