qemu/block
Eric Blake 7da537f70d nbd/client: Lower min_block for block-status, unaligned size
We have a latent bug in our NBD client code, tickled by the brand new
nbdkit 1.11.10 block status support:

$ nbdkit --filter=log --filter=truncate -U - \
           data data="1" size=511 truncate=64K logfile=/dev/stdout \
           --run 'qemu-img convert $nbd /var/tmp/out'
...
qemu-img: block/io.c:2122: bdrv_co_block_status: Assertion `*pnum && QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset' failed.

The culprit? Our implementation of .bdrv_co_block_status can return
unaligned block status for any server that operates with a lower
actual alignment than what we tell the block layer in
request_alignment, in violation of the block layer's constraints. To
date, we've been unable to trip the bug, because qemu as NBD server
always advertises block sizing (at which point it is a server bug if
the server sends unaligned status - although qemu 3.1 is such a server
and I've sent separate patches for 4.0 both to get the server to obey
the spec, and to let the client to tolerate server oddities at EOF).

But nbdkit does not (yet) advertise block sizing, and therefore is not
in violation of the spec for returning block status at whatever
boundaries it wants, and those unaligned results can occur anywhere
rather than just at EOF. While we are still wise to avoid sending
sub-sector read/write requests to a server of unknown origin, we MUST
consider that a server telling us block status without an advertised
block size is correct.  So, we either have to munge unaligned answers
from the server into aligned ones that we hand back to the block
layer, or we have to tell the block layer about a smaller alignment.

Similarly, if the server advertises an image size that is not
sector-aligned, we might as well assume that the server intends to let
us access those tail bytes, and therefore supports a minimum block
size of 1, regardless of whether the server supports block status
(although we still need more patches to fix the problem that with an
unaligned image, we can send read or block status requests that exceed
EOF to the server). Again, qemu as server cannot trip this problem
(because it rounds images to sector alignment), but nbdkit advertised
unaligned size even before it gained block status support.

Solve both alignment problems at once by using better heuristics on
what alignment to report to the block layer when the server did not
give us something to work with. Note that very few NBD servers
implement block status (to date, only qemu and nbdkit are known to do
so); and as the NBD spec mentioned block sizing constraints prior to
documenting block status, it can be assumed that any future
implementations of block status are aware that they must advertise
block size if they want a minimum size other than 1.

We've had a long history of struggles with picking the right alignment
to use in the block layer, as evidenced by the commit message of
fd8d372d (v2.12) that introduced the current choice of forced 512-byte
alignment.

There is no iotest coverage for this fix, because qemu can't provoke
it, and I didn't want to make test 241 dependent on nbdkit.

Fixes: fd8d372d
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190329042750.14704-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
2019-03-30 20:52:19 -05:00
..
accounting.c block/accounting: introduce latency histogram 2018-03-19 14:58:37 -05:00
backup.c block/backup: use qemu_iovec_init_buf 2019-02-22 09:42:13 +00:00
blkdebug.c block: Advertise BDRV_REQ_NO_FALLBACK in filter drivers 2019-03-26 11:37:51 +01:00
blklogwrites.c block: Purify .bdrv_refresh_filename() 2019-02-25 15:11:27 +01:00
blkreplay.c trivial: Make bios files and source files non-executable 2018-09-25 17:26:18 +02:00
blkverify.c block: Purify .bdrv_refresh_filename() 2019-02-25 15:11:27 +01:00
block-backend.c Block layer patches: 2019-02-26 19:04:47 +00:00
bochs.c avoid TABs in files that only contain a few 2019-01-11 15:46:56 +01:00
cloop.c block: Require auto-read-only for existing fallbacks 2018-11-05 15:09:55 +01:00
commit.c block: Freeze the backing chain for the duration of the commit job 2019-03-12 20:30:14 +01:00
copy-on-read.c block: Advertise BDRV_REQ_NO_FALLBACK in filter drivers 2019-03-26 11:37:51 +01:00
create.c jobs: utilize job_exit shim 2018-08-31 16:28:33 +02:00
crypto.c block: Make bdrv_{copy_on_read,crypto_luks,replication} static 2019-03-19 15:49:29 +01:00
crypto.h block/crypto: Simplify block_crypto_{open,create}_opts_init() 2018-06-29 14:20:56 +02:00
curl.c block/curl: Implement bdrv_refresh_filename() 2019-02-25 15:11:27 +01:00
dirty-bitmap.c bitmaps: Fix typo in function name 2019-03-12 12:05:49 -04:00
dmg-bz2.c dmg: Move libbz2 code to dmg-bz2.so 2016-10-07 14:14:06 +02:00
dmg-lzfse.c block: adding lzfse decompressing support as a module. 2018-12-14 11:52:40 +01:00
dmg.c dmg: don't skip zero chunk 2019-01-04 11:15:09 +00:00
dmg.h dmg: including dmg-lzfse module inside dmg block driver. 2018-12-14 11:52:40 +01:00
file-posix.c file-posix: Support BDRV_REQ_NO_FALLBACK for zero writes 2019-03-26 11:37:51 +01:00
file-win32.c avoid TABs in files that only contain a few 2019-01-11 15:46:56 +01:00
gluster.c gluster: the glfs_io_cbk callback function pointer adds pre/post stat args 2019-03-12 14:26:49 +01:00
io.c block: Add BDRV_REQ_NO_FALLBACK 2019-03-26 11:37:51 +01:00
iscsi-opts.c Move include qemu/option.h from qemu-common.h to actual users 2018-02-09 13:52:16 +01:00
iscsi.c block/iscsi: Restrict Linux-specific code 2019-03-11 16:33:49 +01:00
linux-aio.c avoid TABs in files that only contain a few 2019-01-11 15:46:56 +01:00
Makefile.objs configure: adding support to lzfse library. 2018-12-14 11:52:40 +01:00
mirror.c block: Advertise BDRV_REQ_NO_FALLBACK in filter drivers 2019-03-26 11:37:51 +01:00
nbd-client.c nbd-client: Work around server BLOCK_STATUS misalignment at EOF 2019-03-30 10:06:08 -05:00
nbd-client.h nbd: Restrict connection_co reentrance 2019-02-25 15:03:19 +01:00
nbd.c nbd/client: Lower min_block for block-status, unaligned size 2019-03-30 20:52:19 -05:00
nfs.c block: Purify .bdrv_refresh_filename() 2019-02-25 15:11:27 +01:00
null.c block/null: Generate filename even with latency-ns 2019-02-25 15:11:27 +01:00
nvme.c block/nvme: Fix bdrv_refresh_filename() 2019-02-25 15:11:27 +01:00
parallels.c block/parallels: use QEMU_IOVEC_INIT_BUF 2019-02-22 09:42:13 +00:00
parallels.h Clean up includes 2018-02-09 05:05:11 +01:00
qapi.c qapi: drop x- from x-block-latency-histogram-set 2019-03-12 20:30:08 +01:00
qcow2-bitmap.c block/qcow2-bitmap: Allow resizes with persistent bitmaps 2019-03-12 14:57:38 -04:00
qcow2-cache.c qcow2: External file I/O 2019-03-08 12:26:46 +01:00
qcow2-cluster.c qcow2: Implement data-file-raw create option 2019-03-08 12:26:46 +01:00
qcow2-refcount.c qcow2: Support external data file in qemu-img check 2019-03-08 12:26:46 +01:00
qcow2-snapshot.c qcow2: Return error for snapshot operation with data file 2019-03-08 12:26:46 +01:00
qcow2.c qcow2: Fix data file error condition in qcow2_co_create() 2019-03-19 15:49:29 +01:00
qcow2.h block/qcow2-bitmap: Allow resizes with persistent bitmaps 2019-03-12 14:57:38 -04:00
qcow.c Block layer patches: 2019-02-26 19:04:47 +00:00
qed-check.c block: convert bdrv_check callback to coroutine_fn 2018-03-09 15:17:47 +01:00
qed-cluster.c qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
qed-l2-cache.c qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
qed-table.c block/qed: use qemu_iovec_init_buf 2019-02-22 09:42:13 +00:00
qed.c Block layer patches: 2019-02-26 19:04:47 +00:00
qed.h qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
quorum.c block: Purify .bdrv_refresh_filename() 2019-02-25 15:11:27 +01:00
raw-format.c block: Advertise BDRV_REQ_NO_FALLBACK in filter drivers 2019-03-26 11:37:51 +01:00
rbd.c block: Add strong_runtime_opts to BlockDriver 2019-02-25 15:11:27 +01:00
replication.c block: Make bdrv_{copy_on_read,crypto_luks,replication} static 2019-03-19 15:49:29 +01:00
sheepdog.c block: Add strong_runtime_opts to BlockDriver 2019-02-25 15:11:27 +01:00
snapshot.c block/snapshot: remove bdrv_snapshot_delete_by_id_or_name 2019-02-25 15:03:18 +01:00
ssh.c block: Add strong_runtime_opts to BlockDriver 2019-02-25 15:11:27 +01:00
stream.c block: Freeze the backing chain for the duration of the stream job 2019-03-12 20:30:14 +01:00
throttle-groups.c throttle-groups: fix restart coroutine iothread race 2019-01-24 10:02:28 +00:00
throttle.c block: Add strong_runtime_opts to BlockDriver 2019-02-25 15:11:27 +01:00
trace-events nbd: Tolerate some server non-compliance in NBD_CMD_BLOCK_STATUS 2019-03-30 10:06:08 -05:00
vdi.c block: Eliminate the S_1KiB, S_2KiB, ... macros 2019-02-01 13:46:45 +01:00
vhdx-endian.c block/vhdx: Don't take address of fields in packed structs 2018-11-05 15:09:54 +01:00
vhdx-log.c block: Use bdrv_refresh_filename() to pull 2019-02-25 15:11:25 +01:00
vhdx.c block/vhdx: Don't take address of fields in packed structs 2018-11-05 15:09:54 +01:00
vhdx.h qemu/queue.h: leave head structs anonymous unless necessary 2019-01-11 15:46:55 +01:00
vmdk.c vmdk: Support version=3 in VMDK descriptor files 2019-03-19 15:49:29 +01:00
vpc.c block: Add strong_runtime_opts to BlockDriver 2019-02-25 15:11:27 +01:00
vvfat.c block: Add strong_runtime_opts to BlockDriver 2019-02-25 15:11:27 +01:00
vxhs.c block: Add strong_runtime_opts to BlockDriver 2019-02-25 15:11:27 +01:00
win32-aio.c file-win32: Switch to byte-based callbacks 2018-05-15 16:11:41 +02:00
write-threshold.c qapi: Drop qapi_event_send_FOO()'s Error ** argument 2018-08-28 18:21:38 +02:00