mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Vladimir Sementsov-Ogievskiy	bbfb7c2f35	block/nbd: safer transition to receiving request req->receiving is a flag of request being in one concrete yield point in nbd_co_do_receive_one_chunk(). Such kind of boolean flag is always better to unset before scheduling the coroutine, to avoid double scheduling. So, let's be more careful. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-33-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 12:21:22 -05:00
Vladimir Sementsov-Ogievskiy	91e0998f5a	block/nbd: add nbd_client_connected() helper We already have two similar helpers for other state. Let's add another one for convenience. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-32-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 12:21:22 -05:00
Vladimir Sementsov-Ogievskiy	a71d597b98	block/nbd: reuse nbd_co_do_establish_connection() in nbd_open() The only last step we need to reuse the function is coroutine-wrapper. nbd_open() may be called from non-coroutine context. So, generate the wrapper and use it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-31-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 12:21:22 -05:00
Vladimir Sementsov-Ogievskiy	97cf89259e	nbd/client-connection: add option for non-blocking connection attempt We'll need a possibility of non-blocking nbd_co_establish_connection(), so that it returns immediately, and it returns success only if a connections was previously established in background. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-30-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 12:21:22 -05:00
Vladimir Sementsov-Ogievskiy	51edbf537d	block/nbd: split nbd_co_do_establish_connection out of nbd_reconnect_attempt Split out the part that we want to reuse for nbd_open(). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210610100802.5888-29-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 12:21:21 -05:00
Vladimir Sementsov-Ogievskiy	43cb34dede	nbd/client-connection: return only one io channel block/nbd doesn't need underlying sioc channel anymore. So, we can update nbd/client-connection interface to return only one top-most io channel, which is more straight forward. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-27-vsementsov@virtuozzo.com> [eblake: squash in Vladimir's fixes for uninit usage caught by clang] Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 12:20:53 -05:00
Vladimir Sementsov-Ogievskiy	95a078ea3e	block/nbd: drop BDRVNBDState::sioc Currently sioc pointer is used just to pass from socket-connection to nbd negotiation. Drop the field, and use local variables instead. With next commit we'll update nbd/client-connection.c to behave appropriately (return only top-most ioc, not two channels). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-26-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:54 -05:00
Vladimir Sementsov-Ogievskiy	c2405af0e4	block/nbd: don't touch s->sioc in nbd_teardown_connection() Negotiation during reconnect is now done in a thread, and s->sioc is not available during negotiation. Negotiation in thread will be cancelled by nbd_client_connection_release() called from nbd_clear_bdrvstate(). So, we don't need this code chunk anymore. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-25-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:54 -05:00
Vladimir Sementsov-Ogievskiy	6d2b0332d3	block/nbd: use negotiation of NBDClientConnection Now that we can opt in to negotiation as part of the client connection thread, use that to simplify connection_co. This is another step on the way to moving all reconnect code into NBDClientConnection. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-24-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:54 -05:00
Vladimir Sementsov-Ogievskiy	e9ba7788b0	block/nbd: split nbd_handle_updated_info out of nbd_client_handshake() To be reused in the following patch. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-23-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:54 -05:00
Vladimir Sementsov-Ogievskiy	130d49baa5	nbd/client-connection: add possibility of negotiation Add arguments and logic to support nbd negotiation in the same thread after successful connection. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210610100802.5888-20-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy	5276c87c12	nbd: move connection code from block/nbd to nbd/client-connection We now have bs-independent connection API, which consists of four functions: nbd_client_connection_new() nbd_client_connection_release() nbd_co_establish_connection() nbd_co_establish_connection_cancel() Move them to a separate file together with NBDClientConnection structure which becomes private to the new API. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-18-vsementsov@virtuozzo.com> [eblake: comment tweaks] Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy	248d470198	block/nbd: introduce nbd_client_connection_release() This is a last step of creating bs-independent nbd connection interface. With next commit we can finally move it to separate file. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210610100802.5888-17-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy	f68729747d	block/nbd: introduce nbd_client_connection_new() This is a step of creating bs-independent nbd connection interface. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-16-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy	90ddc64fb2	block/nbd: rename NBDConnectThread to NBDClientConnection We are going to move the connection code to its own file, and want clear names and APIs first. The structure is shared between user and (possibly) several runs of connect-thread. So it's wrong to call it "thread". Let's rename to something more generic. Appropriately rename connect_thread and thr variables to conn. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-15-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy	c3e7730485	block/nbd: make nbd_co_establish_connection_cancel() bs-independent nbd_co_establish_connection_cancel() actually needs only pointer to NBDConnectThread. So, make it clean. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-14-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy	d33833d7af	block/nbd: bs-independent interface for nbd_co_establish_connection() We are going to split connection code to a separate file. Now we are ready to give nbd_co_establish_connection() clean and bs-independent interface. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-13-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy	b8e8a3d116	block/nbd: drop thr->state We don't need all these states. The code refactored to use two boolean variables looks simpler. While moving the comment in nbd_co_establish_connection() rework it to give better information. Also, we are going to move the connection code to separate file and mentioning drained section would be confusing. Improve also the comment in NBDConnectThread, while dropping removed state names from it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210610100802.5888-12-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: comment tweak] Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy	08ea55d068	block/nbd: simplify waking of nbd_co_establish_connection() Instead of managing connect_bh, bh_ctx, and wait_connect fields, we can use a single link to the waiting coroutine with proper mutex protection. So new logic is: nbd_co_establish_connection() sets wait_co under the mutex, releases the mutex, then yield()s. Note that wait_co may be scheduled by the thread immediately after unlocking the mutex. Still, the main thread (or iothread) will not reach the code for entering the coroutine until the yield(), so we are safe. connect_thread_func() and nbd_co_establish_connection_cancel() do the following to handle wait_co: Under the mutex, if thr->wait_co is not NULL, make it NULL and schedule it. This way, we avoid scheduling the coroutine twice. Still scheduling is a bit different: In connect_thread_func() we can just call aio_co_wake under mutex, after commit [async: the main AioContext is only "current" if under the BQL] we are sure that aio_co_wake() will not try to acquire the aio context and do qemu_aio_coroutine_enter() but simply schedule the coroutine by aio_co_schedule(). nbd_co_establish_connection_cancel() will be called from non-coroutine context in further patch and will be able to go through qemu_aio_coroutine_enter() path of aio_co_wake(). So keep current behavior of waking the coroutine after the critical section. Also, this commit reduces the dependence of nbd_co_establish_connection() on the internals of bs (we now use a generic pointer to the coroutine, instead of direct use of s->connection_co). This is a step towards splitting the connection API out of nbd.c. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210610100802.5888-11-vsementsov@virtuozzo.com> Reviewied-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy	2def3edb4b	block/nbd: BDRVNBDState: drop unused connect_err and connect_status These fields are write-only. Drop them. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-10-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy	2a25def4be	block/nbd: nbd_client_handshake(): fix leak of s->ioc Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20210610100802.5888-9-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Roman Kagan	e8b35bf5dc	block/nbd: ensure ->connection_thread is always valid Simplify lifetime management of BDRVNBDState->connect_thread by delaying the possible cleanup of it until the BDRVNBDState itself goes away. This also reverts `0267101af6` "block/nbd: fix possible use after free of s->connect_thread" as now s->connect_thread can't be cleared until the very end. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> [vsementsov: rebase, revert `0267101af6` changes] Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: tweak comment] Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-8-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy	6cc702beac	block/nbd: call socket_address_parse_named_fd() in advance Detecting monitor by current coroutine works bad when we are not in coroutine context. And that's exactly so in nbd reconnect code, where qio_channel_socket_connect_sync() is called from thread. Monitor is needed only to parse named file descriptor. So, let's just parse it during nbd_open(), so that all further users of s->saddr don't need to access monitor. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210610100802.5888-7-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy	fb392b548e	block/nbd: connect_thread_func(): do qio_channel_set_delay(false) nbd_open() does it (through nbd_establish_connection()). Actually we lost that call on reconnect path in `1dc4718d84` "block/nbd: use non-blocking connect: fix vm hang on connect()" when we have introduced reconnect thread. Fixes: `1dc4718d84` Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210610100802.5888-5-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy	bbba1c376b	block/nbd: fix how state is cleared on nbd_open() failure paths We have two "return error" paths in nbd_open() after nbd_process_options(). Actually we should call nbd_clear_bdrvstate() on these paths. Interesting that nbd_process_options() calls nbd_clear_bdrvstate() by itself. Let's fix leaks and refactor things to be more obvious: - intialize yank at top of nbd_open() - move yank cleanup to nbd_clear_bdrvstate() - refactor nbd_open() so that all failure paths except for yank-register goes through nbd_clear_bdrvstate() Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-4-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Roman Kagan	3687ad4903	block/nbd: fix channel object leak nbd_free_connect_thread leaks the channel object if it hasn't been stolen. Unref it and fix the leak. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-3-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:52 -05:00
Paolo Bonzini	29a6ea24eb	coroutine-sleep: replace QemuCoSleepState pointer with struct in the API Right now, users of qemu_co_sleep_ns_wakeable are simply passing a pointer to QemuCoSleepState by reference to the function. But QemuCoSleepState really is just a Coroutine*; making the content of the struct public is just as efficient and lets us skip the user_state_pointer indirection. Since the usage is changed, take the occasion to rename the struct to QemuCoSleep. Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20210517100548.28806-6-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2021-05-21 18:22:33 +01:00
Paolo Bonzini	eaee072085	coroutine-sleep: allow qemu_co_sleep_wake that wakes nothing All callers of qemu_co_sleep_wake are checking whether they are passing a NULL argument inside the pointer-to-pointer: do the check in qemu_co_sleep_wake itself. As a side effect, qemu_co_sleep_wake can be called more than once and it will only wake the coroutine once; after the first time, the argument will be set to NULL via sleep_state->user_state_pointer. However, this would not be safe unless co_sleep_cb keeps using the QemuCoSleepState directly, so make it go through the pointer-to-pointer instead. Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20210517100548.28806-4-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2021-05-21 18:22:33 +01:00
Vladimir Sementsov-Ogievskiy	0267101af6	block/nbd: fix possible use after free of s->connect_thread If on nbd_close() we detach the thread (in nbd_co_establish_connection_cancel() thr->state becomes CONNECT_THREAD_RUNNING_DETACHED), after that point we should not use s->connect_thread (which is set to NULL), as running thread may free it at any time. Still nbd_co_establish_connection() does exactly this: it saves s->connect_thread to local variable (just for better code style) and use it even after yield point, when thread may be already detached. Fix that. Also check thr to be non-NULL on nbd_co_establish_connection() start for safety. After this patch "case CONNECT_THREAD_RUNNING_DETACHED" becomes impossible in the second switch in nbd_co_establish_connection(). Still, don't add extra abort() just before the release. If it somehow possible to reach this "case:" it won't hurt. Anyway, good refactoring of all this reconnect mess will come soon. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210406155114.1057355-1-vsementsov@virtuozzo.com> Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru> Signed-off-by: Max Reitz <mreitz@redhat.com>	2021-04-13 15:35:12 +02:00
Vladimir Sementsov-Ogievskiy	c4f7f24e1f	block/nbd: implement .bdrv_cancel_in_flight Just stop waiting for connection in existing requests. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210205163720.887197-3-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-02-12 09:45:18 -06:00
Roman Kagan	ddde5ee769	block/nbd: only enter connection coroutine if it's present When an NBD block driver state is moved from one aio_context to another (e.g. when doing a drain in a migration thread), nbd_client_attach_aio_context_bh is executed that enters the connection coroutine. However, the assumption that ->connection_co is always present here appears incorrect: the connection may have encountered an error other than -EIO in the underlying transport, and thus may have decided to quit rather than keep trying to reconnect, and therefore it may have terminated the connection coroutine. As a result an attempt to reassign the client in this state (NBD_CLIENT_QUIT) to a different aio_context leads to a null pointer dereference: #0 qio_channel_detach_aio_context (ioc=0x0) at /build/qemu-gYtjVn/qemu-5.0.1/io/channel.c:452 #1 0x0000562a242824b3 in bdrv_detach_aio_context (bs=0x562a268d6a00) at /build/qemu-gYtjVn/qemu-5.0.1/block.c:6151 #2 bdrv_set_aio_context_ignore (bs=bs@entry=0x562a268d6a00, new_context=new_context@entry=0x562a260c9580, ignore=ignore@entry=0x7feeadc9b780) at /build/qemu-gYtjVn/qemu-5.0.1/block.c:6230 #3 0x0000562a24282969 in bdrv_child_try_set_aio_context (bs=bs@entry=0x562a268d6a00, ctx=0x562a260c9580, ignore_child=<optimized out>, errp=<optimized out>) at /build/qemu-gYtjVn/qemu-5.0.1/block.c:6332 #4 0x0000562a242bb7db in blk_do_set_aio_context (blk=0x562a2735d0d0, new_context=0x562a260c9580, update_root_node=update_root_node@entry=true, errp=errp@entry=0x0) at /build/qemu-gYtjVn/qemu-5.0.1/block/block-backend.c:1989 #5 0x0000562a242be0bd in blk_set_aio_context (blk=<optimized out>, new_context=<optimized out>, errp=errp@entry=0x0) at /build/qemu-gYtjVn/qemu-5.0.1/block/block-backend.c:2010 #6 0x0000562a23fbd953 in virtio_blk_data_plane_stop (vdev=<optimized out>) at /build/qemu-gYtjVn/qemu-5.0.1/hw/block/dataplane/virtio-blk.c:292 #7 0x0000562a241fc7bf in virtio_bus_stop_ioeventfd (bus=0x562a260dbf08) at /build/qemu-gYtjVn/qemu-5.0.1/hw/virtio/virtio-bus.c:245 #8 0x0000562a23fefb2e in virtio_vmstate_change (opaque=0x562a260dbf90, running=0, state=<optimized out>) at /build/qemu-gYtjVn/qemu-5.0.1/hw/virtio/virtio.c:3220 #9 0x0000562a2402ebfd in vm_state_notify (running=running@entry=0, state=state@entry=RUN_STATE_FINISH_MIGRATE) at /build/qemu-gYtjVn/qemu-5.0.1/softmmu/vl.c:1275 #10 0x0000562a23f7bc02 in do_vm_stop (state=RUN_STATE_FINISH_MIGRATE, send_stop=<optimized out>) at /build/qemu-gYtjVn/qemu-5.0.1/cpus.c:1032 #11 0x0000562a24209765 in migration_completion (s=0x562a260e83a0) at /build/qemu-gYtjVn/qemu-5.0.1/migration/migration.c:2914 #12 migration_iteration_run (s=0x562a260e83a0) at /build/qemu-gYtjVn/qemu-5.0.1/migration/migration.c:3275 #13 migration_thread (opaque=opaque@entry=0x562a260e83a0) at /build/qemu-gYtjVn/qemu-5.0.1/migration/migration.c:3439 #14 0x0000562a2435ca96 in qemu_thread_start (args=<optimized out>) at /build/qemu-gYtjVn/qemu-5.0.1/util/qemu-thread-posix.c:519 #15 0x00007feed31466ba in start_thread (arg=0x7feeadc9c700) at pthread_create.c:333 #16 0x00007feed2e7c41d in __GI___sysctl (name=0x0, nlen=608471908, oldval=0x562a2452b138, oldlenp=0x0, newval=0x562a2452c5e0 <__func__.28102>, newlen=0) at ../sysdeps/unix/sysv/linux/sysctl.c:30 #17 0x0000000000000000 in ?? () Fix it by checking that the connection coroutine is non-null before trying to enter it. If it is null, no entering is needed, as the connection is probably going down anyway. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210129073859.683063-3-rvkagan@yandex-team.ru> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-02-03 08:17:12 -06:00
Roman Kagan	3b5e4db673	block/nbd: only detach existing iochannel from aio_context When the reconnect in NBD client is in progress, the iochannel used for NBD connection doesn't exist. Therefore an attempt to detach it from the aio_context of the parent BlockDriverState results in a NULL pointer dereference. The problem is triggerable, in particular, when an outgoing migration is about to finish, and stopping the dataplane tries to move the BlockDriverState from the iothread aio_context to the main loop. If the NBD connection is lost before this point, and the NBD client has entered the reconnect procedure, QEMU crashes: #0 qemu_aio_coroutine_enter (ctx=0x5618056c7580, co=0x0) at /build/qemu-6MF7tq/qemu-5.0.1/util/qemu-coroutine.c:109 #1 0x00005618034b1b68 in nbd_client_attach_aio_context_bh ( opaque=0x561805ed4c00) at /build/qemu-6MF7tq/qemu-5.0.1/block/nbd.c:164 #2 0x000056180353116b in aio_wait_bh (opaque=0x7f60e1e63700) at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-wait.c:55 #3 0x0000561803530633 in aio_bh_call (bh=0x7f60d40a7e80) at /build/qemu-6MF7tq/qemu-5.0.1/util/async.c:136 #4 aio_bh_poll (ctx=ctx@entry=0x5618056c7580) at /build/qemu-6MF7tq/qemu-5.0.1/util/async.c:164 #5 0x0000561803533e5a in aio_poll (ctx=ctx@entry=0x5618056c7580, blocking=blocking@entry=true) at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-posix.c:650 #6 0x000056180353128d in aio_wait_bh_oneshot (ctx=0x5618056c7580, cb=<optimized out>, opaque=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-wait.c:71 #7 0x000056180345c50a in bdrv_attach_aio_context (new_context=0x5618056c7580, bs=0x561805ed4c00) at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6172 #8 bdrv_set_aio_context_ignore (bs=bs@entry=0x561805ed4c00, new_context=new_context@entry=0x5618056c7580, ignore=ignore@entry=0x7f60e1e63780) at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6237 #9 0x000056180345c969 in bdrv_child_try_set_aio_context ( bs=bs@entry=0x561805ed4c00, ctx=0x5618056c7580, ignore_child=<optimized out>, errp=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6332 #10 0x00005618034957db in blk_do_set_aio_context (blk=0x56180695b3f0, new_context=0x5618056c7580, update_root_node=update_root_node@entry=true, errp=errp@entry=0x0) at /build/qemu-6MF7tq/qemu-5.0.1/block/block-backend.c:1989 #11 0x00005618034980bd in blk_set_aio_context (blk=<optimized out>, new_context=<optimized out>, errp=errp@entry=0x0) at /build/qemu-6MF7tq/qemu-5.0.1/block/block-backend.c:2010 #12 0x0000561803197953 in virtio_blk_data_plane_stop (vdev=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/hw/block/dataplane/virtio-blk.c:292 #13 0x00005618033d67bf in virtio_bus_stop_ioeventfd (bus=0x5618056d9f08) at /build/qemu-6MF7tq/qemu-5.0.1/hw/virtio/virtio-bus.c:245 #14 0x00005618031c9b2e in virtio_vmstate_change (opaque=0x5618056d9f90, running=0, state=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/hw/virtio/virtio.c:3220 #15 0x0000561803208bfd in vm_state_notify (running=running@entry=0, state=state@entry=RUN_STATE_FINISH_MIGRATE) at /build/qemu-6MF7tq/qemu-5.0.1/softmmu/vl.c:1275 #16 0x0000561803155c02 in do_vm_stop (state=RUN_STATE_FINISH_MIGRATE, send_stop=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/cpus.c:1032 #17 0x00005618033e3765 in migration_completion (s=0x5618056e6960) at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:2914 #18 migration_iteration_run (s=0x5618056e6960) at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:3275 #19 migration_thread (opaque=opaque@entry=0x5618056e6960) at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:3439 #20 0x0000561803536ad6 in qemu_thread_start (args=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/util/qemu-thread-posix.c:519 #21 0x00007f61085d06ba in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #22 0x00007f610830641d in sysctl () from /lib/x86_64-linux-gnu/libc.so.6 #23 0x0000000000000000 in ?? () Fix it by checking that the iochannel is non-null before trying to detach it from the aio_context. If it is null, no detaching is needed, and it will get reattached in the proper aio_context once the connection is reestablished. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210129073859.683063-2-rvkagan@yandex-team.ru> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-02-03 08:17:12 -06:00
Lukas Straub	fee091cdff	block/nbd.c: Add yank feature Register a yank function which shuts down the socket and sets s->state = NBD_CLIENT_QUIT. This is the same behaviour as if an error occured. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <b73eb07db6d1fcd00667beb13ae6117260f002c3.1609167865.git.lukasstraub2@web.de> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2021-01-13 10:21:17 +01:00
Peter Maydell	729cc68373	Remove superfluous timer_del() calls This commit is the result of running the timer-del-timer-free.cocci script on the whole source tree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Corey Minyard <cminyard@mvista.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201215154107.3255-4-peter.maydell@linaro.org	2021-01-08 15:13:38 +00:00
Eric Blake	dbc7b01492	nbd: Add 'qemu-nbd -A' to expose allocation depth Allow the server to expose an additional metacontext to be requested by savvy clients. qemu-nbd adds a new option -A to expose the qemu:allocation-depth metacontext through NBD_CMD_BLOCK_STATUS; this can also be set via QMP when using block-export-add. qemu as client is hacked into viewing the key aspects of this new context by abusing the already-experimental x-dirty-bitmap option to collapse all depths greater than 2, which results in a tri-state value visible in the output of 'qemu-img map --output=json' (yes, that means x-dirty-bitmap is now a bit of a misnomer, but I didn't feel like renaming it as it would introduce a needless break of back-compat, even though we make no compat guarantees with x- members): unallocated (depth 0) => "zero":false, "data":true local (depth 1) => "zero":false, "data":false backing (depth 2+) => "zero":true, "data":true libnbd as client is probably a nicer way to get at the information without having to decipher such hacks in qemu as client. ;) Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20201027050556.269064-11-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2020-10-30 15:22:00 -05:00
Vladimir Sementsov-Ogievskiy	99d72dba1c	block/nbd: nbd_co_reconnect_loop(): don't connect if drained In a recent commit `12c75e20a2` we've improved nbd_co_reconnect_loop() to not make drain wait for additional sleep. Similarly, we shouldn't try to connect, if previous sleep was interrupted by drain begin, otherwise drain_begin will have to wait for the whole connection attempt. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200903190301.367620-5-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-10-09 15:04:32 -05:00
Vladimir Sementsov-Ogievskiy	46f56631b5	block/nbd: fix reconnect-delay reconnect-delay has a design flaw: we handle it in the same loop where we do connection attempt. So, reconnect-delay may be exceeded by unpredictable time of connection attempt. Let's instead use separate timer. How to reproduce the bug: 1. Create an image on node1: qemu-img create -f qcow2 xx 100M 2. Start NBD server on node1: qemu-nbd xx 3. On node2 start qemu-io: ./build/qemu-io --image-opts \ driver=nbd,server.type=inet,server.host=192.168.100.5,server.port=10809,reconnect-delay=15 4. Type 'read 0 512' in qemu-io interface to check that connection works Be careful: you should make steps 5-7 in a short time, less than 15 seconds. 5. Kill nbd server on node1 6. Run 'read 0 512' in qemu-io interface again, to be sure that nbd client goes to reconnect loop. 7. On node1 run the following command sudo iptables -A INPUT -p tcp --dport 10809 -j DROP This will make the connect() call of qemu-io at node2 take a long time. And you'll see that read command in qemu-io will hang for a long time, more than 15 seconds specified by reconnect-delay parameter. It's the bug. 8. Don't forget to drop iptables rule on node1: sudo iptables -D INPUT -p tcp --dport 10809 -j DROP Important note: Step [5] is necessary to reproduce _this_ bug. If we miss step [5], the read command (step 6) will hang for a long time and this commit doesn't help, because there will be not long connect() to unreachable host, but long sendmsg() to unreachable host, which should be fixed by enabling and adjusting keep-alive on the socket, which is a thing for further patch set. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200903190301.367620-4-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-10-09 15:04:32 -05:00
Vladimir Sementsov-Ogievskiy	8a509afd72	block/nbd: correctly use qio_channel_detach_aio_context when needed Don't use nbd_client_detach_aio_context() driver handler where we want to finalize the connection. We should directly use qio_channel_detach_aio_context() in such cases. Driver handler may (and will) contain another things, unrelated to the qio channel. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200903190301.367620-3-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-10-09 15:04:32 -05:00
Vladimir Sementsov-Ogievskiy	8c517de24a	block/nbd: fix drain dead-lock because of nbd reconnect-delay We pause reconnect process during drained section. So, if we have some requests, waiting for reconnect we should cancel them, otherwise they deadlock the drained section. How to reproduce: 1. Create an image: qemu-img create -f qcow2 xx 100M 2. Start NBD server: qemu-nbd xx 3. Start vm with second nbd disk on node2, like this: ./build/x86_64-softmmu/qemu-system-x86_64 -nodefaults -drive \ file=/work/images/cent7.qcow2 -drive \ driver=nbd,server.type=inet,server.host=192.168.100.5,server.port=10809,reconnect-delay=60 \ -vnc :0 -m 2G -enable-kvm -vga std 4. Access the vm through vnc (or some other way?), and check that NBD drive works: dd if=/dev/sdb of=/dev/null bs=1M count=10 - the command should succeed. 5. Now, kill the nbd server, and run dd in the guest again: dd if=/dev/sdb of=/dev/null bs=1M count=10 Now Qemu is trying to reconnect, and dd-generated requests are waiting for the connection (they will wait up to 60 seconds (see reconnect-delay option above) and than fail). But suddenly, vm may totally hang in the deadlock. You may need to increase reconnect-delay period to catch the dead-lock. VM doesn't respond because drain dead-lock happens in cpu thread with global mutex taken. That's not good thing by itself and is not fixed by this commit (true way is using iothreads). Still this commit fixes drain dead-lock itself. Note: probably, we can instead continue to reconnect during drained section. To achieve this, we may move negotiation to the connect thread to make it independent of bs aio context. But expanding drained section doesn't seem good anyway. So, let's now fix the bug the simplest way. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200903190301.367620-2-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-10-09 15:04:32 -05:00
Vladimir Sementsov-Ogievskiy	1dc4718d84	block/nbd: use non-blocking connect: fix vm hang on connect() This makes nbd's connection_co yield during reconnects, so that reconnect doesn't block the main thread. This is very important in case of an unavailable nbd server host: connect() call may take a long time, blocking the main thread (and due to reconnect, it will hang again and again with small gaps of working time during pauses between connection attempts). Realization notes: - We don't want to implement non-blocking connect() over non-blocking socket, because getaddrinfo() doesn't have portable non-blocking realization anyway, so let's just use a thread for both getaddrinfo() and connect(). - We can't use qio_channel_socket_connect_async (which behaves similarly and starts a thread to execute connect() call), as it's relying on someone iterating main loop (g_main_loop_run() or something like this), which is not always the case. - We can't use thread_pool_submit_co API, as thread pool waits for all threads to finish (but we don't want to wait for blocking reconnect attempt on shutdown. So, we just create the thread by hand. Some additional difficulties are: - We want our connect to avoid blocking drained sections and aio context switches. To achieve this, we make it possible to "cancel" synchronous wait for the connect (which is a coroutine yield actually), still, the thread continues in background, and if successful, its result may be reused on next reconnect attempt. - We don't want to wait for reconnect on shutdown, so there is CONNECT_THREAD_RUNNING_DETACHED thread state, which means that the block layer is no longer interested in a result, and thread should close new connected socket on finish and free the state. How to reproduce the bug, fixed with this commit: 1. Create an image on node1: qemu-img create -f qcow2 xx 100M 2. Start NBD server on node1: qemu-nbd xx 3. Start vm with second nbd disk on node2, like this: ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -drive \ file=/work/images/cent7.qcow2 -drive file=nbd+tcp://192.168.100.2 \ -vnc :0 -qmp stdio -m 2G -enable-kvm -vga std 4. Access the vm through vnc (or some other way?), and check that NBD drive works: dd if=/dev/sdb of=/dev/null bs=1M count=10 - the command should succeed. 5. Now, let's trigger nbd-reconnect loop in Qemu process. For this: 5.1 Kill NBD server on node1 5.2 run "dd if=/dev/sdb of=/dev/null bs=1M count=10" in the guest again. The command should fail and a lot of error messages about failing disk may appear as well. Now NBD client driver in Qemu tries to reconnect. Still, VM works well. 6. Make node1 unavailable on NBD port, so connect() from node2 will last for a long time: On node1 (Note, that 10809 is just a default NBD port): sudo iptables -A INPUT -p tcp --dport 10809 -j DROP After some time the guest hangs, and you may check in gdb that Qemu hangs in connect() call, issued from the main thread. This is the BUG. 7. Don't forget to drop iptables rule from your node1: sudo iptables -D INPUT -p tcp --dport 10809 -j DROP Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200812145237.4396-1-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: minor wording and formatting tweaks] Signed-off-by: Eric Blake <eblake@redhat.com>	2020-09-02 16:47:23 -05:00
Vladimir Sementsov-Ogievskiy	12c75e20a2	block/nbd: nbd_co_reconnect_loop(): don't sleep if drained We try to go to wakeable sleep, so that, if drain begins it will break the sleep. But what if nbd_client_co_drain_begin() already called and s->drained is already true? We'll go to sleep, and drain will have to wait for the whole timeout. Let's improve it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727184751.15704-5-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-07-28 09:54:43 -05:00
Vladimir Sementsov-Ogievskiy	fbeb3e63b3	block/nbd: on shutdown terminate connection attempt On shutdown nbd driver may be in a connecting state. We should shutdown it as well, otherwise we may hang in nbd_teardown_connection, waiting for conneciton_co to finish in BDRV_POLL_WHILE(bs, s->connection_co) loop if remote server is down. How to reproduce the dead lock: 1. Create nbd-fault-injector.conf with the following contents: [inject-error "mega1"] event=data io=readwrite when=before 2. In one terminal run nbd-fault-injector in a loop, like this: n=1; while true; do echo $n; ((n++)); ./nbd-fault-injector.py 127.0.0.1:10000 nbd-fault-injector.conf; done 3. In another terminal run qemu-io in a loop, like this: n=1; while true; do echo $n; ((n++)); ./qemu-io -c 'read 0 512' nbd://127.0.0.1:10000; done After some time, qemu-io will hang. Note, that this hang may be triggered by another bug, so the whole case is fixed only together with commit "block/nbd: allow drain during reconnect attempt". Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727184751.15704-4-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-07-28 09:54:43 -05:00
Vladimir Sementsov-Ogievskiy	dd1ec1a4af	block/nbd: allow drain during reconnect attempt It should be safe to reenter qio_channel_yield() on io/channel read/write path, so it's safe to reduce in_flight and allow attaching new aio context. And no problem to allow drain itself: connection attempt is not a guest request. Moreover, if remote server is down, we can hang in negotiation, blocking drain section and provoking a dead lock. How to reproduce the dead lock: 1. Create nbd-fault-injector.conf with the following contents: [inject-error "mega1"] event=data io=readwrite when=before 2. In one terminal run nbd-fault-injector in a loop, like this: n=1; while true; do echo $n; ((n++)); ./nbd-fault-injector.py 127.0.0.1:10000 nbd-fault-injector.conf; done 3. In another terminal run qemu-io in a loop, like this: n=1; while true; do echo $n; ((n++)); ./qemu-io -c 'read 0 512' nbd://127.0.0.1:10000; done After some time, qemu-io will hang trying to drain, for example, like this: #3 aio_poll (ctx=0x55f006bdd890, blocking=true) at util/aio-posix.c:600 #4 bdrv_do_drained_begin (bs=0x55f006bea710, recursive=false, parent=0x0, ignore_bds_parents=false, poll=true) at block/io.c:427 #5 bdrv_drained_begin (bs=0x55f006bea710) at block/io.c:433 #6 blk_drain (blk=0x55f006befc80) at block/block-backend.c:1710 #7 blk_unref (blk=0x55f006befc80) at block/block-backend.c:498 #8 bdrv_open_inherit (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000", reference=0x0, options=0x55f006be86d0, flags=24578, parent=0x0, child_class=0x0, child_role=0, errp=0x7fffba154620) at block.c:3491 #9 bdrv_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000", reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at block.c:3513 #10 blk_new_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000", reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at block/block-backend.c:421 And connection_co stack like this: #0 qemu_coroutine_switch (from_=0x55f006bf2650, to_=0x7fe96e07d918, action=COROUTINE_YIELD) at util/coroutine-ucontext.c:302 #1 qemu_coroutine_yield () at util/qemu-coroutine.c:193 #2 qio_channel_yield (ioc=0x55f006bb3c20, condition=G_IO_IN) at io/channel.c:472 #3 qio_channel_readv_all_eof (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0, niov=1, errp=0x7fe96d729eb0) at io/channel.c:110 #4 qio_channel_readv_all (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0, niov=1, errp=0x7fe96d729eb0) at io/channel.c:143 #5 qio_channel_read_all (ioc=0x55f006bb3c20, buf=0x7fe96d729d28 "\300.\366\004\360U", buflen=8, errp=0x7fe96d729eb0) at io/channel.c:247 #6 nbd_read (ioc=0x55f006bb3c20, buffer=0x7fe96d729d28, size=8, desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at /work/src/qemu/master/include/block/nbd.h:365 #7 nbd_read64 (ioc=0x55f006bb3c20, val=0x7fe96d729d28, desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at /work/src/qemu/master/include/block/nbd.h:391 #8 nbd_start_negotiate (aio_context=0x55f006bdd890, ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0, outioc=0x55f006bf19f8, structured_reply=true, zeroes=0x7fe96d729dca, errp=0x7fe96d729eb0) at nbd/client.c:904 #9 nbd_receive_negotiate (aio_context=0x55f006bdd890, ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0, outioc=0x55f006bf19f8, info=0x55f006bf1a00, errp=0x7fe96d729eb0) at nbd/client.c:1032 #10 nbd_client_connect (bs=0x55f006bea710, errp=0x7fe96d729eb0) at block/nbd.c:1460 #11 nbd_reconnect_attempt (s=0x55f006bf19f0) at block/nbd.c:287 #12 nbd_co_reconnect_loop (s=0x55f006bf19f0) at block/nbd.c:309 #13 nbd_connection_entry (opaque=0x55f006bf19f0) at block/nbd.c:360 #14 coroutine_trampoline (i0=113190480, i1=22000) at util/coroutine-ucontext.c:173 Note, that the hang may be triggered by another bug, so the whole case is fixed only together with commit "block/nbd: on shutdown terminate connection attempt". Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200727184751.15704-3-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-07-28 09:54:43 -05:00
Vladimir Sementsov-Ogievskiy	fa35591b9c	block/nbd: split nbd_establish_connection out of nbd_client_connect We are going to implement non-blocking version of nbd_establish_connection, which for a while will be used only for nbd_reconnect_attempt, not for nbd_open, so we need to call it separately. Refactor nbd_reconnect_attempt in a way which makes next commit simpler. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200727184751.15704-2-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-07-28 09:54:43 -05:00
Nir Soffer	a2b333c018	block: nbd: Fix convert qcow2 compressed to nbd When converting to qcow2 compressed format, the last step is a special zero length compressed write, ending in a call to bdrv_co_truncate(). This call always fails for the nbd driver since it does not implement bdrv_co_truncate(). For block devices, which have the same limits, the call succeeds since the file driver implements bdrv_co_truncate(). If the caller asked to truncate to the same or smaller size with exact=false, the truncate succeeds. Implement the same logic for nbd. Example failing without this change: In one shell start qemu-nbd: $ truncate -s 1g test.tar $ qemu-nbd --socket=/tmp/nbd.sock --persistent --format=raw --offset 1536 test.tar In another shell convert an image to qcow2 compressed via NBD: $ echo "disk data" > disk.raw $ truncate -s 1g disk.raw $ qemu-img convert -f raw -O qcow2 -c disk1.raw nbd+unix:///?socket=/tmp/nbd.sock; echo $? 1 qemu-img failed, but the conversion was successful: $ qemu-img info nbd+unix:///?socket=/tmp/nbd.sock image: nbd+unix://?socket=/tmp/nbd.sock file format: qcow2 virtual size: 1 GiB (1073741824 bytes) ... $ qemu-img check nbd+unix:///?socket=/tmp/nbd.sock No errors were found on the image. 1/16384 = 0.01% allocated, 100.00% fragmented, 100.00% compressed clusters Image end offset: 393216 $ qemu-img compare disk.raw nbd+unix:///?socket=/tmp/nbd.sock Images are identical. Fixes: https://bugzilla.redhat.com/1860627 Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-Id: <20200727215846.395443-2-nsoffer@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: typo fixes] Signed-off-by: Eric Blake <eblake@redhat.com>	2020-07-28 09:54:19 -05:00
Eric Blake	00d69986da	nbd: Avoid off-by-one in long export name truncation When snprintf returns the same value as the buffer size, the final byte was truncated to ensure a NUL terminator. Fortunately, such long export names are unusual enough, with no real impact other than what is displayed to the user. Fixes: `5c86bdf120` Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200622210355.414941-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2020-07-13 09:01:01 -05:00
Vladimir Sementsov-Ogievskiy	795d946d07	nbd: Use ERRP_GUARD() If we want to check error after errp-function call, we need to introduce local_err and then propagate it to errp. Instead, use the ERRP_GUARD() macro, benefits are: 1. No need of explicit error_propagate call 2. No need of explicit local_err variable: use errp directly 3. ERRP_GUARD() leaves errp as is if it's not NULL or &error_fatal, this means that we don't break error_abort (we'll abort on error_set, not on error_propagate) If we want to add some info to errp (by error_prepend() or error_append_hint()), we must use the ERRP_GUARD() macro. Otherwise, this info will not be added when errp == &error_fatal (the program will exit prior to the error_append_hint() or error_prepend() call). Fix several such cases, e.g. in nbd_read(). This commit is generated by command sed -n '/^Network Block Device (NBD)$/,/^$/{s/^F: //p}' \ MAINTAINERS \| \ xargs git ls-files \| grep '\.[hc]$' \| \ xargs spatch \ --sp-file scripts/coccinelle/errp-guard.cocci \ --macro-file scripts/cocci-macro-file.h \ --in-place --no-show-diff --max-width 80 Reported-by: Kevin Wolf <kwolf@redhat.com> Reported-by: Greg Kurz <groug@kaod.org> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Commit message tweaked] Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200707165037.1026246-8-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [ERRP_AUTO_PROPAGATE() renamed to ERRP_GUARD(), and auto-propagated-errp.cocci to errp-guard.cocci. Commit message tweaked again.]	2020-07-10 15:18:09 +02:00
Markus Armbruster	af175e85f9	error: Eliminate error_propagate() with Coccinelle, part 2 When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. The previous commit did that with a Coccinelle script I consider fairly trustworthy. This commit uses the same script with the matching of return taken out, i.e. we convert if (!foo(..., &err)) { ... error_propagate(errp, err); ... } to if (!foo(..., errp)) { ... ... } This is unsound: @err could still be read between afterwards. I don't know how to express "no read of @err without an intervening write" in Coccinelle. Instead, I manually double-checked for uses of @err. Suboptimal line breaks tweaked manually. qdev_realize() simplified further to placate scripts/checkpatch.pl. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-36-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	62a35aaa31	qapi: Use returned bool to check for failure, Coccinelle part The previous commit enables conversion of visit_foo(..., &err); if (err) { ... } to if (!visit_foo(..., errp)) { ... } for visitor functions that now return true / false on success / error. Coccinelle script: @@ identifier fun =~ "check_list\|input_type_enum\|lv_start_struct\|lv_type_bool\|lv_type_int64\|lv_type_str\|lv_type_uint64\|output_type_enum\|parse_type_bool\|parse_type_int64\|parse_type_null\|parse_type_number\|parse_type_size\|parse_type_str\|parse_type_uint64\|print_type_bool\|print_type_int64\|print_type_null\|print_type_number\|print_type_size\|print_type_str\|print_type_uint64\|qapi_clone_start_alternate\|qapi_clone_start_list\|qapi_clone_start_struct\|qapi_clone_type_bool\|qapi_clone_type_int64\|qapi_clone_type_null\|qapi_clone_type_number\|qapi_clone_type_str\|qapi_clone_type_uint64\|qapi_dealloc_start_list\|qapi_dealloc_start_struct\|qapi_dealloc_type_anything\|qapi_dealloc_type_bool\|qapi_dealloc_type_int64\|qapi_dealloc_type_null\|qapi_dealloc_type_number\|qapi_dealloc_type_str\|qapi_dealloc_type_uint64\|qobject_input_check_list\|qobject_input_check_struct\|qobject_input_start_alternate\|qobject_input_start_list\|qobject_input_start_struct\|qobject_input_type_any\|qobject_input_type_bool\|qobject_input_type_bool_keyval\|qobject_input_type_int64\|qobject_input_type_int64_keyval\|qobject_input_type_null\|qobject_input_type_number\|qobject_input_type_number_keyval\|qobject_input_type_size_keyval\|qobject_input_type_str\|qobject_input_type_str_keyval\|qobject_input_type_uint64\|qobject_input_type_uint64_keyval\|qobject_output_start_list\|qobject_output_start_struct\|qobject_output_type_any\|qobject_output_type_bool\|qobject_output_type_int64\|qobject_output_type_null\|qobject_output_type_number\|qobject_output_type_str\|qobject_output_type_uint64\|start_list\|visit_check_list\|visit_check_struct\|visit_start_alternate\|visit_start_list\|visit_start_struct\|visit_type_."; expression list args; typedef Error; Error err; @@ - fun(args, &err); - if (err) + if (!fun(args, &err)) { ... } A few line breaks tidied up manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-19-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	235e59cf03	qemu-option: Use returned bool to check for failure The previous commit enables conversion of foo(..., &err); if (err) { ... } to if (!foo(..., &err)) { ... } for QemuOpts functions that now return true / false on success / error. Coccinelle script: @@ identifier fun = { opts_do_parse, parse_option_bool, parse_option_number, parse_option_size, qemu_opt_parse, qemu_opt_rename, qemu_opt_set, qemu_opt_set_bool, qemu_opt_set_number, qemu_opts_absorb_qdict, qemu_opts_do_parse, qemu_opts_from_qdict_entry, qemu_opts_set, qemu_opts_validate }; expression list args, args2; typedef Error; Error *err; @@ - fun(args, &err, args2); - if (err) + if (!fun(args, &err, args2)) { ... } A few line breaks tidied up manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-15-armbru@redhat.com> [Conflict with commit `0b6786a9c1` "block/amend: refactor qcow2 amend options" resolved by rerunning Coccinelle on master's version]	2020-07-10 15:17:35 +02:00
Eric Blake	5c86bdf120	block: Call attention to truncation of long NBD exports Commit `93676c88` relaxed our NBD client code to request export names up to the NBD protocol maximum of 4096 bytes without NUL terminator, even though the block layer can't store anything longer than 4096 bytes including NUL terminator for display to the user. Since this means there are some export names where we have to truncate things, we can at least try to make the truncation a bit more obvious for the user. Note that in spite of the truncated display name, we can still communicate with an NBD server using such a long export name; this was deemed nicer than refusing to even connect to such a server (since the server may not be under our control, and since determining our actual length limits gets tricky when nbd://host:port/export and nbd+unix:///export?socket=/path are themselves variable-length expansions beyond the export name but count towards the block layer name length). Reported-by: Xueqiang Wei <xuwei@redhat.com> Fixes: https://bugzilla.redhat.com/1843684 Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200610163741.3745251-3-eblake@redhat.com>	2020-06-10 12:58:59 -05:00
Vladimir Sementsov-Ogievskiy	714eb0dbc5	block/nbd-client: drop max_block restriction from discard The NBD spec was updated (see nbd.git commit 9f30fedb) so that max_block doesn't relate to NBD_CMD_TRIM. So, drop the restriction. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200401150112.9557-3-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: tweak commit message to call out NBD commit] Signed-off-by: Eric Blake <eblake@redhat.com>	2020-05-04 15:16:46 -05:00
Vladimir Sementsov-Ogievskiy	6bf792b464	block/nbd-client: drop max_block restriction from block_status The NBD spec was updated (see nbd.git commit 9f30fedb) so that max_block doesn't relate to NBD_CMD_BLOCK_STATUS. So, drop the restriction. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200401150112.9557-2-vsementsov@virtuozzo.com> [eblake: tweak commit message to call out NBD commit] Signed-off-by: Eric Blake <eblake@redhat.com>	2020-05-04 15:13:14 -05:00
Maxim Levitsky	5a5e7f8cd8	block: trickle down the fallback image creation function use to the block drivers Instead of checking the .bdrv_co_create_opts to see if we need the fallback, just implement the .bdrv_co_create_opts in the drivers that need it. This way we don't break various places that need to know if the underlying protocol/format really supports image creation, and this way we still allow some drivers to not support image creation. Fixes: `fd17146cd9` Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1816007 Note that technically this driver reverts the image creation fallback for the vxhs driver since I don't have a means to test it, and IMHO it is better to leave it not supported as it was prior to generic image creation patches. Also drop iscsi_create_opts which was left accidentally. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20200326011218.29230-3-mlevitsk@redhat.com> Reviewed-by: Denis V. Lunev <den@openvz.org> [mreitz: Fixed alignment, and moved bdrv_co_create_opts_simple() and bdrv_create_opts_simple from block.h into block_int.h] Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-03-26 14:44:33 +01:00
Pan Nengyuan	8198cf5ef0	block/nbd: fix memory leak in nbd_open() In currently implementation there will be a memory leak when nbd_client_connect() returns error status. Here is an easy way to reproduce: 1. run qemu-iotests as follow and check the result with asan: ./check -raw 143 Following is the asan output backtrack: Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f629688a560 in calloc (/usr/lib64/libasan.so.3+0xc7560) #1 0x7f6295e7e015 in g_malloc0 (/usr/lib64/libglib-2.0.so.0+0x50015) #2 0x56281dab4642 in qobject_input_start_struct /mnt/sdb/qemu-4.2.0-rc0/qapi/qobject-input-visitor.c:295 #3 0x56281dab1a04 in visit_start_struct /mnt/sdb/qemu-4.2.0-rc0/qapi/qapi-visit-core.c:49 #4 0x56281dad1827 in visit_type_SocketAddress qapi/qapi-visit-sockets.c:386 #5 0x56281da8062f in nbd_config /mnt/sdb/qemu-4.2.0-rc0/block/nbd.c:1716 #6 0x56281da8062f in nbd_process_options /mnt/sdb/qemu-4.2.0-rc0/block/nbd.c:1829 #7 0x56281da8062f in nbd_open /mnt/sdb/qemu-4.2.0-rc0/block/nbd.c:1873 Direct leak of 15 byte(s) in 1 object(s) allocated from: #0 0x7f629688a3a0 in malloc (/usr/lib64/libasan.so.3+0xc73a0) #1 0x7f6295e7dfbd in g_malloc (/usr/lib64/libglib-2.0.so.0+0x4ffbd) #2 0x7f6295e96ace in g_strdup (/usr/lib64/libglib-2.0.so.0+0x68ace) #3 0x56281da804ac in nbd_process_options /mnt/sdb/qemu-4.2.0-rc0/block/nbd.c:1834 #4 0x56281da804ac in nbd_open /mnt/sdb/qemu-4.2.0-rc0/block/nbd.c:1873 Indirect leak of 24 byte(s) in 1 object(s) allocated from: #0 0x7f629688a3a0 in malloc (/usr/lib64/libasan.so.3+0xc73a0) #1 0x7f6295e7dfbd in g_malloc (/usr/lib64/libglib-2.0.so.0+0x4ffbd) #2 0x7f6295e96ace in g_strdup (/usr/lib64/libglib-2.0.so.0+0x68ace) #3 0x56281dab41a3 in qobject_input_type_str_keyval /mnt/sdb/qemu-4.2.0-rc0/qapi/qobject-input-visitor.c:536 #4 0x56281dab2ee9 in visit_type_str /mnt/sdb/qemu-4.2.0-rc0/qapi/qapi-visit-core.c:297 #5 0x56281dad0fa1 in visit_type_UnixSocketAddress_members qapi/qapi-visit-sockets.c:141 #6 0x56281dad17b6 in visit_type_SocketAddress_members qapi/qapi-visit-sockets.c:366 #7 0x56281dad186a in visit_type_SocketAddress qapi/qapi-visit-sockets.c:393 #8 0x56281da8062f in nbd_config /mnt/sdb/qemu-4.2.0-rc0/block/nbd.c:1716 #9 0x56281da8062f in nbd_process_options /mnt/sdb/qemu-4.2.0-rc0/block/nbd.c:1829 #10 0x56281da8062f in nbd_open /mnt/sdb/qemu-4.2.0-rc0/block/nbd.c:1873 Fixes: `8f071c9db5` Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Cc: qemu-stable <qemu-stable@nongnu.org> Cc: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <1575517528-44312-3-git-send-email-pannengyuan@huawei.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-02-26 17:29:00 -06:00
Pan Nengyuan	7f493662be	block/nbd: extract the common cleanup code The BDRVNBDState cleanup code is common in two places, add nbd_clear_bdrvstate() function to do these cleanups. Suggested-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <1575517528-44312-2-git-send-email-pannengyuan@huawei.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: fix compilation error and commit message] Signed-off-by: Eric Blake <eblake@redhat.com>	2020-02-26 17:28:08 -06:00
Eric Blake	2485f22fe9	nbd-client: Support leading / in NBD URI The NBD URI specification [1] states that only one leading slash at the beginning of the URI path component is stripped, not all such slashes. This becomes important to a patch I just proposed to nbdkit [2], which would allow the exportname to select a file embedded within an ext2 image: ext2fs demands an absolute pathname beginning with '/', and because qemu was inadvertantly stripping it, my nbdkit patch had to work around the behavior. [1] https://github.com/NetworkBlockDevice/nbd/blob/master/doc/uri.md [2] https://www.redhat.com/archives/libguestfs/2020-February/msg00109.html Note that the qemu bug only affects handling of URIs such as nbd://host:port//abs/path (where '/abs/path' should be the export name); it is still possible to use --image-opts and pass the desired export name with a leading slash directly through JSON even without this patch. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200212023101.1162686-1-eblake@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-02-26 14:45:02 -06:00
Max Reitz	78c81a3f10	block/nbd: Fix hang in .bdrv_close() When nbd_close() is called from a coroutine, the connection_co never gets to run, and thus nbd_teardown_connection() hangs. This is because aio_co_enter() only puts the connection_co into the main coroutine's wake-up queue, so this main coroutine needs to yield and wait for connection_co to terminate. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200122164532.178040-2-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-02-20 16:43:42 +01:00
Vladimir Sementsov-Ogievskiy	d936613547	nbd: assert that Error** is not NULL in nbd_iter_channel_error All callers of nbd_iter_channel_error() pass the address of a local_err variable, and only call this function if an error has already occurred, using this function to propagate that error. This is already implied by its name (local_err instead of the classic errp), but it is worth additionally stressing this by adding an assertion to make it part of the function contract. The local_err parameter is not here to return information about nbd_iter_channel_error failure. Instead it's assumed to be filled when passed to the function. This is already stressed by its name (local_err, instead of classic errp). Stress it additionally by assertion. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20191205174635.18758-22-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2019-12-18 08:43:19 +01:00
Eric Blake	93676c88d7	nbd: Don't send oversize strings Qemu as server currently won't accept export names larger than 256 bytes, nor create dirty bitmap names longer than 1023 bytes, so most uses of qemu as client or server have no reason to get anywhere near the NBD spec maximum of a 4k limit per string. However, we weren't actually enforcing things, ignoring when the remote side violates the protocol on input, and also having several code paths where we send oversize strings on output (for example, qemu-nbd --description could easily send more than 4k). Tighten things up as follows: client: - Perform bounds check on export name and dirty bitmap request prior to handing it to server - Validate that copied server replies are not too long (ignoring NBD_INFO_* replies that are not copied is not too bad) server: - Perform bounds check on export name and description prior to advertising it to client - Reject client name or metadata query that is too long - Adjust things to allow full 4k name limit rather than previous 256 byte limit Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20191114024635.11363-4-eblake@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2019-11-18 16:01:34 -06:00
Vladimir Sementsov-Ogievskiy	f7651539d8	block/nbd: nbd reconnect Implement reconnect. To achieve this: 1. add new modes: connecting-wait: means, that reconnecting is in progress, and there were small number of reconnect attempts, so all requests are waiting for the connection. connecting-nowait: reconnecting is in progress, there were a lot of attempts of reconnect, all requests will return errors. two old modes are used too: connected: normal state quit: exiting after fatal error or on close Possible transitions are: * -> quit connecting-* -> connected connecting-wait -> connecting-nowait (transition is done after reconnect-delay seconds in connecting-wait mode) connected -> connecting-wait 2. Implement reconnect in connection_co. So, in connecting-* mode, connection_co, tries to reconnect unlimited times. 3. Retry nbd queries on channel error, if we are in connecting-wait state. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20191009084158.15614-3-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2019-10-22 09:22:07 -05:00
Maxim Levitsky	e99754b42e	nbd: add empty .bdrv_reopen_prepare Fixes commit job / qemu-img commit, when commiting qcow2 file which is based on nbd export. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1718727 Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-id: 20190930213820.29777-2-mlevitsk@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Eric Blake	f061656cc3	nbd: Implement client use of NBD FAST_ZERO The client side is fairly straightforward: if the server advertised fast zero support, then we can map that to BDRV_REQ_NO_FALLBACK support. A server that advertises FAST_ZERO but not WRITE_ZEROES is technically broken, but we can ignore that situation as it does not change our behavior. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190823143726.27062-4-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2019-09-05 16:03:26 -05:00
Eric Blake	df18c04edf	nbd: Use g_autofree in a few places Thanks to our recent move to use glib's g_autofree, I can join the bandwagon. Getting rid of gotos is fun ;) There are probably more places where we could register cleanup functions and get rid of more gotos; this patch just focuses on the labels that existed merely to call g_free. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190824172813.29720-2-eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2019-09-05 15:52:45 -05:00
Peter Maydell	c6a2225a5a	nbd patches for 2019-08-15 - Addition of InetSocketAddress keep-alive - Addition of BDRV_REQ_PREFETCH for more efficient copy-on-read - Initial refactoring in preparation of NBD reconnect -----BEGIN PGP SIGNATURE----- iQEcBAABCAAGBQJdVaRZAAoJEKeha0olJ0NqrGoIAJSvVLMDeWZIkHr3CQ5AbMHy 6IHUntBwv4PEHw0FyyDU7lLgEWubTwe/7RfvyJ69kQYSJLjvHa3KEic0aa7SOETK hGUlSoIFHEugi+XDcYyy9EG+ItUR7jnunkwomxvFRm4XzjEHFO9ck8fOS+uq/23e LGDHwdoZI6vawUPftbBuRAlB3egCEcBtTWXYMk8lm3MXHOHL7O18DRkfWvwcHfl6 mNIKgTVMtl1gYoJznCUmC5VLHL4jQy+kSNXnyHBQOEEvTcORu0EztJS81H+BODni sxa9seem7JL9NLUTmkJsbGfSM6RKdfypX34oik9yakqUnXRrlxkxI+IX26XfdQ4= =2MAO -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2019-08-15' into staging nbd patches for 2019-08-15 - Addition of InetSocketAddress keep-alive - Addition of BDRV_REQ_PREFETCH for more efficient copy-on-read - Initial refactoring in preparation of NBD reconnect # gpg: Signature made Thu 15 Aug 2019 19:28:41 BST # gpg: using RSA key A7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full] # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full] # gpg: aka "[jpeg image of size 6874]" [full] # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-nbd-2019-08-15: block/nbd: refactor nbd connection parameters block/nbd: add cmdline and qapi parameter reconnect-delay block/nbd: move from quit to state block/nbd: use non-blocking io channel for nbd negotiation block/nbd: split connection_co start out of nbd_client_connect nbd: improve CMD_CACHE: use BDRV_REQ_PREFETCH block/stream: use BDRV_REQ_PREFETCH block: implement BDRV_REQ_PREFETCH qapi: Add InetSocketAddress member keep-alive Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-08-16 15:53:37 +01:00
Markus Armbruster	db72581598	Include qemu/main-loop.h less In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-08-16 13:31:52 +02:00
Vladimir Sementsov-Ogievskiy	8f071c9db5	block/nbd: refactor nbd connection parameters We'll need some connection parameters to be available all the time to implement nbd reconnect. So, let's refactor them: define additional parameters in BDRVNBDState, drop them from function parameters, drop nbd_client_init and separate options parsing instead from nbd_open. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20190618114328.55249-6-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: Drop useless 'if' before object_unref] Signed-off-by: Eric Blake <eblake@redhat.com>	2019-08-15 13:22:14 -05:00
Vladimir Sementsov-Ogievskiy	b172ae2e0e	block/nbd: add cmdline and qapi parameter reconnect-delay Reconnect will be implemented in the following commit, so for now, in semantics below, disconnect itself is a "serious error". Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20190618114328.55249-5-vsementsov@virtuozzo.com> [eblake: slipped from 4.1 to 4.2] Signed-off-by: Eric Blake <eblake@redhat.com>	2019-08-15 13:22:14 -05:00
Vladimir Sementsov-Ogievskiy	a34b1e5e06	block/nbd: move from quit to state To implement reconnect we need several states for the client: CONNECTED, QUIT and two different CONNECTING states. CONNECTING states will be added in the following patches. This patch implements CONNECTED and QUIT. QUIT means, that we should close the connection and fail all current and further requests (like old quit = true). CONNECTED means that connection is ok, we can send requests (like old quit = false). For receiving loop we use a comparison of the current state with QUIT, because reconnect will be in the same loop, so it should be looping until the end. Opposite, for requests we use a comparison of the current state with CONNECTED, as we don't want to send requests in future CONNECTING states. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20190618114328.55249-4-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2019-08-15 13:22:14 -05:00
Vladimir Sementsov-Ogievskiy	a8e2bb6a76	block/nbd: use non-blocking io channel for nbd negotiation No reason to use blocking channel for negotiation and we'll benefit in further reconnect feature, as qio_channel reads and writes will do qemu_coroutine_yield while waiting for io completion. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20190618114328.55249-3-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2019-08-15 13:22:14 -05:00
Vladimir Sementsov-Ogievskiy	962b7b3d4c	block/nbd: split connection_co start out of nbd_client_connect nbd_client_connect is going to be used from connection_co, so, let's refactor nbd_client_connect in advance, leaving io channel configuration all in nbd_client_connect. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20190618114328.55249-2-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2019-08-15 13:22:13 -05:00
Eric Blake	5cf42b1c1f	nbd: Initialize reply on failure We've had two separate reports of different callers running into use of uninitialized data if s->quit is set (one detected by gcc -O3, another by valgrind), due to checking 'nbd_reply_is_simple(reply) \|\| s->quit' in the wrong order. Rather than chasing down which callers need to pre-initialize reply, and whether there are any other uninitialized uses, it's easier to guarantee that reply will always be set by nbd_co_receive_one_chunk() even on failure. The uninitialized use happens to be harmless (the only time the variable is uninitialized is if s->quit is set, so the conditional results in the same action regardless of what was read from reply), and was introduced in commit `65e01d47`. In fixing the problem, it can also be seen that all (one) callers pass in a non-NULL reply, so there is a dead conditional to also be cleaned up. Reported-by: Thomas Huth <thuth@redhat.com> Reported-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190719172001.19770-1-eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-07-19 13:19:18 -05:00
Vladimir Sementsov-Ogievskiy	611ae1d716	block/nbd: merge NBDClientSession struct back to BDRVNBDState No reason to keep it separate, it differs from others block driver behavior and therefore confuses. Instead of generic 'state = (State*)bs->opaque' we have to use special helper. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20190611102720.86114-4-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2019-06-13 10:00:42 -05:00
Vladimir Sementsov-Ogievskiy	86f8cdf3db	block/nbd: merge nbd-client.* to nbd.c No reason for keeping driver handlers realization separate from driver structure. We can get rid of extra header file. While being here, fix comments style, restore forgotten comments for NBD_FOREACH_REPLY_CHUNK and nbd_reply_chunk_iter_receive, remove extra includes. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20190611102720.86114-3-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2019-06-13 09:55:09 -05:00
Eric Blake	7da537f70d	nbd/client: Lower min_block for block-status, unaligned size We have a latent bug in our NBD client code, tickled by the brand new nbdkit 1.11.10 block status support: $ nbdkit --filter=log --filter=truncate -U - \ data data="1" size=511 truncate=64K logfile=/dev/stdout \ --run 'qemu-img convert $nbd /var/tmp/out' ... qemu-img: block/io.c:2122: bdrv_co_block_status: Assertion `pnum && QEMU_IS_ALIGNED(pnum, align) && align > offset - aligned_offset' failed. The culprit? Our implementation of .bdrv_co_block_status can return unaligned block status for any server that operates with a lower actual alignment than what we tell the block layer in request_alignment, in violation of the block layer's constraints. To date, we've been unable to trip the bug, because qemu as NBD server always advertises block sizing (at which point it is a server bug if the server sends unaligned status - although qemu 3.1 is such a server and I've sent separate patches for 4.0 both to get the server to obey the spec, and to let the client to tolerate server oddities at EOF). But nbdkit does not (yet) advertise block sizing, and therefore is not in violation of the spec for returning block status at whatever boundaries it wants, and those unaligned results can occur anywhere rather than just at EOF. While we are still wise to avoid sending sub-sector read/write requests to a server of unknown origin, we MUST consider that a server telling us block status without an advertised block size is correct. So, we either have to munge unaligned answers from the server into aligned ones that we hand back to the block layer, or we have to tell the block layer about a smaller alignment. Similarly, if the server advertises an image size that is not sector-aligned, we might as well assume that the server intends to let us access those tail bytes, and therefore supports a minimum block size of 1, regardless of whether the server supports block status (although we still need more patches to fix the problem that with an unaligned image, we can send read or block status requests that exceed EOF to the server). Again, qemu as server cannot trip this problem (because it rounds images to sector alignment), but nbdkit advertised unaligned size even before it gained block status support. Solve both alignment problems at once by using better heuristics on what alignment to report to the block layer when the server did not give us something to work with. Note that very few NBD servers implement block status (to date, only qemu and nbdkit are known to do so); and as the NBD spec mentioned block sizing constraints prior to documenting block status, it can be assumed that any future implementations of block status are aware that they must advertise block size if they want a minimum size other than 1. We've had a long history of struggles with picking the right alignment to use in the block layer, as evidenced by the commit message of `fd8d372d` (v2.12) that introduced the current choice of forced 512-byte alignment. There is no iotest coverage for this fix, because qemu can't provoke it, and I didn't want to make test 241 dependent on nbdkit. Fixes: `fd8d372d` Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190329042750.14704-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Tested-by: Richard W.M. Jones <rjones@redhat.com>	2019-03-30 20:52:19 -05:00
Max Reitz	998b3a1e5a	block: Purify .bdrv_refresh_filename() Currently, BlockDriver.bdrv_refresh_filename() is supposed to both refresh the filename (BDS.exact_filename) and set BDS.full_open_options. Now that we have generic code in the central bdrv_refresh_filename() for creating BDS.full_open_options, we can drop the latter part from all BlockDriver.bdrv_refresh_filename() implementations. This also means that we can drop all of the existing default code for this from the global bdrv_refresh_filename() itself. Furthermore, we now have to call BlockDriver.bdrv_refresh_filename() after having set BDS.full_open_options, because the block driver's implementation should now be allowed to depend on BDS.full_open_options being set correctly. Finally, with this patch we can drop the @options parameter from BlockDriver.bdrv_refresh_filename(); also, add a comment on this function's purpose in block/block_int.h while touching its interface. This completely obsoletes blklogwrite's implementation of .bdrv_refresh_filename(). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190201192935.18394-25-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:27 +01:00
Max Reitz	2654267cc1	block: Add strong_runtime_opts to BlockDriver This new field can be set by block drivers to list the runtime options they accept that may influence the contents of the respective BDS. As of a follow-up patch, this list will be used by the common bdrv_refresh_filename() implementation to decide which options to put into BDS.full_open_options (and consequently whether a JSON filename has to be created), thus freeing the drivers of having to implement that logic themselves. Additionally, this patch adds the field to all of the block drivers that need it and sets it accordingly. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20190201192935.18394-22-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:27 +01:00
Max Reitz	8a6239c071	block/nbd: Make bdrv_dirname() return NULL The generic bdrv_dirname() implementation would be able to generate some form of directory name for many NBD nodes, but it would be always wrong. Therefore, we have to explicitly make it an error (until NBD has some form of specification for export paths, if it ever will). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20190201192935.18394-18-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:26 +01:00
Vladimir Sementsov-Ogievskiy	d42f78e940	block/nbd: move connection code from block/nbd to block/nbd-client Keep all connection code in one file, to be able to implement reconnect in further patches. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20190201130138.94525-3-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: format tweak] Signed-off-by: Eric Blake <eblake@redhat.com>	2019-02-04 15:11:27 -06:00
Markus Armbruster	ba891d68b4	qstring: Move qstring_from_substr()'s @end one to the right qstring_from_substr() takes the index of the substring's first and last character. qstring_from_substr(s, 0, SIZE_MAX) denotes an empty substring. Awkward. Shift the end index one to the right. This simplifies both qstring_from_substr() and its callers. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180727062204.10401-3-armbru@redhat.com>	2018-07-28 09:09:58 +02:00
Eric Blake	216ee3657e	nbd/client: Add x-dirty-bitmap to query bitmap from server In order to test that the NBD server is properly advertising dirty bitmaps, we need a bare minimum client that can request and read the context. Since feature freeze for 3.0 is imminent, this is the smallest workable patch, which replaces the qemu block status report with the results of the NBD server's dirty bitmap (making it very easy to use 'qemu-img map --output=json' to learn where the dirty portions are). Note that the NBD protocol defines a dirty section with the same bit but opposite sense that normal "base:allocation" uses to report an allocated section; so in qemu-img map output, "data":true corresponds to clean, "data":false corresponds to dirty. A more complete solution that allows dirty bitmaps to be queried at the same time as normal block status will be required before this addition can lose the x- prefix. Until then, the fact that this replaces normal status with dirty status means actions like 'qemu-img convert' will likely misbehave due to treating dirty regions of the file as if they are unallocated. The next patch adds an iotest to exercise this new code. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20180702191458.28741-2-eblake@redhat.com>	2018-07-02 15:27:38 -05:00
Markus Armbruster	af91062ee1	block: Factor out qobject_input_visitor_new_flat_confused() Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	374c52467a	block: Fix -drive for certain non-string scalars The previous commit fixed -blockdev breakage due to misuse of the qobject input visitor's keyval flavor in bdrv_file_open(). The commit message explain why using the plain flavor would be just as wrong; it would break -drive. Turns out we break it in three places: nbd_open(), sd_open() and ssh_file_open(). They are even marked FIXME. Example breakage: $ qemu-system-x86 -drive node-name=n1,driver=nbd,server.type=inet,server.host=localhost,server.port=1234,server.numeric=off qemu-system-x86: -drive node-name=n1,driver=nbd,server.type=inet,server.host=localhost,server.port=1234,server.numeric=off: Invalid parameter type for 'numeric', expected: boolean Fix it the same way: replace qdict_crumple() by qdict_crumple_for_keyval_qiv(), and switch from plain to the keyval flavor. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Max Reitz	609f45ea95	block: Add block-specific QDict header There are numerous QDict functions that have been introduced for and are used only by the block layer. Move their declarations into an own header file to reflect that. While qdict_extract_subqdict() is in fact used outside of the block layer (in util/qemu-config.c), it is still a function related very closely to how the block layer works with nested QDicts, namely by sometimes flattening them. Therefore, its declaration is put into this header as well and util/qemu-config.c includes it with a comment stating exactly which function it needs. Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20180509165530.29561-7-mreitz@redhat.com> [Copyright note tweaked, superfluous includes dropped] Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Michael S. Tsirkin	0d8c41dae5	block: use local path for local headers When pulling in headers that are in the same directory as the C file (as opposed to one in include/), we should use its relative path, without a directory. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>	2018-05-31 04:16:06 +03:00
Marc-André Lureau	cb3e7f08ae	qobject: Replace qobject_incref/QINCREF qobject_decref/QDECREF Now that we can safely call QOBJECT() on QObject * as well as its subtypes, we can have macros qobject_ref() / qobject_unref() that work everywhere instead of having to use QINCREF() / QDECREF() for QObject and qobject_incref() / qobject_decref() for its subtypes. The replacement is mechanical, except I broke a long line, and added a cast in monitor_qmp_cleanup_req_queue_locked(). Unlike qobject_decref(), qobject_unref() doesn't accept void *. Note that the new macros evaluate their argument exactly once, thus no need to shout them. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180419150145.24795-4-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Rebased, semantic conflict resolved, commit message improved] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2018-05-04 08:27:53 +02:00
Vladimir Sementsov-Ogievskiy	78a33ab587	nbd: BLOCK_STATUS for standard get_block_status function: client part Minimal realization: only one extent in server answer is supported. Flag NBD_CMD_FLAG_REQ_ONE is used to force this behavior. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20180312152126.286890-6-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: grammar tweaks, fix min_block check and 32-bit cap, use -1 instead of errno on failure in nbd_negotiate_simple_meta_context, ensure that block status makes progress on success] Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-13 15:43:48 -05:00
Markus Armbruster	9af2398977	Include less of the generated modular QAPI headers In my "build everything" tree, a change to the types in qapi-schema.json triggers a recompile of about 4800 out of 5100 objects. The previous commit split up qmp-commands.h, qmp-event.h, qmp-visit.h, qapi-types.h. Each of these headers still includes all its shards. Reduce compile time by including just the shards we actually need. To illustrate the benefits: adding a type to qapi/migration.json now recompiles some 2300 instead of 4800 objects. The next commit will improve it further. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180211093607.27351-24-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> [eblake: rebase to master] Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-02 13:45:50 -06:00
Eric Blake	fd8d372dd3	nbd: Honor server's advertised minimum block size Commit `79ba8c98` (v2.7) changed the setting of request_alignment to occur only during bdrv_refresh_limits(), rather than at at bdrv_open() time; but at the time, NBD was unaffected, because it still used sector-based callbacks, so the block layer defaulted NBD to use 512 request_alignment. Later, commit `70c4fb26` (also v2.7) changed NBD to use byte-based callbacks, without setting request_alignment. This resulted in NBD using request_alignment of 1, which works great when the server supports it (as is the case for qemu-nbd), but falls apart miserably if the server requires alignment (but only if qemu actually sends a sub-sector request; qemu-io can do it, but most qemu operations still perform on sectors or larger). Even later, the NBD protocol was updated to document that clients should learn the server's minimum alignment during NBD_OPT_GO; and recommended that clients should assume a minimum size of 512 unless the server understands NBD_OPT_GO and replied with a smaller size. Commit `081dd1fe` (v2.10) attempted to do that, by assigning request_alignment to whatever was learned from the server; but it has two flaws: the assignment is done during bdrv_open() so it gets unconditionally wiped out back to 1 during any later bdrv_refresh_limits(); and the code is not using a default of 512 when the server did not report a minimum size. Fix these issues by moving the assignment to request_alignment to the right function, and by using a sane default when the server does not advertise a minimum size. CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20180215032905.27146-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com>	2018-03-01 14:02:32 -06:00
Eric Blake	e24d813b29	block: Simplify bdrv_can_write_zeroes_with_unmap() We don't need the can_write_zeroes_with_unmap field in BlockDriverInfo, because it is redundant information with supported_zero_flags & BDRV_REQ_MAY_UNMAP. Note that BlockDriverInfo and supported_zero_flags are both per-device settings, rather than global state about the driver as a whole, which means one or both of these bits of information can already be conditional. Let's audit how they were set: crypto: always setting can_write_ to false is pointless (the struct starts life zero-initialized), no use of supported_ nbd: just recently fixed to set can_write_ if supported_ includes MAY_UNMAP (thus this commit effectively reverts bca80059e and solves the problem mentioned there in a more global way) file-posix, iscsi, qcow2: can_write_ is conditional, while supported_ was unconditional; but passing MAY_UNMAP would fail with ENOTSUP if the condition wasn't met qed: can_write_ is unconditional, but pwrite_zeroes lacks support for MAY_UNMAP and supported_ is not set. Perhaps support can be added later (since it would be similar to qcow2), but for now claiming false is no real loss all other drivers: can_write_ is not set, and supported_ is either unset or a passthrough Simplify the code by moving the conditional into supported_zero_flags for all drivers, then dropping the now-unused BDI field. For callers that relied on bdrv_can_write_zeroes_with_unmap(), we return the same per-device settings for drivers that had conditions (no observable change in behavior there); and can now return true (instead of false) for drivers that support passthrough (for example, the commit driver) which gives those drivers the same fix as nbd just got in bca80059e. For callers that relied on supported_zero_flags, we now have a few more places that can avoid a wasted call to pwrite_zeroes() that will just fail with ENOTSUP. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20180126193439.20219-1-eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-02-09 12:32:44 -06:00
Markus Armbruster	922a01a013	Move include qemu/option.h from qemu-common.h to actual users qemu-common.h includes qemu/option.h, but most places that include the former don't actually need the latter. Drop the include, and add it to the places that actually need it. While there, drop superfluous includes of both headers, and separate #include from file comment with a blank line. This cleanup makes the number of objects depending on qemu/option.h drop from 4545 (out of 4743) to 284 in my "build everything" tree. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-20-armbru@redhat.com> [Semantic conflict with commit `bdd6a90a9e` in block/nvme.c resolved]	2018-02-09 13:52:16 +01:00
Markus Armbruster	bbcad965bf	Drop superfluous includes of qapi/qmp/qjson.h Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-19-armbru@redhat.com>	2018-02-09 13:52:15 +01:00
Edgar Kaziakhmedov	9776f0db6a	nbd: implement bdrv_get_info callback Since mirror job supports efficient zero out target mechanism (see in mirror_dirty_init()), implement bdrv_get_info to make it work over NBD. Such improvement will allow using the largest chunk possible and will decrease the number of NBD_CMD_WRITE_ZEROES requests on the wire. Signed-off-by: Edgar Kaziakhmedov <edgar.kaziakhmedov@virtuozzo.com> Message-Id: <20180118115158.17219-1-edgar.kaziakhmedov@virtuozzo.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2018-01-26 09:58:46 -06:00
Murilo Opsfelder Araujo	c4365735a7	block/nbd: fix segmentation fault when .desc is not null-terminated The find_desc_by_name() from util/qemu-option.c relies on the .name not being NULL to call strcmp(). This check becomes unsafe when the list is not NULL-terminated, which is the case of nbd_runtime_opts in block/nbd.c, and can result in segmentation fault when strcmp() tries to access an invalid memory: #0 0x00007fff8c75f7d4 in __strcmp_power9 () from /lib64/libc.so.6 #1 0x00000000102d3ec8 in find_desc_by_name (desc=0x1036d6f0, name=0x28e46670 "server.path") at util/qemu-option.c:166 #2 0x00000000102d93e0 in qemu_opts_absorb_qdict (opts=0x28e47a80, qdict=0x28e469a0, errp=0x7fffec247c98) at util/qemu-option.c:1026 #3 0x000000001012a2e4 in nbd_open (bs=0x28e42290, options=0x28e469a0, flags=24578, errp=0x7fffec247d80) at block/nbd.c:406 #4 0x00000000100144e8 in bdrv_open_driver (bs=0x28e42290, drv=0x1036e070 <bdrv_nbd_unix>, node_name=0x0, options=0x28e469a0, open_flags=24578, errp=0x7fffec247f50) at block.c:1135 #5 0x0000000010015b04 in bdrv_open_common (bs=0x28e42290, file=0x0, options=0x28e469a0, errp=0x7fffec247f50) at block.c:1395 >From gdb, the desc[i].name was not NULL and resulted in strcmp() accessing an invalid memory: >>> p desc[5] $8 = { name = 0x1037f098 "R27A", type = 1561964883, help = 0xc0bbb23e <error: Cannot access memory at address 0xc0bbb23e>, def_value_str = 0x2 <error: Cannot access memory at address 0x2> } >>> p desc[6] $9 = { name = 0x103dac78 <__gcov0.do_qemu_init_bdrv_nbd_init> "\001", type = 272101528, help = 0x29ec0b754403e31f <error: Cannot access memory at address 0x29ec0b754403e31f>, def_value_str = 0x81f343b9 <error: Cannot access memory at address 0x81f343b9> } This patch fixes the segmentation fault in strcmp() by adding a NULL element at the end of nbd_runtime_opts.desc list, which is the common practice to most of other structs like runtime_opts in block/null.c. Thus, the desc[i].name != NULL check becomes safe because it will not evaluate to true when .desc list reached its end. Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1727259 Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.vnet.ibm.com> Message-Id: <20180105133241.14141-2-muriloo@linux.vnet.ibm.com> CC: qemu-stable@nongnu.org Fixes: `7ccc44fd7d` Signed-off-by: Eric Blake <eblake@redhat.com>	2018-01-08 09:12:23 -06:00
Eric Blake	081dd1fe36	nbd: Implement NBD_INFO_BLOCK_SIZE on client The upstream NBD Protocol has defined a new extension to allow the server to advertise block sizes to the client, as well as a way for the client to inform the server whether it intends to obey block sizes. When using the block layer as the client, we will obey block sizes; but when used as 'qemu-nbd -c' to hand off to the kernel nbd module as the client, we are still waiting for the kernel to implement a way for us to learn if it will honor block sizes (perhaps by an addition to sysfs, rather than an ioctl), as well as any way to tell the kernel what additional block sizes to obey (NBD_SET_BLKSIZE appears to be accurate for the minimum size, but preferred and maximum sizes would probably be new ioctl()s), so until then, we need to make our request for block sizes conditional. When using ioctl(NBD_SET_BLKSIZE) to hand off to the kernel, use the minimum block size as the sector size if it is larger than 512, which also has the nice effect of cooperating with (non-qemu) servers that don't do read-modify-write when exposing a block device with 4k sectors; it might also allow us to visit a file larger than 2T on a 32-bit kernel. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-10-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-14 12:04:42 +02:00
Eric Blake	004a89fce9	nbd: Create struct for tracking export info The NBD Protocol is introducing some additional information about exports, such as minimum request size and alignment, as well as an advertised maximum request size. It will be easier to feed this information back to the block layer if we gather all the information into a struct, rather than adding yet more pointer parameters during negotiation. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-2-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-14 12:04:41 +02:00
Max Reitz	f69165a8fe	block: Do not strcmp() with NULL uri->scheme uri_parse(...)->scheme may be NULL. In fact, probably every field may be NULL, and the callers do test this for all of the other fields but not for scheme (except for block/gluster.c; block/vxhs.c does not access that field at all). We can easily fix this by using g_strcmp0() instead of strcmp(). Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170613205726.13544-1-mreitz@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-06-26 14:54:46 +02:00
Marc-André Lureau	01b2ffcedd	qapi: merge QInt and QFloat in QNum We would like to use a same QObject type to represent numbers, whether they are int, uint, or floats. Getters will allow some compatibility between the various types if the number fits other representations. Add a few more tests while at it. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170607163635.17635-7-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [parse_stats_intervals() simplified a bit, comment in test_visitor_in_int_overflow() tidied up, suppress bogus warnings] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-06-20 14:31:31 +02:00
Markus Armbruster	bd269ebc82	sockets: Limit SocketAddressLegacy to external interfaces SocketAddressLegacy is a simple union, and simple unions are awkward: they have their variant members wrapped in a "data" object on the wire, and require additional indirections in C. SocketAddress is the equivalent flat union. Convert all users of SocketAddressLegacy to SocketAddress, except for existing external interfaces. See also commit fce5d53..9445673 and 85a82e8..c5f1ae3. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1493192202-3184-7-git-send-email-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [Minor editing accident fixed, commit message and a comment tweaked] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-05-09 09:14:40 +02:00
Markus Armbruster	62cf396b5d	sockets: Rename SocketAddressFlat to SocketAddress Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1493192202-3184-6-git-send-email-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2017-05-09 09:14:40 +02:00

1 2 3 4 5 ...

260 Commits