qemu/qemu-io.c

698 lines
18 KiB
C
Raw Normal View History

/*
* Command line utility to exercise the QEMU I/O path.
*
* Copyright (C) 2009 Red Hat, Inc.
* Copyright (c) 2003-2005 Silicon Graphics, Inc.
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "qemu/osdep.h"
#include <getopt.h>
#include <libgen.h>
#ifndef _WIN32
#include <termios.h>
#endif
#include "qemu/help-texts.h"
#include "qemu/cutils.h"
2016-03-14 11:01:28 +03:00
#include "qapi/error.h"
#include "qemu-io.h"
#include "qemu/error-report.h"
#include "qemu/main-loop.h"
#include "qemu/module.h"
#include "qemu/option.h"
#include "qemu/config-file.h"
#include "qemu/readline.h"
#include "qemu/log.h"
#include "qemu/sockets.h"
#include "qapi/qmp/qstring.h"
#include "qapi/qmp/qdict.h"
#include "qom/object_interfaces.h"
block: New BlockBackend A block device consists of a frontend device model and a backend. A block backend has a tree of block drivers doing the actual work. The tree is managed by the block layer. We currently use a single abstraction BlockDriverState both for tree nodes and the backend as a whole. Drawbacks: * Its API includes both stuff that makes sense only at the block backend level (root of the tree) and stuff that's only for use within the block layer. This makes the API bigger and more complex than necessary. Moreover, it's not obvious which interfaces are meant for device models, and which really aren't. * Since device models keep a reference to their backend, the backend object can't just be destroyed. But for media change, we need to replace the tree. Our solution is to make the BlockDriverState generic, with actual driver state in a separate object, pointed to by member opaque. That lets us replace the tree by deinitializing and reinitializing its root. This special need of the root makes the data structure awkward everywhere in the tree. The general plan is to separate the APIs into "block backend", for use by device models, monitor and whatever other code dealing with block backends, and "block driver", for use by the block layer and whatever other code (if any) dealing with trees and tree nodes. Code dealing with block backends, device models in particular, should become completely oblivious of BlockDriverState. This should let us clean up both APIs, and the tree data structures. This commit is a first step. It creates a minimal "block backend" API: type BlockBackend and functions to create, destroy and find them. BlockBackend objects are created and destroyed exactly when root BlockDriverState objects are created and destroyed. "Root" in the sense of "in bdrv_states". They're not yet used for anything; that'll come shortly. A root BlockDriverState is created with bdrv_new_root(), so where to create a BlockBackend is obvious. Where these roots get destroyed isn't always as obvious. It is obvious in qemu-img.c, qemu-io.c and qemu-nbd.c, and in error paths of blockdev_init(), blk_connect(). That leaves destruction of objects successfully created by blockdev_init() and blk_connect(). blockdev_init() is used only by drive_new() and qmp_blockdev_add(). Objects created by the latter are currently indestructible (see commit 48f364d "blockdev: Refuse to drive_del something added with blockdev-add" and commit 2d246f0 "blockdev: Introduce DriveInfo.enable_auto_del"). Objects created by the former get destroyed by drive_del(). Objects created by blk_connect() get destroyed by blk_disconnect(). BlockBackend is reference-counted. Its reference count never exceeds one so far, but that's going to change. In drive_del(), the BB's reference count is surely one now. The BDS's reference count is greater than one when something else is holding a reference, such as a block job. In this case, the BB is destroyed right away, but the BDS lives on until all extra references get dropped. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-07 15:59:04 +04:00
#include "sysemu/block-backend.h"
#include "block/block_int.h"
#include "trace/control.h"
#include "crypto/init.h"
#include "qemu-version.h"
#define CMD_NOFILE_OK 0x01
block: New BlockBackend A block device consists of a frontend device model and a backend. A block backend has a tree of block drivers doing the actual work. The tree is managed by the block layer. We currently use a single abstraction BlockDriverState both for tree nodes and the backend as a whole. Drawbacks: * Its API includes both stuff that makes sense only at the block backend level (root of the tree) and stuff that's only for use within the block layer. This makes the API bigger and more complex than necessary. Moreover, it's not obvious which interfaces are meant for device models, and which really aren't. * Since device models keep a reference to their backend, the backend object can't just be destroyed. But for media change, we need to replace the tree. Our solution is to make the BlockDriverState generic, with actual driver state in a separate object, pointed to by member opaque. That lets us replace the tree by deinitializing and reinitializing its root. This special need of the root makes the data structure awkward everywhere in the tree. The general plan is to separate the APIs into "block backend", for use by device models, monitor and whatever other code dealing with block backends, and "block driver", for use by the block layer and whatever other code (if any) dealing with trees and tree nodes. Code dealing with block backends, device models in particular, should become completely oblivious of BlockDriverState. This should let us clean up both APIs, and the tree data structures. This commit is a first step. It creates a minimal "block backend" API: type BlockBackend and functions to create, destroy and find them. BlockBackend objects are created and destroyed exactly when root BlockDriverState objects are created and destroyed. "Root" in the sense of "in bdrv_states". They're not yet used for anything; that'll come shortly. A root BlockDriverState is created with bdrv_new_root(), so where to create a BlockBackend is obvious. Where these roots get destroyed isn't always as obvious. It is obvious in qemu-img.c, qemu-io.c and qemu-nbd.c, and in error paths of blockdev_init(), blk_connect(). That leaves destruction of objects successfully created by blockdev_init() and blk_connect(). blockdev_init() is used only by drive_new() and qmp_blockdev_add(). Objects created by the latter are currently indestructible (see commit 48f364d "blockdev: Refuse to drive_del something added with blockdev-add" and commit 2d246f0 "blockdev: Introduce DriveInfo.enable_auto_del"). Objects created by the former get destroyed by drive_del(). Objects created by blk_connect() get destroyed by blk_disconnect(). BlockBackend is reference-counted. Its reference count never exceeds one so far, but that's going to change. In drive_del(), the BB's reference count is surely one now. The BDS's reference count is greater than one when something else is holding a reference, such as a block job. In this case, the BB is destroyed right away, but the BDS lives on until all extra references get dropped. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-07 15:59:04 +04:00
static BlockBackend *qemuio_blk;
static bool quit_qemu_io;
/* qemu-io commands passed using -c */
static int ncmdline;
static char **cmdline;
static bool imageOpts;
static ReadLineState *readline_state;
static int ttyEOF;
static int get_eof_char(void)
{
#ifdef _WIN32
return 0x4; /* Ctrl-D */
#else
struct termios tty;
if (tcgetattr(STDIN_FILENO, &tty) != 0) {
if (errno == ENOTTY) {
return 0x0; /* just expect read() == 0 */
} else {
return 0x4; /* Ctrl-D */
}
}
return tty.c_cc[VEOF];
#endif
}
static int close_f(BlockBackend *blk, int argc, char **argv)
{
block: New BlockBackend A block device consists of a frontend device model and a backend. A block backend has a tree of block drivers doing the actual work. The tree is managed by the block layer. We currently use a single abstraction BlockDriverState both for tree nodes and the backend as a whole. Drawbacks: * Its API includes both stuff that makes sense only at the block backend level (root of the tree) and stuff that's only for use within the block layer. This makes the API bigger and more complex than necessary. Moreover, it's not obvious which interfaces are meant for device models, and which really aren't. * Since device models keep a reference to their backend, the backend object can't just be destroyed. But for media change, we need to replace the tree. Our solution is to make the BlockDriverState generic, with actual driver state in a separate object, pointed to by member opaque. That lets us replace the tree by deinitializing and reinitializing its root. This special need of the root makes the data structure awkward everywhere in the tree. The general plan is to separate the APIs into "block backend", for use by device models, monitor and whatever other code dealing with block backends, and "block driver", for use by the block layer and whatever other code (if any) dealing with trees and tree nodes. Code dealing with block backends, device models in particular, should become completely oblivious of BlockDriverState. This should let us clean up both APIs, and the tree data structures. This commit is a first step. It creates a minimal "block backend" API: type BlockBackend and functions to create, destroy and find them. BlockBackend objects are created and destroyed exactly when root BlockDriverState objects are created and destroyed. "Root" in the sense of "in bdrv_states". They're not yet used for anything; that'll come shortly. A root BlockDriverState is created with bdrv_new_root(), so where to create a BlockBackend is obvious. Where these roots get destroyed isn't always as obvious. It is obvious in qemu-img.c, qemu-io.c and qemu-nbd.c, and in error paths of blockdev_init(), blk_connect(). That leaves destruction of objects successfully created by blockdev_init() and blk_connect(). blockdev_init() is used only by drive_new() and qmp_blockdev_add(). Objects created by the latter are currently indestructible (see commit 48f364d "blockdev: Refuse to drive_del something added with blockdev-add" and commit 2d246f0 "blockdev: Introduce DriveInfo.enable_auto_del"). Objects created by the former get destroyed by drive_del(). Objects created by blk_connect() get destroyed by blk_disconnect(). BlockBackend is reference-counted. Its reference count never exceeds one so far, but that's going to change. In drive_del(), the BB's reference count is surely one now. The BDS's reference count is greater than one when something else is holding a reference, such as a block job. In this case, the BB is destroyed right away, but the BDS lives on until all extra references get dropped. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-07 15:59:04 +04:00
blk_unref(qemuio_blk);
qemuio_blk = NULL;
return 0;
}
static const cmdinfo_t close_cmd = {
.name = "close",
.altname = "c",
.cfunc = close_f,
.oneline = "close the current open file",
};
static int openfile(char *name, int flags, bool writethrough, bool force_share,
QDict *opts)
{
Error *local_err = NULL;
if (qemuio_blk) {
error_report("file open already, try 'help close'");
qobject_unref(opts);
return 1;
}
if (force_share) {
if (!opts) {
opts = qdict_new();
}
if (qdict_haskey(opts, BDRV_OPT_FORCE_SHARE)
&& strcmp(qdict_get_str(opts, BDRV_OPT_FORCE_SHARE), "on")) {
error_report("-U conflicts with image options");
qobject_unref(opts);
return 1;
}
qdict_put_str(opts, BDRV_OPT_FORCE_SHARE, "on");
}
qemuio_blk = blk_new_open(name, NULL, opts, flags, &local_err);
if (!qemuio_blk) {
error_reportf_err(local_err, "can't open%s%s: ",
name ? " device " : "", name ?: "");
return 1;
}
blk_set_enable_write_cache(qemuio_blk, !writethrough);
return 0;
}
static void open_help(void)
{
printf(
"\n"
" opens a new file in the requested mode\n"
"\n"
" Example:\n"
" 'open -n -o driver=raw /tmp/data' - opens raw data file read-write, uncached\n"
"\n"
" Opens a file for subsequent use by all of the other qemu-io commands.\n"
" -r, -- open file read-only\n"
" -s, -- use snapshot file\n"
" -C, -- use copy-on-read\n"
" -n, -- disable host cache, short for -t none\n"
" -U, -- force shared permissions\n"
" -k, -- use kernel AIO implementation (Linux only, prefer use of -i)\n"
" -i, -- use AIO mode (threads, native or io_uring)\n"
" -t, -- use the given cache mode for the image\n"
" -d, -- use the given discard mode for the image\n"
" -o, -- options to be given to the block driver"
"\n");
}
static int open_f(BlockBackend *blk, int argc, char **argv);
static const cmdinfo_t open_cmd = {
.name = "open",
.altname = "o",
.cfunc = open_f,
.argmin = 1,
.argmax = -1,
.flags = CMD_NOFILE_OK,
.args = "[-rsCnkU] [-t cache] [-d discard] [-o options] [path]",
.oneline = "open the file specified by path",
.help = open_help,
};
static QemuOptsList empty_opts = {
.name = "drive",
.merge_lists = true,
.head = QTAILQ_HEAD_INITIALIZER(empty_opts.head),
.desc = {
/* no elements => accept any params */
{ /* end of list */ }
},
};
static int open_f(BlockBackend *blk, int argc, char **argv)
{
int flags = BDRV_O_UNMAP;
int readonly = 0;
bool writethrough = true;
int c;
int ret;
QemuOpts *qopts;
QDict *opts;
bool force_share = false;
while ((c = getopt(argc, argv, "snCro:ki:t:d:U")) != -1) {
switch (c) {
case 's':
flags |= BDRV_O_SNAPSHOT;
break;
case 'n':
flags |= BDRV_O_NOCACHE;
writethrough = false;
break;
case 'C':
flags |= BDRV_O_COPY_ON_READ;
break;
case 'r':
readonly = 1;
break;
case 'k':
flags |= BDRV_O_NATIVE_AIO;
break;
case 't':
if (bdrv_parse_cache_mode(optarg, &flags, &writethrough) < 0) {
error_report("Invalid cache option: %s", optarg);
qemu_opts_reset(&empty_opts);
return -EINVAL;
}
break;
case 'd':
if (bdrv_parse_discard_flags(optarg, &flags) < 0) {
error_report("Invalid discard option: %s", optarg);
qemu_opts_reset(&empty_opts);
return -EINVAL;
}
break;
case 'i':
if (bdrv_parse_aio(optarg, &flags) < 0) {
error_report("Invalid aio option: %s", optarg);
qemu_opts_reset(&empty_opts);
return -EINVAL;
}
break;
case 'o':
if (imageOpts) {
printf("--image-opts and 'open -o' are mutually exclusive\n");
qemu_opts_reset(&empty_opts);
return -EINVAL;
}
QemuOpts: Wean off qerror_report_err() qerror_report_err() is a transitional interface to help with converting existing monitor commands to QMP. It should not be used elsewhere. The only remaining user in qemu-option.c is qemu_opts_parse(). Is it used in QMP context? If not, we can simply replace qerror_report_err() by error_report_err(). The uses in qemu-img.c, qemu-io.c, qemu-nbd.c and under tests/ are clearly not in QMP context. The uses in vl.c aren't either, because the only QMP command handlers there are qmp_query_status() and qmp_query_machines(), and they don't call it. Remaining uses: * drive_def(): Command line -drive and such, HMP drive_add and pci_add * hmp_chardev_add(): HMP chardev-add * monitor_parse_command(): HMP core * tmp_config_parse(): Command line -tpmdev * net_host_device_add(): HMP host_net_add * net_client_parse(): Command line -net and -netdev * qemu_global_option(): Command line -global * vnc_parse_func(): Command line -display, -vnc, default display, HMP change, QMP change. Bummer. * qemu_pci_hot_add_nic(): HMP pci_add * usb_net_init(): Command line -usbdevice, HMP usb_add Propagate errors through qemu_opts_parse(). Create a convenience function qemu_opts_parse_noisily() that passes errors to error_report_err(). Switch all non-QMP users outside tests to it. That leaves vnc_parse_func(). Propagate errors through it. Since I'm touching it anyway, rename it to vnc_parse(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
2015-02-13 14:50:26 +03:00
if (!qemu_opts_parse_noisily(&empty_opts, optarg, false)) {
qemu_opts_reset(&empty_opts);
return -EINVAL;
}
break;
case 'U':
force_share = true;
break;
default:
qemu_opts_reset(&empty_opts);
qemuio_command_usage(&open_cmd);
return -EINVAL;
}
}
if (!readonly) {
flags |= BDRV_O_RDWR;
}
if (imageOpts && (optind == argc - 1)) {
if (!qemu_opts_parse_noisily(&empty_opts, argv[optind], false)) {
qemu_opts_reset(&empty_opts);
return -EINVAL;
}
optind++;
}
qopts = qemu_opts_find(&empty_opts, NULL);
opts = qopts ? qemu_opts_to_qdict(qopts, NULL) : NULL;
qemu_opts_reset(&empty_opts);
if (optind == argc - 1) {
ret = openfile(argv[optind], flags, writethrough, force_share, opts);
} else if (optind == argc) {
ret = openfile(NULL, flags, writethrough, force_share, opts);
} else {
qobject_unref(opts);
qemu-io: Don't die on second open Most callback commands in qemu-io return 0 to keep the interpreter loop running, or 1 to quit immediately. However, open_f() just passed through the return value of openfile(), which has different semantics of returning 0 if a file was opened, or 1 on any failure. As a result of mixing the return semantics, we are forcing the qemu-io interpreter to exit early on any failures, which is rather annoying when some of the failures are obviously trying to give the user a hint of how to proceed (if we didn't then kill qemu-io out from under the user's feet): $ qemu-io qemu-io> open foo qemu-io> open foo file open already, try 'help close' $ echo $? 0 In general, we WANT openfile() to report failures, since it is the function used in the form 'qemu-io -c "$something" no_such_file' for performing one or more -c options on a single file, and it is not worth attempting $something if the file itself cannot be opened. So the solution is to fix open_f() to always return 0 (when we are in interactive mode, even failure to open should not end the session), and save the return value of openfile() for command line use in main(). Note, however, that we do have some qemu-iotests that do 'qemu-io -c "open file" -c "$something"'; such tests will now proceed to attempt $something whether or not the open succeeded, the same way as if the two commands had been attempted in interactive mode. As such, the expected output for those tests has to be modified. But it also means that it is now possible to use -c close and have a single qemu-io command line operate on more than one file even without using interactive mode. Although the '-c open' action is a subtle change in behavior, remember that qemu-io is for debugging purposes, so as long as it serves the needs of qemu-iotests while still being reasonable for interactive use, it should not be a problem that we are changing tests to the new behavior. This has been awkward since at least as far back as commit e3aff4f, in 2009. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-05 23:38:42 +03:00
qemuio_command_usage(&open_cmd);
return -EINVAL;
}
if (ret) {
return -EINVAL;
}
return 0;
}
static int quit_f(BlockBackend *blk, int argc, char **argv)
{
quit_qemu_io = true;
return 0;
}
static const cmdinfo_t quit_cmd = {
.name = "quit",
.altname = "q",
.cfunc = quit_f,
.argmin = -1,
.argmax = -1,
.flags = CMD_FLAG_GLOBAL,
.oneline = "exit the program",
};
static void usage(const char *name)
{
printf(
"Usage: %s [OPTIONS]... [-c STRING]... [file]\n"
"QEMU Disk exerciser\n"
"\n"
" --object OBJECTDEF define an object such as 'secret' for\n"
" passwords and/or encryption keys\n"
" --image-opts treat file as option string\n"
" -c, --cmd STRING execute command with its arguments\n"
" from the given string\n"
" -f, --format FMT specifies the block driver to use\n"
" -r, --read-only export read-only\n"
" -s, --snapshot use snapshot file\n"
" -n, --nocache disable host cache, short for -t none\n"
" -C, --copy-on-read enable copy-on-read\n"
" -m, --misalign misalign allocations for O_DIRECT\n"
" -k, --native-aio use kernel AIO implementation\n"
" (Linux only, prefer use of -i)\n"
" -i, --aio=MODE use AIO mode (threads, native or io_uring)\n"
" -t, --cache=MODE use the given cache mode for the image\n"
" -d, --discard=MODE use the given discard mode for the image\n"
" -T, --trace [[enable=]<pattern>][,events=<file>][,file=<file>]\n"
" specify tracing options\n"
" see qemu-img(1) man page for full description\n"
" -U, --force-share force shared permissions\n"
" -h, --help display this help and exit\n"
" -V, --version output version information and exit\n"
"\n"
"See '%s -c help' for information on available commands.\n"
"\n"
QEMU_HELP_BOTTOM "\n",
name, name);
}
static char *get_prompt(void)
{
static char prompt[FILENAME_MAX + 2 /*"> "*/ + 1 /*"\0"*/ ];
if (!prompt[0]) {
snprintf(prompt, sizeof(prompt), "%s> ", g_get_prgname());
}
return prompt;
}
static void G_GNUC_PRINTF(2, 3) readline_printf_func(void *opaque,
const char *fmt, ...)
{
va_list ap;
va_start(ap, fmt);
vprintf(fmt, ap);
va_end(ap);
}
static void readline_flush_func(void *opaque)
{
fflush(stdout);
}
static void readline_func(void *opaque, const char *str, void *readline_opaque)
{
char **line = readline_opaque;
*line = g_strdup(str);
}
static void completion_match(const char *cmd, void *opaque)
{
readline_add_completion(readline_state, cmd);
}
static void readline_completion_func(void *opaque, const char *str)
{
readline_set_completion_index(readline_state, strlen(str));
qemuio_complete_command(str, completion_match, NULL);
}
static char *fetchline_readline(void)
{
char *line = NULL;
readline_start(readline_state, get_prompt(), 0, readline_func, &line);
while (!line) {
int ch = getchar();
if (ttyEOF != 0x0 && ch == ttyEOF) {
printf("\n");
break;
}
readline_handle_byte(readline_state, ch);
}
return line;
}
#define MAXREADLINESZ 1024
static char *fetchline_fgets(void)
{
char *p, *line = g_malloc(MAXREADLINESZ);
if (!fgets(line, MAXREADLINESZ, stdin)) {
g_free(line);
return NULL;
}
p = line + strlen(line);
if (p != line && p[-1] == '\n') {
p[-1] = '\0';
}
return line;
}
static char *fetchline(void)
{
if (readline_state) {
return fetchline_readline();
} else {
return fetchline_fgets();
}
}
static void prep_fetchline(void *opaque)
{
int *fetchable = opaque;
qemu_set_fd_handler(STDIN_FILENO, NULL, NULL, NULL);
*fetchable= 1;
}
monitor: hmp_qemu_io: acquire aio contex, fix crash Max reported the following bug: $ ./qemu-img create -f raw src.img 1G $ ./qemu-img create -f raw dst.img 1G $ (echo ' {"execute":"qmp_capabilities"} {"execute":"blockdev-mirror", "arguments":{"job-id":"mirror", "device":"source", "target":"target", "sync":"full", "filter-node-name":"mirror-top"}} '; sleep 3; echo ' {"execute":"human-monitor-command", "arguments":{"command-line": "qemu-io mirror-top \"write 0 1G\""}}') \ | x86_64-softmmu/qemu-system-x86_64 \ -qmp stdio \ -blockdev file,node-name=source,filename=src.img \ -blockdev file,node-name=target,filename=dst.img \ -object iothread,id=iothr0 \ -device virtio-blk,drive=source,iothread=iothr0 crashes: 0 raise () at /usr/lib/libc.so.6 1 abort () at /usr/lib/libc.so.6 2 error_exit (err=<optimized out>, msg=msg@entry=0x55fbb1634790 <__func__.27> "qemu_mutex_unlock_impl") at ../util/qemu-thread-posix.c:37 3 qemu_mutex_unlock_impl (mutex=mutex@entry=0x55fbb25ab6e0, file=file@entry=0x55fbb1636957 "../util/async.c", line=line@entry=650) at ../util/qemu-thread-posix.c:109 4 aio_context_release (ctx=ctx@entry=0x55fbb25ab680) at ../util/async.c:650 5 bdrv_do_drained_begin (bs=bs@entry=0x55fbb3a87000, recursive=recursive@entry=false, parent=parent@entry=0x0, ignore_bds_parents=ignore_bds_parents@entry=false, poll=poll@entry=true) at ../block/io.c:441 6 bdrv_do_drained_begin (poll=true, ignore_bds_parents=false, parent=0x0, recursive=false, bs=0x55fbb3a87000) at ../block/io.c:448 7 blk_drain (blk=0x55fbb26c5a00) at ../block/block-backend.c:1718 8 blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:498 9 blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:491 10 hmp_qemu_io (mon=0x7fffaf3fc7d0, qdict=<optimized out>) at ../block/monitor/block-hmp-cmds.c:628 man pthread_mutex_unlock ... EPERM The mutex type is PTHREAD_MUTEX_ERRORCHECK or PTHREAD_MUTEX_RECURSIVE, or the mutex is a robust mutex, and the current thread does not own the mutex. So, thread doesn't own the mutex. And we have iothread here. Next, note that AIO_WAIT_WHILE() documents that ctx must be acquired exactly once by caller. But where is it acquired in the call stack? Seems nowhere. qemuio_command do acquire aio context.. But we need context acquired around blk_unref() as well and actually around blk_insert_bs() too. Let's refactor qemuio_command so that it doesn't acquire aio context but callers do that instead. This way we can cleanly acquire aio context in hmp_qemu_io() around all three calls. Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210423134233.51495-1-vsementsov@virtuozzo.com> [mreitz: Fixed comment] Signed-off-by: Max Reitz <mreitz@redhat.com>
2021-04-23 16:42:33 +03:00
static int do_qemuio_command(const char *cmd)
{
int ret;
AioContext *ctx =
qemuio_blk ? blk_get_aio_context(qemuio_blk) : qemu_get_aio_context();
aio_context_acquire(ctx);
ret = qemuio_command(qemuio_blk, cmd);
aio_context_release(ctx);
return ret;
}
static int command_loop(void)
{
int i, fetchable = 0, prompted = 0;
int ret, last_error = 0;
char *input;
for (i = 0; !quit_qemu_io && i < ncmdline; i++) {
monitor: hmp_qemu_io: acquire aio contex, fix crash Max reported the following bug: $ ./qemu-img create -f raw src.img 1G $ ./qemu-img create -f raw dst.img 1G $ (echo ' {"execute":"qmp_capabilities"} {"execute":"blockdev-mirror", "arguments":{"job-id":"mirror", "device":"source", "target":"target", "sync":"full", "filter-node-name":"mirror-top"}} '; sleep 3; echo ' {"execute":"human-monitor-command", "arguments":{"command-line": "qemu-io mirror-top \"write 0 1G\""}}') \ | x86_64-softmmu/qemu-system-x86_64 \ -qmp stdio \ -blockdev file,node-name=source,filename=src.img \ -blockdev file,node-name=target,filename=dst.img \ -object iothread,id=iothr0 \ -device virtio-blk,drive=source,iothread=iothr0 crashes: 0 raise () at /usr/lib/libc.so.6 1 abort () at /usr/lib/libc.so.6 2 error_exit (err=<optimized out>, msg=msg@entry=0x55fbb1634790 <__func__.27> "qemu_mutex_unlock_impl") at ../util/qemu-thread-posix.c:37 3 qemu_mutex_unlock_impl (mutex=mutex@entry=0x55fbb25ab6e0, file=file@entry=0x55fbb1636957 "../util/async.c", line=line@entry=650) at ../util/qemu-thread-posix.c:109 4 aio_context_release (ctx=ctx@entry=0x55fbb25ab680) at ../util/async.c:650 5 bdrv_do_drained_begin (bs=bs@entry=0x55fbb3a87000, recursive=recursive@entry=false, parent=parent@entry=0x0, ignore_bds_parents=ignore_bds_parents@entry=false, poll=poll@entry=true) at ../block/io.c:441 6 bdrv_do_drained_begin (poll=true, ignore_bds_parents=false, parent=0x0, recursive=false, bs=0x55fbb3a87000) at ../block/io.c:448 7 blk_drain (blk=0x55fbb26c5a00) at ../block/block-backend.c:1718 8 blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:498 9 blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:491 10 hmp_qemu_io (mon=0x7fffaf3fc7d0, qdict=<optimized out>) at ../block/monitor/block-hmp-cmds.c:628 man pthread_mutex_unlock ... EPERM The mutex type is PTHREAD_MUTEX_ERRORCHECK or PTHREAD_MUTEX_RECURSIVE, or the mutex is a robust mutex, and the current thread does not own the mutex. So, thread doesn't own the mutex. And we have iothread here. Next, note that AIO_WAIT_WHILE() documents that ctx must be acquired exactly once by caller. But where is it acquired in the call stack? Seems nowhere. qemuio_command do acquire aio context.. But we need context acquired around blk_unref() as well and actually around blk_insert_bs() too. Let's refactor qemuio_command so that it doesn't acquire aio context but callers do that instead. This way we can cleanly acquire aio context in hmp_qemu_io() around all three calls. Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210423134233.51495-1-vsementsov@virtuozzo.com> [mreitz: Fixed comment] Signed-off-by: Max Reitz <mreitz@redhat.com>
2021-04-23 16:42:33 +03:00
ret = do_qemuio_command(cmdline[i]);
if (ret < 0) {
last_error = ret;
}
}
if (cmdline) {
g_free(cmdline);
return last_error;
}
while (!quit_qemu_io) {
if (!prompted) {
printf("%s", get_prompt());
fflush(stdout);
qemu_set_fd_handler(STDIN_FILENO, prep_fetchline, NULL, &fetchable);
prompted = 1;
}
main_loop_wait(false);
if (!fetchable) {
continue;
}
input = fetchline();
if (input == NULL) {
break;
}
monitor: hmp_qemu_io: acquire aio contex, fix crash Max reported the following bug: $ ./qemu-img create -f raw src.img 1G $ ./qemu-img create -f raw dst.img 1G $ (echo ' {"execute":"qmp_capabilities"} {"execute":"blockdev-mirror", "arguments":{"job-id":"mirror", "device":"source", "target":"target", "sync":"full", "filter-node-name":"mirror-top"}} '; sleep 3; echo ' {"execute":"human-monitor-command", "arguments":{"command-line": "qemu-io mirror-top \"write 0 1G\""}}') \ | x86_64-softmmu/qemu-system-x86_64 \ -qmp stdio \ -blockdev file,node-name=source,filename=src.img \ -blockdev file,node-name=target,filename=dst.img \ -object iothread,id=iothr0 \ -device virtio-blk,drive=source,iothread=iothr0 crashes: 0 raise () at /usr/lib/libc.so.6 1 abort () at /usr/lib/libc.so.6 2 error_exit (err=<optimized out>, msg=msg@entry=0x55fbb1634790 <__func__.27> "qemu_mutex_unlock_impl") at ../util/qemu-thread-posix.c:37 3 qemu_mutex_unlock_impl (mutex=mutex@entry=0x55fbb25ab6e0, file=file@entry=0x55fbb1636957 "../util/async.c", line=line@entry=650) at ../util/qemu-thread-posix.c:109 4 aio_context_release (ctx=ctx@entry=0x55fbb25ab680) at ../util/async.c:650 5 bdrv_do_drained_begin (bs=bs@entry=0x55fbb3a87000, recursive=recursive@entry=false, parent=parent@entry=0x0, ignore_bds_parents=ignore_bds_parents@entry=false, poll=poll@entry=true) at ../block/io.c:441 6 bdrv_do_drained_begin (poll=true, ignore_bds_parents=false, parent=0x0, recursive=false, bs=0x55fbb3a87000) at ../block/io.c:448 7 blk_drain (blk=0x55fbb26c5a00) at ../block/block-backend.c:1718 8 blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:498 9 blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:491 10 hmp_qemu_io (mon=0x7fffaf3fc7d0, qdict=<optimized out>) at ../block/monitor/block-hmp-cmds.c:628 man pthread_mutex_unlock ... EPERM The mutex type is PTHREAD_MUTEX_ERRORCHECK or PTHREAD_MUTEX_RECURSIVE, or the mutex is a robust mutex, and the current thread does not own the mutex. So, thread doesn't own the mutex. And we have iothread here. Next, note that AIO_WAIT_WHILE() documents that ctx must be acquired exactly once by caller. But where is it acquired in the call stack? Seems nowhere. qemuio_command do acquire aio context.. But we need context acquired around blk_unref() as well and actually around blk_insert_bs() too. Let's refactor qemuio_command so that it doesn't acquire aio context but callers do that instead. This way we can cleanly acquire aio context in hmp_qemu_io() around all three calls. Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210423134233.51495-1-vsementsov@virtuozzo.com> [mreitz: Fixed comment] Signed-off-by: Max Reitz <mreitz@redhat.com>
2021-04-23 16:42:33 +03:00
ret = do_qemuio_command(input);
g_free(input);
if (ret < 0) {
last_error = ret;
}
prompted = 0;
fetchable = 0;
}
qemu_set_fd_handler(STDIN_FILENO, NULL, NULL, NULL);
return last_error;
}
static void add_user_command(char *user_cmd)
{
cmdline = g_renew(char *, cmdline, ++ncmdline);
cmdline[ncmdline - 1] = user_cmd;
}
static void reenable_tty_echo(void)
{
qemu_set_tty_echo(STDIN_FILENO, true);
}
enum {
OPTION_OBJECT = 256,
OPTION_IMAGE_OPTS = 257,
};
static QemuOptsList file_opts = {
.name = "file",
.implied_opt_name = "file",
.head = QTAILQ_HEAD_INITIALIZER(file_opts.head),
.desc = {
/* no elements => accept any params */
{ /* end of list */ }
},
};
int main(int argc, char **argv)
{
int readonly = 0;
const char *sopt = "hVc:d:f:rsnCmki:t:T:U";
const struct option lopt[] = {
{ "help", no_argument, NULL, 'h' },
{ "version", no_argument, NULL, 'V' },
{ "cmd", required_argument, NULL, 'c' },
{ "format", required_argument, NULL, 'f' },
{ "read-only", no_argument, NULL, 'r' },
{ "snapshot", no_argument, NULL, 's' },
{ "nocache", no_argument, NULL, 'n' },
{ "copy-on-read", no_argument, NULL, 'C' },
{ "misalign", no_argument, NULL, 'm' },
{ "native-aio", no_argument, NULL, 'k' },
{ "aio", required_argument, NULL, 'i' },
{ "discard", required_argument, NULL, 'd' },
{ "cache", required_argument, NULL, 't' },
{ "trace", required_argument, NULL, 'T' },
{ "object", required_argument, NULL, OPTION_OBJECT },
{ "image-opts", no_argument, NULL, OPTION_IMAGE_OPTS },
{ "force-share", no_argument, 0, 'U'},
{ NULL, 0, NULL, 0 }
};
int c;
int opt_index = 0;
int flags = BDRV_O_UNMAP;
int ret;
bool writethrough = true;
QDict *opts = NULL;
const char *format = NULL;
bool force_share = false;
#ifdef CONFIG_POSIX
signal(SIGPIPE, SIG_IGN);
#endif
socket_init();
error_init(argv[0]);
module_call_init(MODULE_INIT_TRACE);
qemu_init_exec_dir(argv[0]);
qcrypto_init(&error_fatal);
module_call_init(MODULE_INIT_QOM);
qemu_add_opts(&qemu_trace_opts);
bdrv_init();
while ((c = getopt_long(argc, argv, sopt, lopt, &opt_index)) != -1) {
switch (c) {
case 's':
flags |= BDRV_O_SNAPSHOT;
break;
case 'n':
flags |= BDRV_O_NOCACHE;
writethrough = false;
break;
case 'C':
flags |= BDRV_O_COPY_ON_READ;
break;
case 'd':
if (bdrv_parse_discard_flags(optarg, &flags) < 0) {
error_report("Invalid discard option: %s", optarg);
exit(1);
}
break;
case 'f':
format = optarg;
break;
case 'c':
add_user_command(optarg);
break;
case 'r':
readonly = 1;
break;
case 'm':
qemuio_misalign = true;
break;
case 'k':
flags |= BDRV_O_NATIVE_AIO;
break;
case 'i':
if (bdrv_parse_aio(optarg, &flags) < 0) {
error_report("Invalid aio option: %s", optarg);
exit(1);
}
break;
case 't':
if (bdrv_parse_cache_mode(optarg, &flags, &writethrough) < 0) {
error_report("Invalid cache option: %s", optarg);
exit(1);
}
break;
case 'T':
trace_opt_parse(optarg);
break;
case 'V':
printf("%s version " QEMU_FULL_VERSION "\n"
QEMU_COPYRIGHT "\n", g_get_prgname());
exit(0);
case 'h':
usage(g_get_prgname());
exit(0);
case 'U':
force_share = true;
break;
case OPTION_OBJECT:
user_creatable_process_cmdline(optarg);
break;
case OPTION_IMAGE_OPTS:
imageOpts = true;
break;
default:
usage(g_get_prgname());
exit(1);
}
}
if ((argc - optind) > 1) {
usage(g_get_prgname());
exit(1);
}
if (format && imageOpts) {
error_report("--image-opts and -f are mutually exclusive");
exit(1);
}
qemu_init_main_loop(&error_fatal);
if (!trace_init_backends()) {
exit(1);
}
trace_init_file();
qemu_set_log(LOG_TRACE, &error_fatal);
/* initialize commands */
qemuio_add_command(&quit_cmd);
qemuio_add_command(&open_cmd);
qemuio_add_command(&close_cmd);
if (isatty(STDIN_FILENO)) {
ttyEOF = get_eof_char();
readline_state = readline_init(readline_printf_func,
readline_flush_func,
NULL,
readline_completion_func);
qemu_set_tty_echo(STDIN_FILENO, false);
atexit(reenable_tty_echo);
}
/* open the device */
if (!readonly) {
flags |= BDRV_O_RDWR;
}
if ((argc - optind) == 1) {
if (imageOpts) {
QemuOpts *qopts = NULL;
qopts = qemu_opts_parse_noisily(&file_opts, argv[optind], false);
if (!qopts) {
exit(1);
}
opts = qemu_opts_to_qdict(qopts, NULL);
if (openfile(NULL, flags, writethrough, force_share, opts)) {
exit(1);
}
} else {
if (format) {
opts = qdict_new();
qdict_put_str(opts, "driver", format);
}
if (openfile(argv[optind], flags, writethrough,
force_share, opts)) {
exit(1);
}
}
}
ret = command_loop();
/*
* Make sure all outstanding requests complete before the program exits.
*/
bdrv_drain_all();
block: New BlockBackend A block device consists of a frontend device model and a backend. A block backend has a tree of block drivers doing the actual work. The tree is managed by the block layer. We currently use a single abstraction BlockDriverState both for tree nodes and the backend as a whole. Drawbacks: * Its API includes both stuff that makes sense only at the block backend level (root of the tree) and stuff that's only for use within the block layer. This makes the API bigger and more complex than necessary. Moreover, it's not obvious which interfaces are meant for device models, and which really aren't. * Since device models keep a reference to their backend, the backend object can't just be destroyed. But for media change, we need to replace the tree. Our solution is to make the BlockDriverState generic, with actual driver state in a separate object, pointed to by member opaque. That lets us replace the tree by deinitializing and reinitializing its root. This special need of the root makes the data structure awkward everywhere in the tree. The general plan is to separate the APIs into "block backend", for use by device models, monitor and whatever other code dealing with block backends, and "block driver", for use by the block layer and whatever other code (if any) dealing with trees and tree nodes. Code dealing with block backends, device models in particular, should become completely oblivious of BlockDriverState. This should let us clean up both APIs, and the tree data structures. This commit is a first step. It creates a minimal "block backend" API: type BlockBackend and functions to create, destroy and find them. BlockBackend objects are created and destroyed exactly when root BlockDriverState objects are created and destroyed. "Root" in the sense of "in bdrv_states". They're not yet used for anything; that'll come shortly. A root BlockDriverState is created with bdrv_new_root(), so where to create a BlockBackend is obvious. Where these roots get destroyed isn't always as obvious. It is obvious in qemu-img.c, qemu-io.c and qemu-nbd.c, and in error paths of blockdev_init(), blk_connect(). That leaves destruction of objects successfully created by blockdev_init() and blk_connect(). blockdev_init() is used only by drive_new() and qmp_blockdev_add(). Objects created by the latter are currently indestructible (see commit 48f364d "blockdev: Refuse to drive_del something added with blockdev-add" and commit 2d246f0 "blockdev: Introduce DriveInfo.enable_auto_del"). Objects created by the former get destroyed by drive_del(). Objects created by blk_connect() get destroyed by blk_disconnect(). BlockBackend is reference-counted. Its reference count never exceeds one so far, but that's going to change. In drive_del(), the BB's reference count is surely one now. The BDS's reference count is greater than one when something else is holding a reference, such as a block job. In this case, the BB is destroyed right away, but the BDS lives on until all extra references get dropped. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-07 15:59:04 +04:00
blk_unref(qemuio_blk);
g_free(readline_state);
if (ret < 0) {
return 1;
} else {
return 0;
}
}