2010-12-06 22:53:01 +03:00
|
|
|
/*
|
|
|
|
* QEMU Block driver for RADOS (Ceph)
|
|
|
|
*
|
2011-05-27 03:07:31 +04:00
|
|
|
* Copyright (C) 2010-2011 Christian Brunner <chb@muc.de>,
|
|
|
|
* Josh Durgin <josh.durgin@dreamhost.com>
|
2010-12-06 22:53:01 +03:00
|
|
|
*
|
|
|
|
* This work is licensed under the terms of the GNU GPL, version 2. See
|
|
|
|
* the COPYING file in the top-level directory.
|
|
|
|
*
|
2012-01-13 20:44:23 +04:00
|
|
|
* Contributions after 2012-01-13 are licensed under the terms of the
|
|
|
|
* GNU GPL, version 2 or (at your option) any later version.
|
2010-12-06 22:53:01 +03:00
|
|
|
*/
|
|
|
|
|
2016-01-18 21:01:42 +03:00
|
|
|
#include "qemu/osdep.h"
|
2011-05-27 03:07:31 +04:00
|
|
|
|
rbd: Fix bugs around -drive parameter "server"
qemu_rbd_open() takes option parameters as a flattened QDict, with
keys of the form server.%d.host, server.%d.port, where %d counts up
from zero.
qemu_rbd_array_opts() extracts these values as follows. First, it
calls qdict_array_entries() to find the list's length. For each list
element, it formats the list's key prefix (e.g. "server.0."), then
creates a new QDict holding the options with that key prefix, then
converts that to a QemuOpts, so it can finally get the member values
from there.
If there's one surefire way to make code using QDict more awkward,
it's creating more of them and mixing in QemuOpts for good measure.
The extraction of keys starting with server.%d into another QDict
makes us ignore parameters like server.0.neither-host-nor-port
silently.
The conversion to QemuOpts abuses runtime_opts, as described a few
commits ago.
Rewrite to simply get the values straight from the options QDict.
Fixes -drive not to crash when server.*.* are present, but
server.*.host is absent.
Fixes -drive to reject invalid server.*.*.
Permits cleaning up runtime_opts. Do that, and fix -drive to reject
bogus parameters host and port instead of silently ignoring them.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-11-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 11:56:08 +03:00
|
|
|
#include <rbd/librbd.h>
|
include/qemu/osdep.h: Don't include qapi/error.h
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the
Error typedef. Since then, we've moved to include qemu/osdep.h
everywhere. Its file comment explains: "To avoid getting into
possible circular include dependencies, this file should not include
any other QEMU headers, with the exceptions of config-host.h,
compiler.h, os-posix.h and os-win32.h, all of which are doing a
similar job to this file and are under similar constraints."
qapi/error.h doesn't do a similar job, and it doesn't adhere to
similar constraints: it includes qapi-types.h. That's in excess of
100KiB of crap most .c files don't actually need.
Add the typedef to qemu/typedefs.h, and include that instead of
qapi/error.h. Include qapi/error.h in .c files that need it and don't
get it now. Include qapi-types.h in qom/object.h for uint16List.
Update scripts/clean-includes accordingly. Update it further to match
reality: replace config.h by config-target.h, add sysemu/os-posix.h,
sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h
comment quoted above similarly.
This reduces the number of objects depending on qapi/error.h from "all
of them" to less than a third. Unfortunately, the number depending on
qapi-types.h shrinks only a little. More work is needed for that one.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
[Fix compilation without the spice devel packages. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-14 11:01:28 +03:00
|
|
|
#include "qapi/error.h"
|
2012-12-17 21:20:00 +04:00
|
|
|
#include "qemu/error-report.h"
|
2019-05-23 17:35:07 +03:00
|
|
|
#include "qemu/module.h"
|
2018-02-01 14:18:46 +03:00
|
|
|
#include "qemu/option.h"
|
2012-12-17 21:19:44 +04:00
|
|
|
#include "block/block_int.h"
|
2018-06-14 22:14:28 +03:00
|
|
|
#include "block/qdict.h"
|
2016-01-21 17:19:19 +03:00
|
|
|
#include "crypto/secret.h"
|
2016-03-20 20:16:19 +03:00
|
|
|
#include "qemu/cutils.h"
|
2019-09-17 14:58:19 +03:00
|
|
|
#include "sysemu/replay.h"
|
2017-02-27 01:50:42 +03:00
|
|
|
#include "qapi/qmp/qstring.h"
|
2018-02-01 14:18:39 +03:00
|
|
|
#include "qapi/qmp/qdict.h"
|
rbd: Fix regression in legacy key/values containing escaped :
Commit c7cacb3 accidentally broke legacy key-value parsing through
pseudo-filename parsing of -drive file=rbd://..., for any key that
contains an escaped ':'. Such a key is surprisingly common, thanks
to mon_host specifying a 'host:port' string. The break happens
because passing things from QDict through QemuOpts back to another
QDict requires that we pack our parsed key/value pairs into a string,
and then reparse that string, but the intermediate string that we
created ("key1=value1:key2=value2") lost the \: escaping that was
present in the original, so that we could no longer see which : were
used as separators vs. those used as part of the original input.
Fix it by collecting the key/value pairs through a QList, and
sending that list on a round trip through a JSON QString (as in
'["key1","value1","key2","value2"]') on its way through QemuOpts,
rather than hand-rolling our own string. Since the string is only
handled internally, this was faster than creating a full-blown
struct of '[{"key1":"value1"},{"key2":"value2"}]', and safer at
guaranteeing order compared to '{"key1":"value1","key2":"value2"}'.
It would be nicer if we didn't have to round-trip through QemuOpts
in the first place, but that's a much bigger task for later.
Reproducer:
./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-drive 'file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d-e7008b209a70'\
':id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr7BKKXg=='\
':auth_supported=cephx\;none:mon_host=192.168.1.2\:6789'\
',format=raw,if=none,id=drive-virtio-disk0,'\
'serial=ea141b5c-cdb3-4765-910d-e7008b209a70,cache=writeback'
Even without an RBD setup, this serves a test of whether we get
the incorrect parser error of:
qemu-system-x86_64: -drive file=rbd:...cache=writeback: conf option 6789 has no value
or the correct behavior of hanging while trying to connect to
the requested mon_host of 192.168.1.2:6789.
Reported-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170331152730.12514-1-eblake@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 18:27:30 +03:00
|
|
|
#include "qapi/qmp/qjson.h"
|
2018-02-01 14:18:38 +03:00
|
|
|
#include "qapi/qmp/qlist.h"
|
2018-02-15 22:58:24 +03:00
|
|
|
#include "qapi/qobject-input-visitor.h"
|
|
|
|
#include "qapi/qapi-visit-block-core.h"
|
2010-12-06 22:53:01 +03:00
|
|
|
|
|
|
|
/*
|
|
|
|
* When specifying the image filename use:
|
|
|
|
*
|
2011-05-27 03:07:32 +04:00
|
|
|
* rbd:poolname/devicename[@snapshotname][:option1=value1[:option2=value2...]]
|
2010-12-06 22:53:01 +03:00
|
|
|
*
|
2011-09-16 01:11:10 +04:00
|
|
|
* poolname must be the name of an existing rados pool.
|
2010-12-06 22:53:01 +03:00
|
|
|
*
|
2011-09-16 01:11:10 +04:00
|
|
|
* devicename is the name of the rbd image.
|
2010-12-06 22:53:01 +03:00
|
|
|
*
|
2011-09-16 01:11:10 +04:00
|
|
|
* Each option given is used to configure rados, and may be any valid
|
|
|
|
* Ceph option, "id", or "conf".
|
2011-05-27 03:07:32 +04:00
|
|
|
*
|
2011-09-16 01:11:10 +04:00
|
|
|
* The "id" option indicates what user we should authenticate as to
|
|
|
|
* the Ceph cluster. If it is excluded we will use the Ceph default
|
|
|
|
* (normally 'admin').
|
2010-12-06 22:53:01 +03:00
|
|
|
*
|
2011-09-16 01:11:10 +04:00
|
|
|
* The "conf" option specifies a Ceph configuration file to read. If
|
|
|
|
* it is not specified, we will read from the default Ceph locations
|
|
|
|
* (e.g., /etc/ceph/ceph.conf). To avoid reading _any_ configuration
|
|
|
|
* file, specify conf=/dev/null.
|
2010-12-06 22:53:01 +03:00
|
|
|
*
|
2011-09-16 01:11:10 +04:00
|
|
|
* Configuration values containing :, @, or = can be escaped with a
|
|
|
|
* leading "\".
|
2010-12-06 22:53:01 +03:00
|
|
|
*/
|
|
|
|
|
2012-05-01 10:16:45 +04:00
|
|
|
/* rbd_aio_discard added in 0.1.2 */
|
|
|
|
#if LIBRBD_VERSION_CODE >= LIBRBD_VERSION(0, 1, 2)
|
|
|
|
#define LIBRBD_SUPPORTS_DISCARD
|
|
|
|
#else
|
|
|
|
#undef LIBRBD_SUPPORTS_DISCARD
|
|
|
|
#endif
|
|
|
|
|
2010-12-06 22:53:01 +03:00
|
|
|
#define OBJ_MAX_SIZE (1UL << OBJ_DEFAULT_OBJ_ORDER)
|
|
|
|
|
2011-05-27 03:07:31 +04:00
|
|
|
#define RBD_MAX_SNAPS 100
|
|
|
|
|
2017-02-21 09:50:03 +03:00
|
|
|
/* The LIBRBD_SUPPORTS_IOVEC is defined in librbd.h */
|
|
|
|
#ifdef LIBRBD_SUPPORTS_IOVEC
|
|
|
|
#define LIBRBD_USE_IOVEC 1
|
|
|
|
#else
|
|
|
|
#define LIBRBD_USE_IOVEC 0
|
|
|
|
#endif
|
|
|
|
|
2012-05-01 10:16:45 +04:00
|
|
|
typedef enum {
|
|
|
|
RBD_AIO_READ,
|
|
|
|
RBD_AIO_WRITE,
|
2013-03-30 00:03:23 +04:00
|
|
|
RBD_AIO_DISCARD,
|
|
|
|
RBD_AIO_FLUSH
|
2012-05-01 10:16:45 +04:00
|
|
|
} RBDAIOCmd;
|
|
|
|
|
2010-12-06 22:53:01 +03:00
|
|
|
typedef struct RBDAIOCB {
|
2014-10-07 15:59:14 +04:00
|
|
|
BlockAIOCB common;
|
2012-11-20 16:44:55 +04:00
|
|
|
int64_t ret;
|
2010-12-06 22:53:01 +03:00
|
|
|
QEMUIOVector *qiov;
|
|
|
|
char *bounce;
|
2012-05-01 10:16:45 +04:00
|
|
|
RBDAIOCmd cmd;
|
2010-12-06 22:53:01 +03:00
|
|
|
int error;
|
|
|
|
struct BDRVRBDState *s;
|
|
|
|
} RBDAIOCB;
|
|
|
|
|
|
|
|
typedef struct RADOSCB {
|
|
|
|
RBDAIOCB *acb;
|
|
|
|
struct BDRVRBDState *s;
|
2011-05-27 03:07:31 +04:00
|
|
|
int64_t size;
|
2010-12-06 22:53:01 +03:00
|
|
|
char *buf;
|
2012-11-20 16:44:55 +04:00
|
|
|
int64_t ret;
|
2010-12-06 22:53:01 +03:00
|
|
|
} RADOSCB;
|
|
|
|
|
|
|
|
typedef struct BDRVRBDState {
|
2011-05-27 03:07:31 +04:00
|
|
|
rados_t cluster;
|
|
|
|
rados_ioctx_t io_ctx;
|
|
|
|
rbd_image_t image;
|
2017-04-07 23:55:31 +03:00
|
|
|
char *image_name;
|
2011-05-27 03:07:31 +04:00
|
|
|
char *snap;
|
block/rbd: Add support for ceph namespaces
Starting from ceph Nautilus, RBD has support for namespaces, allowing
for finer grain ACLs on images inside a pool, and tenant isolation.
In the rbd cli tool documentation, the new image-spec and snap-spec are :
- [pool-name/[namespace-name/]]image-name
- [pool-name/[namespace-name/]]image-name@snap-name
When using an non namespace's enabled qemu, it complains about not
finding the image called namespace-name/image-name, thus we only need to
parse the image once again to find if there is a '/' in its name, and if
there is, use what is before it as the name of the namespace to later
pass it to rados_ioctx_set_namespace.
rados_ioctx_set_namespace if called with en empty string or a null
pointer as the namespace parameters pretty much does nothing, as it then
defaults to the default namespace.
The namespace is extracted inside qemu_rbd_parse_filename, stored in the
qdict, and used in qemu_rbd_connect to make it work with both qemu-img,
and qemu itself.
Signed-off-by: Florian Florensa <fflorensa@online.net>
Message-Id: <20200110111513.321728-2-fflorensa@online.net>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-01-10 14:15:13 +03:00
|
|
|
char *namespace;
|
2019-05-09 17:59:27 +03:00
|
|
|
uint64_t image_size;
|
2010-12-06 22:53:01 +03:00
|
|
|
} BDRVRBDState;
|
|
|
|
|
2018-02-16 20:48:25 +03:00
|
|
|
static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,
|
|
|
|
BlockdevOptionsRbd *opts, bool cache,
|
|
|
|
const char *keypairs, const char *secretid,
|
|
|
|
Error **errp);
|
|
|
|
|
2017-03-28 11:56:01 +03:00
|
|
|
static char *qemu_rbd_next_tok(char *src, char delim, char **p)
|
2010-12-06 22:53:01 +03:00
|
|
|
{
|
|
|
|
char *end;
|
|
|
|
|
|
|
|
*p = NULL;
|
|
|
|
|
2017-03-28 11:56:02 +03:00
|
|
|
for (end = src; *end; ++end) {
|
2011-09-20 00:35:26 +04:00
|
|
|
if (*end == delim) {
|
2017-03-28 11:56:02 +03:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
if (*end == '\\' && end[1] != '\0') {
|
|
|
|
end++;
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
}
|
2017-03-28 11:56:02 +03:00
|
|
|
if (*end == delim) {
|
|
|
|
*p = end + 1;
|
|
|
|
*end = '\0';
|
|
|
|
}
|
2017-02-24 18:30:33 +03:00
|
|
|
return src;
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
|
2011-09-20 00:35:26 +04:00
|
|
|
static void qemu_rbd_unescape(char *src)
|
|
|
|
{
|
|
|
|
char *p;
|
|
|
|
|
|
|
|
for (p = src; *src; ++src, ++p) {
|
|
|
|
if (*src == '\\' && src[1] != '\0') {
|
|
|
|
src++;
|
|
|
|
}
|
|
|
|
*p = *src;
|
|
|
|
}
|
|
|
|
*p = '\0';
|
|
|
|
}
|
|
|
|
|
2017-02-27 01:50:42 +03:00
|
|
|
static void qemu_rbd_parse_filename(const char *filename, QDict *options,
|
|
|
|
Error **errp)
|
2010-12-06 22:53:01 +03:00
|
|
|
{
|
|
|
|
const char *start;
|
rbd: Fix regression in legacy key/values containing escaped :
Commit c7cacb3 accidentally broke legacy key-value parsing through
pseudo-filename parsing of -drive file=rbd://..., for any key that
contains an escaped ':'. Such a key is surprisingly common, thanks
to mon_host specifying a 'host:port' string. The break happens
because passing things from QDict through QemuOpts back to another
QDict requires that we pack our parsed key/value pairs into a string,
and then reparse that string, but the intermediate string that we
created ("key1=value1:key2=value2") lost the \: escaping that was
present in the original, so that we could no longer see which : were
used as separators vs. those used as part of the original input.
Fix it by collecting the key/value pairs through a QList, and
sending that list on a round trip through a JSON QString (as in
'["key1","value1","key2","value2"]') on its way through QemuOpts,
rather than hand-rolling our own string. Since the string is only
handled internally, this was faster than creating a full-blown
struct of '[{"key1":"value1"},{"key2":"value2"}]', and safer at
guaranteeing order compared to '{"key1":"value1","key2":"value2"}'.
It would be nicer if we didn't have to round-trip through QemuOpts
in the first place, but that's a much bigger task for later.
Reproducer:
./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-drive 'file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d-e7008b209a70'\
':id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr7BKKXg=='\
':auth_supported=cephx\;none:mon_host=192.168.1.2\:6789'\
',format=raw,if=none,id=drive-virtio-disk0,'\
'serial=ea141b5c-cdb3-4765-910d-e7008b209a70,cache=writeback'
Even without an RBD setup, this serves a test of whether we get
the incorrect parser error of:
qemu-system-x86_64: -drive file=rbd:...cache=writeback: conf option 6789 has no value
or the correct behavior of hanging while trying to connect to
the requested mon_host of 192.168.1.2:6789.
Reported-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170331152730.12514-1-eblake@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 18:27:30 +03:00
|
|
|
char *p, *buf;
|
|
|
|
QList *keypairs = NULL;
|
block/rbd: Add support for ceph namespaces
Starting from ceph Nautilus, RBD has support for namespaces, allowing
for finer grain ACLs on images inside a pool, and tenant isolation.
In the rbd cli tool documentation, the new image-spec and snap-spec are :
- [pool-name/[namespace-name/]]image-name
- [pool-name/[namespace-name/]]image-name@snap-name
When using an non namespace's enabled qemu, it complains about not
finding the image called namespace-name/image-name, thus we only need to
parse the image once again to find if there is a '/' in its name, and if
there is, use what is before it as the name of the namespace to later
pass it to rados_ioctx_set_namespace.
rados_ioctx_set_namespace if called with en empty string or a null
pointer as the namespace parameters pretty much does nothing, as it then
defaults to the default namespace.
The namespace is extracted inside qemu_rbd_parse_filename, stored in the
qdict, and used in qemu_rbd_connect to make it work with both qemu-img,
and qemu itself.
Signed-off-by: Florian Florensa <fflorensa@online.net>
Message-Id: <20200110111513.321728-2-fflorensa@online.net>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-01-10 14:15:13 +03:00
|
|
|
char *found_str, *image_name;
|
2010-12-06 22:53:01 +03:00
|
|
|
|
|
|
|
if (!strstart(filename, "rbd:", &start)) {
|
2014-05-16 13:00:11 +04:00
|
|
|
error_setg(errp, "File name must start with 'rbd:'");
|
2017-02-27 01:50:42 +03:00
|
|
|
return;
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
|
2011-08-21 07:09:37 +04:00
|
|
|
buf = g_strdup(start);
|
2010-12-06 22:53:01 +03:00
|
|
|
p = buf;
|
|
|
|
|
2017-03-28 11:56:01 +03:00
|
|
|
found_str = qemu_rbd_next_tok(p, '/', &p);
|
2017-02-24 18:30:33 +03:00
|
|
|
if (!p) {
|
|
|
|
error_setg(errp, "Pool name is required");
|
2010-12-06 22:53:01 +03:00
|
|
|
goto done;
|
|
|
|
}
|
2017-02-24 18:30:33 +03:00
|
|
|
qemu_rbd_unescape(found_str);
|
2017-04-28 00:58:17 +03:00
|
|
|
qdict_put_str(options, "pool", found_str);
|
2011-05-27 03:07:32 +04:00
|
|
|
|
|
|
|
if (strchr(p, '@')) {
|
block/rbd: Add support for ceph namespaces
Starting from ceph Nautilus, RBD has support for namespaces, allowing
for finer grain ACLs on images inside a pool, and tenant isolation.
In the rbd cli tool documentation, the new image-spec and snap-spec are :
- [pool-name/[namespace-name/]]image-name
- [pool-name/[namespace-name/]]image-name@snap-name
When using an non namespace's enabled qemu, it complains about not
finding the image called namespace-name/image-name, thus we only need to
parse the image once again to find if there is a '/' in its name, and if
there is, use what is before it as the name of the namespace to later
pass it to rados_ioctx_set_namespace.
rados_ioctx_set_namespace if called with en empty string or a null
pointer as the namespace parameters pretty much does nothing, as it then
defaults to the default namespace.
The namespace is extracted inside qemu_rbd_parse_filename, stored in the
qdict, and used in qemu_rbd_connect to make it work with both qemu-img,
and qemu itself.
Signed-off-by: Florian Florensa <fflorensa@online.net>
Message-Id: <20200110111513.321728-2-fflorensa@online.net>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-01-10 14:15:13 +03:00
|
|
|
image_name = qemu_rbd_next_tok(p, '@', &p);
|
2017-02-24 18:30:33 +03:00
|
|
|
|
2017-03-28 11:56:01 +03:00
|
|
|
found_str = qemu_rbd_next_tok(p, ':', &p);
|
2017-02-24 18:30:33 +03:00
|
|
|
qemu_rbd_unescape(found_str);
|
2017-04-28 00:58:17 +03:00
|
|
|
qdict_put_str(options, "snapshot", found_str);
|
2011-05-27 03:07:32 +04:00
|
|
|
} else {
|
block/rbd: Add support for ceph namespaces
Starting from ceph Nautilus, RBD has support for namespaces, allowing
for finer grain ACLs on images inside a pool, and tenant isolation.
In the rbd cli tool documentation, the new image-spec and snap-spec are :
- [pool-name/[namespace-name/]]image-name
- [pool-name/[namespace-name/]]image-name@snap-name
When using an non namespace's enabled qemu, it complains about not
finding the image called namespace-name/image-name, thus we only need to
parse the image once again to find if there is a '/' in its name, and if
there is, use what is before it as the name of the namespace to later
pass it to rados_ioctx_set_namespace.
rados_ioctx_set_namespace if called with en empty string or a null
pointer as the namespace parameters pretty much does nothing, as it then
defaults to the default namespace.
The namespace is extracted inside qemu_rbd_parse_filename, stored in the
qdict, and used in qemu_rbd_connect to make it work with both qemu-img,
and qemu itself.
Signed-off-by: Florian Florensa <fflorensa@online.net>
Message-Id: <20200110111513.321728-2-fflorensa@online.net>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-01-10 14:15:13 +03:00
|
|
|
image_name = qemu_rbd_next_tok(p, ':', &p);
|
|
|
|
}
|
|
|
|
/* Check for namespace in the image_name */
|
|
|
|
if (strchr(image_name, '/')) {
|
|
|
|
found_str = qemu_rbd_next_tok(image_name, '/', &image_name);
|
2017-02-24 18:30:33 +03:00
|
|
|
qemu_rbd_unescape(found_str);
|
block/rbd: Add support for ceph namespaces
Starting from ceph Nautilus, RBD has support for namespaces, allowing
for finer grain ACLs on images inside a pool, and tenant isolation.
In the rbd cli tool documentation, the new image-spec and snap-spec are :
- [pool-name/[namespace-name/]]image-name
- [pool-name/[namespace-name/]]image-name@snap-name
When using an non namespace's enabled qemu, it complains about not
finding the image called namespace-name/image-name, thus we only need to
parse the image once again to find if there is a '/' in its name, and if
there is, use what is before it as the name of the namespace to later
pass it to rados_ioctx_set_namespace.
rados_ioctx_set_namespace if called with en empty string or a null
pointer as the namespace parameters pretty much does nothing, as it then
defaults to the default namespace.
The namespace is extracted inside qemu_rbd_parse_filename, stored in the
qdict, and used in qemu_rbd_connect to make it work with both qemu-img,
and qemu itself.
Signed-off-by: Florian Florensa <fflorensa@online.net>
Message-Id: <20200110111513.321728-2-fflorensa@online.net>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-01-10 14:15:13 +03:00
|
|
|
qdict_put_str(options, "namespace", found_str);
|
|
|
|
} else {
|
|
|
|
qdict_put_str(options, "namespace", "");
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
block/rbd: Add support for ceph namespaces
Starting from ceph Nautilus, RBD has support for namespaces, allowing
for finer grain ACLs on images inside a pool, and tenant isolation.
In the rbd cli tool documentation, the new image-spec and snap-spec are :
- [pool-name/[namespace-name/]]image-name
- [pool-name/[namespace-name/]]image-name@snap-name
When using an non namespace's enabled qemu, it complains about not
finding the image called namespace-name/image-name, thus we only need to
parse the image once again to find if there is a '/' in its name, and if
there is, use what is before it as the name of the namespace to later
pass it to rados_ioctx_set_namespace.
rados_ioctx_set_namespace if called with en empty string or a null
pointer as the namespace parameters pretty much does nothing, as it then
defaults to the default namespace.
The namespace is extracted inside qemu_rbd_parse_filename, stored in the
qdict, and used in qemu_rbd_connect to make it work with both qemu-img,
and qemu itself.
Signed-off-by: Florian Florensa <fflorensa@online.net>
Message-Id: <20200110111513.321728-2-fflorensa@online.net>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-01-10 14:15:13 +03:00
|
|
|
qemu_rbd_unescape(image_name);
|
|
|
|
qdict_put_str(options, "image", image_name);
|
2017-02-24 18:30:33 +03:00
|
|
|
if (!p) {
|
2010-12-06 22:53:01 +03:00
|
|
|
goto done;
|
|
|
|
}
|
|
|
|
|
2017-02-27 01:50:42 +03:00
|
|
|
/* The following are essentially all key/value pairs, and we treat
|
|
|
|
* 'id' and 'conf' a bit special. Key/value pairs may be in any order. */
|
|
|
|
while (p) {
|
|
|
|
char *name, *value;
|
2017-03-28 11:56:01 +03:00
|
|
|
name = qemu_rbd_next_tok(p, '=', &p);
|
2017-02-27 01:50:42 +03:00
|
|
|
if (!p) {
|
|
|
|
error_setg(errp, "conf option %s has no value", name);
|
|
|
|
break;
|
2011-09-07 20:28:04 +04:00
|
|
|
}
|
2017-02-27 01:50:42 +03:00
|
|
|
|
|
|
|
qemu_rbd_unescape(name);
|
|
|
|
|
2017-03-28 11:56:01 +03:00
|
|
|
value = qemu_rbd_next_tok(p, ':', &p);
|
2017-02-27 01:50:42 +03:00
|
|
|
qemu_rbd_unescape(value);
|
|
|
|
|
|
|
|
if (!strcmp(name, "conf")) {
|
2017-04-28 00:58:17 +03:00
|
|
|
qdict_put_str(options, "conf", value);
|
2017-02-27 01:50:42 +03:00
|
|
|
} else if (!strcmp(name, "id")) {
|
2017-04-28 00:58:17 +03:00
|
|
|
qdict_put_str(options, "user", value);
|
2017-02-27 01:50:42 +03:00
|
|
|
} else {
|
rbd: Fix regression in legacy key/values containing escaped :
Commit c7cacb3 accidentally broke legacy key-value parsing through
pseudo-filename parsing of -drive file=rbd://..., for any key that
contains an escaped ':'. Such a key is surprisingly common, thanks
to mon_host specifying a 'host:port' string. The break happens
because passing things from QDict through QemuOpts back to another
QDict requires that we pack our parsed key/value pairs into a string,
and then reparse that string, but the intermediate string that we
created ("key1=value1:key2=value2") lost the \: escaping that was
present in the original, so that we could no longer see which : were
used as separators vs. those used as part of the original input.
Fix it by collecting the key/value pairs through a QList, and
sending that list on a round trip through a JSON QString (as in
'["key1","value1","key2","value2"]') on its way through QemuOpts,
rather than hand-rolling our own string. Since the string is only
handled internally, this was faster than creating a full-blown
struct of '[{"key1":"value1"},{"key2":"value2"}]', and safer at
guaranteeing order compared to '{"key1":"value1","key2":"value2"}'.
It would be nicer if we didn't have to round-trip through QemuOpts
in the first place, but that's a much bigger task for later.
Reproducer:
./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-drive 'file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d-e7008b209a70'\
':id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr7BKKXg=='\
':auth_supported=cephx\;none:mon_host=192.168.1.2\:6789'\
',format=raw,if=none,id=drive-virtio-disk0,'\
'serial=ea141b5c-cdb3-4765-910d-e7008b209a70,cache=writeback'
Even without an RBD setup, this serves a test of whether we get
the incorrect parser error of:
qemu-system-x86_64: -drive file=rbd:...cache=writeback: conf option 6789 has no value
or the correct behavior of hanging while trying to connect to
the requested mon_host of 192.168.1.2:6789.
Reported-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170331152730.12514-1-eblake@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 18:27:30 +03:00
|
|
|
/*
|
|
|
|
* We pass these internally to qemu_rbd_set_keypairs(), so
|
|
|
|
* we can get away with the simpler list of [ "key1",
|
|
|
|
* "value1", "key2", "value2" ] rather than a raw dict
|
|
|
|
* { "key1": "value1", "key2": "value2" } where we can't
|
|
|
|
* guarantee order, or even a more correct but complex
|
|
|
|
* [ { "key1": "value1" }, { "key2": "value2" } ]
|
|
|
|
*/
|
|
|
|
if (!keypairs) {
|
|
|
|
keypairs = qlist_new();
|
2017-02-27 01:50:42 +03:00
|
|
|
}
|
2017-04-28 00:58:17 +03:00
|
|
|
qlist_append_str(keypairs, name);
|
|
|
|
qlist_append_str(keypairs, value);
|
2017-02-27 01:50:42 +03:00
|
|
|
}
|
2011-09-07 20:28:04 +04:00
|
|
|
}
|
2017-02-27 01:50:42 +03:00
|
|
|
|
rbd: Fix regression in legacy key/values containing escaped :
Commit c7cacb3 accidentally broke legacy key-value parsing through
pseudo-filename parsing of -drive file=rbd://..., for any key that
contains an escaped ':'. Such a key is surprisingly common, thanks
to mon_host specifying a 'host:port' string. The break happens
because passing things from QDict through QemuOpts back to another
QDict requires that we pack our parsed key/value pairs into a string,
and then reparse that string, but the intermediate string that we
created ("key1=value1:key2=value2") lost the \: escaping that was
present in the original, so that we could no longer see which : were
used as separators vs. those used as part of the original input.
Fix it by collecting the key/value pairs through a QList, and
sending that list on a round trip through a JSON QString (as in
'["key1","value1","key2","value2"]') on its way through QemuOpts,
rather than hand-rolling our own string. Since the string is only
handled internally, this was faster than creating a full-blown
struct of '[{"key1":"value1"},{"key2":"value2"}]', and safer at
guaranteeing order compared to '{"key1":"value1","key2":"value2"}'.
It would be nicer if we didn't have to round-trip through QemuOpts
in the first place, but that's a much bigger task for later.
Reproducer:
./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-drive 'file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d-e7008b209a70'\
':id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr7BKKXg=='\
':auth_supported=cephx\;none:mon_host=192.168.1.2\:6789'\
',format=raw,if=none,id=drive-virtio-disk0,'\
'serial=ea141b5c-cdb3-4765-910d-e7008b209a70,cache=writeback'
Even without an RBD setup, this serves a test of whether we get
the incorrect parser error of:
qemu-system-x86_64: -drive file=rbd:...cache=writeback: conf option 6789 has no value
or the correct behavior of hanging while trying to connect to
the requested mon_host of 192.168.1.2:6789.
Reported-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170331152730.12514-1-eblake@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 18:27:30 +03:00
|
|
|
if (keypairs) {
|
|
|
|
qdict_put(options, "=keyvalue-pairs",
|
|
|
|
qobject_to_json(QOBJECT(keypairs)));
|
2017-02-27 01:50:42 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
done:
|
|
|
|
g_free(buf);
|
2018-04-19 18:01:43 +03:00
|
|
|
qobject_unref(keypairs);
|
2017-02-27 01:50:42 +03:00
|
|
|
return;
|
2011-09-07 20:28:04 +04:00
|
|
|
}
|
|
|
|
|
2016-01-21 17:19:19 +03:00
|
|
|
|
2018-04-24 22:25:04 +03:00
|
|
|
static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
|
|
|
|
{
|
|
|
|
/* XXX Does RBD support AIO on less than 512-byte alignment? */
|
|
|
|
bs->bl.request_alignment = 512;
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2018-06-14 22:14:43 +03:00
|
|
|
static int qemu_rbd_set_auth(rados_t cluster, BlockdevOptionsRbd *opts,
|
2016-01-21 17:19:19 +03:00
|
|
|
Error **errp)
|
|
|
|
{
|
2018-06-14 22:14:43 +03:00
|
|
|
char *key, *acr;
|
rbd: New parameter auth-client-required
Parameter auth-client-required lets you configure authentication
methods. We tried to provide that in v2.9.0, but backed out due to
interface design doubts (commit 464444fcc16).
This commit is similar to what we backed out, but simpler: we use a
list of enumeration values instead of a list of objects with a member
of enumeration type.
Let's review our reasons for backing out the first try, as stated in
the commit message:
* The implementation uses deprecated rados_conf_set() key
"auth_supported". No biggie.
Fixed: we use "auth-client-required".
* The implementation makes -drive silently ignore invalid parameters
"auth" and "auth-supported.*.X" where X isn't "auth". Fixable (in
fact I'm going to fix similar bugs around parameter server), so
again no biggie.
That fix is commit 2836284db60. This commit doesn't bring the bugs
back.
* BlockdevOptionsRbd member @password-secret applies only to
authentication method cephx. Should it be a variant member of
RbdAuthMethod?
We've had time to ponder, and we decided to stick to the way Ceph
configuration works: the key configured separately, and silently
ignored if the authentication method doesn't use it.
* BlockdevOptionsRbd member @user could apply to both methods cephx
and none, but I'm not sure it's actually used with none. If it
isn't, should it be a variant member of RbdAuthMethod?
Likewise.
* The client offers a *set* of authentication methods, not a list.
Should the methods be optional members of BlockdevOptionsRbd instead
of members of list @auth-supported? The latter begs the question
what multiple entries for the same method mean. Trivial question
now that RbdAuthMethod contains nothing but @type, but less so when
RbdAuthMethod acquires other members, such the ones discussed above.
Again, we decided to stick to the way Ceph configuration works, except
we make auth-client-required a list of enumeration values instead of a
string containing keywords separated by delimiters.
* How BlockdevOptionsRbd member @auth-supported interacts with
settings from a configuration file specified with @conf is
undocumented. I suspect it's untested, too.
Not actually true, the documentation for @conf says "Values in the
configuration file will be overridden by options specified via QAPI",
and we've tested this.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-06-14 22:14:42 +03:00
|
|
|
int r;
|
|
|
|
GString *accu;
|
|
|
|
RbdAuthModeList *auth;
|
|
|
|
|
2018-06-14 22:14:43 +03:00
|
|
|
if (opts->key_secret) {
|
|
|
|
key = qcrypto_secret_lookup_as_base64(opts->key_secret, errp);
|
|
|
|
if (!key) {
|
|
|
|
return -EIO;
|
|
|
|
}
|
|
|
|
r = rados_conf_set(cluster, "key", key);
|
|
|
|
g_free(key);
|
|
|
|
if (r < 0) {
|
|
|
|
error_setg_errno(errp, -r, "Could not set 'key'");
|
|
|
|
return r;
|
rbd: New parameter auth-client-required
Parameter auth-client-required lets you configure authentication
methods. We tried to provide that in v2.9.0, but backed out due to
interface design doubts (commit 464444fcc16).
This commit is similar to what we backed out, but simpler: we use a
list of enumeration values instead of a list of objects with a member
of enumeration type.
Let's review our reasons for backing out the first try, as stated in
the commit message:
* The implementation uses deprecated rados_conf_set() key
"auth_supported". No biggie.
Fixed: we use "auth-client-required".
* The implementation makes -drive silently ignore invalid parameters
"auth" and "auth-supported.*.X" where X isn't "auth". Fixable (in
fact I'm going to fix similar bugs around parameter server), so
again no biggie.
That fix is commit 2836284db60. This commit doesn't bring the bugs
back.
* BlockdevOptionsRbd member @password-secret applies only to
authentication method cephx. Should it be a variant member of
RbdAuthMethod?
We've had time to ponder, and we decided to stick to the way Ceph
configuration works: the key configured separately, and silently
ignored if the authentication method doesn't use it.
* BlockdevOptionsRbd member @user could apply to both methods cephx
and none, but I'm not sure it's actually used with none. If it
isn't, should it be a variant member of RbdAuthMethod?
Likewise.
* The client offers a *set* of authentication methods, not a list.
Should the methods be optional members of BlockdevOptionsRbd instead
of members of list @auth-supported? The latter begs the question
what multiple entries for the same method mean. Trivial question
now that RbdAuthMethod contains nothing but @type, but less so when
RbdAuthMethod acquires other members, such the ones discussed above.
Again, we decided to stick to the way Ceph configuration works, except
we make auth-client-required a list of enumeration values instead of a
string containing keywords separated by delimiters.
* How BlockdevOptionsRbd member @auth-supported interacts with
settings from a configuration file specified with @conf is
undocumented. I suspect it's untested, too.
Not actually true, the documentation for @conf says "Values in the
configuration file will be overridden by options specified via QAPI",
and we've tested this.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-06-14 22:14:42 +03:00
|
|
|
}
|
2016-01-21 17:19:19 +03:00
|
|
|
}
|
|
|
|
|
rbd: New parameter auth-client-required
Parameter auth-client-required lets you configure authentication
methods. We tried to provide that in v2.9.0, but backed out due to
interface design doubts (commit 464444fcc16).
This commit is similar to what we backed out, but simpler: we use a
list of enumeration values instead of a list of objects with a member
of enumeration type.
Let's review our reasons for backing out the first try, as stated in
the commit message:
* The implementation uses deprecated rados_conf_set() key
"auth_supported". No biggie.
Fixed: we use "auth-client-required".
* The implementation makes -drive silently ignore invalid parameters
"auth" and "auth-supported.*.X" where X isn't "auth". Fixable (in
fact I'm going to fix similar bugs around parameter server), so
again no biggie.
That fix is commit 2836284db60. This commit doesn't bring the bugs
back.
* BlockdevOptionsRbd member @password-secret applies only to
authentication method cephx. Should it be a variant member of
RbdAuthMethod?
We've had time to ponder, and we decided to stick to the way Ceph
configuration works: the key configured separately, and silently
ignored if the authentication method doesn't use it.
* BlockdevOptionsRbd member @user could apply to both methods cephx
and none, but I'm not sure it's actually used with none. If it
isn't, should it be a variant member of RbdAuthMethod?
Likewise.
* The client offers a *set* of authentication methods, not a list.
Should the methods be optional members of BlockdevOptionsRbd instead
of members of list @auth-supported? The latter begs the question
what multiple entries for the same method mean. Trivial question
now that RbdAuthMethod contains nothing but @type, but less so when
RbdAuthMethod acquires other members, such the ones discussed above.
Again, we decided to stick to the way Ceph configuration works, except
we make auth-client-required a list of enumeration values instead of a
string containing keywords separated by delimiters.
* How BlockdevOptionsRbd member @auth-supported interacts with
settings from a configuration file specified with @conf is
undocumented. I suspect it's untested, too.
Not actually true, the documentation for @conf says "Values in the
configuration file will be overridden by options specified via QAPI",
and we've tested this.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-06-14 22:14:42 +03:00
|
|
|
if (opts->has_auth_client_required) {
|
|
|
|
accu = g_string_new("");
|
|
|
|
for (auth = opts->auth_client_required; auth; auth = auth->next) {
|
|
|
|
if (accu->str[0]) {
|
|
|
|
g_string_append_c(accu, ';');
|
|
|
|
}
|
|
|
|
g_string_append(accu, RbdAuthMode_str(auth->value));
|
|
|
|
}
|
|
|
|
acr = g_string_free(accu, FALSE);
|
|
|
|
r = rados_conf_set(cluster, "auth_client_required", acr);
|
|
|
|
g_free(acr);
|
|
|
|
if (r < 0) {
|
|
|
|
error_setg_errno(errp, -r,
|
|
|
|
"Could not set 'auth_client_required'");
|
|
|
|
return r;
|
|
|
|
}
|
|
|
|
}
|
2016-01-21 17:19:19 +03:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
rbd: Fix regression in legacy key/values containing escaped :
Commit c7cacb3 accidentally broke legacy key-value parsing through
pseudo-filename parsing of -drive file=rbd://..., for any key that
contains an escaped ':'. Such a key is surprisingly common, thanks
to mon_host specifying a 'host:port' string. The break happens
because passing things from QDict through QemuOpts back to another
QDict requires that we pack our parsed key/value pairs into a string,
and then reparse that string, but the intermediate string that we
created ("key1=value1:key2=value2") lost the \: escaping that was
present in the original, so that we could no longer see which : were
used as separators vs. those used as part of the original input.
Fix it by collecting the key/value pairs through a QList, and
sending that list on a round trip through a JSON QString (as in
'["key1","value1","key2","value2"]') on its way through QemuOpts,
rather than hand-rolling our own string. Since the string is only
handled internally, this was faster than creating a full-blown
struct of '[{"key1":"value1"},{"key2":"value2"}]', and safer at
guaranteeing order compared to '{"key1":"value1","key2":"value2"}'.
It would be nicer if we didn't have to round-trip through QemuOpts
in the first place, but that's a much bigger task for later.
Reproducer:
./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-drive 'file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d-e7008b209a70'\
':id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr7BKKXg=='\
':auth_supported=cephx\;none:mon_host=192.168.1.2\:6789'\
',format=raw,if=none,id=drive-virtio-disk0,'\
'serial=ea141b5c-cdb3-4765-910d-e7008b209a70,cache=writeback'
Even without an RBD setup, this serves a test of whether we get
the incorrect parser error of:
qemu-system-x86_64: -drive file=rbd:...cache=writeback: conf option 6789 has no value
or the correct behavior of hanging while trying to connect to
the requested mon_host of 192.168.1.2:6789.
Reported-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170331152730.12514-1-eblake@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 18:27:30 +03:00
|
|
|
static int qemu_rbd_set_keypairs(rados_t cluster, const char *keypairs_json,
|
2017-02-27 01:50:42 +03:00
|
|
|
Error **errp)
|
2011-05-27 03:07:32 +04:00
|
|
|
{
|
rbd: Fix regression in legacy key/values containing escaped :
Commit c7cacb3 accidentally broke legacy key-value parsing through
pseudo-filename parsing of -drive file=rbd://..., for any key that
contains an escaped ':'. Such a key is surprisingly common, thanks
to mon_host specifying a 'host:port' string. The break happens
because passing things from QDict through QemuOpts back to another
QDict requires that we pack our parsed key/value pairs into a string,
and then reparse that string, but the intermediate string that we
created ("key1=value1:key2=value2") lost the \: escaping that was
present in the original, so that we could no longer see which : were
used as separators vs. those used as part of the original input.
Fix it by collecting the key/value pairs through a QList, and
sending that list on a round trip through a JSON QString (as in
'["key1","value1","key2","value2"]') on its way through QemuOpts,
rather than hand-rolling our own string. Since the string is only
handled internally, this was faster than creating a full-blown
struct of '[{"key1":"value1"},{"key2":"value2"}]', and safer at
guaranteeing order compared to '{"key1":"value1","key2":"value2"}'.
It would be nicer if we didn't have to round-trip through QemuOpts
in the first place, but that's a much bigger task for later.
Reproducer:
./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-drive 'file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d-e7008b209a70'\
':id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr7BKKXg=='\
':auth_supported=cephx\;none:mon_host=192.168.1.2\:6789'\
',format=raw,if=none,id=drive-virtio-disk0,'\
'serial=ea141b5c-cdb3-4765-910d-e7008b209a70,cache=writeback'
Even without an RBD setup, this serves a test of whether we get
the incorrect parser error of:
qemu-system-x86_64: -drive file=rbd:...cache=writeback: conf option 6789 has no value
or the correct behavior of hanging while trying to connect to
the requested mon_host of 192.168.1.2:6789.
Reported-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170331152730.12514-1-eblake@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 18:27:30 +03:00
|
|
|
QList *keypairs;
|
|
|
|
QString *name;
|
|
|
|
QString *value;
|
|
|
|
const char *key;
|
|
|
|
size_t remaining;
|
2011-05-27 03:07:32 +04:00
|
|
|
int ret = 0;
|
|
|
|
|
rbd: Fix regression in legacy key/values containing escaped :
Commit c7cacb3 accidentally broke legacy key-value parsing through
pseudo-filename parsing of -drive file=rbd://..., for any key that
contains an escaped ':'. Such a key is surprisingly common, thanks
to mon_host specifying a 'host:port' string. The break happens
because passing things from QDict through QemuOpts back to another
QDict requires that we pack our parsed key/value pairs into a string,
and then reparse that string, but the intermediate string that we
created ("key1=value1:key2=value2") lost the \: escaping that was
present in the original, so that we could no longer see which : were
used as separators vs. those used as part of the original input.
Fix it by collecting the key/value pairs through a QList, and
sending that list on a round trip through a JSON QString (as in
'["key1","value1","key2","value2"]') on its way through QemuOpts,
rather than hand-rolling our own string. Since the string is only
handled internally, this was faster than creating a full-blown
struct of '[{"key1":"value1"},{"key2":"value2"}]', and safer at
guaranteeing order compared to '{"key1":"value1","key2":"value2"}'.
It would be nicer if we didn't have to round-trip through QemuOpts
in the first place, but that's a much bigger task for later.
Reproducer:
./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-drive 'file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d-e7008b209a70'\
':id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr7BKKXg=='\
':auth_supported=cephx\;none:mon_host=192.168.1.2\:6789'\
',format=raw,if=none,id=drive-virtio-disk0,'\
'serial=ea141b5c-cdb3-4765-910d-e7008b209a70,cache=writeback'
Even without an RBD setup, this serves a test of whether we get
the incorrect parser error of:
qemu-system-x86_64: -drive file=rbd:...cache=writeback: conf option 6789 has no value
or the correct behavior of hanging while trying to connect to
the requested mon_host of 192.168.1.2:6789.
Reported-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170331152730.12514-1-eblake@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 18:27:30 +03:00
|
|
|
if (!keypairs_json) {
|
|
|
|
return ret;
|
|
|
|
}
|
2018-02-24 18:40:29 +03:00
|
|
|
keypairs = qobject_to(QList,
|
|
|
|
qobject_from_json(keypairs_json, &error_abort));
|
rbd: Fix regression in legacy key/values containing escaped :
Commit c7cacb3 accidentally broke legacy key-value parsing through
pseudo-filename parsing of -drive file=rbd://..., for any key that
contains an escaped ':'. Such a key is surprisingly common, thanks
to mon_host specifying a 'host:port' string. The break happens
because passing things from QDict through QemuOpts back to another
QDict requires that we pack our parsed key/value pairs into a string,
and then reparse that string, but the intermediate string that we
created ("key1=value1:key2=value2") lost the \: escaping that was
present in the original, so that we could no longer see which : were
used as separators vs. those used as part of the original input.
Fix it by collecting the key/value pairs through a QList, and
sending that list on a round trip through a JSON QString (as in
'["key1","value1","key2","value2"]') on its way through QemuOpts,
rather than hand-rolling our own string. Since the string is only
handled internally, this was faster than creating a full-blown
struct of '[{"key1":"value1"},{"key2":"value2"}]', and safer at
guaranteeing order compared to '{"key1":"value1","key2":"value2"}'.
It would be nicer if we didn't have to round-trip through QemuOpts
in the first place, but that's a much bigger task for later.
Reproducer:
./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-drive 'file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d-e7008b209a70'\
':id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr7BKKXg=='\
':auth_supported=cephx\;none:mon_host=192.168.1.2\:6789'\
',format=raw,if=none,id=drive-virtio-disk0,'\
'serial=ea141b5c-cdb3-4765-910d-e7008b209a70,cache=writeback'
Even without an RBD setup, this serves a test of whether we get
the incorrect parser error of:
qemu-system-x86_64: -drive file=rbd:...cache=writeback: conf option 6789 has no value
or the correct behavior of hanging while trying to connect to
the requested mon_host of 192.168.1.2:6789.
Reported-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170331152730.12514-1-eblake@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 18:27:30 +03:00
|
|
|
remaining = qlist_size(keypairs) / 2;
|
|
|
|
assert(remaining);
|
|
|
|
|
|
|
|
while (remaining--) {
|
2018-02-24 18:40:29 +03:00
|
|
|
name = qobject_to(QString, qlist_pop(keypairs));
|
|
|
|
value = qobject_to(QString, qlist_pop(keypairs));
|
rbd: Fix regression in legacy key/values containing escaped :
Commit c7cacb3 accidentally broke legacy key-value parsing through
pseudo-filename parsing of -drive file=rbd://..., for any key that
contains an escaped ':'. Such a key is surprisingly common, thanks
to mon_host specifying a 'host:port' string. The break happens
because passing things from QDict through QemuOpts back to another
QDict requires that we pack our parsed key/value pairs into a string,
and then reparse that string, but the intermediate string that we
created ("key1=value1:key2=value2") lost the \: escaping that was
present in the original, so that we could no longer see which : were
used as separators vs. those used as part of the original input.
Fix it by collecting the key/value pairs through a QList, and
sending that list on a round trip through a JSON QString (as in
'["key1","value1","key2","value2"]') on its way through QemuOpts,
rather than hand-rolling our own string. Since the string is only
handled internally, this was faster than creating a full-blown
struct of '[{"key1":"value1"},{"key2":"value2"}]', and safer at
guaranteeing order compared to '{"key1":"value1","key2":"value2"}'.
It would be nicer if we didn't have to round-trip through QemuOpts
in the first place, but that's a much bigger task for later.
Reproducer:
./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-drive 'file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d-e7008b209a70'\
':id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr7BKKXg=='\
':auth_supported=cephx\;none:mon_host=192.168.1.2\:6789'\
',format=raw,if=none,id=drive-virtio-disk0,'\
'serial=ea141b5c-cdb3-4765-910d-e7008b209a70,cache=writeback'
Even without an RBD setup, this serves a test of whether we get
the incorrect parser error of:
qemu-system-x86_64: -drive file=rbd:...cache=writeback: conf option 6789 has no value
or the correct behavior of hanging while trying to connect to
the requested mon_host of 192.168.1.2:6789.
Reported-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170331152730.12514-1-eblake@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 18:27:30 +03:00
|
|
|
assert(name && value);
|
|
|
|
key = qstring_get_str(name);
|
|
|
|
|
|
|
|
ret = rados_conf_set(cluster, key, qstring_get_str(value));
|
2018-04-19 18:01:43 +03:00
|
|
|
qobject_unref(value);
|
2017-02-27 01:50:42 +03:00
|
|
|
if (ret < 0) {
|
rbd: Fix regression in legacy key/values containing escaped :
Commit c7cacb3 accidentally broke legacy key-value parsing through
pseudo-filename parsing of -drive file=rbd://..., for any key that
contains an escaped ':'. Such a key is surprisingly common, thanks
to mon_host specifying a 'host:port' string. The break happens
because passing things from QDict through QemuOpts back to another
QDict requires that we pack our parsed key/value pairs into a string,
and then reparse that string, but the intermediate string that we
created ("key1=value1:key2=value2") lost the \: escaping that was
present in the original, so that we could no longer see which : were
used as separators vs. those used as part of the original input.
Fix it by collecting the key/value pairs through a QList, and
sending that list on a round trip through a JSON QString (as in
'["key1","value1","key2","value2"]') on its way through QemuOpts,
rather than hand-rolling our own string. Since the string is only
handled internally, this was faster than creating a full-blown
struct of '[{"key1":"value1"},{"key2":"value2"}]', and safer at
guaranteeing order compared to '{"key1":"value1","key2":"value2"}'.
It would be nicer if we didn't have to round-trip through QemuOpts
in the first place, but that's a much bigger task for later.
Reproducer:
./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-drive 'file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d-e7008b209a70'\
':id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr7BKKXg=='\
':auth_supported=cephx\;none:mon_host=192.168.1.2\:6789'\
',format=raw,if=none,id=drive-virtio-disk0,'\
'serial=ea141b5c-cdb3-4765-910d-e7008b209a70,cache=writeback'
Even without an RBD setup, this serves a test of whether we get
the incorrect parser error of:
qemu-system-x86_64: -drive file=rbd:...cache=writeback: conf option 6789 has no value
or the correct behavior of hanging while trying to connect to
the requested mon_host of 192.168.1.2:6789.
Reported-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170331152730.12514-1-eblake@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 18:27:30 +03:00
|
|
|
error_setg_errno(errp, -ret, "invalid conf option %s", key);
|
2018-04-19 18:01:43 +03:00
|
|
|
qobject_unref(name);
|
2017-02-27 01:50:42 +03:00
|
|
|
ret = -EINVAL;
|
|
|
|
break;
|
2011-05-27 03:07:32 +04:00
|
|
|
}
|
2018-04-19 18:01:43 +03:00
|
|
|
qobject_unref(name);
|
2011-05-27 03:07:32 +04:00
|
|
|
}
|
|
|
|
|
2018-04-19 18:01:43 +03:00
|
|
|
qobject_unref(keypairs);
|
2011-05-27 03:07:32 +04:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2017-02-21 09:50:03 +03:00
|
|
|
static void qemu_rbd_memset(RADOSCB *rcb, int64_t offs)
|
|
|
|
{
|
|
|
|
if (LIBRBD_USE_IOVEC) {
|
|
|
|
RBDAIOCB *acb = rcb->acb;
|
|
|
|
iov_memset(acb->qiov->iov, acb->qiov->niov, offs, 0,
|
|
|
|
acb->qiov->size - offs);
|
|
|
|
} else {
|
|
|
|
memset(rcb->buf + offs, 0, rcb->size - offs);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2018-06-14 22:14:43 +03:00
|
|
|
/* FIXME Deprecate and remove keypairs or make it available in QMP. */
|
2018-01-31 18:27:38 +03:00
|
|
|
static int qemu_rbd_do_create(BlockdevCreateOptions *options,
|
|
|
|
const char *keypairs, const char *password_secret,
|
|
|
|
Error **errp)
|
2010-12-06 22:53:01 +03:00
|
|
|
{
|
2018-01-31 18:27:38 +03:00
|
|
|
BlockdevCreateOptionsRbd *opts = &options->u.rbd;
|
2011-05-27 03:07:31 +04:00
|
|
|
rados_t cluster;
|
|
|
|
rados_ioctx_t io_ctx;
|
2018-01-31 18:27:38 +03:00
|
|
|
int obj_order = 0;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
assert(options->driver == BLOCKDEV_DRIVER_RBD);
|
|
|
|
if (opts->location->has_snapshot) {
|
|
|
|
error_setg(errp, "Can't use snapshot name for image creation");
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
2010-12-06 22:53:01 +03:00
|
|
|
|
2018-01-31 18:27:38 +03:00
|
|
|
if (opts->has_cluster_size) {
|
|
|
|
int64_t objsize = opts->cluster_size;
|
2014-06-05 13:21:04 +04:00
|
|
|
if ((objsize - 1) & objsize) { /* not a power of 2? */
|
|
|
|
error_setg(errp, "obj size needs to be power of 2");
|
2018-01-31 18:27:38 +03:00
|
|
|
return -EINVAL;
|
2014-06-05 13:21:04 +04:00
|
|
|
}
|
|
|
|
if (objsize < 4096) {
|
|
|
|
error_setg(errp, "obj size too small");
|
2018-01-31 18:27:38 +03:00
|
|
|
return -EINVAL;
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
2015-03-23 18:29:26 +03:00
|
|
|
obj_order = ctz32(objsize);
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
|
2018-02-16 20:48:25 +03:00
|
|
|
ret = qemu_rbd_connect(&cluster, &io_ctx, opts->location, false, keypairs,
|
|
|
|
password_secret, errp);
|
2016-05-09 10:51:59 +03:00
|
|
|
if (ret < 0) {
|
2018-01-31 18:27:38 +03:00
|
|
|
return ret;
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
|
2018-01-31 18:27:38 +03:00
|
|
|
ret = rbd_create(io_ctx, opts->location->image, opts->size, &obj_order);
|
2016-05-09 10:51:59 +03:00
|
|
|
if (ret < 0) {
|
|
|
|
error_setg_errno(errp, -ret, "error rbd create");
|
2018-02-16 20:48:25 +03:00
|
|
|
goto out;
|
2016-05-09 10:51:59 +03:00
|
|
|
}
|
2010-12-06 22:53:01 +03:00
|
|
|
|
2018-01-31 18:27:38 +03:00
|
|
|
ret = 0;
|
2018-02-16 20:48:25 +03:00
|
|
|
out:
|
|
|
|
rados_ioctx_destroy(io_ctx);
|
2016-10-15 11:26:13 +03:00
|
|
|
rados_shutdown(cluster);
|
2018-01-31 18:27:38 +03:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int qemu_rbd_co_create(BlockdevCreateOptions *options, Error **errp)
|
|
|
|
{
|
|
|
|
return qemu_rbd_do_create(options, NULL, NULL, errp);
|
|
|
|
}
|
|
|
|
|
2020-03-26 04:12:17 +03:00
|
|
|
static int coroutine_fn qemu_rbd_co_create_opts(BlockDriver *drv,
|
|
|
|
const char *filename,
|
2018-01-31 18:27:38 +03:00
|
|
|
QemuOpts *opts,
|
|
|
|
Error **errp)
|
|
|
|
{
|
|
|
|
BlockdevCreateOptions *create_options;
|
|
|
|
BlockdevCreateOptionsRbd *rbd_opts;
|
|
|
|
BlockdevOptionsRbd *loc;
|
|
|
|
Error *local_err = NULL;
|
|
|
|
const char *keypairs, *password_secret;
|
|
|
|
QDict *options = NULL;
|
|
|
|
int ret = 0;
|
|
|
|
|
|
|
|
create_options = g_new0(BlockdevCreateOptions, 1);
|
|
|
|
create_options->driver = BLOCKDEV_DRIVER_RBD;
|
|
|
|
rbd_opts = &create_options->u.rbd;
|
|
|
|
|
|
|
|
rbd_opts->location = g_new0(BlockdevOptionsRbd, 1);
|
|
|
|
|
|
|
|
password_secret = qemu_opt_get(opts, "password-secret");
|
|
|
|
|
|
|
|
/* Read out options */
|
|
|
|
rbd_opts->size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
|
|
|
|
BDRV_SECTOR_SIZE);
|
|
|
|
rbd_opts->cluster_size = qemu_opt_get_size_del(opts,
|
|
|
|
BLOCK_OPT_CLUSTER_SIZE, 0);
|
|
|
|
rbd_opts->has_cluster_size = (rbd_opts->cluster_size != 0);
|
|
|
|
|
|
|
|
options = qdict_new();
|
|
|
|
qemu_rbd_parse_filename(filename, options, &local_err);
|
|
|
|
if (local_err) {
|
|
|
|
ret = -EINVAL;
|
|
|
|
error_propagate(errp, local_err);
|
|
|
|
goto exit;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Caution: while qdict_get_try_str() is fine, getting non-string
|
|
|
|
* types would require more care. When @options come from -blockdev
|
|
|
|
* or blockdev_add, its members are typed according to the QAPI
|
|
|
|
* schema, but when they come from -drive, they're all QString.
|
|
|
|
*/
|
|
|
|
loc = rbd_opts->location;
|
block/rbd: Add support for ceph namespaces
Starting from ceph Nautilus, RBD has support for namespaces, allowing
for finer grain ACLs on images inside a pool, and tenant isolation.
In the rbd cli tool documentation, the new image-spec and snap-spec are :
- [pool-name/[namespace-name/]]image-name
- [pool-name/[namespace-name/]]image-name@snap-name
When using an non namespace's enabled qemu, it complains about not
finding the image called namespace-name/image-name, thus we only need to
parse the image once again to find if there is a '/' in its name, and if
there is, use what is before it as the name of the namespace to later
pass it to rados_ioctx_set_namespace.
rados_ioctx_set_namespace if called with en empty string or a null
pointer as the namespace parameters pretty much does nothing, as it then
defaults to the default namespace.
The namespace is extracted inside qemu_rbd_parse_filename, stored in the
qdict, and used in qemu_rbd_connect to make it work with both qemu-img,
and qemu itself.
Signed-off-by: Florian Florensa <fflorensa@online.net>
Message-Id: <20200110111513.321728-2-fflorensa@online.net>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-01-10 14:15:13 +03:00
|
|
|
loc->pool = g_strdup(qdict_get_try_str(options, "pool"));
|
|
|
|
loc->conf = g_strdup(qdict_get_try_str(options, "conf"));
|
|
|
|
loc->has_conf = !!loc->conf;
|
|
|
|
loc->user = g_strdup(qdict_get_try_str(options, "user"));
|
|
|
|
loc->has_user = !!loc->user;
|
|
|
|
loc->q_namespace = g_strdup(qdict_get_try_str(options, "namespace"));
|
|
|
|
loc->image = g_strdup(qdict_get_try_str(options, "image"));
|
|
|
|
keypairs = qdict_get_try_str(options, "=keyvalue-pairs");
|
2018-01-31 18:27:38 +03:00
|
|
|
|
|
|
|
ret = qemu_rbd_do_create(create_options, keypairs, password_secret, errp);
|
|
|
|
if (ret < 0) {
|
|
|
|
goto exit;
|
|
|
|
}
|
2017-02-27 01:50:42 +03:00
|
|
|
|
|
|
|
exit:
|
2018-04-19 18:01:43 +03:00
|
|
|
qobject_unref(options);
|
2018-01-31 18:27:38 +03:00
|
|
|
qapi_free_BlockdevCreateOptions(create_options);
|
2010-12-06 22:53:01 +03:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2013-12-05 19:38:33 +04:00
|
|
|
* This aio completion is being called from rbd_finish_bh() and runs in qemu
|
|
|
|
* BH context.
|
2010-12-06 22:53:01 +03:00
|
|
|
*/
|
2011-05-27 03:07:31 +04:00
|
|
|
static void qemu_rbd_complete_aio(RADOSCB *rcb)
|
2010-12-06 22:53:01 +03:00
|
|
|
{
|
|
|
|
RBDAIOCB *acb = rcb->acb;
|
|
|
|
int64_t r;
|
|
|
|
|
|
|
|
r = rcb->ret;
|
|
|
|
|
2013-03-30 00:03:23 +04:00
|
|
|
if (acb->cmd != RBD_AIO_READ) {
|
2010-12-06 22:53:01 +03:00
|
|
|
if (r < 0) {
|
|
|
|
acb->ret = r;
|
|
|
|
acb->error = 1;
|
|
|
|
} else if (!acb->error) {
|
2011-05-27 03:07:31 +04:00
|
|
|
acb->ret = rcb->size;
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
} else {
|
2011-05-27 03:07:31 +04:00
|
|
|
if (r < 0) {
|
2017-02-21 09:50:03 +03:00
|
|
|
qemu_rbd_memset(rcb, 0);
|
2010-12-06 22:53:01 +03:00
|
|
|
acb->ret = r;
|
|
|
|
acb->error = 1;
|
2011-05-27 03:07:31 +04:00
|
|
|
} else if (r < rcb->size) {
|
2017-02-21 09:50:03 +03:00
|
|
|
qemu_rbd_memset(rcb, r);
|
2010-12-06 22:53:01 +03:00
|
|
|
if (!acb->error) {
|
2011-05-27 03:07:31 +04:00
|
|
|
acb->ret = rcb->size;
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
} else if (!acb->error) {
|
2011-05-27 03:07:31 +04:00
|
|
|
acb->ret = r;
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-12-05 19:38:33 +04:00
|
|
|
g_free(rcb);
|
2010-12-06 22:53:01 +03:00
|
|
|
|
2017-02-21 09:50:03 +03:00
|
|
|
if (!LIBRBD_USE_IOVEC) {
|
|
|
|
if (acb->cmd == RBD_AIO_READ) {
|
|
|
|
qemu_iovec_from_buf(acb->qiov, 0, acb->bounce, acb->qiov->size);
|
|
|
|
}
|
|
|
|
qemu_vfree(acb->bounce);
|
2013-12-05 19:38:33 +04:00
|
|
|
}
|
2017-02-21 09:50:03 +03:00
|
|
|
|
2013-12-05 19:38:33 +04:00
|
|
|
acb->common.cb(acb->common.opaque, (acb->ret > 0 ? 0 : acb->ret));
|
2010-12-06 22:53:01 +03:00
|
|
|
|
2014-09-11 09:41:28 +04:00
|
|
|
qemu_aio_unref(acb);
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
|
2018-02-15 22:58:24 +03:00
|
|
|
static char *qemu_rbd_mon_host(BlockdevOptionsRbd *opts, Error **errp)
|
2017-02-27 20:36:46 +03:00
|
|
|
{
|
2018-02-15 22:58:24 +03:00
|
|
|
const char **vals;
|
rbd: Fix bugs around -drive parameter "server"
qemu_rbd_open() takes option parameters as a flattened QDict, with
keys of the form server.%d.host, server.%d.port, where %d counts up
from zero.
qemu_rbd_array_opts() extracts these values as follows. First, it
calls qdict_array_entries() to find the list's length. For each list
element, it formats the list's key prefix (e.g. "server.0."), then
creates a new QDict holding the options with that key prefix, then
converts that to a QemuOpts, so it can finally get the member values
from there.
If there's one surefire way to make code using QDict more awkward,
it's creating more of them and mixing in QemuOpts for good measure.
The extraction of keys starting with server.%d into another QDict
makes us ignore parameters like server.0.neither-host-nor-port
silently.
The conversion to QemuOpts abuses runtime_opts, as described a few
commits ago.
Rewrite to simply get the values straight from the options QDict.
Fixes -drive not to crash when server.*.* are present, but
server.*.host is absent.
Fixes -drive to reject invalid server.*.*.
Permits cleaning up runtime_opts. Do that, and fix -drive to reject
bogus parameters host and port instead of silently ignoring them.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-11-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 11:56:08 +03:00
|
|
|
const char *host, *port;
|
|
|
|
char *rados_str;
|
2018-02-15 22:58:24 +03:00
|
|
|
InetSocketAddressBaseList *p;
|
|
|
|
int i, cnt;
|
|
|
|
|
|
|
|
if (!opts->has_server) {
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
for (cnt = 0, p = opts->server; p; p = p->next) {
|
|
|
|
cnt++;
|
|
|
|
}
|
|
|
|
|
|
|
|
vals = g_new(const char *, cnt + 1);
|
|
|
|
|
|
|
|
for (i = 0, p = opts->server; p; p = p->next, i++) {
|
|
|
|
host = p->value->host;
|
|
|
|
port = p->value->port;
|
2017-02-27 20:36:46 +03:00
|
|
|
|
rbd: Fix bugs around -drive parameter "server"
qemu_rbd_open() takes option parameters as a flattened QDict, with
keys of the form server.%d.host, server.%d.port, where %d counts up
from zero.
qemu_rbd_array_opts() extracts these values as follows. First, it
calls qdict_array_entries() to find the list's length. For each list
element, it formats the list's key prefix (e.g. "server.0."), then
creates a new QDict holding the options with that key prefix, then
converts that to a QemuOpts, so it can finally get the member values
from there.
If there's one surefire way to make code using QDict more awkward,
it's creating more of them and mixing in QemuOpts for good measure.
The extraction of keys starting with server.%d into another QDict
makes us ignore parameters like server.0.neither-host-nor-port
silently.
The conversion to QemuOpts abuses runtime_opts, as described a few
commits ago.
Rewrite to simply get the values straight from the options QDict.
Fixes -drive not to crash when server.*.* are present, but
server.*.host is absent.
Fixes -drive to reject invalid server.*.*.
Permits cleaning up runtime_opts. Do that, and fix -drive to reject
bogus parameters host and port instead of silently ignoring them.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-11-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 11:56:08 +03:00
|
|
|
if (strchr(host, ':')) {
|
2018-02-15 22:58:24 +03:00
|
|
|
vals[i] = g_strdup_printf("[%s]:%s", host, port);
|
2017-02-27 20:36:46 +03:00
|
|
|
} else {
|
2018-02-15 22:58:24 +03:00
|
|
|
vals[i] = g_strdup_printf("%s:%s", host, port);
|
2017-02-27 20:36:46 +03:00
|
|
|
}
|
|
|
|
}
|
rbd: Fix bugs around -drive parameter "server"
qemu_rbd_open() takes option parameters as a flattened QDict, with
keys of the form server.%d.host, server.%d.port, where %d counts up
from zero.
qemu_rbd_array_opts() extracts these values as follows. First, it
calls qdict_array_entries() to find the list's length. For each list
element, it formats the list's key prefix (e.g. "server.0."), then
creates a new QDict holding the options with that key prefix, then
converts that to a QemuOpts, so it can finally get the member values
from there.
If there's one surefire way to make code using QDict more awkward,
it's creating more of them and mixing in QemuOpts for good measure.
The extraction of keys starting with server.%d into another QDict
makes us ignore parameters like server.0.neither-host-nor-port
silently.
The conversion to QemuOpts abuses runtime_opts, as described a few
commits ago.
Rewrite to simply get the values straight from the options QDict.
Fixes -drive not to crash when server.*.* are present, but
server.*.host is absent.
Fixes -drive to reject invalid server.*.*.
Permits cleaning up runtime_opts. Do that, and fix -drive to reject
bogus parameters host and port instead of silently ignoring them.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-11-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 11:56:08 +03:00
|
|
|
vals[i] = NULL;
|
2017-02-27 20:36:46 +03:00
|
|
|
|
rbd: Fix bugs around -drive parameter "server"
qemu_rbd_open() takes option parameters as a flattened QDict, with
keys of the form server.%d.host, server.%d.port, where %d counts up
from zero.
qemu_rbd_array_opts() extracts these values as follows. First, it
calls qdict_array_entries() to find the list's length. For each list
element, it formats the list's key prefix (e.g. "server.0."), then
creates a new QDict holding the options with that key prefix, then
converts that to a QemuOpts, so it can finally get the member values
from there.
If there's one surefire way to make code using QDict more awkward,
it's creating more of them and mixing in QemuOpts for good measure.
The extraction of keys starting with server.%d into another QDict
makes us ignore parameters like server.0.neither-host-nor-port
silently.
The conversion to QemuOpts abuses runtime_opts, as described a few
commits ago.
Rewrite to simply get the values straight from the options QDict.
Fixes -drive not to crash when server.*.* are present, but
server.*.host is absent.
Fixes -drive to reject invalid server.*.*.
Permits cleaning up runtime_opts. Do that, and fix -drive to reject
bogus parameters host and port instead of silently ignoring them.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-11-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 11:56:08 +03:00
|
|
|
rados_str = i ? g_strjoinv(";", (char **)vals) : NULL;
|
|
|
|
g_strfreev((char **)vals);
|
2017-02-27 20:36:46 +03:00
|
|
|
return rados_str;
|
|
|
|
}
|
|
|
|
|
2018-02-15 21:13:47 +03:00
|
|
|
static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,
|
2018-02-15 22:58:24 +03:00
|
|
|
BlockdevOptionsRbd *opts, bool cache,
|
2018-02-15 22:31:04 +03:00
|
|
|
const char *keypairs, const char *secretid,
|
|
|
|
Error **errp)
|
2010-12-06 22:53:01 +03:00
|
|
|
{
|
2017-02-27 20:36:46 +03:00
|
|
|
char *mon_host = NULL;
|
2018-02-15 21:13:47 +03:00
|
|
|
Error *local_err = NULL;
|
2010-12-06 22:53:01 +03:00
|
|
|
int r;
|
|
|
|
|
2018-06-14 22:14:43 +03:00
|
|
|
if (secretid) {
|
|
|
|
if (opts->key_secret) {
|
|
|
|
error_setg(errp,
|
|
|
|
"Legacy 'password-secret' clashes with 'key-secret'");
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
opts->key_secret = g_strdup(secretid);
|
|
|
|
opts->has_key_secret = true;
|
|
|
|
}
|
|
|
|
|
2018-02-15 22:58:24 +03:00
|
|
|
mon_host = qemu_rbd_mon_host(opts, &local_err);
|
2017-02-27 20:36:46 +03:00
|
|
|
if (local_err) {
|
|
|
|
error_propagate(errp, local_err);
|
|
|
|
r = -EINVAL;
|
|
|
|
goto failed_opts;
|
|
|
|
}
|
|
|
|
|
2018-02-15 22:58:24 +03:00
|
|
|
r = rados_create(cluster, opts->user);
|
2011-05-27 03:07:31 +04:00
|
|
|
if (r < 0) {
|
2016-05-09 10:51:59 +03:00
|
|
|
error_setg_errno(errp, -r, "error initializing");
|
2013-04-25 17:59:27 +04:00
|
|
|
goto failed_opts;
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
|
2017-02-27 01:50:42 +03:00
|
|
|
/* try default location when conf=NULL, but ignore failure */
|
2018-02-15 22:58:24 +03:00
|
|
|
r = rados_conf_read_file(*cluster, opts->conf);
|
|
|
|
if (opts->has_conf && r < 0) {
|
|
|
|
error_setg_errno(errp, -r, "error reading conf file %s", opts->conf);
|
2017-02-27 01:50:42 +03:00
|
|
|
goto failed_shutdown;
|
2015-06-11 06:28:45 +03:00
|
|
|
}
|
|
|
|
|
2018-02-15 21:13:47 +03:00
|
|
|
r = qemu_rbd_set_keypairs(*cluster, keypairs, errp);
|
2017-02-27 01:50:42 +03:00
|
|
|
if (r < 0) {
|
|
|
|
goto failed_shutdown;
|
2015-06-11 06:28:45 +03:00
|
|
|
}
|
|
|
|
|
2017-02-27 20:36:46 +03:00
|
|
|
if (mon_host) {
|
2018-02-15 21:13:47 +03:00
|
|
|
r = rados_conf_set(*cluster, "mon_host", mon_host);
|
2017-02-27 20:36:46 +03:00
|
|
|
if (r < 0) {
|
|
|
|
goto failed_shutdown;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2018-06-14 22:14:43 +03:00
|
|
|
r = qemu_rbd_set_auth(*cluster, opts, errp);
|
|
|
|
if (r < 0) {
|
2016-01-21 17:19:19 +03:00
|
|
|
goto failed_shutdown;
|
|
|
|
}
|
|
|
|
|
2012-05-18 00:42:29 +04:00
|
|
|
/*
|
|
|
|
* Fallback to more conservative semantics if setting cache
|
|
|
|
* options fails. Ignore errors from setting rbd_cache because the
|
|
|
|
* only possible error is that the option does not exist, and
|
|
|
|
* librbd defaults to no caching. If write through caching cannot
|
|
|
|
* be set up, fall back to no caching.
|
|
|
|
*/
|
2018-02-15 21:13:47 +03:00
|
|
|
if (cache) {
|
|
|
|
rados_conf_set(*cluster, "rbd_cache", "true");
|
2012-05-18 00:42:29 +04:00
|
|
|
} else {
|
2018-02-15 21:13:47 +03:00
|
|
|
rados_conf_set(*cluster, "rbd_cache", "false");
|
2012-05-18 00:42:29 +04:00
|
|
|
}
|
|
|
|
|
2018-02-15 21:13:47 +03:00
|
|
|
r = rados_connect(*cluster);
|
2011-05-27 03:07:31 +04:00
|
|
|
if (r < 0) {
|
2016-05-09 10:51:59 +03:00
|
|
|
error_setg_errno(errp, -r, "error connecting");
|
2011-09-07 20:28:06 +04:00
|
|
|
goto failed_shutdown;
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
|
2018-02-15 22:58:24 +03:00
|
|
|
r = rados_ioctx_create(*cluster, opts->pool, io_ctx);
|
2011-05-27 03:07:31 +04:00
|
|
|
if (r < 0) {
|
2018-02-15 22:58:24 +03:00
|
|
|
error_setg_errno(errp, -r, "error opening pool %s", opts->pool);
|
2011-09-07 20:28:06 +04:00
|
|
|
goto failed_shutdown;
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
block/rbd: Add support for ceph namespaces
Starting from ceph Nautilus, RBD has support for namespaces, allowing
for finer grain ACLs on images inside a pool, and tenant isolation.
In the rbd cli tool documentation, the new image-spec and snap-spec are :
- [pool-name/[namespace-name/]]image-name
- [pool-name/[namespace-name/]]image-name@snap-name
When using an non namespace's enabled qemu, it complains about not
finding the image called namespace-name/image-name, thus we only need to
parse the image once again to find if there is a '/' in its name, and if
there is, use what is before it as the name of the namespace to later
pass it to rados_ioctx_set_namespace.
rados_ioctx_set_namespace if called with en empty string or a null
pointer as the namespace parameters pretty much does nothing, as it then
defaults to the default namespace.
The namespace is extracted inside qemu_rbd_parse_filename, stored in the
qdict, and used in qemu_rbd_connect to make it work with both qemu-img,
and qemu itself.
Signed-off-by: Florian Florensa <fflorensa@online.net>
Message-Id: <20200110111513.321728-2-fflorensa@online.net>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-01-10 14:15:13 +03:00
|
|
|
/*
|
|
|
|
* Set the namespace after opening the io context on the pool,
|
|
|
|
* if nspace == NULL or if nspace == "", it is just as we did nothing
|
|
|
|
*/
|
|
|
|
rados_ioctx_set_namespace(*io_ctx, opts->q_namespace);
|
2010-12-06 22:53:01 +03:00
|
|
|
|
2018-02-15 21:13:47 +03:00
|
|
|
return 0;
|
|
|
|
|
|
|
|
failed_shutdown:
|
|
|
|
rados_shutdown(*cluster);
|
|
|
|
failed_opts:
|
|
|
|
g_free(mon_host);
|
|
|
|
return r;
|
|
|
|
}
|
|
|
|
|
2018-09-12 01:32:30 +03:00
|
|
|
static int qemu_rbd_convert_options(QDict *options, BlockdevOptionsRbd **opts,
|
|
|
|
Error **errp)
|
|
|
|
{
|
|
|
|
Visitor *v;
|
|
|
|
|
|
|
|
/* Convert the remaining options into a QAPI object */
|
|
|
|
v = qobject_input_visitor_new_flat_confused(options, errp);
|
|
|
|
if (!v) {
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
2020-07-07 19:06:07 +03:00
|
|
|
visit_type_BlockdevOptionsRbd(v, NULL, opts, errp);
|
2018-09-12 01:32:30 +03:00
|
|
|
visit_free(v);
|
2020-07-07 19:06:07 +03:00
|
|
|
if (!opts) {
|
2018-09-12 01:32:30 +03:00
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-09-12 01:32:31 +03:00
|
|
|
static int qemu_rbd_attempt_legacy_options(QDict *options,
|
|
|
|
BlockdevOptionsRbd **opts,
|
|
|
|
char **keypairs)
|
|
|
|
{
|
|
|
|
char *filename;
|
|
|
|
int r;
|
|
|
|
|
|
|
|
filename = g_strdup(qdict_get_try_str(options, "filename"));
|
|
|
|
if (!filename) {
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
qdict_del(options, "filename");
|
|
|
|
|
|
|
|
qemu_rbd_parse_filename(filename, options, NULL);
|
|
|
|
|
|
|
|
/* keypairs freed by caller */
|
|
|
|
*keypairs = g_strdup(qdict_get_try_str(options, "=keyvalue-pairs"));
|
|
|
|
if (*keypairs) {
|
|
|
|
qdict_del(options, "=keyvalue-pairs");
|
|
|
|
}
|
|
|
|
|
|
|
|
r = qemu_rbd_convert_options(options, opts, NULL);
|
|
|
|
|
|
|
|
g_free(filename);
|
|
|
|
return r;
|
|
|
|
}
|
|
|
|
|
2018-02-15 21:13:47 +03:00
|
|
|
static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
|
|
|
|
Error **errp)
|
|
|
|
{
|
|
|
|
BDRVRBDState *s = bs->opaque;
|
2018-02-15 22:58:24 +03:00
|
|
|
BlockdevOptionsRbd *opts = NULL;
|
2018-04-04 18:40:45 +03:00
|
|
|
const QDictEntry *e;
|
2018-02-15 21:13:47 +03:00
|
|
|
Error *local_err = NULL;
|
2018-02-15 22:31:04 +03:00
|
|
|
char *keypairs, *secretid;
|
2018-02-15 21:13:47 +03:00
|
|
|
int r;
|
|
|
|
|
2018-02-15 22:31:04 +03:00
|
|
|
keypairs = g_strdup(qdict_get_try_str(options, "=keyvalue-pairs"));
|
|
|
|
if (keypairs) {
|
|
|
|
qdict_del(options, "=keyvalue-pairs");
|
|
|
|
}
|
|
|
|
|
|
|
|
secretid = g_strdup(qdict_get_try_str(options, "password-secret"));
|
|
|
|
if (secretid) {
|
|
|
|
qdict_del(options, "password-secret");
|
|
|
|
}
|
|
|
|
|
2018-09-12 01:32:30 +03:00
|
|
|
r = qemu_rbd_convert_options(options, &opts, &local_err);
|
2018-02-15 22:58:24 +03:00
|
|
|
if (local_err) {
|
2018-09-12 01:32:31 +03:00
|
|
|
/* If keypairs are present, that means some options are present in
|
|
|
|
* the modern option format. Don't attempt to parse legacy option
|
|
|
|
* formats, as we won't support mixed usage. */
|
|
|
|
if (keypairs) {
|
|
|
|
error_propagate(errp, local_err);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* If the initial attempt to convert and process the options failed,
|
|
|
|
* we may be attempting to open an image file that has the rbd options
|
|
|
|
* specified in the older format consisting of all key/value pairs
|
|
|
|
* encoded in the filename. Go ahead and attempt to parse the
|
|
|
|
* filename, and see if we can pull out the required options. */
|
|
|
|
r = qemu_rbd_attempt_legacy_options(options, &opts, &keypairs);
|
|
|
|
if (r < 0) {
|
|
|
|
/* Propagate the original error, not the legacy parsing fallback
|
|
|
|
* error, as the latter was just a best-effort attempt. */
|
|
|
|
error_propagate(errp, local_err);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
/* Take care whenever deciding to actually deprecate; once this ability
|
|
|
|
* is removed, we will not be able to open any images with legacy-styled
|
|
|
|
* backing image strings. */
|
2018-10-17 11:26:27 +03:00
|
|
|
warn_report("RBD options encoded in the filename as keyvalue pairs "
|
|
|
|
"is deprecated");
|
2018-02-15 22:58:24 +03:00
|
|
|
}
|
|
|
|
|
2018-04-04 18:40:45 +03:00
|
|
|
/* Remove the processed options from the QDict (the visitor processes
|
|
|
|
* _all_ options in the QDict) */
|
|
|
|
while ((e = qdict_first(options))) {
|
|
|
|
qdict_del(options, e->key);
|
|
|
|
}
|
|
|
|
|
2018-02-16 20:54:52 +03:00
|
|
|
r = qemu_rbd_connect(&s->cluster, &s->io_ctx, opts,
|
|
|
|
!(flags & BDRV_O_NOCACHE), keypairs, secretid, errp);
|
2018-02-15 21:13:47 +03:00
|
|
|
if (r < 0) {
|
2018-02-15 22:31:04 +03:00
|
|
|
goto out;
|
2018-02-15 21:13:47 +03:00
|
|
|
}
|
|
|
|
|
2018-02-16 20:54:52 +03:00
|
|
|
s->snap = g_strdup(opts->snapshot);
|
|
|
|
s->image_name = g_strdup(opts->image);
|
|
|
|
|
block: do not set BDS read_only if copy_on_read enabled
A few block drivers will set the BDS read_only flag from their
.bdrv_open() function. This means the bs->read_only flag could
be set after we enable copy_on_read, as the BDRV_O_COPY_ON_READ
flag check occurs prior to the call to bdrv->bdrv_open().
This adds an error return to bdrv_set_read_only(), and an error will be
return if we try to set the BDS to read_only while copy_on_read is
enabled.
This patch also changes the behavior of vvfat. Before, vvfat could
override the drive 'readonly' flag with its own, internal 'rw' flag.
For instance, this -drive parameter would result in a writable image:
"-drive format=vvfat,dir=/tmp/vvfat,rw,if=virtio,readonly=on"
This is not correct. Now, attempting to use the above -drive parameter
will result in an error (i.e., 'rw' is incompatible with 'readonly=on').
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 0c5b4c1cc2c651471b131f21376dfd5ea24d2196.1491597120.git.jcody@redhat.com
2017-04-07 23:55:26 +03:00
|
|
|
/* rbd_open is always r/w */
|
2017-04-07 23:55:31 +03:00
|
|
|
r = rbd_open(s->io_ctx, s->image_name, &s->image, s->snap);
|
2010-12-06 22:53:01 +03:00
|
|
|
if (r < 0) {
|
2017-04-07 23:55:31 +03:00
|
|
|
error_setg_errno(errp, -r, "error reading header from %s",
|
|
|
|
s->image_name);
|
2011-09-07 20:28:06 +04:00
|
|
|
goto failed_open;
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
|
2019-05-09 17:59:27 +03:00
|
|
|
r = rbd_get_size(s->image, &s->image_size);
|
|
|
|
if (r < 0) {
|
|
|
|
error_setg_errno(errp, -r, "error getting image size from %s",
|
|
|
|
s->image_name);
|
|
|
|
rbd_close(s->image);
|
|
|
|
goto failed_open;
|
|
|
|
}
|
|
|
|
|
block: do not set BDS read_only if copy_on_read enabled
A few block drivers will set the BDS read_only flag from their
.bdrv_open() function. This means the bs->read_only flag could
be set after we enable copy_on_read, as the BDRV_O_COPY_ON_READ
flag check occurs prior to the call to bdrv->bdrv_open().
This adds an error return to bdrv_set_read_only(), and an error will be
return if we try to set the BDS to read_only while copy_on_read is
enabled.
This patch also changes the behavior of vvfat. Before, vvfat could
override the drive 'readonly' flag with its own, internal 'rw' flag.
For instance, this -drive parameter would result in a writable image:
"-drive format=vvfat,dir=/tmp/vvfat,rw,if=virtio,readonly=on"
This is not correct. Now, attempting to use the above -drive parameter
will result in an error (i.e., 'rw' is incompatible with 'readonly=on').
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 0c5b4c1cc2c651471b131f21376dfd5ea24d2196.1491597120.git.jcody@redhat.com
2017-04-07 23:55:26 +03:00
|
|
|
/* If we are using an rbd snapshot, we must be r/o, otherwise
|
|
|
|
* leave as-is */
|
|
|
|
if (s->snap != NULL) {
|
2018-10-12 12:27:41 +03:00
|
|
|
r = bdrv_apply_auto_read_only(bs, "rbd snapshots are read-only", errp);
|
|
|
|
if (r < 0) {
|
|
|
|
rbd_close(s->image);
|
|
|
|
goto failed_open;
|
block: do not set BDS read_only if copy_on_read enabled
A few block drivers will set the BDS read_only flag from their
.bdrv_open() function. This means the bs->read_only flag could
be set after we enable copy_on_read, as the BDRV_O_COPY_ON_READ
flag check occurs prior to the call to bdrv->bdrv_open().
This adds an error return to bdrv_set_read_only(), and an error will be
return if we try to set the BDS to read_only while copy_on_read is
enabled.
This patch also changes the behavior of vvfat. Before, vvfat could
override the drive 'readonly' flag with its own, internal 'rw' flag.
For instance, this -drive parameter would result in a writable image:
"-drive format=vvfat,dir=/tmp/vvfat,rw,if=virtio,readonly=on"
This is not correct. Now, attempting to use the above -drive parameter
will result in an error (i.e., 'rw' is incompatible with 'readonly=on').
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 0c5b4c1cc2c651471b131f21376dfd5ea24d2196.1491597120.git.jcody@redhat.com
2017-04-07 23:55:26 +03:00
|
|
|
}
|
|
|
|
}
|
2010-12-06 22:53:01 +03:00
|
|
|
|
2020-04-28 23:29:00 +03:00
|
|
|
/* When extending regular files, we get zeros from the OS */
|
|
|
|
bs->supported_truncate_flags = BDRV_REQ_ZERO_WRITE;
|
|
|
|
|
2018-02-15 22:31:04 +03:00
|
|
|
r = 0;
|
|
|
|
goto out;
|
2010-12-06 22:53:01 +03:00
|
|
|
|
2011-09-07 20:28:06 +04:00
|
|
|
failed_open:
|
2011-05-27 03:07:31 +04:00
|
|
|
rados_ioctx_destroy(s->io_ctx);
|
2011-09-07 20:28:06 +04:00
|
|
|
g_free(s->snap);
|
2017-04-07 23:55:31 +03:00
|
|
|
g_free(s->image_name);
|
2018-02-15 21:13:47 +03:00
|
|
|
rados_shutdown(s->cluster);
|
2018-02-15 22:31:04 +03:00
|
|
|
out:
|
2018-02-15 22:58:24 +03:00
|
|
|
qapi_free_BlockdevOptionsRbd(opts);
|
2018-02-15 22:31:04 +03:00
|
|
|
g_free(keypairs);
|
|
|
|
g_free(secretid);
|
2010-12-06 22:53:01 +03:00
|
|
|
return r;
|
|
|
|
}
|
|
|
|
|
2017-04-07 23:55:32 +03:00
|
|
|
|
|
|
|
/* Since RBD is currently always opened R/W via the API,
|
|
|
|
* we just need to check if we are using a snapshot or not, in
|
|
|
|
* order to determine if we will allow it to be R/W */
|
|
|
|
static int qemu_rbd_reopen_prepare(BDRVReopenState *state,
|
|
|
|
BlockReopenQueue *queue, Error **errp)
|
|
|
|
{
|
|
|
|
BDRVRBDState *s = state->bs->opaque;
|
|
|
|
int ret = 0;
|
|
|
|
|
|
|
|
if (s->snap && state->flags & BDRV_O_RDWR) {
|
|
|
|
error_setg(errp,
|
|
|
|
"Cannot change node '%s' to r/w when using RBD snapshot",
|
|
|
|
bdrv_get_device_or_node_name(state->bs));
|
|
|
|
ret = -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2011-05-27 03:07:31 +04:00
|
|
|
static void qemu_rbd_close(BlockDriverState *bs)
|
2010-12-06 22:53:01 +03:00
|
|
|
{
|
|
|
|
BDRVRBDState *s = bs->opaque;
|
|
|
|
|
2011-05-27 03:07:31 +04:00
|
|
|
rbd_close(s->image);
|
|
|
|
rados_ioctx_destroy(s->io_ctx);
|
2011-08-21 07:09:37 +04:00
|
|
|
g_free(s->snap);
|
2017-04-07 23:55:31 +03:00
|
|
|
g_free(s->image_name);
|
2011-05-27 03:07:31 +04:00
|
|
|
rados_shutdown(s->cluster);
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
|
2019-05-09 17:59:27 +03:00
|
|
|
/* Resize the RBD image and update the 'image_size' with the current size */
|
|
|
|
static int qemu_rbd_resize(BlockDriverState *bs, uint64_t size)
|
|
|
|
{
|
|
|
|
BDRVRBDState *s = bs->opaque;
|
|
|
|
int r;
|
|
|
|
|
|
|
|
r = rbd_resize(s->image, size);
|
|
|
|
if (r < 0) {
|
|
|
|
return r;
|
|
|
|
}
|
|
|
|
|
|
|
|
s->image_size = size;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2012-10-31 19:34:37 +04:00
|
|
|
static const AIOCBInfo rbd_aiocb_info = {
|
2010-12-06 22:53:01 +03:00
|
|
|
.aiocb_size = sizeof(RBDAIOCB),
|
|
|
|
};
|
|
|
|
|
2013-12-05 19:38:33 +04:00
|
|
|
static void rbd_finish_bh(void *opaque)
|
2010-12-06 22:53:01 +03:00
|
|
|
{
|
2013-12-05 19:38:33 +04:00
|
|
|
RADOSCB *rcb = opaque;
|
|
|
|
qemu_rbd_complete_aio(rcb);
|
2011-05-27 03:07:31 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* This is the callback function for rbd_aio_read and _write
|
|
|
|
*
|
|
|
|
* Note: this function is being called from a non qemu thread so
|
|
|
|
* we need to be careful about what we do here. Generally we only
|
2013-12-05 19:38:33 +04:00
|
|
|
* schedule a BH, and do the rest of the io completion handling
|
|
|
|
* from rbd_finish_bh() which runs in a qemu context.
|
2011-05-27 03:07:31 +04:00
|
|
|
*/
|
|
|
|
static void rbd_finish_aiocb(rbd_completion_t c, RADOSCB *rcb)
|
|
|
|
{
|
2013-12-05 19:38:33 +04:00
|
|
|
RBDAIOCB *acb = rcb->acb;
|
|
|
|
|
2011-05-27 03:07:31 +04:00
|
|
|
rcb->ret = rbd_aio_get_return_value(c);
|
|
|
|
rbd_aio_release(c);
|
2010-12-06 22:53:01 +03:00
|
|
|
|
2019-09-17 14:58:19 +03:00
|
|
|
replay_bh_schedule_oneshot_event(bdrv_get_aio_context(acb->common.bs),
|
|
|
|
rbd_finish_bh, rcb);
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
|
2012-05-01 10:16:45 +04:00
|
|
|
static int rbd_aio_discard_wrapper(rbd_image_t image,
|
|
|
|
uint64_t off,
|
|
|
|
uint64_t len,
|
|
|
|
rbd_completion_t comp)
|
|
|
|
{
|
|
|
|
#ifdef LIBRBD_SUPPORTS_DISCARD
|
|
|
|
return rbd_aio_discard(image, off, len, comp);
|
|
|
|
#else
|
|
|
|
return -ENOTSUP;
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
2013-03-30 00:03:23 +04:00
|
|
|
static int rbd_aio_flush_wrapper(rbd_image_t image,
|
|
|
|
rbd_completion_t comp)
|
|
|
|
{
|
|
|
|
#ifdef LIBRBD_SUPPORTS_AIO_FLUSH
|
|
|
|
return rbd_aio_flush(image, comp);
|
|
|
|
#else
|
|
|
|
return -ENOTSUP;
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
2014-10-07 15:59:14 +04:00
|
|
|
static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
|
2016-07-16 02:22:56 +03:00
|
|
|
int64_t off,
|
2014-10-07 15:59:14 +04:00
|
|
|
QEMUIOVector *qiov,
|
2016-07-16 02:22:56 +03:00
|
|
|
int64_t size,
|
2014-10-07 15:59:15 +04:00
|
|
|
BlockCompletionFunc *cb,
|
2014-10-07 15:59:14 +04:00
|
|
|
void *opaque,
|
|
|
|
RBDAIOCmd cmd)
|
2010-12-06 22:53:01 +03:00
|
|
|
{
|
|
|
|
RBDAIOCB *acb;
|
2014-05-21 20:11:48 +04:00
|
|
|
RADOSCB *rcb = NULL;
|
2011-05-27 03:07:31 +04:00
|
|
|
rbd_completion_t c;
|
2011-05-27 03:07:33 +04:00
|
|
|
int r;
|
2010-12-06 22:53:01 +03:00
|
|
|
|
|
|
|
BDRVRBDState *s = bs->opaque;
|
|
|
|
|
2012-10-31 19:34:37 +04:00
|
|
|
acb = qemu_aio_get(&rbd_aiocb_info, bs, cb, opaque);
|
2012-05-01 10:16:45 +04:00
|
|
|
acb->cmd = cmd;
|
2010-12-06 22:53:01 +03:00
|
|
|
acb->qiov = qiov;
|
2016-07-16 02:22:56 +03:00
|
|
|
assert(!qiov || qiov->size == size);
|
2017-02-21 09:50:03 +03:00
|
|
|
|
|
|
|
rcb = g_new(RADOSCB, 1);
|
|
|
|
|
|
|
|
if (!LIBRBD_USE_IOVEC) {
|
|
|
|
if (cmd == RBD_AIO_DISCARD || cmd == RBD_AIO_FLUSH) {
|
|
|
|
acb->bounce = NULL;
|
|
|
|
} else {
|
|
|
|
acb->bounce = qemu_try_blockalign(bs, qiov->size);
|
|
|
|
if (acb->bounce == NULL) {
|
|
|
|
goto failed;
|
|
|
|
}
|
2014-05-21 20:11:48 +04:00
|
|
|
}
|
2017-02-21 09:50:03 +03:00
|
|
|
if (cmd == RBD_AIO_WRITE) {
|
|
|
|
qemu_iovec_to_buf(acb->qiov, 0, acb->bounce, qiov->size);
|
|
|
|
}
|
|
|
|
rcb->buf = acb->bounce;
|
2012-05-01 10:16:45 +04:00
|
|
|
}
|
2017-02-21 09:50:03 +03:00
|
|
|
|
2010-12-06 22:53:01 +03:00
|
|
|
acb->ret = 0;
|
|
|
|
acb->error = 0;
|
|
|
|
acb->s = s;
|
|
|
|
|
2011-05-27 03:07:31 +04:00
|
|
|
rcb->acb = acb;
|
|
|
|
rcb->s = acb->s;
|
|
|
|
rcb->size = size;
|
2011-05-27 03:07:33 +04:00
|
|
|
r = rbd_aio_create_completion(rcb, (rbd_callback_t) rbd_finish_aiocb, &c);
|
|
|
|
if (r < 0) {
|
|
|
|
goto failed;
|
|
|
|
}
|
2010-12-06 22:53:01 +03:00
|
|
|
|
2012-05-01 10:16:45 +04:00
|
|
|
switch (cmd) {
|
2019-05-09 17:59:27 +03:00
|
|
|
case RBD_AIO_WRITE: {
|
|
|
|
/*
|
|
|
|
* RBD APIs don't allow us to write more than actual size, so in order
|
|
|
|
* to support growing images, we resize the image before write
|
|
|
|
* operations that exceed the current size.
|
|
|
|
*/
|
|
|
|
if (off + size > s->image_size) {
|
|
|
|
r = qemu_rbd_resize(bs, off + size);
|
|
|
|
if (r < 0) {
|
|
|
|
goto failed_completion;
|
|
|
|
}
|
|
|
|
}
|
2017-02-21 09:50:03 +03:00
|
|
|
#ifdef LIBRBD_SUPPORTS_IOVEC
|
|
|
|
r = rbd_aio_writev(s->image, qiov->iov, qiov->niov, off, c);
|
|
|
|
#else
|
|
|
|
r = rbd_aio_write(s->image, off, size, rcb->buf, c);
|
|
|
|
#endif
|
2012-05-01 10:16:45 +04:00
|
|
|
break;
|
2019-05-09 17:59:27 +03:00
|
|
|
}
|
2012-05-01 10:16:45 +04:00
|
|
|
case RBD_AIO_READ:
|
2017-02-21 09:50:03 +03:00
|
|
|
#ifdef LIBRBD_SUPPORTS_IOVEC
|
|
|
|
r = rbd_aio_readv(s->image, qiov->iov, qiov->niov, off, c);
|
|
|
|
#else
|
|
|
|
r = rbd_aio_read(s->image, off, size, rcb->buf, c);
|
|
|
|
#endif
|
2012-05-01 10:16:45 +04:00
|
|
|
break;
|
|
|
|
case RBD_AIO_DISCARD:
|
|
|
|
r = rbd_aio_discard_wrapper(s->image, off, size, c);
|
|
|
|
break;
|
2013-03-30 00:03:23 +04:00
|
|
|
case RBD_AIO_FLUSH:
|
|
|
|
r = rbd_aio_flush_wrapper(s->image, c);
|
|
|
|
break;
|
2012-05-01 10:16:45 +04:00
|
|
|
default:
|
|
|
|
r = -EINVAL;
|
2011-05-27 03:07:33 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
if (r < 0) {
|
2014-06-05 18:19:26 +04:00
|
|
|
goto failed_completion;
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
return &acb->common;
|
2011-05-27 03:07:33 +04:00
|
|
|
|
2014-06-05 18:19:26 +04:00
|
|
|
failed_completion:
|
|
|
|
rbd_aio_release(c);
|
2011-05-27 03:07:33 +04:00
|
|
|
failed:
|
2011-08-21 07:09:37 +04:00
|
|
|
g_free(rcb);
|
2017-02-21 09:50:03 +03:00
|
|
|
if (!LIBRBD_USE_IOVEC) {
|
|
|
|
qemu_vfree(acb->bounce);
|
|
|
|
}
|
|
|
|
|
2014-09-11 09:41:28 +04:00
|
|
|
qemu_aio_unref(acb);
|
2011-05-27 03:07:33 +04:00
|
|
|
return NULL;
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
|
2018-04-24 22:25:04 +03:00
|
|
|
static BlockAIOCB *qemu_rbd_aio_preadv(BlockDriverState *bs,
|
|
|
|
uint64_t offset, uint64_t bytes,
|
|
|
|
QEMUIOVector *qiov, int flags,
|
|
|
|
BlockCompletionFunc *cb,
|
|
|
|
void *opaque)
|
2010-12-06 22:53:01 +03:00
|
|
|
{
|
2018-04-24 22:25:04 +03:00
|
|
|
return rbd_start_aio(bs, offset, qiov, bytes, cb, opaque,
|
2012-05-01 10:16:45 +04:00
|
|
|
RBD_AIO_READ);
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
|
2018-04-24 22:25:04 +03:00
|
|
|
static BlockAIOCB *qemu_rbd_aio_pwritev(BlockDriverState *bs,
|
|
|
|
uint64_t offset, uint64_t bytes,
|
|
|
|
QEMUIOVector *qiov, int flags,
|
|
|
|
BlockCompletionFunc *cb,
|
|
|
|
void *opaque)
|
2010-12-06 22:53:01 +03:00
|
|
|
{
|
2018-04-24 22:25:04 +03:00
|
|
|
return rbd_start_aio(bs, offset, qiov, bytes, cb, opaque,
|
2012-05-01 10:16:45 +04:00
|
|
|
RBD_AIO_WRITE);
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
|
2013-03-30 00:03:23 +04:00
|
|
|
#ifdef LIBRBD_SUPPORTS_AIO_FLUSH
|
2014-10-07 15:59:14 +04:00
|
|
|
static BlockAIOCB *qemu_rbd_aio_flush(BlockDriverState *bs,
|
2014-10-07 15:59:15 +04:00
|
|
|
BlockCompletionFunc *cb,
|
2014-10-07 15:59:14 +04:00
|
|
|
void *opaque)
|
2013-03-30 00:03:23 +04:00
|
|
|
{
|
|
|
|
return rbd_start_aio(bs, 0, NULL, 0, cb, opaque, RBD_AIO_FLUSH);
|
|
|
|
}
|
|
|
|
|
|
|
|
#else
|
|
|
|
|
2011-10-20 15:16:24 +04:00
|
|
|
static int qemu_rbd_co_flush(BlockDriverState *bs)
|
2011-09-16 01:11:11 +04:00
|
|
|
{
|
|
|
|
#if LIBRBD_VERSION_CODE >= LIBRBD_VERSION(0, 1, 1)
|
|
|
|
/* rbd_flush added in 0.1.1 */
|
|
|
|
BDRVRBDState *s = bs->opaque;
|
|
|
|
return rbd_flush(s->image);
|
|
|
|
#else
|
|
|
|
return 0;
|
|
|
|
#endif
|
|
|
|
}
|
2013-03-30 00:03:23 +04:00
|
|
|
#endif
|
2011-09-16 01:11:11 +04:00
|
|
|
|
2011-05-27 03:07:31 +04:00
|
|
|
static int qemu_rbd_getinfo(BlockDriverState *bs, BlockDriverInfo *bdi)
|
2010-12-06 22:53:01 +03:00
|
|
|
{
|
|
|
|
BDRVRBDState *s = bs->opaque;
|
2011-05-27 03:07:31 +04:00
|
|
|
rbd_image_info_t info;
|
|
|
|
int r;
|
|
|
|
|
|
|
|
r = rbd_stat(s->image, &info, sizeof(info));
|
|
|
|
if (r < 0) {
|
|
|
|
return r;
|
|
|
|
}
|
|
|
|
|
|
|
|
bdi->cluster_size = info.obj_size;
|
2010-12-06 22:53:01 +03:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2011-05-27 03:07:31 +04:00
|
|
|
static int64_t qemu_rbd_getlength(BlockDriverState *bs)
|
2010-12-06 22:53:01 +03:00
|
|
|
{
|
|
|
|
BDRVRBDState *s = bs->opaque;
|
2011-05-27 03:07:31 +04:00
|
|
|
rbd_image_info_t info;
|
|
|
|
int r;
|
2010-12-06 22:53:01 +03:00
|
|
|
|
2011-05-27 03:07:31 +04:00
|
|
|
r = rbd_stat(s->image, &info, sizeof(info));
|
|
|
|
if (r < 0) {
|
|
|
|
return r;
|
|
|
|
}
|
|
|
|
|
|
|
|
return info.size;
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
|
block: Convert .bdrv_truncate callback to coroutine_fn
bdrv_truncate() is an operation that can block (even for a quite long
time, depending on the PreallocMode) in I/O paths that shouldn't block.
Convert it to a coroutine_fn so that we have the infrastructure for
drivers to make their .bdrv_co_truncate implementation asynchronous.
This change could potentially introduce new race conditions because
bdrv_truncate() isn't necessarily executed atomically any more. Whether
this is a problem needs to be evaluated for each block driver that
supports truncate:
* file-posix/win32, gluster, iscsi, nfs, rbd, ssh, sheepdog: The
protocol drivers are trivially safe because they don't actually yield
yet, so there is no change in behaviour.
* copy-on-read, crypto, raw-format: Essentially just filter drivers that
pass the request to a child node, no problem.
* qcow2: The implementation modifies metadata, so it needs to hold
s->lock to be safe with concurrent I/O requests. In order to avoid
double locking, this requires pulling the locking out into
preallocate_co() and using qcow2_write_caches() instead of
bdrv_flush().
* qed: Does a single header update, this is fine without locking.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-06-21 18:54:35 +03:00
|
|
|
static int coroutine_fn qemu_rbd_co_truncate(BlockDriverState *bs,
|
|
|
|
int64_t offset,
|
2019-09-18 12:51:40 +03:00
|
|
|
bool exact,
|
block: Convert .bdrv_truncate callback to coroutine_fn
bdrv_truncate() is an operation that can block (even for a quite long
time, depending on the PreallocMode) in I/O paths that shouldn't block.
Convert it to a coroutine_fn so that we have the infrastructure for
drivers to make their .bdrv_co_truncate implementation asynchronous.
This change could potentially introduce new race conditions because
bdrv_truncate() isn't necessarily executed atomically any more. Whether
this is a problem needs to be evaluated for each block driver that
supports truncate:
* file-posix/win32, gluster, iscsi, nfs, rbd, ssh, sheepdog: The
protocol drivers are trivially safe because they don't actually yield
yet, so there is no change in behaviour.
* copy-on-read, crypto, raw-format: Essentially just filter drivers that
pass the request to a child node, no problem.
* qcow2: The implementation modifies metadata, so it needs to hold
s->lock to be safe with concurrent I/O requests. In order to avoid
double locking, this requires pulling the locking out into
preallocate_co() and using qcow2_write_caches() instead of
bdrv_flush().
* qed: Does a single header update, this is fine without locking.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-06-21 18:54:35 +03:00
|
|
|
PreallocMode prealloc,
|
2020-04-24 15:54:39 +03:00
|
|
|
BdrvRequestFlags flags,
|
block: Convert .bdrv_truncate callback to coroutine_fn
bdrv_truncate() is an operation that can block (even for a quite long
time, depending on the PreallocMode) in I/O paths that shouldn't block.
Convert it to a coroutine_fn so that we have the infrastructure for
drivers to make their .bdrv_co_truncate implementation asynchronous.
This change could potentially introduce new race conditions because
bdrv_truncate() isn't necessarily executed atomically any more. Whether
this is a problem needs to be evaluated for each block driver that
supports truncate:
* file-posix/win32, gluster, iscsi, nfs, rbd, ssh, sheepdog: The
protocol drivers are trivially safe because they don't actually yield
yet, so there is no change in behaviour.
* copy-on-read, crypto, raw-format: Essentially just filter drivers that
pass the request to a child node, no problem.
* qcow2: The implementation modifies metadata, so it needs to hold
s->lock to be safe with concurrent I/O requests. In order to avoid
double locking, this requires pulling the locking out into
preallocate_co() and using qcow2_write_caches() instead of
bdrv_flush().
* qed: Does a single header update, this is fine without locking.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-06-21 18:54:35 +03:00
|
|
|
Error **errp)
|
2011-05-27 03:07:34 +04:00
|
|
|
{
|
|
|
|
int r;
|
|
|
|
|
2017-06-13 23:20:52 +03:00
|
|
|
if (prealloc != PREALLOC_MODE_OFF) {
|
|
|
|
error_setg(errp, "Unsupported preallocation mode '%s'",
|
2017-08-24 11:46:08 +03:00
|
|
|
PreallocMode_str(prealloc));
|
2017-06-13 23:20:52 +03:00
|
|
|
return -ENOTSUP;
|
|
|
|
}
|
|
|
|
|
2019-05-09 17:59:27 +03:00
|
|
|
r = qemu_rbd_resize(bs, offset);
|
2011-05-27 03:07:34 +04:00
|
|
|
if (r < 0) {
|
2017-03-28 23:51:29 +03:00
|
|
|
error_setg_errno(errp, -r, "Failed to resize file");
|
2011-05-27 03:07:34 +04:00
|
|
|
return r;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2011-05-27 03:07:31 +04:00
|
|
|
static int qemu_rbd_snap_create(BlockDriverState *bs,
|
|
|
|
QEMUSnapshotInfo *sn_info)
|
2010-12-06 22:53:01 +03:00
|
|
|
{
|
|
|
|
BDRVRBDState *s = bs->opaque;
|
|
|
|
int r;
|
|
|
|
|
|
|
|
if (sn_info->name[0] == '\0') {
|
|
|
|
return -EINVAL; /* we need a name for rbd snapshots */
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* rbd snapshots are using the name as the user controlled unique identifier
|
|
|
|
* we can't use the rbd snapid for that purpose, as it can't be set
|
|
|
|
*/
|
|
|
|
if (sn_info->id_str[0] != '\0' &&
|
|
|
|
strcmp(sn_info->id_str, sn_info->name) != 0) {
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (strlen(sn_info->name) >= sizeof(sn_info->id_str)) {
|
|
|
|
return -ERANGE;
|
|
|
|
}
|
|
|
|
|
2011-05-27 03:07:31 +04:00
|
|
|
r = rbd_snap_create(s->image, sn_info->name);
|
2010-12-06 22:53:01 +03:00
|
|
|
if (r < 0) {
|
2011-05-27 03:07:31 +04:00
|
|
|
error_report("failed to create snap: %s", strerror(-r));
|
2010-12-06 22:53:01 +03:00
|
|
|
return r;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2012-01-11 23:53:52 +04:00
|
|
|
static int qemu_rbd_snap_remove(BlockDriverState *bs,
|
snapshot: distinguish id and name in snapshot delete
Snapshot creation actually already distinguish id and name since it take
a structured parameter *sn, but delete can't. Later an accurate delete
is needed in qmp_transaction abort and blockdev-snapshot-delete-sync,
so change its prototype. Also *errp is added to tip error, but return
value is kepted to let caller check what kind of error happens. Existing
caller for it are savevm, delvm and qemu-img, they are not impacted by
introducing a new function bdrv_snapshot_delete_by_id_or_name(), which
check the return value and do the operation again.
Before this patch:
For qcow2, it search id first then name to find the one to delete.
For rbd, it search name.
For sheepdog, it does nothing.
After this patch:
For qcow2, logic is the same by call it twice in caller.
For rbd, it always fails in delete with id, but still search for name
in second try, no change to user.
Some code for *errp is based on Pavel's patch.
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-11 10:04:33 +04:00
|
|
|
const char *snapshot_id,
|
|
|
|
const char *snapshot_name,
|
|
|
|
Error **errp)
|
2012-01-11 23:53:52 +04:00
|
|
|
{
|
|
|
|
BDRVRBDState *s = bs->opaque;
|
|
|
|
int r;
|
|
|
|
|
snapshot: distinguish id and name in snapshot delete
Snapshot creation actually already distinguish id and name since it take
a structured parameter *sn, but delete can't. Later an accurate delete
is needed in qmp_transaction abort and blockdev-snapshot-delete-sync,
so change its prototype. Also *errp is added to tip error, but return
value is kepted to let caller check what kind of error happens. Existing
caller for it are savevm, delvm and qemu-img, they are not impacted by
introducing a new function bdrv_snapshot_delete_by_id_or_name(), which
check the return value and do the operation again.
Before this patch:
For qcow2, it search id first then name to find the one to delete.
For rbd, it search name.
For sheepdog, it does nothing.
After this patch:
For qcow2, logic is the same by call it twice in caller.
For rbd, it always fails in delete with id, but still search for name
in second try, no change to user.
Some code for *errp is based on Pavel's patch.
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-11 10:04:33 +04:00
|
|
|
if (!snapshot_name) {
|
|
|
|
error_setg(errp, "rbd need a valid snapshot name");
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* If snapshot_id is specified, it must be equal to name, see
|
|
|
|
qemu_rbd_snap_list() */
|
|
|
|
if (snapshot_id && strcmp(snapshot_id, snapshot_name)) {
|
|
|
|
error_setg(errp,
|
|
|
|
"rbd do not support snapshot id, it should be NULL or "
|
|
|
|
"equal to snapshot name");
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
2012-01-11 23:53:52 +04:00
|
|
|
r = rbd_snap_remove(s->image, snapshot_name);
|
snapshot: distinguish id and name in snapshot delete
Snapshot creation actually already distinguish id and name since it take
a structured parameter *sn, but delete can't. Later an accurate delete
is needed in qmp_transaction abort and blockdev-snapshot-delete-sync,
so change its prototype. Also *errp is added to tip error, but return
value is kepted to let caller check what kind of error happens. Existing
caller for it are savevm, delvm and qemu-img, they are not impacted by
introducing a new function bdrv_snapshot_delete_by_id_or_name(), which
check the return value and do the operation again.
Before this patch:
For qcow2, it search id first then name to find the one to delete.
For rbd, it search name.
For sheepdog, it does nothing.
After this patch:
For qcow2, logic is the same by call it twice in caller.
For rbd, it always fails in delete with id, but still search for name
in second try, no change to user.
Some code for *errp is based on Pavel's patch.
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-11 10:04:33 +04:00
|
|
|
if (r < 0) {
|
|
|
|
error_setg_errno(errp, -r, "Failed to remove the snapshot");
|
|
|
|
}
|
2012-01-11 23:53:52 +04:00
|
|
|
return r;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int qemu_rbd_snap_rollback(BlockDriverState *bs,
|
|
|
|
const char *snapshot_name)
|
|
|
|
{
|
|
|
|
BDRVRBDState *s = bs->opaque;
|
|
|
|
|
2016-06-14 00:57:58 +03:00
|
|
|
return rbd_snap_rollback(s->image, snapshot_name);
|
2012-01-11 23:53:52 +04:00
|
|
|
}
|
|
|
|
|
2011-05-27 03:07:31 +04:00
|
|
|
static int qemu_rbd_snap_list(BlockDriverState *bs,
|
|
|
|
QEMUSnapshotInfo **psn_tab)
|
2010-12-06 22:53:01 +03:00
|
|
|
{
|
|
|
|
BDRVRBDState *s = bs->opaque;
|
|
|
|
QEMUSnapshotInfo *sn_info, *sn_tab = NULL;
|
2011-05-27 03:07:31 +04:00
|
|
|
int i, snap_count;
|
|
|
|
rbd_snap_info_t *snaps;
|
|
|
|
int max_snaps = RBD_MAX_SNAPS;
|
2010-12-06 22:53:01 +03:00
|
|
|
|
2011-05-27 03:07:31 +04:00
|
|
|
do {
|
2014-08-19 12:31:09 +04:00
|
|
|
snaps = g_new(rbd_snap_info_t, max_snaps);
|
2011-05-27 03:07:31 +04:00
|
|
|
snap_count = rbd_snap_list(s->image, snaps, &max_snaps);
|
2013-09-25 18:00:48 +04:00
|
|
|
if (snap_count <= 0) {
|
2011-08-21 07:09:37 +04:00
|
|
|
g_free(snaps);
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
2011-05-27 03:07:31 +04:00
|
|
|
} while (snap_count == -ERANGE);
|
2010-12-06 22:53:01 +03:00
|
|
|
|
2011-05-27 03:07:31 +04:00
|
|
|
if (snap_count <= 0) {
|
2011-12-07 05:05:10 +04:00
|
|
|
goto done;
|
2010-12-06 22:53:01 +03:00
|
|
|
}
|
|
|
|
|
block: Use g_new() & friends where that makes obvious sense
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer,
for two reasons. One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.
Patch created with Coccinelle, with two manual changes on top:
* Add const to bdrv_iterate_format() to keep the types straight
* Convert the allocation in bdrv_drop_intermediate(), which Coccinelle
inexplicably misses
Coccinelle semantic patch:
@@
type T;
@@
-g_malloc(sizeof(T))
+g_new(T, 1)
@@
type T;
@@
-g_try_malloc(sizeof(T))
+g_try_new(T, 1)
@@
type T;
@@
-g_malloc0(sizeof(T))
+g_new0(T, 1)
@@
type T;
@@
-g_try_malloc0(sizeof(T))
+g_try_new0(T, 1)
@@
type T;
expression n;
@@
-g_malloc(sizeof(T) * (n))
+g_new(T, n)
@@
type T;
expression n;
@@
-g_try_malloc(sizeof(T) * (n))
+g_try_new(T, n)
@@
type T;
expression n;
@@
-g_malloc0(sizeof(T) * (n))
+g_new0(T, n)
@@
type T;
expression n;
@@
-g_try_malloc0(sizeof(T) * (n))
+g_try_new0(T, n)
@@
type T;
expression p, n;
@@
-g_realloc(p, sizeof(T) * (n))
+g_renew(T, p, n)
@@
type T;
expression p, n;
@@
-g_try_realloc(p, sizeof(T) * (n))
+g_try_renew(T, p, n)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-19 12:31:08 +04:00
|
|
|
sn_tab = g_new0(QEMUSnapshotInfo, snap_count);
|
2010-12-06 22:53:01 +03:00
|
|
|
|
2011-05-27 03:07:31 +04:00
|
|
|
for (i = 0; i < snap_count; i++) {
|
|
|
|
const char *snap_name = snaps[i].name;
|
2010-12-06 22:53:01 +03:00
|
|
|
|
|
|
|
sn_info = sn_tab + i;
|
|
|
|
pstrcpy(sn_info->id_str, sizeof(sn_info->id_str), snap_name);
|
|
|
|
pstrcpy(sn_info->name, sizeof(sn_info->name), snap_name);
|
|
|
|
|
2011-05-27 03:07:31 +04:00
|
|
|
sn_info->vm_state_size = snaps[i].size;
|
2010-12-06 22:53:01 +03:00
|
|
|
sn_info->date_sec = 0;
|
|
|
|
sn_info->date_nsec = 0;
|
|
|
|
sn_info->vm_clock_nsec = 0;
|
|
|
|
}
|
2011-05-27 03:07:31 +04:00
|
|
|
rbd_snap_list_end(snaps);
|
2013-09-25 18:00:48 +04:00
|
|
|
g_free(snaps);
|
2011-05-27 03:07:31 +04:00
|
|
|
|
2011-12-07 05:05:10 +04:00
|
|
|
done:
|
2010-12-06 22:53:01 +03:00
|
|
|
*psn_tab = sn_tab;
|
|
|
|
return snap_count;
|
|
|
|
}
|
|
|
|
|
2012-05-01 10:16:45 +04:00
|
|
|
#ifdef LIBRBD_SUPPORTS_DISCARD
|
2016-07-16 02:22:57 +03:00
|
|
|
static BlockAIOCB *qemu_rbd_aio_pdiscard(BlockDriverState *bs,
|
|
|
|
int64_t offset,
|
2017-06-09 13:18:08 +03:00
|
|
|
int bytes,
|
2016-07-16 02:22:57 +03:00
|
|
|
BlockCompletionFunc *cb,
|
|
|
|
void *opaque)
|
2012-05-01 10:16:45 +04:00
|
|
|
{
|
2017-06-09 13:18:08 +03:00
|
|
|
return rbd_start_aio(bs, offset, NULL, bytes, cb, opaque,
|
2012-05-01 10:16:45 +04:00
|
|
|
RBD_AIO_DISCARD);
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2014-10-09 22:44:32 +04:00
|
|
|
#ifdef LIBRBD_SUPPORTS_INVALIDATE
|
2018-03-01 19:36:18 +03:00
|
|
|
static void coroutine_fn qemu_rbd_co_invalidate_cache(BlockDriverState *bs,
|
|
|
|
Error **errp)
|
2014-10-09 22:44:32 +04:00
|
|
|
{
|
|
|
|
BDRVRBDState *s = bs->opaque;
|
|
|
|
int r = rbd_invalidate_cache(s->image);
|
|
|
|
if (r < 0) {
|
|
|
|
error_setg_errno(errp, -r, "Failed to invalidate the cache");
|
|
|
|
}
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2014-06-05 13:21:04 +04:00
|
|
|
static QemuOptsList qemu_rbd_create_opts = {
|
|
|
|
.name = "rbd-create-opts",
|
|
|
|
.head = QTAILQ_HEAD_INITIALIZER(qemu_rbd_create_opts.head),
|
|
|
|
.desc = {
|
|
|
|
{
|
|
|
|
.name = BLOCK_OPT_SIZE,
|
|
|
|
.type = QEMU_OPT_SIZE,
|
|
|
|
.help = "Virtual disk size"
|
|
|
|
},
|
|
|
|
{
|
|
|
|
.name = BLOCK_OPT_CLUSTER_SIZE,
|
|
|
|
.type = QEMU_OPT_SIZE,
|
|
|
|
.help = "RBD object size"
|
|
|
|
},
|
2016-01-21 17:19:19 +03:00
|
|
|
{
|
|
|
|
.name = "password-secret",
|
|
|
|
.type = QEMU_OPT_STRING,
|
|
|
|
.help = "ID of secret providing the password",
|
|
|
|
},
|
2014-06-05 13:21:04 +04:00
|
|
|
{ /* end of list */ }
|
|
|
|
}
|
2010-12-06 22:53:01 +03:00
|
|
|
};
|
|
|
|
|
2019-02-01 22:29:25 +03:00
|
|
|
static const char *const qemu_rbd_strong_runtime_opts[] = {
|
|
|
|
"pool",
|
2020-09-14 22:05:53 +03:00
|
|
|
"namespace",
|
2019-02-01 22:29:25 +03:00
|
|
|
"image",
|
|
|
|
"conf",
|
|
|
|
"snapshot",
|
|
|
|
"user",
|
|
|
|
"server.",
|
|
|
|
"password-secret",
|
|
|
|
|
|
|
|
NULL
|
|
|
|
};
|
|
|
|
|
2010-12-06 22:53:01 +03:00
|
|
|
static BlockDriver bdrv_rbd = {
|
2017-02-27 01:50:42 +03:00
|
|
|
.format_name = "rbd",
|
|
|
|
.instance_size = sizeof(BDRVRBDState),
|
|
|
|
.bdrv_parse_filename = qemu_rbd_parse_filename,
|
2018-04-24 22:25:04 +03:00
|
|
|
.bdrv_refresh_limits = qemu_rbd_refresh_limits,
|
2017-02-27 01:50:42 +03:00
|
|
|
.bdrv_file_open = qemu_rbd_open,
|
|
|
|
.bdrv_close = qemu_rbd_close,
|
2017-04-07 23:55:32 +03:00
|
|
|
.bdrv_reopen_prepare = qemu_rbd_reopen_prepare,
|
2018-01-31 18:27:38 +03:00
|
|
|
.bdrv_co_create = qemu_rbd_co_create,
|
2018-01-18 15:43:45 +03:00
|
|
|
.bdrv_co_create_opts = qemu_rbd_co_create_opts,
|
2017-02-27 01:50:42 +03:00
|
|
|
.bdrv_has_zero_init = bdrv_has_zero_init_1,
|
|
|
|
.bdrv_get_info = qemu_rbd_getinfo,
|
|
|
|
.create_opts = &qemu_rbd_create_opts,
|
|
|
|
.bdrv_getlength = qemu_rbd_getlength,
|
block: Convert .bdrv_truncate callback to coroutine_fn
bdrv_truncate() is an operation that can block (even for a quite long
time, depending on the PreallocMode) in I/O paths that shouldn't block.
Convert it to a coroutine_fn so that we have the infrastructure for
drivers to make their .bdrv_co_truncate implementation asynchronous.
This change could potentially introduce new race conditions because
bdrv_truncate() isn't necessarily executed atomically any more. Whether
this is a problem needs to be evaluated for each block driver that
supports truncate:
* file-posix/win32, gluster, iscsi, nfs, rbd, ssh, sheepdog: The
protocol drivers are trivially safe because they don't actually yield
yet, so there is no change in behaviour.
* copy-on-read, crypto, raw-format: Essentially just filter drivers that
pass the request to a child node, no problem.
* qcow2: The implementation modifies metadata, so it needs to hold
s->lock to be safe with concurrent I/O requests. In order to avoid
double locking, this requires pulling the locking out into
preallocate_co() and using qcow2_write_caches() instead of
bdrv_flush().
* qed: Does a single header update, this is fine without locking.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-06-21 18:54:35 +03:00
|
|
|
.bdrv_co_truncate = qemu_rbd_co_truncate,
|
2017-02-27 01:50:42 +03:00
|
|
|
.protocol_name = "rbd",
|
2010-12-06 22:53:01 +03:00
|
|
|
|
2018-04-24 22:25:04 +03:00
|
|
|
.bdrv_aio_preadv = qemu_rbd_aio_preadv,
|
|
|
|
.bdrv_aio_pwritev = qemu_rbd_aio_pwritev,
|
2013-03-30 00:03:23 +04:00
|
|
|
|
|
|
|
#ifdef LIBRBD_SUPPORTS_AIO_FLUSH
|
|
|
|
.bdrv_aio_flush = qemu_rbd_aio_flush,
|
|
|
|
#else
|
2011-11-10 20:25:44 +04:00
|
|
|
.bdrv_co_flush_to_disk = qemu_rbd_co_flush,
|
2013-03-30 00:03:23 +04:00
|
|
|
#endif
|
2010-12-06 22:53:01 +03:00
|
|
|
|
2012-05-01 10:16:45 +04:00
|
|
|
#ifdef LIBRBD_SUPPORTS_DISCARD
|
2016-07-16 02:22:57 +03:00
|
|
|
.bdrv_aio_pdiscard = qemu_rbd_aio_pdiscard,
|
2012-05-01 10:16:45 +04:00
|
|
|
#endif
|
|
|
|
|
2011-11-10 20:25:44 +04:00
|
|
|
.bdrv_snapshot_create = qemu_rbd_snap_create,
|
2012-01-11 23:53:52 +04:00
|
|
|
.bdrv_snapshot_delete = qemu_rbd_snap_remove,
|
2011-11-10 20:25:44 +04:00
|
|
|
.bdrv_snapshot_list = qemu_rbd_snap_list,
|
2012-01-11 23:53:52 +04:00
|
|
|
.bdrv_snapshot_goto = qemu_rbd_snap_rollback,
|
2014-10-09 22:44:32 +04:00
|
|
|
#ifdef LIBRBD_SUPPORTS_INVALIDATE
|
2018-03-01 19:36:18 +03:00
|
|
|
.bdrv_co_invalidate_cache = qemu_rbd_co_invalidate_cache,
|
2014-10-09 22:44:32 +04:00
|
|
|
#endif
|
2019-02-01 22:29:25 +03:00
|
|
|
|
|
|
|
.strong_runtime_opts = qemu_rbd_strong_runtime_opts,
|
2010-12-06 22:53:01 +03:00
|
|
|
};
|
|
|
|
|
|
|
|
static void bdrv_rbd_init(void)
|
|
|
|
{
|
|
|
|
bdrv_register(&bdrv_rbd);
|
|
|
|
}
|
|
|
|
|
|
|
|
block_init(bdrv_rbd_init);
|