2010-06-21 00:01:00 +04:00
|
|
|
/*
|
|
|
|
* Copyright (C) 2009-2010 Nippon Telegraph and Telephone Corporation.
|
|
|
|
*
|
|
|
|
* This program is free software; you can redistribute it and/or
|
|
|
|
* modify it under the terms of the GNU General Public License version
|
|
|
|
* 2 as published by the Free Software Foundation.
|
|
|
|
*
|
|
|
|
* You should have received a copy of the GNU General Public License
|
|
|
|
* along with this program. If not, see <http://www.gnu.org/licenses/>.
|
2012-01-13 20:44:23 +04:00
|
|
|
*
|
|
|
|
* Contributions after 2012-01-13 are licensed under the terms of the
|
|
|
|
* GNU GPL, version 2 or (at your option) any later version.
|
2010-06-21 00:01:00 +04:00
|
|
|
*/
|
|
|
|
|
2016-01-18 21:01:42 +03:00
|
|
|
#include "qemu/osdep.h"
|
2019-05-23 17:35:08 +03:00
|
|
|
#include "qemu-common.h"
|
include/qemu/osdep.h: Don't include qapi/error.h
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the
Error typedef. Since then, we've moved to include qemu/osdep.h
everywhere. Its file comment explains: "To avoid getting into
possible circular include dependencies, this file should not include
any other QEMU headers, with the exceptions of config-host.h,
compiler.h, os-posix.h and os-win32.h, all of which are doing a
similar job to this file and are under similar constraints."
qapi/error.h doesn't do a similar job, and it doesn't adhere to
similar constraints: it includes qapi-types.h. That's in excess of
100KiB of crap most .c files don't actually need.
Add the typedef to qemu/typedefs.h, and include that instead of
qapi/error.h. Include qapi/error.h in .c files that need it and don't
get it now. Include qapi-types.h in qom/object.h for uint16List.
Update scripts/clean-includes accordingly. Update it further to match
reality: replace config.h by config-target.h, add sysemu/os-posix.h,
sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h
comment quoted above similarly.
This reduces the number of objects depending on qapi/error.h from "all
of them" to less than a third. Unfortunately, the number depending on
qapi-types.h shrinks only a little. More work is needed for that one.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
[Fix compilation without the spice devel packages. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-14 11:01:28 +03:00
|
|
|
#include "qapi/error.h"
|
2018-02-11 12:36:01 +03:00
|
|
|
#include "qapi/qapi-visit-sockets.h"
|
2018-02-01 15:20:44 +03:00
|
|
|
#include "qapi/qapi-visit-block-core.h"
|
2017-03-06 22:00:43 +03:00
|
|
|
#include "qapi/qmp/qdict.h"
|
sheepdog: Fix blockdev-add
Commit 831acdc "sheepdog: Implement bdrv_parse_filename()" and commit
d282f34 "sheepdog: Support blockdev-add" have different ideas on how
the QemuOpts parameters for the server address are named. Fix that.
While there, rename BlockdevOptionsSheepdog member addr to server, for
consistency with BlockdevOptionsSsh, BlockdevOptionsGluster,
BlockdevOptionsNbd.
Commit 831acdc's example becomes
--drive driver=sheepdog,server.type=inet,server.host=fido,server.port=7000,vdi=dolly
instead of
--drive driver=sheepdog,host=fido,vdi=dolly
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Message-id: 1490895797-29094-10-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-30 20:43:17 +03:00
|
|
|
#include "qapi/qobject-input-visitor.h"
|
2018-02-01 15:20:44 +03:00
|
|
|
#include "qapi/qobject-output-visitor.h"
|
2013-02-22 07:39:51 +04:00
|
|
|
#include "qemu/uri.h"
|
2012-12-17 21:20:00 +04:00
|
|
|
#include "qemu/error-report.h"
|
Include qemu/main-loop.h less
In my "build everything" tree, changing qemu/main-loop.h triggers a
recompile of some 5600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h). It includes block/aio.h,
which in turn includes qemu/event_notifier.h, qemu/notify.h,
qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h,
qemu/thread.h, qemu/timer.h, and a few more.
Include qemu/main-loop.h only where it's needed. Touching it now
recompiles only some 1700 objects. For block/aio.h and
qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the
others, they shrink only slightly.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-21-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-12 08:23:50 +03:00
|
|
|
#include "qemu/main-loop.h"
|
2019-05-23 17:35:07 +03:00
|
|
|
#include "qemu/module.h"
|
2018-02-01 14:18:46 +03:00
|
|
|
#include "qemu/option.h"
|
2012-12-17 21:20:00 +04:00
|
|
|
#include "qemu/sockets.h"
|
2012-12-17 21:19:44 +04:00
|
|
|
#include "block/block_int.h"
|
2018-06-14 22:14:28 +03:00
|
|
|
#include "block/qdict.h"
|
2016-03-08 17:57:05 +03:00
|
|
|
#include "sysemu/block-backend.h"
|
2012-12-17 21:20:00 +04:00
|
|
|
#include "qemu/bitops.h"
|
2016-03-20 20:16:19 +03:00
|
|
|
#include "qemu/cutils.h"
|
2018-12-13 19:27:27 +03:00
|
|
|
#include "trace.h"
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
#define SD_PROTO_VER 0x01
|
|
|
|
|
|
|
|
#define SD_DEFAULT_ADDR "localhost"
|
2013-02-22 07:39:52 +04:00
|
|
|
#define SD_DEFAULT_PORT 7000
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
#define SD_OP_CREATE_AND_WRITE_OBJ 0x01
|
|
|
|
#define SD_OP_READ_OBJ 0x02
|
|
|
|
#define SD_OP_WRITE_OBJ 0x03
|
2013-04-23 10:03:33 +04:00
|
|
|
/* 0x04 is used internally by Sheepdog */
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
#define SD_OP_NEW_VDI 0x11
|
|
|
|
#define SD_OP_LOCK_VDI 0x12
|
|
|
|
#define SD_OP_RELEASE_VDI 0x13
|
|
|
|
#define SD_OP_GET_VDI_INFO 0x14
|
|
|
|
#define SD_OP_READ_VDIS 0x15
|
2012-04-04 00:03:58 +04:00
|
|
|
#define SD_OP_FLUSH_VDI 0x16
|
2013-04-25 16:49:39 +04:00
|
|
|
#define SD_OP_DEL_VDI 0x17
|
2015-02-13 12:20:53 +03:00
|
|
|
#define SD_OP_GET_CLUSTER_DEFAULT 0x18
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
#define SD_FLAG_CMD_WRITE 0x01
|
|
|
|
#define SD_FLAG_CMD_COW 0x02
|
2013-01-10 12:03:47 +04:00
|
|
|
#define SD_FLAG_CMD_CACHE 0x04 /* Writeback mode for cache */
|
|
|
|
#define SD_FLAG_CMD_DIRECT 0x08 /* Don't use cache */
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
#define SD_RES_SUCCESS 0x00 /* Success */
|
|
|
|
#define SD_RES_UNKNOWN 0x01 /* Unknown error */
|
|
|
|
#define SD_RES_NO_OBJ 0x02 /* No object found */
|
|
|
|
#define SD_RES_EIO 0x03 /* I/O error */
|
|
|
|
#define SD_RES_VDI_EXIST 0x04 /* Vdi exists already */
|
|
|
|
#define SD_RES_INVALID_PARMS 0x05 /* Invalid parameters */
|
|
|
|
#define SD_RES_SYSTEM_ERROR 0x06 /* System error */
|
|
|
|
#define SD_RES_VDI_LOCKED 0x07 /* Vdi is locked */
|
|
|
|
#define SD_RES_NO_VDI 0x08 /* No vdi found */
|
|
|
|
#define SD_RES_NO_BASE_VDI 0x09 /* No base vdi found */
|
|
|
|
#define SD_RES_VDI_READ 0x0A /* Cannot read requested vdi */
|
|
|
|
#define SD_RES_VDI_WRITE 0x0B /* Cannot write requested vdi */
|
|
|
|
#define SD_RES_BASE_VDI_READ 0x0C /* Cannot read base vdi */
|
|
|
|
#define SD_RES_BASE_VDI_WRITE 0x0D /* Cannot write base vdi */
|
|
|
|
#define SD_RES_NO_TAG 0x0E /* Requested tag is not found */
|
|
|
|
#define SD_RES_STARTUP 0x0F /* Sheepdog is on starting up */
|
|
|
|
#define SD_RES_VDI_NOT_LOCKED 0x10 /* Vdi is not locked */
|
|
|
|
#define SD_RES_SHUTDOWN 0x11 /* Sheepdog is shutting down */
|
|
|
|
#define SD_RES_NO_MEM 0x12 /* Cannot allocate memory */
|
|
|
|
#define SD_RES_FULL_VDI 0x13 /* we already have the maximum vdis */
|
|
|
|
#define SD_RES_VER_MISMATCH 0x14 /* Protocol version mismatch */
|
|
|
|
#define SD_RES_NO_SPACE 0x15 /* Server has no room for new objects */
|
|
|
|
#define SD_RES_WAIT_FOR_FORMAT 0x16 /* Waiting for a format operation */
|
|
|
|
#define SD_RES_WAIT_FOR_JOIN 0x17 /* Waiting for other nodes joining */
|
|
|
|
#define SD_RES_JOIN_FAILED 0x18 /* Target node had failed to join sheepdog */
|
2013-03-18 10:27:55 +04:00
|
|
|
#define SD_RES_HALT 0x19 /* Sheepdog is stopped serving IO request */
|
2013-04-25 20:19:52 +04:00
|
|
|
#define SD_RES_READONLY 0x1A /* Object is read-only */
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Object ID rules
|
|
|
|
*
|
|
|
|
* 0 - 19 (20 bits): data object space
|
|
|
|
* 20 - 31 (12 bits): reserved data object space
|
|
|
|
* 32 - 55 (24 bits): vdi object space
|
|
|
|
* 56 - 59 ( 4 bits): reserved vdi object space
|
2011-10-14 11:41:06 +04:00
|
|
|
* 60 - 63 ( 4 bits): object type identifier space
|
2010-06-21 00:01:00 +04:00
|
|
|
*/
|
|
|
|
|
|
|
|
#define VDI_SPACE_SHIFT 32
|
|
|
|
#define VDI_BIT (UINT64_C(1) << 63)
|
|
|
|
#define VMSTATE_BIT (UINT64_C(1) << 62)
|
|
|
|
#define MAX_DATA_OBJS (UINT64_C(1) << 20)
|
|
|
|
#define MAX_CHILDREN 1024
|
|
|
|
#define SD_MAX_VDI_LEN 256
|
|
|
|
#define SD_MAX_VDI_TAG_LEN 256
|
|
|
|
#define SD_NR_VDIS (1U << 24)
|
|
|
|
#define SD_DATA_OBJ_SIZE (UINT64_C(1) << 22)
|
|
|
|
#define SD_MAX_VDI_SIZE (SD_DATA_OBJ_SIZE * MAX_DATA_OBJS)
|
2015-02-13 12:20:53 +03:00
|
|
|
#define SD_DEFAULT_BLOCK_SIZE_SHIFT 22
|
2013-11-07 18:56:38 +04:00
|
|
|
/*
|
|
|
|
* For erasure coding, we use at most SD_EC_MAX_STRIP for data strips and
|
|
|
|
* (SD_EC_MAX_STRIP - 1) for parity strips
|
|
|
|
*
|
|
|
|
* SD_MAX_COPIES is sum of number of data strips and parity strips.
|
|
|
|
*/
|
|
|
|
#define SD_EC_MAX_STRIP 16
|
|
|
|
#define SD_MAX_COPIES (SD_EC_MAX_STRIP * 2 - 1)
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
#define SD_INODE_SIZE (sizeof(SheepdogInode))
|
|
|
|
#define CURRENT_VDI_ID 0
|
|
|
|
|
2014-08-11 09:43:45 +04:00
|
|
|
#define LOCK_TYPE_NORMAL 0
|
|
|
|
#define LOCK_TYPE_SHARED 1 /* for iSCSI multipath */
|
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
typedef struct SheepdogReq {
|
|
|
|
uint8_t proto_ver;
|
|
|
|
uint8_t opcode;
|
|
|
|
uint16_t flags;
|
|
|
|
uint32_t epoch;
|
|
|
|
uint32_t id;
|
|
|
|
uint32_t data_length;
|
|
|
|
uint32_t opcode_specific[8];
|
|
|
|
} SheepdogReq;
|
|
|
|
|
|
|
|
typedef struct SheepdogRsp {
|
|
|
|
uint8_t proto_ver;
|
|
|
|
uint8_t opcode;
|
|
|
|
uint16_t flags;
|
|
|
|
uint32_t epoch;
|
|
|
|
uint32_t id;
|
|
|
|
uint32_t data_length;
|
|
|
|
uint32_t result;
|
|
|
|
uint32_t opcode_specific[7];
|
|
|
|
} SheepdogRsp;
|
|
|
|
|
|
|
|
typedef struct SheepdogObjReq {
|
|
|
|
uint8_t proto_ver;
|
|
|
|
uint8_t opcode;
|
|
|
|
uint16_t flags;
|
|
|
|
uint32_t epoch;
|
|
|
|
uint32_t id;
|
|
|
|
uint32_t data_length;
|
|
|
|
uint64_t oid;
|
|
|
|
uint64_t cow_oid;
|
2013-10-23 12:51:51 +04:00
|
|
|
uint8_t copies;
|
2013-10-23 12:51:52 +04:00
|
|
|
uint8_t copy_policy;
|
|
|
|
uint8_t reserved[6];
|
2010-06-21 00:01:00 +04:00
|
|
|
uint64_t offset;
|
|
|
|
} SheepdogObjReq;
|
|
|
|
|
|
|
|
typedef struct SheepdogObjRsp {
|
|
|
|
uint8_t proto_ver;
|
|
|
|
uint8_t opcode;
|
|
|
|
uint16_t flags;
|
|
|
|
uint32_t epoch;
|
|
|
|
uint32_t id;
|
|
|
|
uint32_t data_length;
|
|
|
|
uint32_t result;
|
2013-10-23 12:51:51 +04:00
|
|
|
uint8_t copies;
|
2013-10-23 12:51:52 +04:00
|
|
|
uint8_t copy_policy;
|
|
|
|
uint8_t reserved[2];
|
2010-06-21 00:01:00 +04:00
|
|
|
uint32_t pad[6];
|
|
|
|
} SheepdogObjRsp;
|
|
|
|
|
|
|
|
typedef struct SheepdogVdiReq {
|
|
|
|
uint8_t proto_ver;
|
|
|
|
uint8_t opcode;
|
|
|
|
uint16_t flags;
|
|
|
|
uint32_t epoch;
|
|
|
|
uint32_t id;
|
|
|
|
uint32_t data_length;
|
|
|
|
uint64_t vdi_size;
|
2014-01-03 16:13:12 +04:00
|
|
|
uint32_t base_vdi_id;
|
2013-10-23 12:51:51 +04:00
|
|
|
uint8_t copies;
|
2013-10-23 12:51:52 +04:00
|
|
|
uint8_t copy_policy;
|
2015-02-13 12:20:53 +03:00
|
|
|
uint8_t store_policy;
|
|
|
|
uint8_t block_size_shift;
|
2010-06-21 00:01:00 +04:00
|
|
|
uint32_t snapid;
|
2014-08-11 09:43:45 +04:00
|
|
|
uint32_t type;
|
|
|
|
uint32_t pad[2];
|
2010-06-21 00:01:00 +04:00
|
|
|
} SheepdogVdiReq;
|
|
|
|
|
|
|
|
typedef struct SheepdogVdiRsp {
|
|
|
|
uint8_t proto_ver;
|
|
|
|
uint8_t opcode;
|
|
|
|
uint16_t flags;
|
|
|
|
uint32_t epoch;
|
|
|
|
uint32_t id;
|
|
|
|
uint32_t data_length;
|
|
|
|
uint32_t result;
|
|
|
|
uint32_t rsvd;
|
|
|
|
uint32_t vdi_id;
|
|
|
|
uint32_t pad[5];
|
|
|
|
} SheepdogVdiRsp;
|
|
|
|
|
2015-02-13 12:20:53 +03:00
|
|
|
typedef struct SheepdogClusterRsp {
|
|
|
|
uint8_t proto_ver;
|
|
|
|
uint8_t opcode;
|
|
|
|
uint16_t flags;
|
|
|
|
uint32_t epoch;
|
|
|
|
uint32_t id;
|
|
|
|
uint32_t data_length;
|
|
|
|
uint32_t result;
|
|
|
|
uint8_t nr_copies;
|
|
|
|
uint8_t copy_policy;
|
|
|
|
uint8_t block_size_shift;
|
|
|
|
uint8_t __pad1;
|
|
|
|
uint32_t __pad2[6];
|
|
|
|
} SheepdogClusterRsp;
|
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
typedef struct SheepdogInode {
|
|
|
|
char name[SD_MAX_VDI_LEN];
|
|
|
|
char tag[SD_MAX_VDI_TAG_LEN];
|
|
|
|
uint64_t ctime;
|
|
|
|
uint64_t snap_ctime;
|
|
|
|
uint64_t vm_clock_nsec;
|
|
|
|
uint64_t vdi_size;
|
|
|
|
uint64_t vm_state_size;
|
|
|
|
uint16_t copy_policy;
|
|
|
|
uint8_t nr_copies;
|
|
|
|
uint8_t block_size_shift;
|
|
|
|
uint32_t snap_id;
|
|
|
|
uint32_t vdi_id;
|
|
|
|
uint32_t parent_vdi_id;
|
|
|
|
uint32_t child_vdi_id[MAX_CHILDREN];
|
|
|
|
uint32_t data_vdi_id[MAX_DATA_OBJS];
|
|
|
|
} SheepdogInode;
|
|
|
|
|
2014-06-06 08:35:12 +04:00
|
|
|
#define SD_INODE_HEADER_SIZE offsetof(SheepdogInode, data_vdi_id)
|
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
/*
|
|
|
|
* 64 bit FNV-1a non-zero initial basis
|
|
|
|
*/
|
|
|
|
#define FNV1A_64_INIT ((uint64_t)0xcbf29ce484222325ULL)
|
|
|
|
|
|
|
|
/*
|
|
|
|
* 64 bit Fowler/Noll/Vo FNV-1a hash code
|
|
|
|
*/
|
|
|
|
static inline uint64_t fnv_64a_buf(void *buf, size_t len, uint64_t hval)
|
|
|
|
{
|
|
|
|
unsigned char *bp = buf;
|
|
|
|
unsigned char *be = bp + len;
|
|
|
|
while (bp < be) {
|
|
|
|
hval ^= (uint64_t) *bp++;
|
|
|
|
hval += (hval << 1) + (hval << 4) + (hval << 5) +
|
|
|
|
(hval << 7) + (hval << 8) + (hval << 40);
|
|
|
|
}
|
|
|
|
return hval;
|
|
|
|
}
|
|
|
|
|
2012-10-06 20:57:14 +04:00
|
|
|
static inline bool is_data_obj_writable(SheepdogInode *inode, unsigned int idx)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
|
|
|
return inode->vdi_id == inode->data_vdi_id[idx];
|
|
|
|
}
|
|
|
|
|
2012-10-06 20:57:14 +04:00
|
|
|
static inline bool is_data_obj(uint64_t oid)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
|
|
|
return !(VDI_BIT & oid);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline uint64_t data_oid_to_idx(uint64_t oid)
|
|
|
|
{
|
|
|
|
return oid & (MAX_DATA_OBJS - 1);
|
|
|
|
}
|
|
|
|
|
2013-10-24 11:01:13 +04:00
|
|
|
static inline uint32_t oid_to_vid(uint64_t oid)
|
|
|
|
{
|
|
|
|
return (oid & ~VDI_BIT) >> VDI_SPACE_SHIFT;
|
|
|
|
}
|
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
static inline uint64_t vid_to_vdi_oid(uint32_t vid)
|
|
|
|
{
|
|
|
|
return VDI_BIT | ((uint64_t)vid << VDI_SPACE_SHIFT);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline uint64_t vid_to_vmstate_oid(uint32_t vid, uint32_t idx)
|
|
|
|
{
|
|
|
|
return VMSTATE_BIT | ((uint64_t)vid << VDI_SPACE_SHIFT) | idx;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline uint64_t vid_to_data_oid(uint32_t vid, uint32_t idx)
|
|
|
|
{
|
|
|
|
return ((uint64_t)vid << VDI_SPACE_SHIFT) | idx;
|
|
|
|
}
|
|
|
|
|
2012-10-06 20:57:14 +04:00
|
|
|
static inline bool is_snapshot(struct SheepdogInode *inode)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
|
|
|
return !!inode->snap_ctime;
|
|
|
|
}
|
|
|
|
|
2015-12-23 15:22:26 +03:00
|
|
|
static inline size_t count_data_objs(const struct SheepdogInode *inode)
|
|
|
|
{
|
|
|
|
return DIV_ROUND_UP(inode->vdi_size,
|
|
|
|
(1UL << inode->block_size_shift));
|
|
|
|
}
|
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
typedef struct SheepdogAIOCB SheepdogAIOCB;
|
2016-11-29 14:32:43 +03:00
|
|
|
typedef struct BDRVSheepdogState BDRVSheepdogState;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
typedef struct AIOReq {
|
|
|
|
SheepdogAIOCB *aiocb;
|
|
|
|
unsigned int iov_offset;
|
|
|
|
|
|
|
|
uint64_t oid;
|
|
|
|
uint64_t base_oid;
|
|
|
|
uint64_t offset;
|
|
|
|
unsigned int data_len;
|
|
|
|
uint8_t flags;
|
|
|
|
uint32_t id;
|
2014-06-06 08:35:11 +04:00
|
|
|
bool create;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2012-06-27 02:26:22 +04:00
|
|
|
QLIST_ENTRY(AIOReq) aio_siblings;
|
2010-06-21 00:01:00 +04:00
|
|
|
} AIOReq;
|
|
|
|
|
|
|
|
enum AIOCBState {
|
|
|
|
AIOCB_WRITE_UDATA,
|
|
|
|
AIOCB_READ_UDATA,
|
2013-01-15 12:28:55 +04:00
|
|
|
AIOCB_FLUSH_CACHE,
|
2013-04-23 10:03:33 +04:00
|
|
|
AIOCB_DISCARD_OBJ,
|
2010-06-21 00:01:00 +04:00
|
|
|
};
|
|
|
|
|
2015-09-01 06:03:09 +03:00
|
|
|
#define AIOCBOverlapping(x, y) \
|
2015-07-17 19:44:24 +03:00
|
|
|
(!(x->max_affect_data_idx < y->min_affect_data_idx \
|
|
|
|
|| y->max_affect_data_idx < x->min_affect_data_idx))
|
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
struct SheepdogAIOCB {
|
2016-11-29 14:32:43 +03:00
|
|
|
BDRVSheepdogState *s;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
QEMUIOVector *qiov;
|
|
|
|
|
|
|
|
int64_t sector_num;
|
|
|
|
int nb_sectors;
|
|
|
|
|
|
|
|
int ret;
|
|
|
|
enum AIOCBState aiocb_type;
|
|
|
|
|
2011-08-12 16:33:15 +04:00
|
|
|
Coroutine *coroutine;
|
2012-06-27 02:26:21 +04:00
|
|
|
int nr_pending;
|
2015-07-17 19:44:24 +03:00
|
|
|
|
|
|
|
uint32_t min_affect_data_idx;
|
|
|
|
uint32_t max_affect_data_idx;
|
|
|
|
|
2015-09-01 06:03:09 +03:00
|
|
|
/*
|
|
|
|
* The difference between affect_data_idx and dirty_data_idx:
|
|
|
|
* affect_data_idx represents range of index of all request types.
|
|
|
|
* dirty_data_idx represents range of index updated by COW requests.
|
|
|
|
* dirty_data_idx is used for updating an inode object.
|
|
|
|
*/
|
|
|
|
uint32_t min_dirty_data_idx;
|
|
|
|
uint32_t max_dirty_data_idx;
|
|
|
|
|
2015-07-17 19:44:24 +03:00
|
|
|
QLIST_ENTRY(SheepdogAIOCB) aiocb_siblings;
|
2010-06-21 00:01:00 +04:00
|
|
|
};
|
|
|
|
|
2016-11-29 14:32:43 +03:00
|
|
|
struct BDRVSheepdogState {
|
2013-10-24 11:01:15 +04:00
|
|
|
BlockDriverState *bs;
|
2014-05-08 18:34:52 +04:00
|
|
|
AioContext *aio_context;
|
2013-10-24 11:01:15 +04:00
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
SheepdogInode inode;
|
|
|
|
|
|
|
|
char name[SD_MAX_VDI_LEN];
|
2012-10-06 20:57:14 +04:00
|
|
|
bool is_snapshot;
|
2013-01-10 12:03:47 +04:00
|
|
|
uint32_t cache_flags;
|
2013-04-23 10:03:33 +04:00
|
|
|
bool discard_supported;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2017-04-26 10:36:41 +03:00
|
|
|
SocketAddress *addr;
|
2010-06-21 00:01:00 +04:00
|
|
|
int fd;
|
|
|
|
|
2011-08-12 16:33:15 +04:00
|
|
|
CoMutex lock;
|
|
|
|
Coroutine *co_send;
|
|
|
|
Coroutine *co_recv;
|
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
uint32_t aioreq_seq_num;
|
2013-10-24 11:01:15 +04:00
|
|
|
|
|
|
|
/* Every aio request must be linked to either of these queues. */
|
2018-12-06 13:58:10 +03:00
|
|
|
QLIST_HEAD(, AIOReq) inflight_aio_head;
|
|
|
|
QLIST_HEAD(, AIOReq) failed_aio_head;
|
2015-07-17 19:44:24 +03:00
|
|
|
|
2017-06-29 16:27:48 +03:00
|
|
|
CoMutex queue_lock;
|
2015-09-01 06:03:09 +03:00
|
|
|
CoQueue overlapping_queue;
|
2018-12-06 13:58:10 +03:00
|
|
|
QLIST_HEAD(, SheepdogAIOCB) inflight_aiocb_head;
|
2016-11-29 14:32:43 +03:00
|
|
|
};
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2015-08-28 05:53:58 +03:00
|
|
|
typedef struct BDRVSheepdogReopenState {
|
|
|
|
int fd;
|
|
|
|
int cache_flags;
|
|
|
|
} BDRVSheepdogReopenState;
|
|
|
|
|
2017-11-08 01:27:21 +03:00
|
|
|
static const char *sd_strerror(int err)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
static const struct {
|
|
|
|
int err;
|
|
|
|
const char *desc;
|
|
|
|
} errors[] = {
|
|
|
|
{SD_RES_SUCCESS, "Success"},
|
|
|
|
{SD_RES_UNKNOWN, "Unknown error"},
|
|
|
|
{SD_RES_NO_OBJ, "No object found"},
|
|
|
|
{SD_RES_EIO, "I/O error"},
|
|
|
|
{SD_RES_VDI_EXIST, "VDI exists already"},
|
|
|
|
{SD_RES_INVALID_PARMS, "Invalid parameters"},
|
|
|
|
{SD_RES_SYSTEM_ERROR, "System error"},
|
|
|
|
{SD_RES_VDI_LOCKED, "VDI is already locked"},
|
|
|
|
{SD_RES_NO_VDI, "No vdi found"},
|
|
|
|
{SD_RES_NO_BASE_VDI, "No base VDI found"},
|
|
|
|
{SD_RES_VDI_READ, "Failed read the requested VDI"},
|
|
|
|
{SD_RES_VDI_WRITE, "Failed to write the requested VDI"},
|
|
|
|
{SD_RES_BASE_VDI_READ, "Failed to read the base VDI"},
|
|
|
|
{SD_RES_BASE_VDI_WRITE, "Failed to write the base VDI"},
|
|
|
|
{SD_RES_NO_TAG, "Failed to find the requested tag"},
|
|
|
|
{SD_RES_STARTUP, "The system is still booting"},
|
|
|
|
{SD_RES_VDI_NOT_LOCKED, "VDI isn't locked"},
|
|
|
|
{SD_RES_SHUTDOWN, "The system is shutting down"},
|
|
|
|
{SD_RES_NO_MEM, "Out of memory on the server"},
|
|
|
|
{SD_RES_FULL_VDI, "We already have the maximum vdis"},
|
|
|
|
{SD_RES_VER_MISMATCH, "Protocol version mismatch"},
|
|
|
|
{SD_RES_NO_SPACE, "Server has no space for new objects"},
|
|
|
|
{SD_RES_WAIT_FOR_FORMAT, "Sheepdog is waiting for a format operation"},
|
|
|
|
{SD_RES_WAIT_FOR_JOIN, "Sheepdog is waiting for other nodes joining"},
|
|
|
|
{SD_RES_JOIN_FAILED, "Target node had failed to join sheepdog"},
|
2013-03-18 10:27:55 +04:00
|
|
|
{SD_RES_HALT, "Sheepdog is stopped serving IO request"},
|
2013-04-25 20:19:52 +04:00
|
|
|
{SD_RES_READONLY, "Object is read-only"},
|
2010-06-21 00:01:00 +04:00
|
|
|
};
|
|
|
|
|
|
|
|
for (i = 0; i < ARRAY_SIZE(errors); ++i) {
|
|
|
|
if (errors[i].err == err) {
|
|
|
|
return errors[i].desc;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return "Invalid error code";
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Sheepdog I/O handling:
|
|
|
|
*
|
2011-08-12 16:33:15 +04:00
|
|
|
* 1. In sd_co_rw_vector, we send the I/O requests to the server and
|
2012-06-27 02:26:22 +04:00
|
|
|
* link the requests to the inflight_list in the
|
2016-11-29 14:32:42 +03:00
|
|
|
* BDRVSheepdogState. The function yields while waiting for
|
2011-08-12 16:33:15 +04:00
|
|
|
* receiving the response.
|
2010-06-21 00:01:00 +04:00
|
|
|
*
|
2011-08-12 16:33:15 +04:00
|
|
|
* 2. We receive the response in aio_read_response, the fd handler to
|
2016-11-29 14:32:42 +03:00
|
|
|
* the sheepdog connection. We switch back to sd_co_readv/sd_writev
|
|
|
|
* after all the requests belonging to the AIOCB are finished. If
|
|
|
|
* needed, sd_co_writev will send another requests for the vdi object.
|
2010-06-21 00:01:00 +04:00
|
|
|
*/
|
|
|
|
|
|
|
|
static inline AIOReq *alloc_aio_req(BDRVSheepdogState *s, SheepdogAIOCB *acb,
|
|
|
|
uint64_t oid, unsigned int data_len,
|
2014-06-06 08:35:11 +04:00
|
|
|
uint64_t offset, uint8_t flags, bool create,
|
2010-06-21 00:01:00 +04:00
|
|
|
uint64_t base_oid, unsigned int iov_offset)
|
|
|
|
{
|
|
|
|
AIOReq *aio_req;
|
|
|
|
|
2011-08-21 07:09:37 +04:00
|
|
|
aio_req = g_malloc(sizeof(*aio_req));
|
2010-06-21 00:01:00 +04:00
|
|
|
aio_req->aiocb = acb;
|
|
|
|
aio_req->iov_offset = iov_offset;
|
|
|
|
aio_req->oid = oid;
|
|
|
|
aio_req->base_oid = base_oid;
|
|
|
|
aio_req->offset = offset;
|
|
|
|
aio_req->data_len = data_len;
|
|
|
|
aio_req->flags = flags;
|
|
|
|
aio_req->id = s->aioreq_seq_num++;
|
2014-06-06 08:35:11 +04:00
|
|
|
aio_req->create = create;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2012-06-27 02:26:21 +04:00
|
|
|
acb->nr_pending++;
|
2010-06-21 00:01:00 +04:00
|
|
|
return aio_req;
|
|
|
|
}
|
|
|
|
|
2016-11-29 14:32:45 +03:00
|
|
|
static void wait_for_overlapping_aiocb(BDRVSheepdogState *s, SheepdogAIOCB *acb)
|
|
|
|
{
|
|
|
|
SheepdogAIOCB *cb;
|
|
|
|
|
|
|
|
retry:
|
|
|
|
QLIST_FOREACH(cb, &s->inflight_aiocb_head, aiocb_siblings) {
|
|
|
|
if (AIOCBOverlapping(acb, cb)) {
|
2017-06-29 16:27:48 +03:00
|
|
|
qemu_co_queue_wait(&s->overlapping_queue, &s->queue_lock);
|
2016-11-29 14:32:45 +03:00
|
|
|
goto retry;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2016-11-29 14:32:43 +03:00
|
|
|
static void sd_aio_setup(SheepdogAIOCB *acb, BDRVSheepdogState *s,
|
|
|
|
QEMUIOVector *qiov, int64_t sector_num, int nb_sectors,
|
|
|
|
int type)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
2015-07-17 19:44:24 +03:00
|
|
|
uint32_t object_size;
|
|
|
|
|
|
|
|
object_size = (UINT32_C(1) << s->inode.block_size_shift);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2016-11-29 14:32:43 +03:00
|
|
|
acb->s = s;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
acb->qiov = qiov;
|
|
|
|
|
|
|
|
acb->sector_num = sector_num;
|
|
|
|
acb->nb_sectors = nb_sectors;
|
|
|
|
|
2011-08-12 16:33:15 +04:00
|
|
|
acb->coroutine = qemu_coroutine_self();
|
2010-06-21 00:01:00 +04:00
|
|
|
acb->ret = 0;
|
2012-06-27 02:26:21 +04:00
|
|
|
acb->nr_pending = 0;
|
2015-07-17 19:44:24 +03:00
|
|
|
|
|
|
|
acb->min_affect_data_idx = acb->sector_num * BDRV_SECTOR_SIZE / object_size;
|
|
|
|
acb->max_affect_data_idx = (acb->sector_num * BDRV_SECTOR_SIZE +
|
|
|
|
acb->nb_sectors * BDRV_SECTOR_SIZE) / object_size;
|
|
|
|
|
2015-09-01 06:03:09 +03:00
|
|
|
acb->min_dirty_data_idx = UINT32_MAX;
|
|
|
|
acb->max_dirty_data_idx = 0;
|
2016-11-29 14:32:43 +03:00
|
|
|
acb->aiocb_type = type;
|
2016-11-29 14:32:45 +03:00
|
|
|
|
|
|
|
if (type == AIOCB_FLUSH_CACHE) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2017-06-29 16:27:48 +03:00
|
|
|
qemu_co_mutex_lock(&s->queue_lock);
|
2016-11-29 14:32:45 +03:00
|
|
|
wait_for_overlapping_aiocb(s, acb);
|
|
|
|
QLIST_INSERT_HEAD(&s->inflight_aiocb_head, acb, aiocb_siblings);
|
2017-06-29 16:27:48 +03:00
|
|
|
qemu_co_mutex_unlock(&s->queue_lock);
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2017-04-26 10:36:41 +03:00
|
|
|
static SocketAddress *sd_server_config(QDict *options, Error **errp)
|
sheepdog: Fix blockdev-add
Commit 831acdc "sheepdog: Implement bdrv_parse_filename()" and commit
d282f34 "sheepdog: Support blockdev-add" have different ideas on how
the QemuOpts parameters for the server address are named. Fix that.
While there, rename BlockdevOptionsSheepdog member addr to server, for
consistency with BlockdevOptionsSsh, BlockdevOptionsGluster,
BlockdevOptionsNbd.
Commit 831acdc's example becomes
--drive driver=sheepdog,server.type=inet,server.host=fido,server.port=7000,vdi=dolly
instead of
--drive driver=sheepdog,host=fido,vdi=dolly
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Message-id: 1490895797-29094-10-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-30 20:43:17 +03:00
|
|
|
{
|
|
|
|
QDict *server = NULL;
|
|
|
|
Visitor *iv = NULL;
|
2017-04-26 10:36:41 +03:00
|
|
|
SocketAddress *saddr = NULL;
|
sheepdog: Fix blockdev-add
Commit 831acdc "sheepdog: Implement bdrv_parse_filename()" and commit
d282f34 "sheepdog: Support blockdev-add" have different ideas on how
the QemuOpts parameters for the server address are named. Fix that.
While there, rename BlockdevOptionsSheepdog member addr to server, for
consistency with BlockdevOptionsSsh, BlockdevOptionsGluster,
BlockdevOptionsNbd.
Commit 831acdc's example becomes
--drive driver=sheepdog,server.type=inet,server.host=fido,server.port=7000,vdi=dolly
instead of
--drive driver=sheepdog,host=fido,vdi=dolly
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Message-id: 1490895797-29094-10-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-30 20:43:17 +03:00
|
|
|
Error *local_err = NULL;
|
|
|
|
|
|
|
|
qdict_extract_subqdict(options, &server, "server.");
|
|
|
|
|
2018-06-14 22:14:33 +03:00
|
|
|
iv = qobject_input_visitor_new_flat_confused(server, errp);
|
|
|
|
if (!iv) {
|
sheepdog: Fix blockdev-add
Commit 831acdc "sheepdog: Implement bdrv_parse_filename()" and commit
d282f34 "sheepdog: Support blockdev-add" have different ideas on how
the QemuOpts parameters for the server address are named. Fix that.
While there, rename BlockdevOptionsSheepdog member addr to server, for
consistency with BlockdevOptionsSsh, BlockdevOptionsGluster,
BlockdevOptionsNbd.
Commit 831acdc's example becomes
--drive driver=sheepdog,server.type=inet,server.host=fido,server.port=7000,vdi=dolly
instead of
--drive driver=sheepdog,host=fido,vdi=dolly
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Message-id: 1490895797-29094-10-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-30 20:43:17 +03:00
|
|
|
goto done;
|
|
|
|
}
|
|
|
|
|
2017-04-26 10:36:41 +03:00
|
|
|
visit_type_SocketAddress(iv, NULL, &saddr, &local_err);
|
sheepdog: Fix blockdev-add
Commit 831acdc "sheepdog: Implement bdrv_parse_filename()" and commit
d282f34 "sheepdog: Support blockdev-add" have different ideas on how
the QemuOpts parameters for the server address are named. Fix that.
While there, rename BlockdevOptionsSheepdog member addr to server, for
consistency with BlockdevOptionsSsh, BlockdevOptionsGluster,
BlockdevOptionsNbd.
Commit 831acdc's example becomes
--drive driver=sheepdog,server.type=inet,server.host=fido,server.port=7000,vdi=dolly
instead of
--drive driver=sheepdog,host=fido,vdi=dolly
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Message-id: 1490895797-29094-10-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-30 20:43:17 +03:00
|
|
|
if (local_err) {
|
|
|
|
error_propagate(errp, local_err);
|
|
|
|
goto done;
|
|
|
|
}
|
|
|
|
|
|
|
|
done:
|
|
|
|
visit_free(iv);
|
2018-04-19 18:01:43 +03:00
|
|
|
qobject_unref(server);
|
sheepdog: Fix blockdev-add
Commit 831acdc "sheepdog: Implement bdrv_parse_filename()" and commit
d282f34 "sheepdog: Support blockdev-add" have different ideas on how
the QemuOpts parameters for the server address are named. Fix that.
While there, rename BlockdevOptionsSheepdog member addr to server, for
consistency with BlockdevOptionsSsh, BlockdevOptionsGluster,
BlockdevOptionsNbd.
Commit 831acdc's example becomes
--drive driver=sheepdog,server.type=inet,server.host=fido,server.port=7000,vdi=dolly
instead of
--drive driver=sheepdog,host=fido,vdi=dolly
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Message-id: 1490895797-29094-10-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-30 20:43:17 +03:00
|
|
|
return saddr;
|
|
|
|
}
|
|
|
|
|
2015-02-18 06:57:55 +03:00
|
|
|
/* Return -EIO in case of error, file descriptor on success */
|
2014-05-16 13:00:19 +04:00
|
|
|
static int connect_to_sdog(BDRVSheepdogState *s, Error **errp)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
2013-02-22 07:39:52 +04:00
|
|
|
int fd;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2017-06-16 11:54:45 +03:00
|
|
|
fd = socket_connect(s->addr, errp);
|
2013-02-22 07:39:53 +04:00
|
|
|
|
2017-04-26 10:36:41 +03:00
|
|
|
if (s->addr->type == SOCKET_ADDRESS_TYPE_INET && fd >= 0) {
|
2017-03-06 22:00:42 +03:00
|
|
|
int ret = socket_set_nodelay(fd);
|
|
|
|
if (ret < 0) {
|
2018-10-17 11:26:27 +03:00
|
|
|
warn_report("can't set TCP_NODELAY: %s", strerror(errno));
|
2013-02-22 07:39:53 +04:00
|
|
|
}
|
|
|
|
}
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2014-05-16 13:00:19 +04:00
|
|
|
if (fd >= 0) {
|
2013-03-27 13:10:43 +04:00
|
|
|
qemu_set_nonblock(fd);
|
2015-02-18 06:57:55 +03:00
|
|
|
} else {
|
|
|
|
fd = -EIO;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
return fd;
|
|
|
|
}
|
|
|
|
|
2015-02-18 06:57:55 +03:00
|
|
|
/* Return 0 on success and -errno in case of error */
|
2012-05-30 04:03:55 +04:00
|
|
|
static coroutine_fn int send_co_req(int sockfd, SheepdogReq *hdr, void *data,
|
|
|
|
unsigned int *wlen)
|
2012-04-04 00:03:58 +04:00
|
|
|
{
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = qemu_co_send(sockfd, hdr, sizeof(*hdr));
|
2013-10-24 11:01:11 +04:00
|
|
|
if (ret != sizeof(*hdr)) {
|
2012-04-04 00:03:58 +04:00
|
|
|
error_report("failed to send a req, %s", strerror(errno));
|
2016-03-07 23:36:03 +03:00
|
|
|
return -errno;
|
2012-04-04 00:03:58 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
ret = qemu_co_send(sockfd, data, *wlen);
|
2013-10-24 11:01:11 +04:00
|
|
|
if (ret != *wlen) {
|
2012-04-04 00:03:58 +04:00
|
|
|
error_report("failed to send a req, %s", strerror(errno));
|
2016-03-07 23:36:03 +03:00
|
|
|
return -errno;
|
2012-04-04 00:03:58 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
2012-05-30 04:03:55 +04:00
|
|
|
|
2012-07-04 20:41:06 +04:00
|
|
|
typedef struct SheepdogReqCo {
|
|
|
|
int sockfd;
|
2016-10-27 13:48:58 +03:00
|
|
|
BlockDriverState *bs;
|
2014-05-08 18:34:52 +04:00
|
|
|
AioContext *aio_context;
|
2012-07-04 20:41:06 +04:00
|
|
|
SheepdogReq *hdr;
|
|
|
|
void *data;
|
|
|
|
unsigned int *wlen;
|
|
|
|
unsigned int *rlen;
|
|
|
|
int ret;
|
|
|
|
bool finished;
|
2017-02-13 16:52:30 +03:00
|
|
|
Coroutine *co;
|
2012-07-04 20:41:06 +04:00
|
|
|
} SheepdogReqCo;
|
|
|
|
|
2017-02-13 16:52:30 +03:00
|
|
|
static void restart_co_req(void *opaque)
|
|
|
|
{
|
|
|
|
SheepdogReqCo *srco = opaque;
|
|
|
|
|
|
|
|
aio_co_wake(srco->co);
|
|
|
|
}
|
|
|
|
|
2012-07-04 20:41:06 +04:00
|
|
|
static coroutine_fn void do_co_req(void *opaque)
|
2012-04-04 00:03:58 +04:00
|
|
|
{
|
|
|
|
int ret;
|
2012-07-04 20:41:06 +04:00
|
|
|
SheepdogReqCo *srco = opaque;
|
|
|
|
int sockfd = srco->sockfd;
|
|
|
|
SheepdogReq *hdr = srco->hdr;
|
|
|
|
void *data = srco->data;
|
|
|
|
unsigned int *wlen = srco->wlen;
|
|
|
|
unsigned int *rlen = srco->rlen;
|
2012-06-27 02:26:19 +04:00
|
|
|
|
2017-02-13 16:52:30 +03:00
|
|
|
srco->co = qemu_coroutine_self();
|
2015-10-23 06:08:05 +03:00
|
|
|
aio_set_fd_handler(srco->aio_context, sockfd, false,
|
2017-02-13 16:52:30 +03:00
|
|
|
NULL, restart_co_req, NULL, srco);
|
2012-04-04 00:03:58 +04:00
|
|
|
|
|
|
|
ret = send_co_req(sockfd, hdr, data, wlen);
|
|
|
|
if (ret < 0) {
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2015-10-23 06:08:05 +03:00
|
|
|
aio_set_fd_handler(srco->aio_context, sockfd, false,
|
2017-02-13 16:52:30 +03:00
|
|
|
restart_co_req, NULL, NULL, srco);
|
2012-06-27 02:26:19 +04:00
|
|
|
|
2012-04-04 00:03:58 +04:00
|
|
|
ret = qemu_co_recv(sockfd, hdr, sizeof(*hdr));
|
2013-10-24 11:01:11 +04:00
|
|
|
if (ret != sizeof(*hdr)) {
|
2012-04-04 00:03:58 +04:00
|
|
|
error_report("failed to get a rsp, %s", strerror(errno));
|
2012-05-16 22:15:33 +04:00
|
|
|
ret = -errno;
|
2012-04-04 00:03:58 +04:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (*rlen > hdr->data_length) {
|
|
|
|
*rlen = hdr->data_length;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (*rlen) {
|
|
|
|
ret = qemu_co_recv(sockfd, data, *rlen);
|
2013-10-24 11:01:11 +04:00
|
|
|
if (ret != *rlen) {
|
2012-04-04 00:03:58 +04:00
|
|
|
error_report("failed to get the data, %s", strerror(errno));
|
2012-05-16 22:15:33 +04:00
|
|
|
ret = -errno;
|
2012-04-04 00:03:58 +04:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
ret = 0;
|
|
|
|
out:
|
2013-03-12 11:05:43 +04:00
|
|
|
/* there is at most one request for this sockfd, so it is safe to
|
|
|
|
* set each handler to NULL. */
|
2015-10-23 06:08:05 +03:00
|
|
|
aio_set_fd_handler(srco->aio_context, sockfd, false,
|
2016-12-01 22:26:41 +03:00
|
|
|
NULL, NULL, NULL, NULL);
|
2012-07-04 20:41:06 +04:00
|
|
|
|
2017-02-13 16:52:30 +03:00
|
|
|
srco->co = NULL;
|
2012-07-04 20:41:06 +04:00
|
|
|
srco->ret = ret;
|
2017-06-05 15:38:54 +03:00
|
|
|
/* Set srco->finished before reading bs->wakeup. */
|
|
|
|
atomic_mb_set(&srco->finished, true);
|
2016-10-27 13:49:05 +03:00
|
|
|
if (srco->bs) {
|
|
|
|
bdrv_wakeup(srco->bs);
|
|
|
|
}
|
2012-07-04 20:41:06 +04:00
|
|
|
}
|
|
|
|
|
2015-02-18 06:57:55 +03:00
|
|
|
/*
|
|
|
|
* Send the request to the sheep in a synchronous manner.
|
|
|
|
*
|
|
|
|
* Return 0 on success, -errno in case of error.
|
|
|
|
*/
|
2016-10-27 13:48:58 +03:00
|
|
|
static int do_req(int sockfd, BlockDriverState *bs, SheepdogReq *hdr,
|
2014-05-08 18:34:52 +04:00
|
|
|
void *data, unsigned int *wlen, unsigned int *rlen)
|
2012-07-04 20:41:06 +04:00
|
|
|
{
|
|
|
|
Coroutine *co;
|
|
|
|
SheepdogReqCo srco = {
|
|
|
|
.sockfd = sockfd,
|
2016-10-27 13:48:58 +03:00
|
|
|
.aio_context = bs ? bdrv_get_aio_context(bs) : qemu_get_aio_context(),
|
|
|
|
.bs = bs,
|
2012-07-04 20:41:06 +04:00
|
|
|
.hdr = hdr,
|
|
|
|
.data = data,
|
|
|
|
.wlen = wlen,
|
|
|
|
.rlen = rlen,
|
|
|
|
.ret = 0,
|
|
|
|
.finished = false,
|
|
|
|
};
|
|
|
|
|
|
|
|
if (qemu_in_coroutine()) {
|
|
|
|
do_co_req(&srco);
|
|
|
|
} else {
|
coroutine: move entry argument to qemu_coroutine_create
In practice the entry argument is always known at creation time, and
it is confusing that sometimes qemu_coroutine_enter is used with a
non-NULL argument to re-enter a coroutine (this happens in
block/sheepdog.c and tests/test-coroutine.c). So pass the opaque value
at creation time, for consistency with e.g. aio_bh_new.
Mostly done with the following semantic patch:
@ entry1 @
expression entry, arg, co;
@@
- co = qemu_coroutine_create(entry);
+ co = qemu_coroutine_create(entry, arg);
...
- qemu_coroutine_enter(co, arg);
+ qemu_coroutine_enter(co);
@ entry2 @
expression entry, arg;
identifier co;
@@
- Coroutine *co = qemu_coroutine_create(entry);
+ Coroutine *co = qemu_coroutine_create(entry, arg);
...
- qemu_coroutine_enter(co, arg);
+ qemu_coroutine_enter(co);
@ entry3 @
expression entry, arg;
@@
- qemu_coroutine_enter(qemu_coroutine_create(entry), arg);
+ qemu_coroutine_enter(qemu_coroutine_create(entry, arg));
@ reentry @
expression co;
@@
- qemu_coroutine_enter(co, NULL);
+ qemu_coroutine_enter(co);
except for the aforementioned few places where the semantic patch
stumbled (as expected) and for test_co_queue, which would otherwise
produce an uninitialized variable warning.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-07-04 20:10:01 +03:00
|
|
|
co = qemu_coroutine_create(do_co_req, &srco);
|
2016-10-27 13:48:58 +03:00
|
|
|
if (bs) {
|
2017-04-11 14:43:52 +03:00
|
|
|
bdrv_coroutine_enter(bs, co);
|
2016-10-27 13:48:58 +03:00
|
|
|
BDRV_POLL_WHILE(bs, !srco.finished);
|
|
|
|
} else {
|
|
|
|
qemu_coroutine_enter(co);
|
|
|
|
while (!srco.finished) {
|
|
|
|
aio_poll(qemu_get_aio_context(), true);
|
|
|
|
}
|
2012-07-04 20:41:06 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return srco.ret;
|
2012-04-04 00:03:58 +04:00
|
|
|
}
|
|
|
|
|
2013-10-24 11:01:16 +04:00
|
|
|
static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
|
2014-06-06 08:35:11 +04:00
|
|
|
struct iovec *iov, int niov,
|
|
|
|
enum AIOCBState aiocb_type);
|
2013-10-24 11:01:16 +04:00
|
|
|
static void coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req);
|
2013-10-24 11:01:13 +04:00
|
|
|
static int reload_inode(BDRVSheepdogState *s, uint32_t snapid, const char *tag);
|
2014-05-16 13:00:20 +04:00
|
|
|
static int get_sheep_fd(BDRVSheepdogState *s, Error **errp);
|
2013-10-24 11:01:15 +04:00
|
|
|
static void co_write_request(void *opaque);
|
2012-06-27 02:26:23 +04:00
|
|
|
|
2013-10-24 11:01:15 +04:00
|
|
|
static coroutine_fn void reconnect_to_sdog(void *opaque)
|
|
|
|
{
|
|
|
|
BDRVSheepdogState *s = opaque;
|
|
|
|
AIOReq *aio_req, *next;
|
|
|
|
|
2015-10-23 06:08:05 +03:00
|
|
|
aio_set_fd_handler(s->aio_context, s->fd, false, NULL,
|
2016-12-01 22:26:41 +03:00
|
|
|
NULL, NULL, NULL);
|
2013-10-24 11:01:15 +04:00
|
|
|
close(s->fd);
|
|
|
|
s->fd = -1;
|
|
|
|
|
|
|
|
/* Wait for outstanding write requests to be completed. */
|
|
|
|
while (s->co_send != NULL) {
|
|
|
|
co_write_request(opaque);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Try to reconnect the sheepdog server every one second. */
|
|
|
|
while (s->fd < 0) {
|
2014-08-28 14:27:55 +04:00
|
|
|
Error *local_err = NULL;
|
2014-05-16 13:00:20 +04:00
|
|
|
s->fd = get_sheep_fd(s, &local_err);
|
2013-10-24 11:01:15 +04:00
|
|
|
if (s->fd < 0) {
|
2018-12-13 19:27:27 +03:00
|
|
|
trace_sheepdog_reconnect_to_sdog();
|
2015-02-12 15:55:05 +03:00
|
|
|
error_report_err(local_err);
|
2017-11-09 13:26:52 +03:00
|
|
|
qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 1000000000ULL);
|
2013-10-24 11:01:15 +04:00
|
|
|
}
|
|
|
|
};
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Now we have to resend all the request in the inflight queue. However,
|
|
|
|
* resend_aioreq() can yield and newly created requests can be added to the
|
|
|
|
* inflight queue before the coroutine is resumed. To avoid mixing them, we
|
|
|
|
* have to move all the inflight requests to the failed queue before
|
|
|
|
* resend_aioreq() is called.
|
|
|
|
*/
|
2017-06-29 16:27:48 +03:00
|
|
|
qemu_co_mutex_lock(&s->queue_lock);
|
2013-10-24 11:01:15 +04:00
|
|
|
QLIST_FOREACH_SAFE(aio_req, &s->inflight_aio_head, aio_siblings, next) {
|
|
|
|
QLIST_REMOVE(aio_req, aio_siblings);
|
|
|
|
QLIST_INSERT_HEAD(&s->failed_aio_head, aio_req, aio_siblings);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Resend all the failed aio requests. */
|
|
|
|
while (!QLIST_EMPTY(&s->failed_aio_head)) {
|
|
|
|
aio_req = QLIST_FIRST(&s->failed_aio_head);
|
|
|
|
QLIST_REMOVE(aio_req, aio_siblings);
|
2017-06-29 16:27:48 +03:00
|
|
|
qemu_co_mutex_unlock(&s->queue_lock);
|
2013-10-24 11:01:15 +04:00
|
|
|
resend_aioreq(s, aio_req);
|
2017-06-29 16:27:48 +03:00
|
|
|
qemu_co_mutex_lock(&s->queue_lock);
|
2013-10-24 11:01:15 +04:00
|
|
|
}
|
2017-06-29 16:27:48 +03:00
|
|
|
qemu_co_mutex_unlock(&s->queue_lock);
|
2013-10-24 11:01:15 +04:00
|
|
|
}
|
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
/*
|
|
|
|
* Receive responses of the I/O requests.
|
|
|
|
*
|
|
|
|
* This function is registered as a fd handler, and called from the
|
|
|
|
* main loop when s->fd is ready for reading responses.
|
|
|
|
*/
|
2011-10-05 11:17:31 +04:00
|
|
|
static void coroutine_fn aio_read_response(void *opaque)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
|
|
|
SheepdogObjRsp rsp;
|
|
|
|
BDRVSheepdogState *s = opaque;
|
|
|
|
int fd = s->fd;
|
|
|
|
int ret;
|
|
|
|
AIOReq *aio_req = NULL;
|
|
|
|
SheepdogAIOCB *acb;
|
2013-04-23 10:03:33 +04:00
|
|
|
uint64_t idx;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
/* read a header */
|
2011-09-08 15:46:25 +04:00
|
|
|
ret = qemu_co_recv(fd, &rsp, sizeof(rsp));
|
2013-10-24 11:01:11 +04:00
|
|
|
if (ret != sizeof(rsp)) {
|
2011-06-22 16:03:54 +04:00
|
|
|
error_report("failed to get the header, %s", strerror(errno));
|
2013-10-24 11:01:15 +04:00
|
|
|
goto err;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2012-06-27 02:26:22 +04:00
|
|
|
/* find the right aio_req from the inflight aio list */
|
|
|
|
QLIST_FOREACH(aio_req, &s->inflight_aio_head, aio_siblings) {
|
2010-06-21 00:01:00 +04:00
|
|
|
if (aio_req->id == rsp.id) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (!aio_req) {
|
2011-06-22 16:03:54 +04:00
|
|
|
error_report("cannot find aio_req %x", rsp.id);
|
2013-10-24 11:01:15 +04:00
|
|
|
goto err;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
acb = aio_req->aiocb;
|
|
|
|
|
|
|
|
switch (acb->aiocb_type) {
|
|
|
|
case AIOCB_WRITE_UDATA:
|
|
|
|
if (!is_data_obj(aio_req->oid)) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
idx = data_oid_to_idx(aio_req->oid);
|
|
|
|
|
2014-06-06 08:35:11 +04:00
|
|
|
if (aio_req->create) {
|
2010-06-21 00:01:00 +04:00
|
|
|
/*
|
|
|
|
* If the object is newly created one, we need to update
|
|
|
|
* the vdi object (metadata object). min_dirty_data_idx
|
|
|
|
* and max_dirty_data_idx are changed to include updated
|
|
|
|
* index between them.
|
|
|
|
*/
|
2012-12-17 10:17:26 +04:00
|
|
|
if (rsp.result == SD_RES_SUCCESS) {
|
|
|
|
s->inode.data_vdi_id[idx] = s->inode.vdi_id;
|
2015-09-01 06:03:09 +03:00
|
|
|
acb->max_dirty_data_idx = MAX(idx, acb->max_dirty_data_idx);
|
|
|
|
acb->min_dirty_data_idx = MIN(idx, acb->min_dirty_data_idx);
|
2012-12-17 10:17:26 +04:00
|
|
|
}
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
break;
|
|
|
|
case AIOCB_READ_UDATA:
|
2012-06-07 20:22:46 +04:00
|
|
|
ret = qemu_co_recvv(fd, acb->qiov->iov, acb->qiov->niov,
|
|
|
|
aio_req->iov_offset, rsp.data_length);
|
2013-10-24 11:01:11 +04:00
|
|
|
if (ret != rsp.data_length) {
|
2011-06-22 16:03:54 +04:00
|
|
|
error_report("failed to get the data, %s", strerror(errno));
|
2013-10-24 11:01:15 +04:00
|
|
|
goto err;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
break;
|
2013-01-15 12:28:55 +04:00
|
|
|
case AIOCB_FLUSH_CACHE:
|
|
|
|
if (rsp.result == SD_RES_INVALID_PARMS) {
|
2018-12-13 19:27:27 +03:00
|
|
|
trace_sheepdog_aio_read_response();
|
2013-01-15 12:28:55 +04:00
|
|
|
s->cache_flags = SD_FLAG_CMD_DIRECT;
|
|
|
|
rsp.result = SD_RES_SUCCESS;
|
|
|
|
}
|
|
|
|
break;
|
2013-04-23 10:03:33 +04:00
|
|
|
case AIOCB_DISCARD_OBJ:
|
|
|
|
switch (rsp.result) {
|
|
|
|
case SD_RES_INVALID_PARMS:
|
2017-03-06 22:00:42 +03:00
|
|
|
error_report("server doesn't support discard command");
|
2013-04-23 10:03:33 +04:00
|
|
|
rsp.result = SD_RES_SUCCESS;
|
|
|
|
s->discard_supported = false;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
break;
|
|
|
|
}
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2016-11-29 14:32:42 +03:00
|
|
|
/* No more data for this aio_req (reload_inode below uses its own file
|
|
|
|
* descriptor handler which doesn't use co_recv).
|
|
|
|
*/
|
|
|
|
s->co_recv = NULL;
|
|
|
|
|
2017-06-29 16:27:48 +03:00
|
|
|
qemu_co_mutex_lock(&s->queue_lock);
|
2016-11-29 14:32:44 +03:00
|
|
|
QLIST_REMOVE(aio_req, aio_siblings);
|
2017-06-29 16:27:48 +03:00
|
|
|
qemu_co_mutex_unlock(&s->queue_lock);
|
|
|
|
|
2013-04-25 20:19:54 +04:00
|
|
|
switch (rsp.result) {
|
|
|
|
case SD_RES_SUCCESS:
|
|
|
|
break;
|
|
|
|
case SD_RES_READONLY:
|
2013-10-24 11:01:13 +04:00
|
|
|
if (s->inode.vdi_id == oid_to_vid(aio_req->oid)) {
|
|
|
|
ret = reload_inode(s, 0, "");
|
|
|
|
if (ret < 0) {
|
2013-10-24 11:01:15 +04:00
|
|
|
goto err;
|
2013-10-24 11:01:13 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
if (is_data_obj(aio_req->oid)) {
|
|
|
|
aio_req->oid = vid_to_data_oid(s->inode.vdi_id,
|
|
|
|
data_oid_to_idx(aio_req->oid));
|
|
|
|
} else {
|
|
|
|
aio_req->oid = vid_to_vdi_oid(s->inode.vdi_id);
|
|
|
|
}
|
2013-10-24 11:01:16 +04:00
|
|
|
resend_aioreq(s, aio_req);
|
2016-11-29 14:32:42 +03:00
|
|
|
return;
|
2013-04-25 20:19:54 +04:00
|
|
|
default:
|
2010-06-21 00:01:00 +04:00
|
|
|
acb->ret = -EIO;
|
2011-06-22 16:03:54 +04:00
|
|
|
error_report("%s", sd_strerror(rsp.result));
|
2013-04-25 20:19:54 +04:00
|
|
|
break;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2016-11-29 14:32:44 +03:00
|
|
|
g_free(aio_req);
|
|
|
|
|
|
|
|
if (!--acb->nr_pending) {
|
2010-06-21 00:01:00 +04:00
|
|
|
/*
|
|
|
|
* We've finished all requests which belong to the AIOCB, so
|
2011-08-12 16:33:15 +04:00
|
|
|
* we can switch back to sd_co_readv/writev now.
|
2010-06-21 00:01:00 +04:00
|
|
|
*/
|
2017-02-13 16:52:30 +03:00
|
|
|
aio_co_wake(acb->coroutine);
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
2016-11-29 14:32:42 +03:00
|
|
|
|
2013-10-24 11:01:15 +04:00
|
|
|
return;
|
2016-11-29 14:32:42 +03:00
|
|
|
|
2013-10-24 11:01:15 +04:00
|
|
|
err:
|
|
|
|
reconnect_to_sdog(opaque);
|
2011-08-12 16:33:15 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static void co_read_response(void *opaque)
|
|
|
|
{
|
|
|
|
BDRVSheepdogState *s = opaque;
|
|
|
|
|
|
|
|
if (!s->co_recv) {
|
coroutine: move entry argument to qemu_coroutine_create
In practice the entry argument is always known at creation time, and
it is confusing that sometimes qemu_coroutine_enter is used with a
non-NULL argument to re-enter a coroutine (this happens in
block/sheepdog.c and tests/test-coroutine.c). So pass the opaque value
at creation time, for consistency with e.g. aio_bh_new.
Mostly done with the following semantic patch:
@ entry1 @
expression entry, arg, co;
@@
- co = qemu_coroutine_create(entry);
+ co = qemu_coroutine_create(entry, arg);
...
- qemu_coroutine_enter(co, arg);
+ qemu_coroutine_enter(co);
@ entry2 @
expression entry, arg;
identifier co;
@@
- Coroutine *co = qemu_coroutine_create(entry);
+ Coroutine *co = qemu_coroutine_create(entry, arg);
...
- qemu_coroutine_enter(co, arg);
+ qemu_coroutine_enter(co);
@ entry3 @
expression entry, arg;
@@
- qemu_coroutine_enter(qemu_coroutine_create(entry), arg);
+ qemu_coroutine_enter(qemu_coroutine_create(entry, arg));
@ reentry @
expression co;
@@
- qemu_coroutine_enter(co, NULL);
+ qemu_coroutine_enter(co);
except for the aforementioned few places where the semantic patch
stumbled (as expected) and for test_co_queue, which would otherwise
produce an uninitialized variable warning.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-07-04 20:10:01 +03:00
|
|
|
s->co_recv = qemu_coroutine_create(aio_read_response, opaque);
|
2011-08-12 16:33:15 +04:00
|
|
|
}
|
|
|
|
|
2017-04-11 17:08:53 +03:00
|
|
|
aio_co_enter(s->aio_context, s->co_recv);
|
2011-08-12 16:33:15 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static void co_write_request(void *opaque)
|
|
|
|
{
|
|
|
|
BDRVSheepdogState *s = opaque;
|
|
|
|
|
2017-02-13 16:52:30 +03:00
|
|
|
aio_co_wake(s->co_send);
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2014-03-24 12:30:17 +04:00
|
|
|
* Return a socket descriptor to read/write objects.
|
2010-06-21 00:01:00 +04:00
|
|
|
*
|
2014-03-24 12:30:17 +04:00
|
|
|
* We cannot use this descriptor for other operations because
|
2010-06-21 00:01:00 +04:00
|
|
|
* the block driver may be on waiting response from the server.
|
|
|
|
*/
|
2014-05-16 13:00:20 +04:00
|
|
|
static int get_sheep_fd(BDRVSheepdogState *s, Error **errp)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
2013-02-22 07:39:53 +04:00
|
|
|
int fd;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2014-05-16 13:00:20 +04:00
|
|
|
fd = connect_to_sdog(s, errp);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (fd < 0) {
|
2012-05-16 22:15:33 +04:00
|
|
|
return fd;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2015-10-23 06:08:05 +03:00
|
|
|
aio_set_fd_handler(s->aio_context, fd, false,
|
2016-12-01 22:26:41 +03:00
|
|
|
co_read_response, NULL, NULL, s);
|
2010-06-21 00:01:00 +04:00
|
|
|
return fd;
|
|
|
|
}
|
|
|
|
|
2017-03-06 22:00:39 +03:00
|
|
|
/*
|
|
|
|
* Parse numeric snapshot ID in @str
|
|
|
|
* If @str can't be parsed as number, return false.
|
|
|
|
* Else, if the number is zero or too large, set *@snapid to zero and
|
|
|
|
* return true.
|
|
|
|
* Else, set *@snapid to the number and return true.
|
|
|
|
*/
|
|
|
|
static bool sd_parse_snapid(const char *str, uint32_t *snapid)
|
|
|
|
{
|
|
|
|
unsigned long ul;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = qemu_strtoul(str, NULL, 10, &ul);
|
|
|
|
if (ret == -ERANGE) {
|
|
|
|
ul = ret = 0;
|
|
|
|
}
|
|
|
|
if (ret) {
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
if (ul > UINT32_MAX) {
|
|
|
|
ul = 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
*snapid = ul;
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
static bool sd_parse_snapid_or_tag(const char *str,
|
|
|
|
uint32_t *snapid, char tag[])
|
|
|
|
{
|
|
|
|
if (!sd_parse_snapid(str, snapid)) {
|
|
|
|
*snapid = 0;
|
|
|
|
if (g_strlcpy(tag, str, SD_MAX_VDI_TAG_LEN) >= SD_MAX_VDI_TAG_LEN) {
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
} else if (!*snapid) {
|
|
|
|
return false;
|
|
|
|
} else {
|
|
|
|
tag[0] = 0;
|
|
|
|
}
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2017-03-06 22:00:43 +03:00
|
|
|
typedef struct {
|
|
|
|
const char *path; /* non-null iff transport is tcp */
|
|
|
|
const char *host; /* valid when transport is tcp */
|
|
|
|
int port; /* valid when transport is tcp */
|
|
|
|
char vdi[SD_MAX_VDI_LEN];
|
|
|
|
char tag[SD_MAX_VDI_TAG_LEN];
|
|
|
|
uint32_t snap_id;
|
|
|
|
/* Remainder is only for sd_config_done() */
|
|
|
|
URI *uri;
|
|
|
|
QueryParams *qp;
|
|
|
|
} SheepdogConfig;
|
|
|
|
|
|
|
|
static void sd_config_done(SheepdogConfig *cfg)
|
|
|
|
{
|
|
|
|
if (cfg->qp) {
|
|
|
|
query_params_free(cfg->qp);
|
|
|
|
}
|
|
|
|
uri_free(cfg->uri);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void sd_parse_uri(SheepdogConfig *cfg, const char *filename,
|
sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.
Add real error reporting, such as:
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters
The code to translate legacy syntax to URI fails to escape URI
meta-characters. The new error messages are misleading then. Replace
them by the old "Can't parse filename" message. "Internal error"
would be more honest. Anyway, no worse than before. Also add a FIXME
comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:41 +03:00
|
|
|
Error **errp)
|
2013-02-22 07:39:51 +04:00
|
|
|
{
|
sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.
Add real error reporting, such as:
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters
The code to translate legacy syntax to URI fails to escape URI
meta-characters. The new error messages are misleading then. Replace
them by the old "Can't parse filename" message. "Internal error"
would be more honest. Anyway, no worse than before. Also add a FIXME
comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:41 +03:00
|
|
|
Error *err = NULL;
|
2013-02-22 07:39:51 +04:00
|
|
|
QueryParams *qp = NULL;
|
2017-03-06 22:00:42 +03:00
|
|
|
bool is_unix;
|
|
|
|
URI *uri;
|
2013-02-22 07:39:51 +04:00
|
|
|
|
2017-03-06 22:00:43 +03:00
|
|
|
memset(cfg, 0, sizeof(*cfg));
|
|
|
|
|
|
|
|
cfg->uri = uri = uri_parse(filename);
|
2013-02-22 07:39:51 +04:00
|
|
|
if (!uri) {
|
block: include original filename when reporting invalid URIs
Consider passing a JSON based block driver to "qemu-img commit"
$ qemu-img commit 'json:{"driver":"qcow2","file":{"driver":"gluster",\
"volume":"gv0","path":"sn1.qcow2",
"server":[{"type":\
"tcp","host":"10.73.199.197","port":"24007"}]},}'
Currently it will commit the content and then report an incredibly
useless error message when trying to re-open the committed image:
qemu-img: invalid URI
Usage: file=gluster[+transport]://[host[:port]]volume/path[?socket=...][,file.debug=N][,file.logfile=/path/filename.log]
With this fix we get:
qemu-img: invalid URI json:{"server.0.host": "10.73.199.197",
"driver": "gluster", "path": "luks.qcow2", "server.0.type":
"tcp", "server.0.port": "24007", "volume": "gv0"}
Of course the root cause problem still exists, but now we know
what actually needs fixing.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20180206105204.14817-1-berrange@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2018-02-06 13:52:04 +03:00
|
|
|
error_setg(&err, "invalid URI '%s'", filename);
|
sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.
Add real error reporting, such as:
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters
The code to translate legacy syntax to URI fails to escape URI
meta-characters. The new error messages are misleading then. Replace
them by the old "Can't parse filename" message. "Internal error"
would be more honest. Anyway, no worse than before. Also add a FIXME
comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:41 +03:00
|
|
|
goto out;
|
2013-02-22 07:39:51 +04:00
|
|
|
}
|
|
|
|
|
2013-02-22 07:39:53 +04:00
|
|
|
/* transport */
|
2017-06-13 23:57:26 +03:00
|
|
|
if (!g_strcmp0(uri->scheme, "sheepdog")) {
|
2017-03-06 22:00:42 +03:00
|
|
|
is_unix = false;
|
2017-06-13 23:57:26 +03:00
|
|
|
} else if (!g_strcmp0(uri->scheme, "sheepdog+tcp")) {
|
2017-03-06 22:00:42 +03:00
|
|
|
is_unix = false;
|
2017-06-13 23:57:26 +03:00
|
|
|
} else if (!g_strcmp0(uri->scheme, "sheepdog+unix")) {
|
2017-03-06 22:00:42 +03:00
|
|
|
is_unix = true;
|
2013-02-22 07:39:53 +04:00
|
|
|
} else {
|
sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.
Add real error reporting, such as:
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters
The code to translate legacy syntax to URI fails to escape URI
meta-characters. The new error messages are misleading then. Replace
them by the old "Can't parse filename" message. "Internal error"
would be more honest. Anyway, no worse than before. Also add a FIXME
comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:41 +03:00
|
|
|
error_setg(&err, "URI scheme must be 'sheepdog', 'sheepdog+tcp',"
|
|
|
|
" or 'sheepdog+unix'");
|
2013-02-22 07:39:53 +04:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2013-02-22 07:39:51 +04:00
|
|
|
if (uri->path == NULL || !strcmp(uri->path, "/")) {
|
sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.
Add real error reporting, such as:
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters
The code to translate legacy syntax to URI fails to escape URI
meta-characters. The new error messages are misleading then. Replace
them by the old "Can't parse filename" message. "Internal error"
would be more honest. Anyway, no worse than before. Also add a FIXME
comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:41 +03:00
|
|
|
error_setg(&err, "missing file path in URI");
|
2013-02-22 07:39:51 +04:00
|
|
|
goto out;
|
|
|
|
}
|
2017-03-06 22:00:43 +03:00
|
|
|
if (g_strlcpy(cfg->vdi, uri->path + 1, SD_MAX_VDI_LEN)
|
|
|
|
>= SD_MAX_VDI_LEN) {
|
sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.
Add real error reporting, such as:
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters
The code to translate legacy syntax to URI fails to escape URI
meta-characters. The new error messages are misleading then. Replace
them by the old "Can't parse filename" message. "Internal error"
would be more honest. Anyway, no worse than before. Also add a FIXME
comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:41 +03:00
|
|
|
error_setg(&err, "VDI name is too long");
|
2017-03-06 22:00:40 +03:00
|
|
|
goto out;
|
|
|
|
}
|
2013-02-22 07:39:51 +04:00
|
|
|
|
2017-03-06 22:00:43 +03:00
|
|
|
cfg->qp = qp = query_params_parse(uri->query);
|
2013-02-22 07:39:53 +04:00
|
|
|
|
2017-03-06 22:00:42 +03:00
|
|
|
if (is_unix) {
|
2013-02-22 07:39:53 +04:00
|
|
|
/* sheepdog+unix:///vdiname?socket=path */
|
sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.
Add real error reporting, such as:
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters
The code to translate legacy syntax to URI fails to escape URI
meta-characters. The new error messages are misleading then. Replace
them by the old "Can't parse filename" message. "Internal error"
would be more honest. Anyway, no worse than before. Also add a FIXME
comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:41 +03:00
|
|
|
if (uri->server || uri->port) {
|
|
|
|
error_setg(&err, "URI scheme %s doesn't accept a server address",
|
|
|
|
uri->scheme);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
if (!qp->n) {
|
|
|
|
error_setg(&err,
|
|
|
|
"URI scheme %s requires query parameter 'socket'",
|
|
|
|
uri->scheme);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
if (qp->n != 1 || strcmp(qp->p[0].name, "socket")) {
|
|
|
|
error_setg(&err, "unexpected query parameters");
|
2013-02-22 07:39:53 +04:00
|
|
|
goto out;
|
|
|
|
}
|
2017-03-06 22:00:43 +03:00
|
|
|
cfg->path = qp->p[0].value;
|
2013-02-22 07:39:53 +04:00
|
|
|
} else {
|
|
|
|
/* sheepdog[+tcp]://[host:port]/vdiname */
|
sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.
Add real error reporting, such as:
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters
The code to translate legacy syntax to URI fails to escape URI
meta-characters. The new error messages are misleading then. Replace
them by the old "Can't parse filename" message. "Internal error"
would be more honest. Anyway, no worse than before. Also add a FIXME
comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:41 +03:00
|
|
|
if (qp->n) {
|
|
|
|
error_setg(&err, "unexpected query parameters");
|
|
|
|
goto out;
|
|
|
|
}
|
2017-03-06 22:00:43 +03:00
|
|
|
cfg->host = uri->server;
|
|
|
|
cfg->port = uri->port;
|
2013-02-22 07:39:53 +04:00
|
|
|
}
|
2013-02-22 07:39:51 +04:00
|
|
|
|
|
|
|
/* snapshot tag */
|
|
|
|
if (uri->fragment) {
|
2017-03-06 22:00:43 +03:00
|
|
|
if (!sd_parse_snapid_or_tag(uri->fragment,
|
|
|
|
&cfg->snap_id, cfg->tag)) {
|
sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.
Add real error reporting, such as:
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters
The code to translate legacy syntax to URI fails to escape URI
meta-characters. The new error messages are misleading then. Replace
them by the old "Can't parse filename" message. "Internal error"
would be more honest. Anyway, no worse than before. Also add a FIXME
comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:41 +03:00
|
|
|
error_setg(&err, "'%s' is not a valid snapshot ID",
|
|
|
|
uri->fragment);
|
2017-03-06 22:00:39 +03:00
|
|
|
goto out;
|
2013-02-22 07:39:51 +04:00
|
|
|
}
|
|
|
|
} else {
|
2017-03-06 22:00:43 +03:00
|
|
|
cfg->snap_id = CURRENT_VDI_ID; /* search current vdi */
|
2013-02-22 07:39:51 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
out:
|
2017-03-06 22:00:42 +03:00
|
|
|
if (err) {
|
|
|
|
error_propagate(errp, err);
|
2017-03-06 22:00:43 +03:00
|
|
|
sd_config_done(cfg);
|
2013-02-22 07:39:51 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
/*
|
2013-02-22 07:39:51 +04:00
|
|
|
* Parse a filename (old syntax)
|
2010-06-21 00:01:00 +04:00
|
|
|
*
|
|
|
|
* filename must be one of the following formats:
|
|
|
|
* 1. [vdiname]
|
|
|
|
* 2. [vdiname]:[snapid]
|
|
|
|
* 3. [vdiname]:[tag]
|
|
|
|
* 4. [hostname]:[port]:[vdiname]
|
|
|
|
* 5. [hostname]:[port]:[vdiname]:[snapid]
|
|
|
|
* 6. [hostname]:[port]:[vdiname]:[tag]
|
|
|
|
*
|
|
|
|
* You can boot from the snapshot images by specifying `snapid` or
|
|
|
|
* `tag'.
|
|
|
|
*
|
|
|
|
* You can run VMs outside the Sheepdog cluster by specifying
|
|
|
|
* `hostname' and `port' (experimental).
|
|
|
|
*/
|
2017-03-06 22:00:43 +03:00
|
|
|
static void parse_vdiname(SheepdogConfig *cfg, const char *filename,
|
sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.
Add real error reporting, such as:
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters
The code to translate legacy syntax to URI fails to escape URI
meta-characters. The new error messages are misleading then. Replace
them by the old "Can't parse filename" message. "Internal error"
would be more honest. Anyway, no worse than before. Also add a FIXME
comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:41 +03:00
|
|
|
Error **errp)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.
Add real error reporting, such as:
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters
The code to translate legacy syntax to URI fails to escape URI
meta-characters. The new error messages are misleading then. Replace
them by the old "Can't parse filename" message. "Internal error"
would be more honest. Anyway, no worse than before. Also add a FIXME
comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:41 +03:00
|
|
|
Error *err = NULL;
|
2013-02-22 07:39:51 +04:00
|
|
|
char *p, *q, *uri;
|
|
|
|
const char *host_spec, *vdi_spec;
|
sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.
Add real error reporting, such as:
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters
The code to translate legacy syntax to URI fails to escape URI
meta-characters. The new error messages are misleading then. Replace
them by the old "Can't parse filename" message. "Internal error"
would be more honest. Anyway, no worse than before. Also add a FIXME
comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:41 +03:00
|
|
|
int nr_sep;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2016-06-15 19:14:37 +03:00
|
|
|
strstart(filename, "sheepdog:", &filename);
|
2011-08-21 07:09:37 +04:00
|
|
|
p = q = g_strdup(filename);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
/* count the number of separators */
|
|
|
|
nr_sep = 0;
|
|
|
|
while (*p) {
|
|
|
|
if (*p == ':') {
|
|
|
|
nr_sep++;
|
|
|
|
}
|
|
|
|
p++;
|
|
|
|
}
|
|
|
|
p = q;
|
|
|
|
|
2013-02-22 07:39:51 +04:00
|
|
|
/* use the first two tokens as host_spec. */
|
2010-06-21 00:01:00 +04:00
|
|
|
if (nr_sep >= 2) {
|
2013-02-22 07:39:51 +04:00
|
|
|
host_spec = p;
|
2010-06-21 00:01:00 +04:00
|
|
|
p = strchr(p, ':');
|
2013-02-22 07:39:51 +04:00
|
|
|
p++;
|
2010-06-21 00:01:00 +04:00
|
|
|
p = strchr(p, ':');
|
|
|
|
*p++ = '\0';
|
|
|
|
} else {
|
2013-02-22 07:39:51 +04:00
|
|
|
host_spec = "";
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2013-02-22 07:39:51 +04:00
|
|
|
vdi_spec = p;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2013-02-22 07:39:51 +04:00
|
|
|
p = strchr(vdi_spec, ':');
|
2010-06-21 00:01:00 +04:00
|
|
|
if (p) {
|
2013-02-22 07:39:51 +04:00
|
|
|
*p++ = '#';
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2013-02-22 07:39:51 +04:00
|
|
|
uri = g_strdup_printf("sheepdog://%s/%s", host_spec, vdi_spec);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.
Add real error reporting, such as:
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters
The code to translate legacy syntax to URI fails to escape URI
meta-characters. The new error messages are misleading then. Replace
them by the old "Can't parse filename" message. "Internal error"
would be more honest. Anyway, no worse than before. Also add a FIXME
comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:41 +03:00
|
|
|
/*
|
|
|
|
* FIXME We to escape URI meta-characters, e.g. "x?y=z"
|
|
|
|
* produces "sheepdog://x?y=z". Because of that ...
|
|
|
|
*/
|
2017-03-06 22:00:43 +03:00
|
|
|
sd_parse_uri(cfg, uri, &err);
|
sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.
Add real error reporting, such as:
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters
The code to translate legacy syntax to URI fails to escape URI
meta-characters. The new error messages are misleading then. Replace
them by the old "Can't parse filename" message. "Internal error"
would be more honest. Anyway, no worse than before. Also add a FIXME
comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:41 +03:00
|
|
|
if (err) {
|
|
|
|
/*
|
|
|
|
* ... this can fail, but the error message is misleading.
|
|
|
|
* Replace it by the traditional useless one until the
|
|
|
|
* escaping is fixed.
|
|
|
|
*/
|
|
|
|
error_free(err);
|
|
|
|
error_setg(errp, "Can't parse filename");
|
|
|
|
}
|
2013-02-22 07:39:51 +04:00
|
|
|
|
|
|
|
g_free(q);
|
|
|
|
g_free(uri);
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2017-03-06 22:00:43 +03:00
|
|
|
static void sd_parse_filename(const char *filename, QDict *options,
|
|
|
|
Error **errp)
|
|
|
|
{
|
|
|
|
Error *err = NULL;
|
|
|
|
SheepdogConfig cfg;
|
|
|
|
char buf[32];
|
|
|
|
|
|
|
|
if (strstr(filename, "://")) {
|
|
|
|
sd_parse_uri(&cfg, filename, &err);
|
|
|
|
} else {
|
|
|
|
parse_vdiname(&cfg, filename, &err);
|
|
|
|
}
|
|
|
|
if (err) {
|
|
|
|
error_propagate(errp, err);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (cfg.path) {
|
sheepdog: Fix blockdev-add
Commit 831acdc "sheepdog: Implement bdrv_parse_filename()" and commit
d282f34 "sheepdog: Support blockdev-add" have different ideas on how
the QemuOpts parameters for the server address are named. Fix that.
While there, rename BlockdevOptionsSheepdog member addr to server, for
consistency with BlockdevOptionsSsh, BlockdevOptionsGluster,
BlockdevOptionsNbd.
Commit 831acdc's example becomes
--drive driver=sheepdog,server.type=inet,server.host=fido,server.port=7000,vdi=dolly
instead of
--drive driver=sheepdog,host=fido,vdi=dolly
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Message-id: 1490895797-29094-10-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-30 20:43:17 +03:00
|
|
|
qdict_set_default_str(options, "server.path", cfg.path);
|
|
|
|
qdict_set_default_str(options, "server.type", "unix");
|
|
|
|
} else {
|
|
|
|
qdict_set_default_str(options, "server.type", "inet");
|
|
|
|
qdict_set_default_str(options, "server.host",
|
|
|
|
cfg.host ?: SD_DEFAULT_ADDR);
|
|
|
|
snprintf(buf, sizeof(buf), "%d", cfg.port ?: SD_DEFAULT_PORT);
|
|
|
|
qdict_set_default_str(options, "server.port", buf);
|
2017-03-06 22:00:43 +03:00
|
|
|
}
|
|
|
|
qdict_set_default_str(options, "vdi", cfg.vdi);
|
|
|
|
qdict_set_default_str(options, "tag", cfg.tag);
|
|
|
|
if (cfg.snap_id) {
|
|
|
|
snprintf(buf, sizeof(buf), "%d", cfg.snap_id);
|
|
|
|
qdict_set_default_str(options, "snap-id", buf);
|
|
|
|
}
|
|
|
|
|
|
|
|
sd_config_done(&cfg);
|
|
|
|
}
|
|
|
|
|
2013-04-25 20:19:51 +04:00
|
|
|
static int find_vdi_name(BDRVSheepdogState *s, const char *filename,
|
|
|
|
uint32_t snapid, const char *tag, uint32_t *vid,
|
2014-05-16 13:00:23 +04:00
|
|
|
bool lock, Error **errp)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
|
|
|
int ret, fd;
|
|
|
|
SheepdogVdiReq hdr;
|
|
|
|
SheepdogVdiRsp *rsp = (SheepdogVdiRsp *)&hdr;
|
|
|
|
unsigned int wlen, rlen = 0;
|
block/sheepdog: Use QEMU_NONSTRING for non NUL-terminated arrays
GCC 8 added a -Wstringop-truncation warning:
The -Wstringop-truncation warning added in GCC 8.0 via r254630 for
bug 81117 is specifically intended to highlight likely unintended
uses of the strncpy function that truncate the terminating NUL
character from the source string.
This new warning leads to compilation failures:
CC block/sheepdog.o
qemu/block/sheepdog.c: In function 'find_vdi_name':
qemu/block/sheepdog.c:1239:5: error: 'strncpy' specified bound 256 equals destination size [-Werror=stringop-truncation]
strncpy(buf + SD_MAX_VDI_LEN, tag, SD_MAX_VDI_TAG_LEN);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
make: *** [qemu/rules.mak:69: block/sheepdog.o] Error 1
As described previous to the strncpy() calls, the use of strncpy() is
correct here:
/* This pair of strncpy calls ensures that the buffer is zero-filled,
* which is desirable since we'll soon be sending those bytes, and
* don't want the send_req to read uninitialized data.
*/
strncpy(buf, filename, SD_MAX_VDI_LEN);
strncpy(buf + SD_MAX_VDI_LEN, tag, SD_MAX_VDI_TAG_LEN);
Use the QEMU_NONSTRING attribute, since this array is intended to store
character arrays that do not necessarily contain a terminating NUL.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-01-03 11:56:35 +03:00
|
|
|
char buf[SD_MAX_VDI_LEN + SD_MAX_VDI_TAG_LEN] QEMU_NONSTRING;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2014-05-16 13:00:23 +04:00
|
|
|
fd = connect_to_sdog(s, errp);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (fd < 0) {
|
2012-05-16 22:15:33 +04:00
|
|
|
return fd;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2012-10-04 15:09:47 +04:00
|
|
|
/* This pair of strncpy calls ensures that the buffer is zero-filled,
|
|
|
|
* which is desirable since we'll soon be sending those bytes, and
|
|
|
|
* don't want the send_req to read uninitialized data.
|
|
|
|
*/
|
2010-06-21 00:01:00 +04:00
|
|
|
strncpy(buf, filename, SD_MAX_VDI_LEN);
|
|
|
|
strncpy(buf + SD_MAX_VDI_LEN, tag, SD_MAX_VDI_TAG_LEN);
|
|
|
|
|
|
|
|
memset(&hdr, 0, sizeof(hdr));
|
2013-04-25 20:19:51 +04:00
|
|
|
if (lock) {
|
2010-06-21 00:01:00 +04:00
|
|
|
hdr.opcode = SD_OP_LOCK_VDI;
|
2014-08-11 09:43:45 +04:00
|
|
|
hdr.type = LOCK_TYPE_NORMAL;
|
2013-04-25 20:19:51 +04:00
|
|
|
} else {
|
|
|
|
hdr.opcode = SD_OP_GET_VDI_INFO;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
wlen = SD_MAX_VDI_LEN + SD_MAX_VDI_TAG_LEN;
|
|
|
|
hdr.proto_ver = SD_PROTO_VER;
|
|
|
|
hdr.data_length = wlen;
|
|
|
|
hdr.snapid = snapid;
|
|
|
|
hdr.flags = SD_FLAG_CMD_WRITE;
|
|
|
|
|
2016-10-27 13:48:58 +03:00
|
|
|
ret = do_req(fd, s->bs, (SheepdogReq *)&hdr, buf, &wlen, &rlen);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (ret) {
|
2014-05-16 13:00:23 +04:00
|
|
|
error_setg_errno(errp, -ret, "cannot get vdi info");
|
2010-06-21 00:01:00 +04:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (rsp->result != SD_RES_SUCCESS) {
|
2014-05-16 13:00:23 +04:00
|
|
|
error_setg(errp, "cannot get vdi info, %s, %s %" PRIu32 " %s",
|
|
|
|
sd_strerror(rsp->result), filename, snapid, tag);
|
2012-05-16 22:15:33 +04:00
|
|
|
if (rsp->result == SD_RES_NO_VDI) {
|
|
|
|
ret = -ENOENT;
|
2014-08-11 09:43:46 +04:00
|
|
|
} else if (rsp->result == SD_RES_VDI_LOCKED) {
|
|
|
|
ret = -EBUSY;
|
2012-05-16 22:15:33 +04:00
|
|
|
} else {
|
|
|
|
ret = -EIO;
|
|
|
|
}
|
2010-06-21 00:01:00 +04:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
*vid = rsp->vdi_id;
|
|
|
|
|
|
|
|
ret = 0;
|
|
|
|
out:
|
|
|
|
closesocket(fd);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2013-10-24 11:01:16 +04:00
|
|
|
static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
|
2014-06-06 08:35:11 +04:00
|
|
|
struct iovec *iov, int niov,
|
|
|
|
enum AIOCBState aiocb_type)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
|
|
|
int nr_copies = s->inode.nr_copies;
|
|
|
|
SheepdogObjReq hdr;
|
2013-01-15 12:28:55 +04:00
|
|
|
unsigned int wlen = 0;
|
2010-06-21 00:01:00 +04:00
|
|
|
int ret;
|
|
|
|
uint64_t oid = aio_req->oid;
|
|
|
|
unsigned int datalen = aio_req->data_len;
|
|
|
|
uint64_t offset = aio_req->offset;
|
|
|
|
uint8_t flags = aio_req->flags;
|
|
|
|
uint64_t old_oid = aio_req->base_oid;
|
2014-06-06 08:35:11 +04:00
|
|
|
bool create = aio_req->create;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2017-06-29 16:27:48 +03:00
|
|
|
qemu_co_mutex_lock(&s->queue_lock);
|
2016-11-29 14:32:44 +03:00
|
|
|
QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
|
2017-06-29 16:27:48 +03:00
|
|
|
qemu_co_mutex_unlock(&s->queue_lock);
|
2016-11-29 14:32:44 +03:00
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
if (!nr_copies) {
|
2011-06-22 16:03:54 +04:00
|
|
|
error_report("bug");
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
memset(&hdr, 0, sizeof(hdr));
|
|
|
|
|
2013-01-15 12:28:55 +04:00
|
|
|
switch (aiocb_type) {
|
|
|
|
case AIOCB_FLUSH_CACHE:
|
|
|
|
hdr.opcode = SD_OP_FLUSH_VDI;
|
|
|
|
break;
|
|
|
|
case AIOCB_READ_UDATA:
|
2010-06-21 00:01:00 +04:00
|
|
|
hdr.opcode = SD_OP_READ_OBJ;
|
|
|
|
hdr.flags = flags;
|
2013-01-15 12:28:55 +04:00
|
|
|
break;
|
|
|
|
case AIOCB_WRITE_UDATA:
|
|
|
|
if (create) {
|
|
|
|
hdr.opcode = SD_OP_CREATE_AND_WRITE_OBJ;
|
|
|
|
} else {
|
|
|
|
hdr.opcode = SD_OP_WRITE_OBJ;
|
|
|
|
}
|
2010-06-21 00:01:00 +04:00
|
|
|
wlen = datalen;
|
|
|
|
hdr.flags = SD_FLAG_CMD_WRITE | flags;
|
2013-01-15 12:28:55 +04:00
|
|
|
break;
|
2013-04-23 10:03:33 +04:00
|
|
|
case AIOCB_DISCARD_OBJ:
|
2015-09-01 06:03:10 +03:00
|
|
|
hdr.opcode = SD_OP_WRITE_OBJ;
|
|
|
|
hdr.flags = SD_FLAG_CMD_WRITE | flags;
|
|
|
|
s->inode.data_vdi_id[data_oid_to_idx(oid)] = 0;
|
|
|
|
offset = offsetof(SheepdogInode,
|
|
|
|
data_vdi_id[data_oid_to_idx(oid)]);
|
|
|
|
oid = vid_to_vdi_oid(s->inode.vdi_id);
|
|
|
|
wlen = datalen = sizeof(uint32_t);
|
2013-04-23 10:03:33 +04:00
|
|
|
break;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2013-01-10 12:03:47 +04:00
|
|
|
if (s->cache_flags) {
|
|
|
|
hdr.flags |= s->cache_flags;
|
2012-04-04 00:03:58 +04:00
|
|
|
}
|
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
hdr.oid = oid;
|
|
|
|
hdr.cow_oid = old_oid;
|
|
|
|
hdr.copies = s->inode.nr_copies;
|
|
|
|
|
|
|
|
hdr.data_length = datalen;
|
|
|
|
hdr.offset = offset;
|
|
|
|
|
|
|
|
hdr.id = aio_req->id;
|
|
|
|
|
2011-08-12 16:33:15 +04:00
|
|
|
qemu_co_mutex_lock(&s->lock);
|
|
|
|
s->co_send = qemu_coroutine_self();
|
2015-10-23 06:08:05 +03:00
|
|
|
aio_set_fd_handler(s->aio_context, s->fd, false,
|
2016-12-01 22:26:41 +03:00
|
|
|
co_read_response, co_write_request, NULL, s);
|
2011-09-21 14:36:48 +04:00
|
|
|
socket_set_cork(s->fd, 1);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
/* send a header */
|
2011-09-08 15:46:25 +04:00
|
|
|
ret = qemu_co_send(s->fd, &hdr, sizeof(hdr));
|
2013-10-24 11:01:11 +04:00
|
|
|
if (ret != sizeof(hdr)) {
|
2011-06-22 16:03:54 +04:00
|
|
|
error_report("failed to send a req, %s", strerror(errno));
|
2013-10-24 11:01:15 +04:00
|
|
|
goto out;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
if (wlen) {
|
2012-06-07 20:22:46 +04:00
|
|
|
ret = qemu_co_sendv(s->fd, iov, niov, aio_req->iov_offset, wlen);
|
2013-10-24 11:01:11 +04:00
|
|
|
if (ret != wlen) {
|
2011-06-22 16:03:54 +04:00
|
|
|
error_report("failed to send a data, %s", strerror(errno));
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
}
|
2013-10-24 11:01:15 +04:00
|
|
|
out:
|
2011-09-21 14:36:48 +04:00
|
|
|
socket_set_cork(s->fd, 0);
|
2015-10-23 06:08:05 +03:00
|
|
|
aio_set_fd_handler(s->aio_context, s->fd, false,
|
2016-12-01 22:26:41 +03:00
|
|
|
co_read_response, NULL, NULL, s);
|
2013-10-24 11:01:15 +04:00
|
|
|
s->co_send = NULL;
|
2011-08-12 16:33:15 +04:00
|
|
|
qemu_co_mutex_unlock(&s->lock);
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2016-10-27 13:48:58 +03:00
|
|
|
static int read_write_object(int fd, BlockDriverState *bs, char *buf,
|
2014-05-08 18:34:52 +04:00
|
|
|
uint64_t oid, uint8_t copies,
|
2010-06-21 00:01:00 +04:00
|
|
|
unsigned int datalen, uint64_t offset,
|
2013-01-10 12:03:47 +04:00
|
|
|
bool write, bool create, uint32_t cache_flags)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
|
|
|
SheepdogObjReq hdr;
|
|
|
|
SheepdogObjRsp *rsp = (SheepdogObjRsp *)&hdr;
|
|
|
|
unsigned int wlen, rlen;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
memset(&hdr, 0, sizeof(hdr));
|
|
|
|
|
|
|
|
if (write) {
|
|
|
|
wlen = datalen;
|
|
|
|
rlen = 0;
|
|
|
|
hdr.flags = SD_FLAG_CMD_WRITE;
|
|
|
|
if (create) {
|
|
|
|
hdr.opcode = SD_OP_CREATE_AND_WRITE_OBJ;
|
|
|
|
} else {
|
|
|
|
hdr.opcode = SD_OP_WRITE_OBJ;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
wlen = 0;
|
|
|
|
rlen = datalen;
|
|
|
|
hdr.opcode = SD_OP_READ_OBJ;
|
|
|
|
}
|
2012-04-04 00:03:58 +04:00
|
|
|
|
2013-01-10 12:03:47 +04:00
|
|
|
hdr.flags |= cache_flags;
|
2012-04-04 00:03:58 +04:00
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
hdr.oid = oid;
|
|
|
|
hdr.data_length = datalen;
|
|
|
|
hdr.offset = offset;
|
|
|
|
hdr.copies = copies;
|
|
|
|
|
2016-10-27 13:48:58 +03:00
|
|
|
ret = do_req(fd, bs, (SheepdogReq *)&hdr, buf, &wlen, &rlen);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (ret) {
|
2011-06-22 16:03:54 +04:00
|
|
|
error_report("failed to send a request to the sheep");
|
2012-05-16 22:15:33 +04:00
|
|
|
return ret;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
switch (rsp->result) {
|
|
|
|
case SD_RES_SUCCESS:
|
|
|
|
return 0;
|
|
|
|
default:
|
2011-06-22 16:03:54 +04:00
|
|
|
error_report("%s", sd_strerror(rsp->result));
|
2012-05-16 22:15:33 +04:00
|
|
|
return -EIO;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2016-10-27 13:48:58 +03:00
|
|
|
static int read_object(int fd, BlockDriverState *bs, char *buf,
|
2014-05-08 18:34:52 +04:00
|
|
|
uint64_t oid, uint8_t copies,
|
2013-01-10 12:03:47 +04:00
|
|
|
unsigned int datalen, uint64_t offset,
|
|
|
|
uint32_t cache_flags)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
2016-10-27 13:48:58 +03:00
|
|
|
return read_write_object(fd, bs, buf, oid, copies,
|
2014-05-08 18:34:52 +04:00
|
|
|
datalen, offset, false,
|
2013-01-10 12:03:47 +04:00
|
|
|
false, cache_flags);
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2016-10-27 13:48:58 +03:00
|
|
|
static int write_object(int fd, BlockDriverState *bs, char *buf,
|
2014-05-08 18:34:52 +04:00
|
|
|
uint64_t oid, uint8_t copies,
|
2012-10-06 20:57:14 +04:00
|
|
|
unsigned int datalen, uint64_t offset, bool create,
|
2013-01-10 12:03:47 +04:00
|
|
|
uint32_t cache_flags)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
2016-10-27 13:48:58 +03:00
|
|
|
return read_write_object(fd, bs, buf, oid, copies,
|
2014-05-08 18:34:52 +04:00
|
|
|
datalen, offset, true,
|
2013-01-10 12:03:47 +04:00
|
|
|
create, cache_flags);
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2013-04-25 20:19:53 +04:00
|
|
|
/* update inode with the latest state */
|
|
|
|
static int reload_inode(BDRVSheepdogState *s, uint32_t snapid, const char *tag)
|
|
|
|
{
|
2014-05-16 13:00:19 +04:00
|
|
|
Error *local_err = NULL;
|
2013-04-25 20:19:53 +04:00
|
|
|
SheepdogInode *inode;
|
|
|
|
int ret = 0, fd;
|
|
|
|
uint32_t vid = 0;
|
|
|
|
|
2014-05-16 13:00:19 +04:00
|
|
|
fd = connect_to_sdog(s, &local_err);
|
2013-04-25 20:19:53 +04:00
|
|
|
if (fd < 0) {
|
2015-02-12 15:55:05 +03:00
|
|
|
error_report_err(local_err);
|
2013-04-25 20:19:53 +04:00
|
|
|
return -EIO;
|
|
|
|
}
|
|
|
|
|
2014-06-06 08:35:12 +04:00
|
|
|
inode = g_malloc(SD_INODE_HEADER_SIZE);
|
2013-04-25 20:19:53 +04:00
|
|
|
|
2014-05-16 13:00:23 +04:00
|
|
|
ret = find_vdi_name(s, s->name, snapid, tag, &vid, false, &local_err);
|
2013-04-25 20:19:53 +04:00
|
|
|
if (ret) {
|
2015-02-12 15:55:05 +03:00
|
|
|
error_report_err(local_err);
|
2013-04-25 20:19:53 +04:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2016-10-27 13:48:58 +03:00
|
|
|
ret = read_object(fd, s->bs, (char *)inode, vid_to_vdi_oid(vid),
|
2014-06-06 08:35:12 +04:00
|
|
|
s->inode.nr_copies, SD_INODE_HEADER_SIZE, 0,
|
|
|
|
s->cache_flags);
|
2013-04-25 20:19:53 +04:00
|
|
|
if (ret < 0) {
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (inode->vdi_id != s->inode.vdi_id) {
|
2014-06-06 08:35:12 +04:00
|
|
|
memcpy(&s->inode, inode, SD_INODE_HEADER_SIZE);
|
2013-04-25 20:19:53 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
out:
|
|
|
|
g_free(inode);
|
|
|
|
closesocket(fd);
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2013-10-24 11:01:16 +04:00
|
|
|
static void coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req)
|
2013-04-25 20:19:54 +04:00
|
|
|
{
|
|
|
|
SheepdogAIOCB *acb = aio_req->aiocb;
|
2014-06-06 08:35:11 +04:00
|
|
|
|
|
|
|
aio_req->create = false;
|
2013-04-25 20:19:54 +04:00
|
|
|
|
|
|
|
/* check whether this request becomes a CoW one */
|
2013-10-24 11:01:12 +04:00
|
|
|
if (acb->aiocb_type == AIOCB_WRITE_UDATA && is_data_obj(aio_req->oid)) {
|
2013-04-25 20:19:54 +04:00
|
|
|
int idx = data_oid_to_idx(aio_req->oid);
|
|
|
|
|
|
|
|
if (is_data_obj_writable(&s->inode, idx)) {
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2013-10-24 11:01:18 +04:00
|
|
|
if (s->inode.data_vdi_id[idx]) {
|
|
|
|
aio_req->base_oid = vid_to_data_oid(s->inode.data_vdi_id[idx], idx);
|
|
|
|
aio_req->flags |= SD_FLAG_CMD_COW;
|
|
|
|
}
|
2014-06-06 08:35:11 +04:00
|
|
|
aio_req->create = true;
|
2013-04-25 20:19:54 +04:00
|
|
|
}
|
|
|
|
out:
|
2013-10-24 11:01:12 +04:00
|
|
|
if (is_data_obj(aio_req->oid)) {
|
2014-06-06 08:35:11 +04:00
|
|
|
add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov,
|
2013-10-24 11:01:16 +04:00
|
|
|
acb->aiocb_type);
|
2013-10-24 11:01:12 +04:00
|
|
|
} else {
|
|
|
|
struct iovec iov;
|
|
|
|
iov.iov_base = &s->inode;
|
|
|
|
iov.iov_len = sizeof(s->inode);
|
2014-06-06 08:35:11 +04:00
|
|
|
add_aio_request(s, aio_req, &iov, 1, AIOCB_WRITE_UDATA);
|
2013-10-24 11:01:12 +04:00
|
|
|
}
|
2013-04-25 20:19:54 +04:00
|
|
|
}
|
|
|
|
|
2014-05-08 18:34:52 +04:00
|
|
|
static void sd_detach_aio_context(BlockDriverState *bs)
|
|
|
|
{
|
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
|
|
|
|
2015-10-23 06:08:05 +03:00
|
|
|
aio_set_fd_handler(s->aio_context, s->fd, false, NULL,
|
2016-12-01 22:26:41 +03:00
|
|
|
NULL, NULL, NULL);
|
2014-05-08 18:34:52 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static void sd_attach_aio_context(BlockDriverState *bs,
|
|
|
|
AioContext *new_context)
|
|
|
|
{
|
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
|
|
|
|
|
|
|
s->aio_context = new_context;
|
2015-10-23 06:08:05 +03:00
|
|
|
aio_set_fd_handler(new_context, s->fd, false,
|
2016-12-01 22:26:41 +03:00
|
|
|
co_read_response, NULL, NULL, s);
|
2014-05-08 18:34:52 +04:00
|
|
|
}
|
|
|
|
|
2013-04-12 20:10:49 +04:00
|
|
|
static QemuOptsList runtime_opts = {
|
|
|
|
.name = "sheepdog",
|
|
|
|
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
|
|
|
|
.desc = {
|
2017-03-06 22:00:43 +03:00
|
|
|
{
|
|
|
|
.name = "vdi",
|
|
|
|
.type = QEMU_OPT_STRING,
|
|
|
|
},
|
|
|
|
{
|
|
|
|
.name = "snap-id",
|
|
|
|
.type = QEMU_OPT_NUMBER,
|
|
|
|
},
|
|
|
|
{
|
|
|
|
.name = "tag",
|
2013-04-12 20:10:49 +04:00
|
|
|
.type = QEMU_OPT_STRING,
|
|
|
|
},
|
|
|
|
{ /* end of list */ }
|
|
|
|
},
|
|
|
|
};
|
|
|
|
|
2013-09-05 16:22:29 +04:00
|
|
|
static int sd_open(BlockDriverState *bs, QDict *options, int flags,
|
|
|
|
Error **errp)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
|
|
|
int ret, fd;
|
|
|
|
uint32_t vid = 0;
|
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
sheepdog: Fix blockdev-add
Commit 831acdc "sheepdog: Implement bdrv_parse_filename()" and commit
d282f34 "sheepdog: Support blockdev-add" have different ideas on how
the QemuOpts parameters for the server address are named. Fix that.
While there, rename BlockdevOptionsSheepdog member addr to server, for
consistency with BlockdevOptionsSsh, BlockdevOptionsGluster,
BlockdevOptionsNbd.
Commit 831acdc's example becomes
--drive driver=sheepdog,server.type=inet,server.host=fido,server.port=7000,vdi=dolly
instead of
--drive driver=sheepdog,host=fido,vdi=dolly
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Message-id: 1490895797-29094-10-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-30 20:43:17 +03:00
|
|
|
const char *vdi, *snap_id_str, *tag;
|
2017-03-06 22:00:43 +03:00
|
|
|
uint64_t snap_id;
|
2010-06-21 00:01:00 +04:00
|
|
|
char *buf = NULL;
|
2013-04-12 20:10:49 +04:00
|
|
|
QemuOpts *opts;
|
|
|
|
Error *local_err = NULL;
|
|
|
|
|
2013-10-24 11:01:15 +04:00
|
|
|
s->bs = bs;
|
2014-05-08 18:34:52 +04:00
|
|
|
s->aio_context = bdrv_get_aio_context(bs);
|
2013-10-24 11:01:15 +04:00
|
|
|
|
2014-01-02 06:49:17 +04:00
|
|
|
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
|
2013-04-12 20:10:49 +04:00
|
|
|
qemu_opts_absorb_qdict(opts, options, &local_err);
|
2014-01-30 18:07:28 +04:00
|
|
|
if (local_err) {
|
2014-05-16 13:00:24 +04:00
|
|
|
error_propagate(errp, local_err);
|
2013-04-12 20:10:49 +04:00
|
|
|
ret = -EINVAL;
|
sheepdog: Defuse time bomb in sd_open() error handling
When qemu_opts_absorb_qdict() fails, sd_open() closes stdin, because
sd->fd is still zero. Fortunately, qemu_opts_absorb_qdict() can't
fail, because:
1. it only fails when qemu_opt_parse() fails, and
2. the only member of runtime_opts.desc[] is a QEMU_OPT_STRING, and
3. qemu_opt_parse() can't fail for QEMU_OPT_STRING.
Defuse this ticking time bomb by jumping behind the file descriptor
cleanup on error.
Also do that for the error paths where sd->fd is still -1. The file
descriptor cleanup happens to do nothing then, but let's not rely on
that here.
While there, rename label out to err, because it's on the error path,
not the normal path out of the function.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:35 +03:00
|
|
|
goto err_no_fd;
|
2013-04-12 20:10:49 +04:00
|
|
|
}
|
|
|
|
|
sheepdog: Fix blockdev-add
Commit 831acdc "sheepdog: Implement bdrv_parse_filename()" and commit
d282f34 "sheepdog: Support blockdev-add" have different ideas on how
the QemuOpts parameters for the server address are named. Fix that.
While there, rename BlockdevOptionsSheepdog member addr to server, for
consistency with BlockdevOptionsSsh, BlockdevOptionsGluster,
BlockdevOptionsNbd.
Commit 831acdc's example becomes
--drive driver=sheepdog,server.type=inet,server.host=fido,server.port=7000,vdi=dolly
instead of
--drive driver=sheepdog,host=fido,vdi=dolly
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Message-id: 1490895797-29094-10-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-30 20:43:17 +03:00
|
|
|
s->addr = sd_server_config(options, errp);
|
|
|
|
if (!s->addr) {
|
|
|
|
ret = -EINVAL;
|
|
|
|
goto err_no_fd;
|
|
|
|
}
|
|
|
|
|
2017-03-06 22:00:43 +03:00
|
|
|
vdi = qemu_opt_get(opts, "vdi");
|
|
|
|
snap_id_str = qemu_opt_get(opts, "snap-id");
|
|
|
|
snap_id = qemu_opt_get_number(opts, "snap-id", CURRENT_VDI_ID);
|
|
|
|
tag = qemu_opt_get(opts, "tag");
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2017-03-06 22:00:43 +03:00
|
|
|
if (!vdi) {
|
|
|
|
error_setg(errp, "parameter 'vdi' is missing");
|
|
|
|
ret = -EINVAL;
|
|
|
|
goto err_no_fd;
|
|
|
|
}
|
|
|
|
if (strlen(vdi) >= SD_MAX_VDI_LEN) {
|
|
|
|
error_setg(errp, "value of parameter 'vdi' is too long");
|
|
|
|
ret = -EINVAL;
|
|
|
|
goto err_no_fd;
|
|
|
|
}
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2017-03-06 22:00:43 +03:00
|
|
|
if (snap_id > UINT32_MAX) {
|
|
|
|
snap_id = 0;
|
|
|
|
}
|
|
|
|
if (snap_id_str && !snap_id) {
|
|
|
|
error_setg(errp, "'snap-id=%s' is not a valid snapshot ID",
|
|
|
|
snap_id_str);
|
|
|
|
ret = -EINVAL;
|
|
|
|
goto err_no_fd;
|
|
|
|
}
|
2013-02-22 07:39:51 +04:00
|
|
|
|
2017-03-06 22:00:43 +03:00
|
|
|
if (!tag) {
|
|
|
|
tag = "";
|
2013-02-22 07:39:51 +04:00
|
|
|
}
|
2017-11-08 01:27:20 +03:00
|
|
|
if (strlen(tag) >= SD_MAX_VDI_TAG_LEN) {
|
2017-03-06 22:00:43 +03:00
|
|
|
error_setg(errp, "value of parameter 'tag' is too long");
|
sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.
Add real error reporting, such as:
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
$ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters
The code to translate legacy syntax to URI fails to escape URI
meta-characters. The new error messages are misleading then. Replace
them by the old "Can't parse filename" message. "Internal error"
would be more honest. Anyway, no worse than before. Also add a FIXME
comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:41 +03:00
|
|
|
ret = -EINVAL;
|
sheepdog: Defuse time bomb in sd_open() error handling
When qemu_opts_absorb_qdict() fails, sd_open() closes stdin, because
sd->fd is still zero. Fortunately, qemu_opts_absorb_qdict() can't
fail, because:
1. it only fails when qemu_opt_parse() fails, and
2. the only member of runtime_opts.desc[] is a QEMU_OPT_STRING, and
3. qemu_opt_parse() can't fail for QEMU_OPT_STRING.
Defuse this ticking time bomb by jumping behind the file descriptor
cleanup on error.
Also do that for the error paths where sd->fd is still -1. The file
descriptor cleanup happens to do nothing then, but let's not rely on
that here.
While there, rename label out to err, because it's on the error path,
not the normal path out of the function.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:35 +03:00
|
|
|
goto err_no_fd;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
2017-03-06 22:00:43 +03:00
|
|
|
|
|
|
|
QLIST_INIT(&s->inflight_aio_head);
|
|
|
|
QLIST_INIT(&s->failed_aio_head);
|
|
|
|
QLIST_INIT(&s->inflight_aiocb_head);
|
|
|
|
|
2014-05-16 13:00:24 +04:00
|
|
|
s->fd = get_sheep_fd(s, errp);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (s->fd < 0) {
|
2012-05-16 22:15:33 +04:00
|
|
|
ret = s->fd;
|
sheepdog: Defuse time bomb in sd_open() error handling
When qemu_opts_absorb_qdict() fails, sd_open() closes stdin, because
sd->fd is still zero. Fortunately, qemu_opts_absorb_qdict() can't
fail, because:
1. it only fails when qemu_opt_parse() fails, and
2. the only member of runtime_opts.desc[] is a QEMU_OPT_STRING, and
3. qemu_opt_parse() can't fail for QEMU_OPT_STRING.
Defuse this ticking time bomb by jumping behind the file descriptor
cleanup on error.
Also do that for the error paths where sd->fd is still -1. The file
descriptor cleanup happens to do nothing then, but let's not rely on
that here.
While there, rename label out to err, because it's on the error path,
not the normal path out of the function.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:35 +03:00
|
|
|
goto err_no_fd;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2017-03-06 22:00:43 +03:00
|
|
|
ret = find_vdi_name(s, vdi, (uint32_t)snap_id, tag, &vid, true, errp);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (ret) {
|
sheepdog: Defuse time bomb in sd_open() error handling
When qemu_opts_absorb_qdict() fails, sd_open() closes stdin, because
sd->fd is still zero. Fortunately, qemu_opts_absorb_qdict() can't
fail, because:
1. it only fails when qemu_opt_parse() fails, and
2. the only member of runtime_opts.desc[] is a QEMU_OPT_STRING, and
3. qemu_opt_parse() can't fail for QEMU_OPT_STRING.
Defuse this ticking time bomb by jumping behind the file descriptor
cleanup on error.
Also do that for the error paths where sd->fd is still -1. The file
descriptor cleanup happens to do nothing then, but let's not rely on
that here.
While there, rename label out to err, because it's on the error path,
not the normal path out of the function.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:35 +03:00
|
|
|
goto err;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2013-01-10 12:03:47 +04:00
|
|
|
/*
|
|
|
|
* QEMU block layer emulates writethrough cache as 'writeback + flush', so
|
|
|
|
* we always set SD_FLAG_CMD_CACHE (writeback cache) as default.
|
|
|
|
*/
|
|
|
|
s->cache_flags = SD_FLAG_CMD_CACHE;
|
|
|
|
if (flags & BDRV_O_NOCACHE) {
|
|
|
|
s->cache_flags = SD_FLAG_CMD_DIRECT;
|
|
|
|
}
|
2013-04-23 10:03:33 +04:00
|
|
|
s->discard_supported = true;
|
2013-01-10 12:03:47 +04:00
|
|
|
|
2017-03-06 22:00:43 +03:00
|
|
|
if (snap_id || tag[0]) {
|
2018-12-13 19:27:27 +03:00
|
|
|
trace_sheepdog_open(vid);
|
2012-10-06 20:57:14 +04:00
|
|
|
s->is_snapshot = true;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2014-05-16 13:00:24 +04:00
|
|
|
fd = connect_to_sdog(s, errp);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (fd < 0) {
|
2012-05-16 22:15:33 +04:00
|
|
|
ret = fd;
|
sheepdog: Defuse time bomb in sd_open() error handling
When qemu_opts_absorb_qdict() fails, sd_open() closes stdin, because
sd->fd is still zero. Fortunately, qemu_opts_absorb_qdict() can't
fail, because:
1. it only fails when qemu_opt_parse() fails, and
2. the only member of runtime_opts.desc[] is a QEMU_OPT_STRING, and
3. qemu_opt_parse() can't fail for QEMU_OPT_STRING.
Defuse this ticking time bomb by jumping behind the file descriptor
cleanup on error.
Also do that for the error paths where sd->fd is still -1. The file
descriptor cleanup happens to do nothing then, but let's not rely on
that here.
While there, rename label out to err, because it's on the error path,
not the normal path out of the function.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:35 +03:00
|
|
|
goto err;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2011-08-21 07:09:37 +04:00
|
|
|
buf = g_malloc(SD_INODE_SIZE);
|
2016-10-27 13:48:58 +03:00
|
|
|
ret = read_object(fd, s->bs, buf, vid_to_vdi_oid(vid),
|
2014-05-08 18:34:52 +04:00
|
|
|
0, SD_INODE_SIZE, 0, s->cache_flags);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
closesocket(fd);
|
|
|
|
|
|
|
|
if (ret) {
|
2014-05-16 13:00:25 +04:00
|
|
|
error_setg(errp, "Can't read snapshot inode");
|
sheepdog: Defuse time bomb in sd_open() error handling
When qemu_opts_absorb_qdict() fails, sd_open() closes stdin, because
sd->fd is still zero. Fortunately, qemu_opts_absorb_qdict() can't
fail, because:
1. it only fails when qemu_opt_parse() fails, and
2. the only member of runtime_opts.desc[] is a QEMU_OPT_STRING, and
3. qemu_opt_parse() can't fail for QEMU_OPT_STRING.
Defuse this ticking time bomb by jumping behind the file descriptor
cleanup on error.
Also do that for the error paths where sd->fd is still -1. The file
descriptor cleanup happens to do nothing then, but let's not rely on
that here.
While there, rename label out to err, because it's on the error path,
not the normal path out of the function.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:35 +03:00
|
|
|
goto err;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
memcpy(&s->inode, buf, sizeof(s->inode));
|
|
|
|
|
2013-04-23 10:03:34 +04:00
|
|
|
bs->total_sectors = s->inode.vdi_size / BDRV_SECTOR_SIZE;
|
2012-10-04 15:09:47 +04:00
|
|
|
pstrcpy(s->name, sizeof(s->name), vdi);
|
2011-08-12 16:33:15 +04:00
|
|
|
qemu_co_mutex_init(&s->lock);
|
2017-06-29 16:27:48 +03:00
|
|
|
qemu_co_mutex_init(&s->queue_lock);
|
2015-09-01 06:03:09 +03:00
|
|
|
qemu_co_queue_init(&s->overlapping_queue);
|
2013-04-12 20:10:49 +04:00
|
|
|
qemu_opts_del(opts);
|
2011-08-21 07:09:37 +04:00
|
|
|
g_free(buf);
|
2010-06-21 00:01:00 +04:00
|
|
|
return 0;
|
sheepdog: Defuse time bomb in sd_open() error handling
When qemu_opts_absorb_qdict() fails, sd_open() closes stdin, because
sd->fd is still zero. Fortunately, qemu_opts_absorb_qdict() can't
fail, because:
1. it only fails when qemu_opt_parse() fails, and
2. the only member of runtime_opts.desc[] is a QEMU_OPT_STRING, and
3. qemu_opt_parse() can't fail for QEMU_OPT_STRING.
Defuse this ticking time bomb by jumping behind the file descriptor
cleanup on error.
Also do that for the error paths where sd->fd is still -1. The file
descriptor cleanup happens to do nothing then, but let's not rely on
that here.
While there, rename label out to err, because it's on the error path,
not the normal path out of the function.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:35 +03:00
|
|
|
|
|
|
|
err:
|
2015-10-23 06:08:05 +03:00
|
|
|
aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd,
|
2016-12-01 22:26:41 +03:00
|
|
|
false, NULL, NULL, NULL, NULL);
|
sheepdog: Defuse time bomb in sd_open() error handling
When qemu_opts_absorb_qdict() fails, sd_open() closes stdin, because
sd->fd is still zero. Fortunately, qemu_opts_absorb_qdict() can't
fail, because:
1. it only fails when qemu_opt_parse() fails, and
2. the only member of runtime_opts.desc[] is a QEMU_OPT_STRING, and
3. qemu_opt_parse() can't fail for QEMU_OPT_STRING.
Defuse this ticking time bomb by jumping behind the file descriptor
cleanup on error.
Also do that for the error paths where sd->fd is still -1. The file
descriptor cleanup happens to do nothing then, but let's not rely on
that here.
While there, rename label out to err, because it's on the error path,
not the normal path out of the function.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-06 22:00:35 +03:00
|
|
|
closesocket(s->fd);
|
|
|
|
err_no_fd:
|
2013-04-12 20:10:49 +04:00
|
|
|
qemu_opts_del(opts);
|
2011-08-21 07:09:37 +04:00
|
|
|
g_free(buf);
|
2012-05-16 22:15:33 +04:00
|
|
|
return ret;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2015-08-28 05:53:58 +03:00
|
|
|
static int sd_reopen_prepare(BDRVReopenState *state, BlockReopenQueue *queue,
|
|
|
|
Error **errp)
|
|
|
|
{
|
|
|
|
BDRVSheepdogState *s = state->bs->opaque;
|
|
|
|
BDRVSheepdogReopenState *re_s;
|
|
|
|
int ret = 0;
|
|
|
|
|
|
|
|
re_s = state->opaque = g_new0(BDRVSheepdogReopenState, 1);
|
|
|
|
|
|
|
|
re_s->cache_flags = SD_FLAG_CMD_CACHE;
|
|
|
|
if (state->flags & BDRV_O_NOCACHE) {
|
|
|
|
re_s->cache_flags = SD_FLAG_CMD_DIRECT;
|
|
|
|
}
|
|
|
|
|
|
|
|
re_s->fd = get_sheep_fd(s, errp);
|
|
|
|
if (re_s->fd < 0) {
|
|
|
|
ret = re_s->fd;
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void sd_reopen_commit(BDRVReopenState *state)
|
|
|
|
{
|
|
|
|
BDRVSheepdogReopenState *re_s = state->opaque;
|
|
|
|
BDRVSheepdogState *s = state->bs->opaque;
|
|
|
|
|
|
|
|
if (s->fd) {
|
2015-10-23 06:08:05 +03:00
|
|
|
aio_set_fd_handler(s->aio_context, s->fd, false,
|
2016-12-01 22:26:41 +03:00
|
|
|
NULL, NULL, NULL, NULL);
|
2015-08-28 05:53:58 +03:00
|
|
|
closesocket(s->fd);
|
|
|
|
}
|
|
|
|
|
|
|
|
s->fd = re_s->fd;
|
|
|
|
s->cache_flags = re_s->cache_flags;
|
|
|
|
|
|
|
|
g_free(state->opaque);
|
|
|
|
state->opaque = NULL;
|
|
|
|
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void sd_reopen_abort(BDRVReopenState *state)
|
|
|
|
{
|
|
|
|
BDRVSheepdogReopenState *re_s = state->opaque;
|
|
|
|
BDRVSheepdogState *s = state->bs->opaque;
|
|
|
|
|
|
|
|
if (re_s == NULL) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (re_s->fd) {
|
2015-10-23 06:08:05 +03:00
|
|
|
aio_set_fd_handler(s->aio_context, re_s->fd, false,
|
2016-12-01 22:26:41 +03:00
|
|
|
NULL, NULL, NULL, NULL);
|
2015-08-28 05:53:58 +03:00
|
|
|
closesocket(re_s->fd);
|
|
|
|
}
|
|
|
|
|
|
|
|
g_free(state->opaque);
|
|
|
|
state->opaque = NULL;
|
|
|
|
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2014-05-16 13:00:22 +04:00
|
|
|
static int do_sd_create(BDRVSheepdogState *s, uint32_t *vdi_id, int snapshot,
|
|
|
|
Error **errp)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
|
|
|
SheepdogVdiReq hdr;
|
|
|
|
SheepdogVdiRsp *rsp = (SheepdogVdiRsp *)&hdr;
|
|
|
|
int fd, ret;
|
|
|
|
unsigned int wlen, rlen = 0;
|
|
|
|
char buf[SD_MAX_VDI_LEN];
|
|
|
|
|
2014-05-16 13:00:22 +04:00
|
|
|
fd = connect_to_sdog(s, errp);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (fd < 0) {
|
2012-05-16 22:15:33 +04:00
|
|
|
return fd;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2012-10-04 15:09:47 +04:00
|
|
|
/* FIXME: would it be better to fail (e.g., return -EIO) when filename
|
|
|
|
* does not fit in buf? For now, just truncate and avoid buffer overrun.
|
|
|
|
*/
|
2010-06-21 00:01:00 +04:00
|
|
|
memset(buf, 0, sizeof(buf));
|
2013-11-07 18:56:37 +04:00
|
|
|
pstrcpy(buf, sizeof(buf), s->name);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
memset(&hdr, 0, sizeof(hdr));
|
|
|
|
hdr.opcode = SD_OP_NEW_VDI;
|
2014-01-03 16:13:12 +04:00
|
|
|
hdr.base_vdi_id = s->inode.vdi_id;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
wlen = SD_MAX_VDI_LEN;
|
|
|
|
|
|
|
|
hdr.flags = SD_FLAG_CMD_WRITE;
|
|
|
|
hdr.snapid = snapshot;
|
|
|
|
|
|
|
|
hdr.data_length = wlen;
|
2013-11-07 18:56:37 +04:00
|
|
|
hdr.vdi_size = s->inode.vdi_size;
|
|
|
|
hdr.copy_policy = s->inode.copy_policy;
|
2013-11-07 18:56:38 +04:00
|
|
|
hdr.copies = s->inode.nr_copies;
|
2015-02-13 12:20:53 +03:00
|
|
|
hdr.block_size_shift = s->inode.block_size_shift;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2016-10-27 13:48:58 +03:00
|
|
|
ret = do_req(fd, NULL, (SheepdogReq *)&hdr, buf, &wlen, &rlen);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
closesocket(fd);
|
|
|
|
|
|
|
|
if (ret) {
|
2014-05-16 13:00:22 +04:00
|
|
|
error_setg_errno(errp, -ret, "create failed");
|
2012-05-16 22:15:33 +04:00
|
|
|
return ret;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
if (rsp->result != SD_RES_SUCCESS) {
|
2014-05-16 13:00:22 +04:00
|
|
|
error_setg(errp, "%s, %s", sd_strerror(rsp->result), s->inode.name);
|
2010-06-21 00:01:00 +04:00
|
|
|
return -EIO;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (vdi_id) {
|
|
|
|
*vdi_id = rsp->vdi_id;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-02-13 16:03:55 +03:00
|
|
|
static int sd_prealloc(BlockDriverState *bs, int64_t old_size, int64_t new_size,
|
|
|
|
Error **errp)
|
2011-07-05 22:38:48 +04:00
|
|
|
{
|
2016-03-08 17:57:05 +03:00
|
|
|
BlockBackend *blk = NULL;
|
2018-02-13 16:03:54 +03:00
|
|
|
BDRVSheepdogState *base = bs->opaque;
|
2015-02-13 12:20:53 +03:00
|
|
|
unsigned long buf_size;
|
2011-07-05 22:38:48 +04:00
|
|
|
uint32_t idx, max_idx;
|
2015-02-13 12:20:53 +03:00
|
|
|
uint32_t object_size;
|
|
|
|
void *buf = NULL;
|
2011-07-05 22:38:48 +04:00
|
|
|
int ret;
|
|
|
|
|
2019-04-25 15:25:10 +03:00
|
|
|
blk = blk_new(bdrv_get_aio_context(bs),
|
|
|
|
BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE | BLK_PERM_RESIZE,
|
2018-02-13 16:03:54 +03:00
|
|
|
BLK_PERM_ALL);
|
|
|
|
|
|
|
|
ret = blk_insert_bs(blk, bs, errp);
|
|
|
|
if (ret < 0) {
|
2014-05-16 13:00:21 +04:00
|
|
|
goto out_with_err_set;
|
2011-07-05 22:38:48 +04:00
|
|
|
}
|
|
|
|
|
2016-03-08 17:57:05 +03:00
|
|
|
blk_set_allow_write_beyond_eof(blk, true);
|
|
|
|
|
2015-02-13 12:20:53 +03:00
|
|
|
object_size = (UINT32_C(1) << base->inode.block_size_shift);
|
|
|
|
buf_size = MIN(object_size, SD_DATA_OBJ_SIZE);
|
|
|
|
buf = g_malloc0(buf_size);
|
|
|
|
|
2018-02-13 16:03:55 +03:00
|
|
|
max_idx = DIV_ROUND_UP(new_size, buf_size);
|
2011-07-05 22:38:48 +04:00
|
|
|
|
2018-02-13 16:03:55 +03:00
|
|
|
for (idx = old_size / buf_size; idx < max_idx; idx++) {
|
2011-07-05 22:38:48 +04:00
|
|
|
/*
|
|
|
|
* The created image can be a cloned image, so we need to read
|
|
|
|
* a data from the source image.
|
|
|
|
*/
|
2016-03-08 17:57:05 +03:00
|
|
|
ret = blk_pread(blk, idx * buf_size, buf, buf_size);
|
2011-07-05 22:38:48 +04:00
|
|
|
if (ret < 0) {
|
|
|
|
goto out;
|
|
|
|
}
|
2016-05-06 19:26:27 +03:00
|
|
|
ret = blk_pwrite(blk, idx * buf_size, buf, buf_size, 0);
|
2011-07-05 22:38:48 +04:00
|
|
|
if (ret < 0) {
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
}
|
2014-05-16 13:00:21 +04:00
|
|
|
|
2016-03-08 17:57:05 +03:00
|
|
|
ret = 0;
|
2011-07-05 22:38:48 +04:00
|
|
|
out:
|
2014-05-16 13:00:21 +04:00
|
|
|
if (ret < 0) {
|
|
|
|
error_setg_errno(errp, -ret, "Can't pre-allocate");
|
|
|
|
}
|
|
|
|
out_with_err_set:
|
2018-05-18 21:17:17 +03:00
|
|
|
blk_unref(blk);
|
2011-08-21 07:09:37 +04:00
|
|
|
g_free(buf);
|
2011-07-05 22:38:48 +04:00
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2018-02-01 15:20:44 +03:00
|
|
|
static int sd_create_prealloc(BlockdevOptionsSheepdog *location, int64_t size,
|
|
|
|
Error **errp)
|
|
|
|
{
|
|
|
|
BlockDriverState *bs;
|
|
|
|
Visitor *v;
|
|
|
|
QObject *obj = NULL;
|
|
|
|
QDict *qdict;
|
|
|
|
Error *local_err = NULL;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
v = qobject_output_visitor_new(&obj);
|
|
|
|
visit_type_BlockdevOptionsSheepdog(v, NULL, &location, &local_err);
|
|
|
|
visit_free(v);
|
|
|
|
|
|
|
|
if (local_err) {
|
|
|
|
error_propagate(errp, local_err);
|
2018-04-19 18:01:43 +03:00
|
|
|
qobject_unref(obj);
|
2018-02-01 15:20:44 +03:00
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
2018-02-24 18:40:29 +03:00
|
|
|
qdict = qobject_to(QDict, obj);
|
2018-02-01 15:20:44 +03:00
|
|
|
qdict_flatten(qdict);
|
|
|
|
|
|
|
|
qdict_put_str(qdict, "driver", "sheepdog");
|
|
|
|
|
|
|
|
bs = bdrv_open(NULL, NULL, qdict, BDRV_O_PROTOCOL | BDRV_O_RDWR, errp);
|
|
|
|
if (bs == NULL) {
|
|
|
|
ret = -EIO;
|
|
|
|
goto fail;
|
|
|
|
}
|
|
|
|
|
|
|
|
ret = sd_prealloc(bs, 0, size, errp);
|
|
|
|
fail:
|
|
|
|
bdrv_unref(bs);
|
2018-04-19 18:01:43 +03:00
|
|
|
qobject_unref(qdict);
|
2018-02-01 15:20:44 +03:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2018-02-01 19:26:27 +03:00
|
|
|
static int parse_redundancy(BDRVSheepdogState *s, SheepdogRedundancy *opt)
|
|
|
|
{
|
|
|
|
struct SheepdogInode *inode = &s->inode;
|
|
|
|
|
|
|
|
switch (opt->type) {
|
|
|
|
case SHEEPDOG_REDUNDANCY_TYPE_FULL:
|
|
|
|
if (opt->u.full.copies > SD_MAX_COPIES || opt->u.full.copies < 1) {
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
inode->copy_policy = 0;
|
|
|
|
inode->nr_copies = opt->u.full.copies;
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
case SHEEPDOG_REDUNDANCY_TYPE_ERASURE_CODED:
|
|
|
|
{
|
|
|
|
int64_t copy = opt->u.erasure_coded.data_strips;
|
|
|
|
int64_t parity = opt->u.erasure_coded.parity_strips;
|
|
|
|
|
|
|
|
if (copy != 2 && copy != 4 && copy != 8 && copy != 16) {
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (parity >= SD_EC_MAX_STRIP || parity < 1) {
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* 4 bits for parity and 4 bits for data.
|
|
|
|
* We have to compress upper data bits because it can't represent 16
|
|
|
|
*/
|
|
|
|
inode->copy_policy = ((copy / 2) << 4) + parity;
|
|
|
|
inode->nr_copies = copy + parity;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
default:
|
|
|
|
g_assert_not_reached();
|
|
|
|
}
|
|
|
|
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
2013-11-07 18:56:38 +04:00
|
|
|
/*
|
|
|
|
* Sheepdog support two kinds of redundancy, full replication and erasure
|
|
|
|
* coding.
|
|
|
|
*
|
|
|
|
* # create a fully replicated vdi with x copies
|
|
|
|
* -o redundancy=x (1 <= x <= SD_MAX_COPIES)
|
|
|
|
*
|
|
|
|
* # create a erasure coded vdi with x data strips and y parity strips
|
|
|
|
* -o redundancy=x:y (x must be one of {2,4,8,16} and 1 <= y < SD_EC_MAX_STRIP)
|
|
|
|
*/
|
2018-02-01 15:20:44 +03:00
|
|
|
static SheepdogRedundancy *parse_redundancy_str(const char *opt)
|
2013-11-07 18:56:38 +04:00
|
|
|
{
|
2018-02-01 15:20:44 +03:00
|
|
|
SheepdogRedundancy *redundancy;
|
2013-11-07 18:56:38 +04:00
|
|
|
const char *n1, *n2;
|
|
|
|
long copy, parity;
|
|
|
|
char p[10];
|
2018-02-01 19:26:27 +03:00
|
|
|
int ret;
|
2013-11-07 18:56:38 +04:00
|
|
|
|
|
|
|
pstrcpy(p, sizeof(p), opt);
|
|
|
|
n1 = strtok(p, ":");
|
|
|
|
n2 = strtok(NULL, ":");
|
|
|
|
|
|
|
|
if (!n1) {
|
2018-02-01 15:20:44 +03:00
|
|
|
return NULL;
|
2013-11-07 18:56:38 +04:00
|
|
|
}
|
|
|
|
|
2018-02-01 19:26:27 +03:00
|
|
|
ret = qemu_strtol(n1, NULL, 10, ©);
|
|
|
|
if (ret < 0) {
|
2018-02-01 15:20:44 +03:00
|
|
|
return NULL;
|
2013-11-07 18:56:38 +04:00
|
|
|
}
|
|
|
|
|
2018-02-01 15:20:44 +03:00
|
|
|
redundancy = g_new0(SheepdogRedundancy, 1);
|
2018-02-01 19:26:27 +03:00
|
|
|
if (!n2) {
|
2018-02-01 15:20:44 +03:00
|
|
|
*redundancy = (SheepdogRedundancy) {
|
2018-02-01 19:26:27 +03:00
|
|
|
.type = SHEEPDOG_REDUNDANCY_TYPE_FULL,
|
|
|
|
.u.full.copies = copy,
|
|
|
|
};
|
|
|
|
} else {
|
|
|
|
ret = qemu_strtol(n2, NULL, 10, &parity);
|
|
|
|
if (ret < 0) {
|
2018-05-03 18:35:09 +03:00
|
|
|
g_free(redundancy);
|
2018-02-01 15:20:44 +03:00
|
|
|
return NULL;
|
2018-02-01 19:26:27 +03:00
|
|
|
}
|
2013-11-07 18:56:38 +04:00
|
|
|
|
2018-02-01 15:20:44 +03:00
|
|
|
*redundancy = (SheepdogRedundancy) {
|
2018-02-01 19:26:27 +03:00
|
|
|
.type = SHEEPDOG_REDUNDANCY_TYPE_ERASURE_CODED,
|
|
|
|
.u.erasure_coded = {
|
|
|
|
.data_strips = copy,
|
|
|
|
.parity_strips = parity,
|
|
|
|
},
|
|
|
|
};
|
2013-11-07 18:56:38 +04:00
|
|
|
}
|
|
|
|
|
2018-02-01 15:20:44 +03:00
|
|
|
return redundancy;
|
2013-11-07 18:56:38 +04:00
|
|
|
}
|
|
|
|
|
2018-02-01 15:20:44 +03:00
|
|
|
static int parse_block_size_shift(BDRVSheepdogState *s,
|
|
|
|
BlockdevCreateOptionsSheepdog *opts)
|
2015-02-13 12:20:53 +03:00
|
|
|
{
|
|
|
|
struct SheepdogInode *inode = &s->inode;
|
|
|
|
uint64_t object_size;
|
|
|
|
int obj_order;
|
|
|
|
|
2018-02-01 15:20:44 +03:00
|
|
|
if (opts->has_object_size) {
|
|
|
|
object_size = opts->object_size;
|
|
|
|
|
2015-02-13 12:20:53 +03:00
|
|
|
if ((object_size - 1) & object_size) { /* not a power of 2? */
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
2015-03-23 18:29:26 +03:00
|
|
|
obj_order = ctz32(object_size);
|
2015-02-13 12:20:53 +03:00
|
|
|
if (obj_order < 20 || obj_order > 31) {
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
inode->block_size_shift = (uint8_t)obj_order;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-02-01 15:20:44 +03:00
|
|
|
static int sd_co_create(BlockdevCreateOptions *options, Error **errp)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
2018-02-01 15:20:44 +03:00
|
|
|
BlockdevCreateOptionsSheepdog *opts = &options->u.sheepdog;
|
2012-05-16 22:15:34 +04:00
|
|
|
int ret = 0;
|
2013-11-07 18:56:37 +04:00
|
|
|
uint32_t vid = 0;
|
2010-06-21 00:01:00 +04:00
|
|
|
char *backing_file = NULL;
|
2014-06-05 13:21:05 +04:00
|
|
|
char *buf = NULL;
|
2012-05-16 22:15:34 +04:00
|
|
|
BDRVSheepdogState *s;
|
2015-02-13 12:20:53 +03:00
|
|
|
uint64_t max_vdi_size;
|
2012-10-06 20:57:14 +04:00
|
|
|
bool prealloc = false;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2018-02-01 15:20:44 +03:00
|
|
|
assert(options->driver == BLOCKDEV_DRIVER_SHEEPDOG);
|
|
|
|
|
block: Use g_new() & friends where that makes obvious sense
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer,
for two reasons. One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.
Patch created with Coccinelle, with two manual changes on top:
* Add const to bdrv_iterate_format() to keep the types straight
* Convert the allocation in bdrv_drop_intermediate(), which Coccinelle
inexplicably misses
Coccinelle semantic patch:
@@
type T;
@@
-g_malloc(sizeof(T))
+g_new(T, 1)
@@
type T;
@@
-g_try_malloc(sizeof(T))
+g_try_new(T, 1)
@@
type T;
@@
-g_malloc0(sizeof(T))
+g_new0(T, 1)
@@
type T;
@@
-g_try_malloc0(sizeof(T))
+g_try_new0(T, 1)
@@
type T;
expression n;
@@
-g_malloc(sizeof(T) * (n))
+g_new(T, n)
@@
type T;
expression n;
@@
-g_try_malloc(sizeof(T) * (n))
+g_try_new(T, n)
@@
type T;
expression n;
@@
-g_malloc0(sizeof(T) * (n))
+g_new0(T, n)
@@
type T;
expression n;
@@
-g_try_malloc0(sizeof(T) * (n))
+g_try_new0(T, n)
@@
type T;
expression p, n;
@@
-g_realloc(p, sizeof(T) * (n))
+g_renew(T, p, n)
@@
type T;
expression p, n;
@@
-g_try_realloc(p, sizeof(T) * (n))
+g_try_renew(T, p, n)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-19 12:31:08 +04:00
|
|
|
s = g_new0(BDRVSheepdogState, 1);
|
2012-05-16 22:15:34 +04:00
|
|
|
|
2018-02-01 15:20:44 +03:00
|
|
|
/* Steal SocketAddress from QAPI, set NULL to prevent double free */
|
|
|
|
s->addr = opts->location->server;
|
|
|
|
opts->location->server = NULL;
|
|
|
|
|
|
|
|
if (strlen(opts->location->vdi) >= sizeof(s->name)) {
|
|
|
|
error_setg(errp, "'vdi' string too long");
|
|
|
|
ret = -EINVAL;
|
2012-05-16 22:15:34 +04:00
|
|
|
goto out;
|
2011-01-27 19:33:10 +03:00
|
|
|
}
|
2018-02-01 15:20:44 +03:00
|
|
|
pstrcpy(s->name, sizeof(s->name), opts->location->vdi);
|
2011-01-27 19:33:10 +03:00
|
|
|
|
2018-02-01 15:20:44 +03:00
|
|
|
s->inode.vdi_size = opts->size;
|
|
|
|
backing_file = opts->backing_file;
|
2017-03-06 22:00:43 +03:00
|
|
|
|
2018-02-01 15:20:44 +03:00
|
|
|
if (!opts->has_preallocation) {
|
|
|
|
opts->preallocation = PREALLOC_MODE_OFF;
|
|
|
|
}
|
|
|
|
switch (opts->preallocation) {
|
|
|
|
case PREALLOC_MODE_OFF:
|
2014-06-05 13:21:05 +04:00
|
|
|
prealloc = false;
|
2018-02-01 15:20:44 +03:00
|
|
|
break;
|
|
|
|
case PREALLOC_MODE_FULL:
|
2014-06-05 13:21:05 +04:00
|
|
|
prealloc = true;
|
2018-02-01 15:20:44 +03:00
|
|
|
break;
|
|
|
|
default:
|
|
|
|
error_setg(errp, "Preallocation mode not supported for Sheepdog");
|
2014-06-05 13:21:05 +04:00
|
|
|
ret = -EINVAL;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2018-02-01 15:20:44 +03:00
|
|
|
if (opts->has_redundancy) {
|
|
|
|
ret = parse_redundancy(s, opts->redundancy);
|
2014-06-05 13:21:05 +04:00
|
|
|
if (ret < 0) {
|
2018-02-01 15:20:44 +03:00
|
|
|
error_setg(errp, "Invalid redundancy mode");
|
2014-06-05 13:21:05 +04:00
|
|
|
goto out;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
}
|
2015-02-13 12:20:53 +03:00
|
|
|
ret = parse_block_size_shift(s, opts);
|
|
|
|
if (ret < 0) {
|
|
|
|
error_setg(errp, "Invalid object_size."
|
|
|
|
" obect_size needs to be power of 2"
|
|
|
|
" and be limited from 2^20 to 2^31");
|
2012-05-16 22:15:34 +04:00
|
|
|
goto out;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2018-02-01 15:20:44 +03:00
|
|
|
if (opts->has_backing_file) {
|
2016-03-08 17:57:05 +03:00
|
|
|
BlockBackend *blk;
|
2014-01-03 16:13:12 +04:00
|
|
|
BDRVSheepdogState *base;
|
2010-06-21 00:01:00 +04:00
|
|
|
BlockDriver *drv;
|
|
|
|
|
|
|
|
/* Currently, only Sheepdog backing image is supported. */
|
2018-02-01 15:20:44 +03:00
|
|
|
drv = bdrv_find_protocol(opts->backing_file, true, NULL);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (!drv || strcmp(drv->protocol_name, "sheepdog") != 0) {
|
2014-05-16 13:00:24 +04:00
|
|
|
error_setg(errp, "backing_file must be a sheepdog image");
|
2012-05-16 22:15:34 +04:00
|
|
|
ret = -EINVAL;
|
|
|
|
goto out;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2018-02-01 15:20:44 +03:00
|
|
|
blk = blk_new_open(opts->backing_file, NULL, NULL,
|
2016-03-15 16:34:37 +03:00
|
|
|
BDRV_O_PROTOCOL, errp);
|
2016-03-08 17:57:05 +03:00
|
|
|
if (blk == NULL) {
|
|
|
|
ret = -EIO;
|
2012-05-16 22:15:34 +04:00
|
|
|
goto out;
|
2012-05-16 22:15:33 +04:00
|
|
|
}
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2016-03-08 17:57:05 +03:00
|
|
|
base = blk_bs(blk)->opaque;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2014-01-03 16:13:12 +04:00
|
|
|
if (!is_snapshot(&base->inode)) {
|
2014-05-16 13:00:24 +04:00
|
|
|
error_setg(errp, "cannot clone from a non snapshot vdi");
|
2016-03-08 17:57:05 +03:00
|
|
|
blk_unref(blk);
|
2012-05-16 22:15:34 +04:00
|
|
|
ret = -EINVAL;
|
|
|
|
goto out;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
2014-01-03 16:13:12 +04:00
|
|
|
s->inode.vdi_id = base->inode.vdi_id;
|
2016-03-08 17:57:05 +03:00
|
|
|
blk_unref(blk);
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2014-06-17 09:45:35 +04:00
|
|
|
s->aio_context = qemu_get_aio_context();
|
2015-02-13 12:20:53 +03:00
|
|
|
|
|
|
|
/* if block_size_shift is not specified, get cluster default value */
|
|
|
|
if (s->inode.block_size_shift == 0) {
|
|
|
|
SheepdogVdiReq hdr;
|
|
|
|
SheepdogClusterRsp *rsp = (SheepdogClusterRsp *)&hdr;
|
|
|
|
int fd;
|
|
|
|
unsigned int wlen = 0, rlen = 0;
|
|
|
|
|
2017-03-06 22:00:37 +03:00
|
|
|
fd = connect_to_sdog(s, errp);
|
2015-02-13 12:20:53 +03:00
|
|
|
if (fd < 0) {
|
2017-03-06 22:00:37 +03:00
|
|
|
ret = fd;
|
2015-02-13 12:20:53 +03:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
memset(&hdr, 0, sizeof(hdr));
|
|
|
|
hdr.opcode = SD_OP_GET_CLUSTER_DEFAULT;
|
|
|
|
hdr.proto_ver = SD_PROTO_VER;
|
|
|
|
|
2016-10-27 13:48:58 +03:00
|
|
|
ret = do_req(fd, NULL, (SheepdogReq *)&hdr,
|
2015-02-13 12:20:53 +03:00
|
|
|
NULL, &wlen, &rlen);
|
|
|
|
closesocket(fd);
|
|
|
|
if (ret) {
|
|
|
|
error_setg_errno(errp, -ret, "failed to get cluster default");
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
if (rsp->result == SD_RES_SUCCESS) {
|
|
|
|
s->inode.block_size_shift = rsp->block_size_shift;
|
|
|
|
} else {
|
|
|
|
s->inode.block_size_shift = SD_DEFAULT_BLOCK_SIZE_SHIFT;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
max_vdi_size = (UINT64_C(1) << s->inode.block_size_shift) * MAX_DATA_OBJS;
|
|
|
|
|
|
|
|
if (s->inode.vdi_size > max_vdi_size) {
|
|
|
|
error_setg(errp, "An image is too large."
|
|
|
|
" The maximum image size is %"PRIu64 "GB",
|
|
|
|
max_vdi_size / 1024 / 1024 / 1024);
|
|
|
|
ret = -EINVAL;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2014-05-16 13:00:24 +04:00
|
|
|
ret = do_sd_create(s, &vid, 0, errp);
|
2014-05-16 13:00:22 +04:00
|
|
|
if (ret) {
|
2012-05-16 22:15:34 +04:00
|
|
|
goto out;
|
2011-07-05 22:38:48 +04:00
|
|
|
}
|
|
|
|
|
2014-05-16 13:00:22 +04:00
|
|
|
if (prealloc) {
|
2018-02-01 15:20:44 +03:00
|
|
|
ret = sd_create_prealloc(opts->location, opts->size, errp);
|
2014-05-16 13:00:21 +04:00
|
|
|
}
|
2012-05-16 22:15:34 +04:00
|
|
|
out:
|
2014-06-05 13:21:05 +04:00
|
|
|
g_free(backing_file);
|
|
|
|
g_free(buf);
|
2018-02-01 15:20:44 +03:00
|
|
|
g_free(s->addr);
|
2012-05-16 22:15:34 +04:00
|
|
|
g_free(s);
|
|
|
|
return ret;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2020-03-26 04:12:17 +03:00
|
|
|
static int coroutine_fn sd_co_create_opts(BlockDriver *drv,
|
|
|
|
const char *filename,
|
|
|
|
QemuOpts *opts,
|
2018-02-01 15:20:44 +03:00
|
|
|
Error **errp)
|
|
|
|
{
|
|
|
|
BlockdevCreateOptions *create_options = NULL;
|
|
|
|
QDict *qdict, *location_qdict;
|
|
|
|
Visitor *v;
|
2018-05-03 18:35:09 +03:00
|
|
|
char *redundancy;
|
2018-02-01 15:20:44 +03:00
|
|
|
Error *local_err = NULL;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
redundancy = qemu_opt_get_del(opts, BLOCK_OPT_REDUNDANCY);
|
|
|
|
|
|
|
|
qdict = qemu_opts_to_qdict(opts, NULL);
|
|
|
|
qdict_put_str(qdict, "driver", "sheepdog");
|
|
|
|
|
|
|
|
location_qdict = qdict_new();
|
|
|
|
qdict_put(qdict, "location", location_qdict);
|
|
|
|
|
|
|
|
sd_parse_filename(filename, location_qdict, &local_err);
|
|
|
|
if (local_err) {
|
|
|
|
error_propagate(errp, local_err);
|
|
|
|
ret = -EINVAL;
|
|
|
|
goto fail;
|
|
|
|
}
|
|
|
|
|
|
|
|
qdict_flatten(qdict);
|
|
|
|
|
|
|
|
/* Change legacy command line options into QMP ones */
|
|
|
|
static const QDictRenames opt_renames[] = {
|
|
|
|
{ BLOCK_OPT_BACKING_FILE, "backing-file" },
|
|
|
|
{ BLOCK_OPT_OBJECT_SIZE, "object-size" },
|
|
|
|
{ NULL, NULL },
|
|
|
|
};
|
|
|
|
|
|
|
|
if (!qdict_rename_keys(qdict, opt_renames, errp)) {
|
|
|
|
ret = -EINVAL;
|
|
|
|
goto fail;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Get the QAPI object */
|
2018-06-14 22:14:33 +03:00
|
|
|
v = qobject_input_visitor_new_flat_confused(qdict, errp);
|
|
|
|
if (!v) {
|
2018-02-01 15:20:44 +03:00
|
|
|
ret = -EINVAL;
|
|
|
|
goto fail;
|
|
|
|
}
|
|
|
|
|
|
|
|
visit_type_BlockdevCreateOptions(v, NULL, &create_options, &local_err);
|
|
|
|
visit_free(v);
|
|
|
|
|
|
|
|
if (local_err) {
|
|
|
|
error_propagate(errp, local_err);
|
|
|
|
ret = -EINVAL;
|
|
|
|
goto fail;
|
|
|
|
}
|
|
|
|
|
|
|
|
assert(create_options->driver == BLOCKDEV_DRIVER_SHEEPDOG);
|
|
|
|
create_options->u.sheepdog.size =
|
|
|
|
ROUND_UP(create_options->u.sheepdog.size, BDRV_SECTOR_SIZE);
|
|
|
|
|
|
|
|
if (redundancy) {
|
|
|
|
create_options->u.sheepdog.has_redundancy = true;
|
|
|
|
create_options->u.sheepdog.redundancy =
|
|
|
|
parse_redundancy_str(redundancy);
|
|
|
|
if (create_options->u.sheepdog.redundancy == NULL) {
|
|
|
|
error_setg(errp, "Invalid redundancy mode");
|
|
|
|
ret = -EINVAL;
|
|
|
|
goto fail;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
ret = sd_co_create(create_options, errp);
|
|
|
|
fail:
|
|
|
|
qapi_free_BlockdevCreateOptions(create_options);
|
2018-04-19 18:01:43 +03:00
|
|
|
qobject_unref(qdict);
|
2018-05-03 18:35:09 +03:00
|
|
|
g_free(redundancy);
|
2018-02-01 15:20:44 +03:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
static void sd_close(BlockDriverState *bs)
|
|
|
|
{
|
2014-05-16 13:00:19 +04:00
|
|
|
Error *local_err = NULL;
|
2010-06-21 00:01:00 +04:00
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
|
|
|
SheepdogVdiReq hdr;
|
|
|
|
SheepdogVdiRsp *rsp = (SheepdogVdiRsp *)&hdr;
|
|
|
|
unsigned int wlen, rlen = 0;
|
|
|
|
int fd, ret;
|
|
|
|
|
2018-12-13 19:27:27 +03:00
|
|
|
trace_sheepdog_close(s->name);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2014-05-16 13:00:19 +04:00
|
|
|
fd = connect_to_sdog(s, &local_err);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (fd < 0) {
|
2015-02-12 15:55:05 +03:00
|
|
|
error_report_err(local_err);
|
2010-06-21 00:01:00 +04:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
memset(&hdr, 0, sizeof(hdr));
|
|
|
|
|
|
|
|
hdr.opcode = SD_OP_RELEASE_VDI;
|
2014-08-11 09:43:45 +04:00
|
|
|
hdr.type = LOCK_TYPE_NORMAL;
|
2014-01-03 16:13:12 +04:00
|
|
|
hdr.base_vdi_id = s->inode.vdi_id;
|
2010-06-21 00:01:00 +04:00
|
|
|
wlen = strlen(s->name) + 1;
|
|
|
|
hdr.data_length = wlen;
|
|
|
|
hdr.flags = SD_FLAG_CMD_WRITE;
|
|
|
|
|
2016-10-27 13:48:58 +03:00
|
|
|
ret = do_req(fd, s->bs, (SheepdogReq *)&hdr,
|
2014-05-08 18:34:52 +04:00
|
|
|
s->name, &wlen, &rlen);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
closesocket(fd);
|
|
|
|
|
|
|
|
if (!ret && rsp->result != SD_RES_SUCCESS &&
|
|
|
|
rsp->result != SD_RES_VDI_NOT_LOCKED) {
|
2011-06-22 16:03:54 +04:00
|
|
|
error_report("%s, %s", sd_strerror(rsp->result), s->name);
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2015-10-23 06:08:05 +03:00
|
|
|
aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd,
|
2016-12-01 22:26:41 +03:00
|
|
|
false, NULL, NULL, NULL, NULL);
|
2010-06-21 00:01:00 +04:00
|
|
|
closesocket(s->fd);
|
2017-04-26 10:36:41 +03:00
|
|
|
qapi_free_SocketAddress(s->addr);
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static int64_t sd_getlength(BlockDriverState *bs)
|
|
|
|
{
|
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
|
|
|
|
|
|
|
return s->inode.vdi_size;
|
|
|
|
}
|
|
|
|
|
block: Convert .bdrv_truncate callback to coroutine_fn
bdrv_truncate() is an operation that can block (even for a quite long
time, depending on the PreallocMode) in I/O paths that shouldn't block.
Convert it to a coroutine_fn so that we have the infrastructure for
drivers to make their .bdrv_co_truncate implementation asynchronous.
This change could potentially introduce new race conditions because
bdrv_truncate() isn't necessarily executed atomically any more. Whether
this is a problem needs to be evaluated for each block driver that
supports truncate:
* file-posix/win32, gluster, iscsi, nfs, rbd, ssh, sheepdog: The
protocol drivers are trivially safe because they don't actually yield
yet, so there is no change in behaviour.
* copy-on-read, crypto, raw-format: Essentially just filter drivers that
pass the request to a child node, no problem.
* qcow2: The implementation modifies metadata, so it needs to hold
s->lock to be safe with concurrent I/O requests. In order to avoid
double locking, this requires pulling the locking out into
preallocate_co() and using qcow2_write_caches() instead of
bdrv_flush().
* qed: Does a single header update, this is fine without locking.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-06-21 18:54:35 +03:00
|
|
|
static int coroutine_fn sd_co_truncate(BlockDriverState *bs, int64_t offset,
|
2019-09-18 12:51:40 +03:00
|
|
|
bool exact, PreallocMode prealloc,
|
|
|
|
Error **errp)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
|
|
|
int ret, fd;
|
|
|
|
unsigned int datalen;
|
2015-02-13 12:20:53 +03:00
|
|
|
uint64_t max_vdi_size;
|
2018-02-13 16:03:56 +03:00
|
|
|
int64_t old_size = s->inode.vdi_size;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2018-02-13 16:03:56 +03:00
|
|
|
if (prealloc != PREALLOC_MODE_OFF && prealloc != PREALLOC_MODE_FULL) {
|
2017-06-13 23:20:52 +03:00
|
|
|
error_setg(errp, "Unsupported preallocation mode '%s'",
|
2017-08-24 11:46:08 +03:00
|
|
|
PreallocMode_str(prealloc));
|
2017-06-13 23:20:52 +03:00
|
|
|
return -ENOTSUP;
|
|
|
|
}
|
|
|
|
|
2015-02-13 12:20:53 +03:00
|
|
|
max_vdi_size = (UINT64_C(1) << s->inode.block_size_shift) * MAX_DATA_OBJS;
|
2018-02-13 16:03:56 +03:00
|
|
|
if (offset < old_size) {
|
2017-03-28 23:51:28 +03:00
|
|
|
error_setg(errp, "shrinking is not supported");
|
2010-06-21 00:01:00 +04:00
|
|
|
return -EINVAL;
|
2015-02-13 12:20:53 +03:00
|
|
|
} else if (offset > max_vdi_size) {
|
2017-03-28 23:51:28 +03:00
|
|
|
error_setg(errp, "too big image size");
|
2010-06-21 00:01:00 +04:00
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
2017-03-28 23:51:28 +03:00
|
|
|
fd = connect_to_sdog(s, errp);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (fd < 0) {
|
2012-05-16 22:15:33 +04:00
|
|
|
return fd;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
/* we don't need to update entire object */
|
2018-05-23 19:07:20 +03:00
|
|
|
datalen = SD_INODE_HEADER_SIZE;
|
2010-06-21 00:01:00 +04:00
|
|
|
s->inode.vdi_size = offset;
|
2016-10-27 13:48:58 +03:00
|
|
|
ret = write_object(fd, s->bs, (char *)&s->inode,
|
2014-05-08 18:34:52 +04:00
|
|
|
vid_to_vdi_oid(s->inode.vdi_id), s->inode.nr_copies,
|
|
|
|
datalen, 0, false, s->cache_flags);
|
2010-06-21 00:01:00 +04:00
|
|
|
close(fd);
|
|
|
|
|
|
|
|
if (ret < 0) {
|
2017-03-28 23:51:28 +03:00
|
|
|
error_setg_errno(errp, -ret, "failed to update an inode");
|
2018-02-13 16:03:56 +03:00
|
|
|
return ret;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2018-02-13 16:03:56 +03:00
|
|
|
if (prealloc == PREALLOC_MODE_FULL) {
|
|
|
|
ret = sd_prealloc(bs, old_size, offset, errp);
|
|
|
|
if (ret < 0) {
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* This function is called after writing data objects. If we need to
|
|
|
|
* update metadata, this sends a write request to the vdi object.
|
|
|
|
*/
|
2011-10-05 11:17:31 +04:00
|
|
|
static void coroutine_fn sd_write_done(SheepdogAIOCB *acb)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
2016-11-29 14:32:43 +03:00
|
|
|
BDRVSheepdogState *s = acb->s;
|
2010-06-21 00:01:00 +04:00
|
|
|
struct iovec iov;
|
|
|
|
AIOReq *aio_req;
|
|
|
|
uint32_t offset, data_len, mn, mx;
|
|
|
|
|
2015-09-01 06:03:09 +03:00
|
|
|
mn = acb->min_dirty_data_idx;
|
|
|
|
mx = acb->max_dirty_data_idx;
|
2010-06-21 00:01:00 +04:00
|
|
|
if (mn <= mx) {
|
|
|
|
/* we need to update the vdi object. */
|
2016-11-29 14:32:42 +03:00
|
|
|
++acb->nr_pending;
|
2010-06-21 00:01:00 +04:00
|
|
|
offset = sizeof(s->inode) - sizeof(s->inode.data_vdi_id) +
|
|
|
|
mn * sizeof(s->inode.data_vdi_id[0]);
|
|
|
|
data_len = (mx - mn + 1) * sizeof(s->inode.data_vdi_id[0]);
|
|
|
|
|
2015-09-01 06:03:09 +03:00
|
|
|
acb->min_dirty_data_idx = UINT32_MAX;
|
|
|
|
acb->max_dirty_data_idx = 0;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
iov.iov_base = &s->inode;
|
|
|
|
iov.iov_len = sizeof(s->inode);
|
|
|
|
aio_req = alloc_aio_req(s, acb, vid_to_vdi_oid(s->inode.vdi_id),
|
2014-06-06 08:35:11 +04:00
|
|
|
data_len, offset, 0, false, 0, offset);
|
|
|
|
add_aio_request(s, aio_req, &iov, 1, AIOCB_WRITE_UDATA);
|
2016-11-29 14:32:42 +03:00
|
|
|
if (--acb->nr_pending) {
|
|
|
|
qemu_coroutine_yield();
|
|
|
|
}
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-04-25 16:49:39 +04:00
|
|
|
/* Delete current working VDI on the snapshot chain */
|
|
|
|
static bool sd_delete(BDRVSheepdogState *s)
|
|
|
|
{
|
2014-05-16 13:00:19 +04:00
|
|
|
Error *local_err = NULL;
|
2013-04-25 16:49:39 +04:00
|
|
|
unsigned int wlen = SD_MAX_VDI_LEN, rlen = 0;
|
|
|
|
SheepdogVdiReq hdr = {
|
|
|
|
.opcode = SD_OP_DEL_VDI,
|
2014-01-03 16:13:12 +04:00
|
|
|
.base_vdi_id = s->inode.vdi_id,
|
2013-04-25 16:49:39 +04:00
|
|
|
.data_length = wlen,
|
|
|
|
.flags = SD_FLAG_CMD_WRITE,
|
|
|
|
};
|
|
|
|
SheepdogVdiRsp *rsp = (SheepdogVdiRsp *)&hdr;
|
|
|
|
int fd, ret;
|
|
|
|
|
2014-05-16 13:00:19 +04:00
|
|
|
fd = connect_to_sdog(s, &local_err);
|
2013-04-25 16:49:39 +04:00
|
|
|
if (fd < 0) {
|
2015-02-12 15:55:05 +03:00
|
|
|
error_report_err(local_err);
|
2013-04-25 16:49:39 +04:00
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2016-10-27 13:48:58 +03:00
|
|
|
ret = do_req(fd, s->bs, (SheepdogReq *)&hdr,
|
2014-05-08 18:34:52 +04:00
|
|
|
s->name, &wlen, &rlen);
|
2013-04-25 16:49:39 +04:00
|
|
|
closesocket(fd);
|
|
|
|
if (ret) {
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
switch (rsp->result) {
|
|
|
|
case SD_RES_NO_VDI:
|
|
|
|
error_report("%s was already deleted", s->name);
|
|
|
|
/* fall through */
|
|
|
|
case SD_RES_SUCCESS:
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
error_report("%s, %s", sd_strerror(rsp->result), s->name);
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
/*
|
|
|
|
* Create a writable VDI from a snapshot
|
|
|
|
*/
|
|
|
|
static int sd_create_branch(BDRVSheepdogState *s)
|
|
|
|
{
|
2014-05-16 13:00:19 +04:00
|
|
|
Error *local_err = NULL;
|
2010-06-21 00:01:00 +04:00
|
|
|
int ret, fd;
|
|
|
|
uint32_t vid;
|
|
|
|
char *buf;
|
2013-04-25 16:49:39 +04:00
|
|
|
bool deleted;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2018-12-13 19:27:27 +03:00
|
|
|
trace_sheepdog_create_branch_snapshot(s->inode.vdi_id);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2011-08-21 07:09:37 +04:00
|
|
|
buf = g_malloc(SD_INODE_SIZE);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2013-04-25 16:49:39 +04:00
|
|
|
/*
|
|
|
|
* Even If deletion fails, we will just create extra snapshot based on
|
2014-03-24 12:30:17 +04:00
|
|
|
* the working VDI which was supposed to be deleted. So no need to
|
2013-04-25 16:49:39 +04:00
|
|
|
* false bail out.
|
|
|
|
*/
|
|
|
|
deleted = sd_delete(s);
|
2014-05-16 13:00:22 +04:00
|
|
|
ret = do_sd_create(s, &vid, !deleted, &local_err);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (ret) {
|
2015-02-12 15:55:05 +03:00
|
|
|
error_report_err(local_err);
|
2010-06-21 00:01:00 +04:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2018-12-13 19:27:27 +03:00
|
|
|
trace_sheepdog_create_branch_created(vid);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2014-05-16 13:00:19 +04:00
|
|
|
fd = connect_to_sdog(s, &local_err);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (fd < 0) {
|
2015-02-12 15:55:05 +03:00
|
|
|
error_report_err(local_err);
|
2012-05-16 22:15:33 +04:00
|
|
|
ret = fd;
|
2010-06-21 00:01:00 +04:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2016-10-27 13:48:58 +03:00
|
|
|
ret = read_object(fd, s->bs, buf, vid_to_vdi_oid(vid),
|
2014-05-08 18:34:52 +04:00
|
|
|
s->inode.nr_copies, SD_INODE_SIZE, 0, s->cache_flags);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
closesocket(fd);
|
|
|
|
|
|
|
|
if (ret < 0) {
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
memcpy(&s->inode, buf, sizeof(s->inode));
|
|
|
|
|
2012-10-06 20:57:14 +04:00
|
|
|
s->is_snapshot = false;
|
2010-06-21 00:01:00 +04:00
|
|
|
ret = 0;
|
2018-12-13 19:27:27 +03:00
|
|
|
trace_sheepdog_create_branch_new(s->inode.vdi_id);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
out:
|
2011-08-21 07:09:37 +04:00
|
|
|
g_free(buf);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Send I/O requests to the server.
|
|
|
|
*
|
|
|
|
* This function sends requests to the server, links the requests to
|
2012-06-27 02:26:22 +04:00
|
|
|
* the inflight_list in BDRVSheepdogState, and exits without
|
2010-06-21 00:01:00 +04:00
|
|
|
* waiting the response. The responses are received in the
|
|
|
|
* `aio_read_response' function which is called from the main loop as
|
|
|
|
* a fd handler.
|
2011-08-12 16:33:15 +04:00
|
|
|
*
|
|
|
|
* Returns 1 when we need to wait a response, 0 when there is no sent
|
|
|
|
* request and -errno in error cases.
|
2010-06-21 00:01:00 +04:00
|
|
|
*/
|
2016-11-29 14:32:43 +03:00
|
|
|
static void coroutine_fn sd_co_rw_vector(SheepdogAIOCB *acb)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
|
|
|
int ret = 0;
|
2013-04-23 10:03:34 +04:00
|
|
|
unsigned long len, done = 0, total = acb->nb_sectors * BDRV_SECTOR_SIZE;
|
2015-02-13 12:20:53 +03:00
|
|
|
unsigned long idx;
|
|
|
|
uint32_t object_size;
|
2010-06-21 00:01:00 +04:00
|
|
|
uint64_t oid;
|
2015-02-13 12:20:53 +03:00
|
|
|
uint64_t offset;
|
2016-11-29 14:32:43 +03:00
|
|
|
BDRVSheepdogState *s = acb->s;
|
2010-06-21 00:01:00 +04:00
|
|
|
SheepdogInode *inode = &s->inode;
|
|
|
|
AIOReq *aio_req;
|
|
|
|
|
|
|
|
if (acb->aiocb_type == AIOCB_WRITE_UDATA && s->is_snapshot) {
|
|
|
|
/*
|
|
|
|
* In the case we open the snapshot VDI, Sheepdog creates the
|
|
|
|
* writable VDI when we do a write operation first.
|
|
|
|
*/
|
|
|
|
ret = sd_create_branch(s);
|
|
|
|
if (ret) {
|
|
|
|
acb->ret = -EIO;
|
2016-11-29 14:32:42 +03:00
|
|
|
return;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-02-13 12:20:53 +03:00
|
|
|
object_size = (UINT32_C(1) << inode->block_size_shift);
|
|
|
|
idx = acb->sector_num * BDRV_SECTOR_SIZE / object_size;
|
|
|
|
offset = (acb->sector_num * BDRV_SECTOR_SIZE) % object_size;
|
|
|
|
|
2012-06-27 02:26:21 +04:00
|
|
|
/*
|
|
|
|
* Make sure we don't free the aiocb before we are done with all requests.
|
|
|
|
* This additional reference is dropped at the end of this function.
|
|
|
|
*/
|
|
|
|
acb->nr_pending++;
|
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
while (done != total) {
|
|
|
|
uint8_t flags = 0;
|
|
|
|
uint64_t old_oid = 0;
|
2012-10-06 20:57:14 +04:00
|
|
|
bool create = false;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
oid = vid_to_data_oid(inode->data_vdi_id[idx], idx);
|
|
|
|
|
2015-02-13 12:20:53 +03:00
|
|
|
len = MIN(total - done, object_size - offset);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2012-07-10 18:12:27 +04:00
|
|
|
switch (acb->aiocb_type) {
|
|
|
|
case AIOCB_READ_UDATA:
|
|
|
|
if (!inode->data_vdi_id[idx]) {
|
|
|
|
qemu_iovec_memset(acb->qiov, done, 0, len);
|
2010-06-21 00:01:00 +04:00
|
|
|
goto done;
|
|
|
|
}
|
2012-07-10 18:12:27 +04:00
|
|
|
break;
|
|
|
|
case AIOCB_WRITE_UDATA:
|
|
|
|
if (!inode->data_vdi_id[idx]) {
|
2012-10-06 20:57:14 +04:00
|
|
|
create = true;
|
2012-07-10 18:12:27 +04:00
|
|
|
} else if (!is_data_obj_writable(inode, idx)) {
|
|
|
|
/* Copy-On-Write */
|
2012-10-06 20:57:14 +04:00
|
|
|
create = true;
|
2012-07-10 18:12:27 +04:00
|
|
|
old_oid = oid;
|
|
|
|
flags = SD_FLAG_CMD_COW;
|
|
|
|
}
|
|
|
|
break;
|
2013-04-23 10:03:33 +04:00
|
|
|
case AIOCB_DISCARD_OBJ:
|
|
|
|
/*
|
|
|
|
* We discard the object only when the whole object is
|
|
|
|
* 1) allocated 2) trimmed. Otherwise, simply skip it.
|
|
|
|
*/
|
2015-02-13 12:20:53 +03:00
|
|
|
if (len != object_size || inode->data_vdi_id[idx] == 0) {
|
2013-04-23 10:03:33 +04:00
|
|
|
goto done;
|
|
|
|
}
|
|
|
|
break;
|
2012-07-10 18:12:27 +04:00
|
|
|
default:
|
|
|
|
break;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
if (create) {
|
2018-12-13 19:27:27 +03:00
|
|
|
trace_sheepdog_co_rw_vector_update(inode->vdi_id, oid,
|
|
|
|
vid_to_data_oid(inode->data_vdi_id[idx], idx),
|
|
|
|
idx);
|
2010-06-21 00:01:00 +04:00
|
|
|
oid = vid_to_data_oid(inode->vdi_id, idx);
|
2018-12-13 19:27:27 +03:00
|
|
|
trace_sheepdog_co_rw_vector_new(oid);
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2014-06-06 08:35:11 +04:00
|
|
|
aio_req = alloc_aio_req(s, acb, oid, len, offset, flags, create,
|
2015-09-01 06:03:10 +03:00
|
|
|
old_oid,
|
|
|
|
acb->aiocb_type == AIOCB_DISCARD_OBJ ?
|
|
|
|
0 : done);
|
2014-06-06 08:35:11 +04:00
|
|
|
add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov,
|
2013-10-24 11:01:16 +04:00
|
|
|
acb->aiocb_type);
|
2010-06-21 00:01:00 +04:00
|
|
|
done:
|
|
|
|
offset = 0;
|
|
|
|
idx++;
|
|
|
|
done += len;
|
|
|
|
}
|
2016-11-29 14:32:42 +03:00
|
|
|
if (--acb->nr_pending) {
|
|
|
|
qemu_coroutine_yield();
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2016-11-29 14:32:45 +03:00
|
|
|
static void sd_aio_complete(SheepdogAIOCB *acb)
|
2015-07-17 19:44:24 +03:00
|
|
|
{
|
2017-06-29 16:27:48 +03:00
|
|
|
BDRVSheepdogState *s;
|
2016-11-29 14:32:45 +03:00
|
|
|
if (acb->aiocb_type == AIOCB_FLUSH_CACHE) {
|
|
|
|
return;
|
2015-07-17 19:44:24 +03:00
|
|
|
}
|
|
|
|
|
2017-06-29 16:27:48 +03:00
|
|
|
s = acb->s;
|
|
|
|
qemu_co_mutex_lock(&s->queue_lock);
|
2016-11-29 14:32:45 +03:00
|
|
|
QLIST_REMOVE(acb, aiocb_siblings);
|
2017-06-29 16:27:48 +03:00
|
|
|
qemu_co_queue_restart_all(&s->overlapping_queue);
|
|
|
|
qemu_co_mutex_unlock(&s->queue_lock);
|
2015-07-17 19:44:24 +03:00
|
|
|
}
|
|
|
|
|
2011-11-10 12:23:22 +04:00
|
|
|
static coroutine_fn int sd_co_writev(BlockDriverState *bs, int64_t sector_num,
|
2018-04-25 01:01:57 +03:00
|
|
|
int nb_sectors, QEMUIOVector *qiov,
|
|
|
|
int flags)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
2016-11-29 14:32:43 +03:00
|
|
|
SheepdogAIOCB acb;
|
2011-08-12 16:33:15 +04:00
|
|
|
int ret;
|
2013-12-13 21:29:28 +04:00
|
|
|
int64_t offset = (sector_num + nb_sectors) * BDRV_SECTOR_SIZE;
|
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2018-04-25 01:01:57 +03:00
|
|
|
assert(!flags);
|
2015-02-05 21:58:24 +03:00
|
|
|
if (offset > s->inode.vdi_size) {
|
2019-09-18 12:51:40 +03:00
|
|
|
ret = sd_co_truncate(bs, offset, false, PREALLOC_MODE_OFF, NULL);
|
2012-05-16 22:15:33 +04:00
|
|
|
if (ret < 0) {
|
|
|
|
return ret;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2016-11-29 14:32:43 +03:00
|
|
|
sd_aio_setup(&acb, s, qiov, sector_num, nb_sectors, AIOCB_WRITE_UDATA);
|
|
|
|
sd_co_rw_vector(&acb);
|
|
|
|
sd_write_done(&acb);
|
2016-11-29 14:32:45 +03:00
|
|
|
sd_aio_complete(&acb);
|
2011-08-12 16:33:15 +04:00
|
|
|
|
2016-11-29 14:32:43 +03:00
|
|
|
return acb.ret;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2011-11-10 12:23:22 +04:00
|
|
|
static coroutine_fn int sd_co_readv(BlockDriverState *bs, int64_t sector_num,
|
2011-08-12 16:33:15 +04:00
|
|
|
int nb_sectors, QEMUIOVector *qiov)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
2016-11-29 14:32:43 +03:00
|
|
|
SheepdogAIOCB acb;
|
2015-07-17 19:44:24 +03:00
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2016-11-29 14:32:43 +03:00
|
|
|
sd_aio_setup(&acb, s, qiov, sector_num, nb_sectors, AIOCB_READ_UDATA);
|
|
|
|
sd_co_rw_vector(&acb);
|
2016-11-29 14:32:45 +03:00
|
|
|
sd_aio_complete(&acb);
|
2011-08-12 16:33:15 +04:00
|
|
|
|
2016-11-29 14:32:43 +03:00
|
|
|
return acb.ret;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2012-04-04 00:03:58 +04:00
|
|
|
static int coroutine_fn sd_co_flush_to_disk(BlockDriverState *bs)
|
|
|
|
{
|
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
2016-11-29 14:32:43 +03:00
|
|
|
SheepdogAIOCB acb;
|
2013-01-15 12:28:55 +04:00
|
|
|
AIOReq *aio_req;
|
2012-04-04 00:03:58 +04:00
|
|
|
|
2013-01-10 12:03:47 +04:00
|
|
|
if (s->cache_flags != SD_FLAG_CMD_CACHE) {
|
2012-04-04 00:03:58 +04:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2016-11-29 14:32:43 +03:00
|
|
|
sd_aio_setup(&acb, s, NULL, 0, 0, AIOCB_FLUSH_CACHE);
|
2012-04-04 00:03:58 +04:00
|
|
|
|
2016-11-29 14:32:43 +03:00
|
|
|
acb.nr_pending++;
|
|
|
|
aio_req = alloc_aio_req(s, &acb, vid_to_vdi_oid(s->inode.vdi_id),
|
2014-06-06 08:35:11 +04:00
|
|
|
0, 0, 0, false, 0, 0);
|
2016-11-29 14:32:43 +03:00
|
|
|
add_aio_request(s, aio_req, NULL, 0, acb.aiocb_type);
|
2012-04-04 00:03:58 +04:00
|
|
|
|
2016-11-29 14:32:43 +03:00
|
|
|
if (--acb.nr_pending) {
|
2016-11-29 14:32:42 +03:00
|
|
|
qemu_coroutine_yield();
|
|
|
|
}
|
2016-11-29 14:32:45 +03:00
|
|
|
|
|
|
|
sd_aio_complete(&acb);
|
2016-11-29 14:32:43 +03:00
|
|
|
return acb.ret;
|
2012-04-04 00:03:58 +04:00
|
|
|
}
|
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
static int sd_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
|
|
|
|
{
|
2014-05-16 13:00:19 +04:00
|
|
|
Error *local_err = NULL;
|
2010-06-21 00:01:00 +04:00
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
|
|
|
int ret, fd;
|
|
|
|
uint32_t new_vid;
|
|
|
|
SheepdogInode *inode;
|
|
|
|
unsigned int datalen;
|
|
|
|
|
2018-12-13 19:27:27 +03:00
|
|
|
trace_sheepdog_snapshot_create_info(sn_info->name, sn_info->id_str, s->name,
|
|
|
|
sn_info->vm_state_size, s->is_snapshot);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
if (s->is_snapshot) {
|
|
|
|
error_report("You can't create a snapshot of a snapshot VDI, "
|
2011-06-22 16:03:54 +04:00
|
|
|
"%s (%" PRIu32 ").", s->name, s->inode.vdi_id);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
2018-12-13 19:27:27 +03:00
|
|
|
trace_sheepdog_snapshot_create(sn_info->name, sn_info->id_str);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
s->inode.vm_state_size = sn_info->vm_state_size;
|
|
|
|
s->inode.vm_clock_nsec = sn_info->vm_clock_nsec;
|
2012-10-04 15:09:47 +04:00
|
|
|
/* It appears that inode.tag does not require a NUL terminator,
|
|
|
|
* which means this use of strncpy is ok.
|
|
|
|
*/
|
2010-06-21 00:01:00 +04:00
|
|
|
strncpy(s->inode.tag, sn_info->name, sizeof(s->inode.tag));
|
|
|
|
/* we don't need to update entire object */
|
2018-05-23 19:07:20 +03:00
|
|
|
datalen = SD_INODE_HEADER_SIZE;
|
2014-05-28 13:17:06 +04:00
|
|
|
inode = g_malloc(datalen);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
/* refresh inode. */
|
2014-05-16 13:00:19 +04:00
|
|
|
fd = connect_to_sdog(s, &local_err);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (fd < 0) {
|
2015-02-12 15:55:05 +03:00
|
|
|
error_report_err(local_err);
|
2012-05-16 22:15:33 +04:00
|
|
|
ret = fd;
|
2010-06-21 00:01:00 +04:00
|
|
|
goto cleanup;
|
|
|
|
}
|
|
|
|
|
2016-10-27 13:48:58 +03:00
|
|
|
ret = write_object(fd, s->bs, (char *)&s->inode,
|
2014-05-08 18:34:52 +04:00
|
|
|
vid_to_vdi_oid(s->inode.vdi_id), s->inode.nr_copies,
|
|
|
|
datalen, 0, false, s->cache_flags);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (ret < 0) {
|
2011-06-22 16:03:54 +04:00
|
|
|
error_report("failed to write snapshot's inode.");
|
2010-06-21 00:01:00 +04:00
|
|
|
goto cleanup;
|
|
|
|
}
|
|
|
|
|
2014-05-16 13:00:22 +04:00
|
|
|
ret = do_sd_create(s, &new_vid, 1, &local_err);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (ret < 0) {
|
2015-12-18 18:35:14 +03:00
|
|
|
error_reportf_err(local_err,
|
|
|
|
"failed to create inode for snapshot: ");
|
2010-06-21 00:01:00 +04:00
|
|
|
goto cleanup;
|
|
|
|
}
|
|
|
|
|
2016-10-27 13:48:58 +03:00
|
|
|
ret = read_object(fd, s->bs, (char *)inode,
|
2014-05-08 18:34:52 +04:00
|
|
|
vid_to_vdi_oid(new_vid), s->inode.nr_copies, datalen, 0,
|
|
|
|
s->cache_flags);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
if (ret < 0) {
|
2011-06-22 16:03:54 +04:00
|
|
|
error_report("failed to read new inode info. %s", strerror(errno));
|
2010-06-21 00:01:00 +04:00
|
|
|
goto cleanup;
|
|
|
|
}
|
|
|
|
|
|
|
|
memcpy(&s->inode, inode, datalen);
|
2018-12-13 19:27:27 +03:00
|
|
|
trace_sheepdog_snapshot_create_inode(s->inode.name, s->inode.snap_id,
|
|
|
|
s->inode.vdi_id);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
cleanup:
|
2014-05-28 13:17:06 +04:00
|
|
|
g_free(inode);
|
2010-06-21 00:01:00 +04:00
|
|
|
closesocket(fd);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2013-04-25 16:49:39 +04:00
|
|
|
/*
|
|
|
|
* We implement rollback(loadvm) operation to the specified snapshot by
|
|
|
|
* 1) switch to the snapshot
|
|
|
|
* 2) rely on sd_create_branch to delete working VDI and
|
2014-03-24 12:30:17 +04:00
|
|
|
* 3) create a new working VDI based on the specified snapshot
|
2013-04-25 16:49:39 +04:00
|
|
|
*/
|
2010-06-21 00:01:00 +04:00
|
|
|
static int sd_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
|
|
|
|
{
|
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
|
|
|
BDRVSheepdogState *old_s;
|
2013-04-25 20:19:53 +04:00
|
|
|
char tag[SD_MAX_VDI_TAG_LEN];
|
2010-06-21 00:01:00 +04:00
|
|
|
uint32_t snapid = 0;
|
2017-03-06 22:00:39 +03:00
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (!sd_parse_snapid_or_tag(snapshot_id, &snapid, tag)) {
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
2010-06-21 00:01:00 +04:00
|
|
|
|
block: Use g_new() & friends where that makes obvious sense
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer,
for two reasons. One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.
Patch created with Coccinelle, with two manual changes on top:
* Add const to bdrv_iterate_format() to keep the types straight
* Convert the allocation in bdrv_drop_intermediate(), which Coccinelle
inexplicably misses
Coccinelle semantic patch:
@@
type T;
@@
-g_malloc(sizeof(T))
+g_new(T, 1)
@@
type T;
@@
-g_try_malloc(sizeof(T))
+g_try_new(T, 1)
@@
type T;
@@
-g_malloc0(sizeof(T))
+g_new0(T, 1)
@@
type T;
@@
-g_try_malloc0(sizeof(T))
+g_try_new0(T, 1)
@@
type T;
expression n;
@@
-g_malloc(sizeof(T) * (n))
+g_new(T, n)
@@
type T;
expression n;
@@
-g_try_malloc(sizeof(T) * (n))
+g_try_new(T, n)
@@
type T;
expression n;
@@
-g_malloc0(sizeof(T) * (n))
+g_new0(T, n)
@@
type T;
expression n;
@@
-g_try_malloc0(sizeof(T) * (n))
+g_try_new0(T, n)
@@
type T;
expression p, n;
@@
-g_realloc(p, sizeof(T) * (n))
+g_renew(T, p, n)
@@
type T;
expression p, n;
@@
-g_try_realloc(p, sizeof(T) * (n))
+g_try_renew(T, p, n)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-19 12:31:08 +04:00
|
|
|
old_s = g_new(BDRVSheepdogState, 1);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
memcpy(old_s, s, sizeof(BDRVSheepdogState));
|
|
|
|
|
2013-04-25 20:19:53 +04:00
|
|
|
ret = reload_inode(s, snapid, tag);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (ret) {
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2013-06-07 21:54:26 +04:00
|
|
|
ret = sd_create_branch(s);
|
|
|
|
if (ret) {
|
2010-06-21 00:01:00 +04:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2011-08-21 07:09:37 +04:00
|
|
|
g_free(old_s);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
out:
|
|
|
|
/* recover bdrv_sd_state */
|
|
|
|
memcpy(s, old_s, sizeof(BDRVSheepdogState));
|
2011-08-21 07:09:37 +04:00
|
|
|
g_free(old_s);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2011-06-22 16:03:54 +04:00
|
|
|
error_report("failed to open. recover old bdrv_sd_state.");
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2015-12-23 15:22:26 +03:00
|
|
|
#define NR_BATCHED_DISCARD 128
|
|
|
|
|
2017-03-06 22:00:36 +03:00
|
|
|
static int remove_objects(BDRVSheepdogState *s, Error **errp)
|
2015-12-23 15:22:26 +03:00
|
|
|
{
|
|
|
|
int fd, i = 0, nr_objs = 0;
|
2017-03-06 22:00:36 +03:00
|
|
|
int ret;
|
2015-12-23 15:22:26 +03:00
|
|
|
SheepdogInode *inode = &s->inode;
|
|
|
|
|
2017-03-06 22:00:36 +03:00
|
|
|
fd = connect_to_sdog(s, errp);
|
2015-12-23 15:22:26 +03:00
|
|
|
if (fd < 0) {
|
2017-03-06 22:00:36 +03:00
|
|
|
return fd;
|
2015-12-23 15:22:26 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
nr_objs = count_data_objs(inode);
|
|
|
|
while (i < nr_objs) {
|
|
|
|
int start_idx, nr_filled_idx;
|
|
|
|
|
|
|
|
while (i < nr_objs && !inode->data_vdi_id[i]) {
|
|
|
|
i++;
|
|
|
|
}
|
|
|
|
start_idx = i;
|
|
|
|
|
|
|
|
nr_filled_idx = 0;
|
|
|
|
while (i < nr_objs && nr_filled_idx < NR_BATCHED_DISCARD) {
|
|
|
|
if (inode->data_vdi_id[i]) {
|
|
|
|
inode->data_vdi_id[i] = 0;
|
|
|
|
nr_filled_idx++;
|
|
|
|
}
|
|
|
|
|
|
|
|
i++;
|
|
|
|
}
|
|
|
|
|
2016-10-27 13:48:58 +03:00
|
|
|
ret = write_object(fd, s->bs,
|
2015-12-23 15:22:26 +03:00
|
|
|
(char *)&inode->data_vdi_id[start_idx],
|
|
|
|
vid_to_vdi_oid(s->inode.vdi_id), inode->nr_copies,
|
|
|
|
(i - start_idx) * sizeof(uint32_t),
|
|
|
|
offsetof(struct SheepdogInode,
|
|
|
|
data_vdi_id[start_idx]),
|
|
|
|
false, s->cache_flags);
|
|
|
|
if (ret < 0) {
|
2017-03-06 22:00:36 +03:00
|
|
|
error_setg(errp, "Failed to discard snapshot inode");
|
2015-12-23 15:22:26 +03:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2017-03-06 22:00:36 +03:00
|
|
|
ret = 0;
|
2015-12-23 15:22:26 +03:00
|
|
|
out:
|
|
|
|
closesocket(fd);
|
2017-03-06 22:00:36 +03:00
|
|
|
return ret;
|
2015-12-23 15:22:26 +03:00
|
|
|
}
|
|
|
|
|
snapshot: distinguish id and name in snapshot delete
Snapshot creation actually already distinguish id and name since it take
a structured parameter *sn, but delete can't. Later an accurate delete
is needed in qmp_transaction abort and blockdev-snapshot-delete-sync,
so change its prototype. Also *errp is added to tip error, but return
value is kepted to let caller check what kind of error happens. Existing
caller for it are savevm, delvm and qemu-img, they are not impacted by
introducing a new function bdrv_snapshot_delete_by_id_or_name(), which
check the return value and do the operation again.
Before this patch:
For qcow2, it search id first then name to find the one to delete.
For rbd, it search name.
For sheepdog, it does nothing.
After this patch:
For qcow2, logic is the same by call it twice in caller.
For rbd, it always fails in delete with id, but still search for name
in second try, no change to user.
Some code for *errp is based on Pavel's patch.
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-11 10:04:33 +04:00
|
|
|
static int sd_snapshot_delete(BlockDriverState *bs,
|
|
|
|
const char *snapshot_id,
|
|
|
|
const char *name,
|
|
|
|
Error **errp)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
2017-03-06 22:00:38 +03:00
|
|
|
/*
|
|
|
|
* FIXME should delete the snapshot matching both @snapshot_id and
|
|
|
|
* @name, but @name not used here
|
|
|
|
*/
|
2016-03-02 19:24:42 +03:00
|
|
|
unsigned long snap_id = 0;
|
2015-12-23 15:22:26 +03:00
|
|
|
char snap_tag[SD_MAX_VDI_TAG_LEN];
|
|
|
|
int fd, ret;
|
|
|
|
char buf[SD_MAX_VDI_LEN + SD_MAX_VDI_TAG_LEN];
|
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
|
|
|
unsigned int wlen = SD_MAX_VDI_LEN + SD_MAX_VDI_TAG_LEN, rlen = 0;
|
|
|
|
uint32_t vid;
|
|
|
|
SheepdogVdiReq hdr = {
|
|
|
|
.opcode = SD_OP_DEL_VDI,
|
|
|
|
.data_length = wlen,
|
|
|
|
.flags = SD_FLAG_CMD_WRITE,
|
|
|
|
};
|
|
|
|
SheepdogVdiRsp *rsp = (SheepdogVdiRsp *)&hdr;
|
|
|
|
|
2017-03-06 22:00:36 +03:00
|
|
|
ret = remove_objects(s, errp);
|
|
|
|
if (ret) {
|
|
|
|
return ret;
|
2015-12-23 15:22:26 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
memset(buf, 0, sizeof(buf));
|
|
|
|
memset(snap_tag, 0, sizeof(snap_tag));
|
|
|
|
pstrcpy(buf, SD_MAX_VDI_LEN, s->name);
|
2017-03-06 22:00:39 +03:00
|
|
|
/* TODO Use sd_parse_snapid() once this mess is cleaned up */
|
2016-03-02 19:24:42 +03:00
|
|
|
ret = qemu_strtoul(snapshot_id, NULL, 10, &snap_id);
|
|
|
|
if (ret || snap_id > UINT32_MAX) {
|
2017-03-06 22:00:38 +03:00
|
|
|
/*
|
|
|
|
* FIXME Since qemu_strtoul() returns -EINVAL when
|
|
|
|
* @snapshot_id is null, @snapshot_id is mandatory. Correct
|
|
|
|
* would be to require at least one of @snapshot_id and @name.
|
|
|
|
*/
|
2016-03-02 19:24:42 +03:00
|
|
|
error_setg(errp, "Invalid snapshot ID: %s",
|
|
|
|
snapshot_id ? snapshot_id : "<null>");
|
|
|
|
return -EINVAL;
|
2015-12-23 15:22:26 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
if (snap_id) {
|
2016-03-02 19:24:42 +03:00
|
|
|
hdr.snapid = (uint32_t) snap_id;
|
2015-12-23 15:22:26 +03:00
|
|
|
} else {
|
2017-03-06 22:00:38 +03:00
|
|
|
/* FIXME I suspect we should use @name here */
|
2017-03-06 22:00:39 +03:00
|
|
|
/* FIXME don't truncate silently */
|
2015-12-23 15:22:26 +03:00
|
|
|
pstrcpy(snap_tag, sizeof(snap_tag), snapshot_id);
|
|
|
|
pstrcpy(buf + SD_MAX_VDI_LEN, SD_MAX_VDI_TAG_LEN, snap_tag);
|
|
|
|
}
|
|
|
|
|
2017-03-06 22:00:36 +03:00
|
|
|
ret = find_vdi_name(s, s->name, snap_id, snap_tag, &vid, true, errp);
|
2015-12-23 15:22:26 +03:00
|
|
|
if (ret) {
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2017-03-06 22:00:36 +03:00
|
|
|
fd = connect_to_sdog(s, errp);
|
2015-12-23 15:22:26 +03:00
|
|
|
if (fd < 0) {
|
2017-03-06 22:00:36 +03:00
|
|
|
return fd;
|
2015-12-23 15:22:26 +03:00
|
|
|
}
|
|
|
|
|
2016-10-27 13:48:58 +03:00
|
|
|
ret = do_req(fd, s->bs, (SheepdogReq *)&hdr,
|
2015-12-23 15:22:26 +03:00
|
|
|
buf, &wlen, &rlen);
|
|
|
|
closesocket(fd);
|
|
|
|
if (ret) {
|
2017-03-06 22:00:36 +03:00
|
|
|
error_setg_errno(errp, -ret, "Couldn't send request to server");
|
2015-12-23 15:22:26 +03:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
switch (rsp->result) {
|
|
|
|
case SD_RES_NO_VDI:
|
2017-03-06 22:00:36 +03:00
|
|
|
error_setg(errp, "Can't find the snapshot");
|
|
|
|
return -ENOENT;
|
2015-12-23 15:22:26 +03:00
|
|
|
case SD_RES_SUCCESS:
|
|
|
|
break;
|
|
|
|
default:
|
2017-03-06 22:00:36 +03:00
|
|
|
error_setg(errp, "%s", sd_strerror(rsp->result));
|
|
|
|
return -EIO;
|
2015-12-23 15:22:26 +03:00
|
|
|
}
|
|
|
|
|
2017-03-06 22:00:36 +03:00
|
|
|
return 0;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static int sd_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
|
|
|
|
{
|
2014-05-16 13:00:19 +04:00
|
|
|
Error *local_err = NULL;
|
2010-06-21 00:01:00 +04:00
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
|
|
|
SheepdogReq req;
|
|
|
|
int fd, nr = 1024, ret, max = BITS_TO_LONGS(SD_NR_VDIS) * sizeof(long);
|
|
|
|
QEMUSnapshotInfo *sn_tab = NULL;
|
|
|
|
unsigned wlen, rlen;
|
|
|
|
int found = 0;
|
2018-05-23 19:07:21 +03:00
|
|
|
SheepdogInode *inode;
|
2010-06-21 00:01:00 +04:00
|
|
|
unsigned long *vdi_inuse;
|
|
|
|
unsigned int start_nr;
|
|
|
|
uint64_t hval;
|
|
|
|
uint32_t vid;
|
|
|
|
|
2011-08-21 07:09:37 +04:00
|
|
|
vdi_inuse = g_malloc(max);
|
2018-05-23 19:07:21 +03:00
|
|
|
inode = g_malloc(SD_INODE_HEADER_SIZE);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2014-05-16 13:00:19 +04:00
|
|
|
fd = connect_to_sdog(s, &local_err);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (fd < 0) {
|
2015-02-12 15:55:05 +03:00
|
|
|
error_report_err(local_err);
|
2012-05-16 22:15:33 +04:00
|
|
|
ret = fd;
|
2010-06-21 00:01:00 +04:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
rlen = max;
|
|
|
|
wlen = 0;
|
|
|
|
|
|
|
|
memset(&req, 0, sizeof(req));
|
|
|
|
|
|
|
|
req.opcode = SD_OP_READ_VDIS;
|
|
|
|
req.data_length = max;
|
|
|
|
|
2016-10-27 13:48:58 +03:00
|
|
|
ret = do_req(fd, s->bs, &req, vdi_inuse, &wlen, &rlen);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
closesocket(fd);
|
|
|
|
if (ret) {
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2014-08-19 12:31:09 +04:00
|
|
|
sn_tab = g_new0(QEMUSnapshotInfo, nr);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
/* calculate a vdi id with hash function */
|
|
|
|
hval = fnv_64a_buf(s->name, strlen(s->name), FNV1A_64_INIT);
|
|
|
|
start_nr = hval & (SD_NR_VDIS - 1);
|
|
|
|
|
2014-05-16 13:00:19 +04:00
|
|
|
fd = connect_to_sdog(s, &local_err);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (fd < 0) {
|
2015-02-12 15:55:05 +03:00
|
|
|
error_report_err(local_err);
|
2012-05-16 22:15:33 +04:00
|
|
|
ret = fd;
|
2010-06-21 00:01:00 +04:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
for (vid = start_nr; found < nr; vid = (vid + 1) % SD_NR_VDIS) {
|
|
|
|
if (!test_bit(vid, vdi_inuse)) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* we don't need to read entire object */
|
2018-05-23 19:07:21 +03:00
|
|
|
ret = read_object(fd, s->bs, (char *)inode,
|
2014-05-08 18:34:52 +04:00
|
|
|
vid_to_vdi_oid(vid),
|
2018-05-23 19:07:20 +03:00
|
|
|
0, SD_INODE_HEADER_SIZE, 0,
|
2013-01-10 12:03:47 +04:00
|
|
|
s->cache_flags);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
if (ret) {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
2018-05-23 19:07:21 +03:00
|
|
|
if (!strcmp(inode->name, s->name) && is_snapshot(inode)) {
|
|
|
|
sn_tab[found].date_sec = inode->snap_ctime >> 32;
|
|
|
|
sn_tab[found].date_nsec = inode->snap_ctime & 0xffffffff;
|
|
|
|
sn_tab[found].vm_state_size = inode->vm_state_size;
|
|
|
|
sn_tab[found].vm_clock_nsec = inode->vm_clock_nsec;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2014-04-29 21:03:12 +04:00
|
|
|
snprintf(sn_tab[found].id_str, sizeof(sn_tab[found].id_str),
|
2018-05-23 19:07:21 +03:00
|
|
|
"%" PRIu32, inode->snap_id);
|
2012-10-04 15:09:47 +04:00
|
|
|
pstrcpy(sn_tab[found].name,
|
2018-05-23 19:07:21 +03:00
|
|
|
MIN(sizeof(sn_tab[found].name), sizeof(inode->tag)),
|
|
|
|
inode->tag);
|
2010-06-21 00:01:00 +04:00
|
|
|
found++;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
closesocket(fd);
|
|
|
|
out:
|
|
|
|
*psn_tab = sn_tab;
|
|
|
|
|
2011-08-21 07:09:37 +04:00
|
|
|
g_free(vdi_inuse);
|
2018-05-23 19:07:21 +03:00
|
|
|
g_free(inode);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2012-05-16 22:15:33 +04:00
|
|
|
if (ret < 0) {
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
return found;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int do_load_save_vmstate(BDRVSheepdogState *s, uint8_t *data,
|
|
|
|
int64_t pos, int size, int load)
|
|
|
|
{
|
2014-05-16 13:00:19 +04:00
|
|
|
Error *local_err = NULL;
|
2012-10-06 20:57:14 +04:00
|
|
|
bool create;
|
|
|
|
int fd, ret = 0, remaining = size;
|
2010-06-21 00:01:00 +04:00
|
|
|
unsigned int data_len;
|
|
|
|
uint64_t vmstate_oid;
|
|
|
|
uint64_t offset;
|
2013-06-07 21:54:26 +04:00
|
|
|
uint32_t vdi_index;
|
|
|
|
uint32_t vdi_id = load ? s->inode.parent_vdi_id : s->inode.vdi_id;
|
2015-02-13 12:20:53 +03:00
|
|
|
uint32_t object_size = (UINT32_C(1) << s->inode.block_size_shift);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2014-05-16 13:00:19 +04:00
|
|
|
fd = connect_to_sdog(s, &local_err);
|
2010-06-21 00:01:00 +04:00
|
|
|
if (fd < 0) {
|
2015-02-12 15:55:05 +03:00
|
|
|
error_report_err(local_err);
|
2012-05-16 22:15:33 +04:00
|
|
|
return fd;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2012-05-29 20:05:15 +04:00
|
|
|
while (remaining) {
|
2015-02-13 12:20:53 +03:00
|
|
|
vdi_index = pos / object_size;
|
|
|
|
offset = pos % object_size;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2015-02-13 12:20:53 +03:00
|
|
|
data_len = MIN(remaining, object_size - offset);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2013-06-07 21:54:26 +04:00
|
|
|
vmstate_oid = vid_to_vmstate_oid(vdi_id, vdi_index);
|
2010-06-21 00:01:00 +04:00
|
|
|
|
|
|
|
create = (offset == 0);
|
|
|
|
if (load) {
|
2016-10-27 13:48:58 +03:00
|
|
|
ret = read_object(fd, s->bs, (char *)data, vmstate_oid,
|
2012-04-04 00:03:58 +04:00
|
|
|
s->inode.nr_copies, data_len, offset,
|
2013-01-10 12:03:47 +04:00
|
|
|
s->cache_flags);
|
2010-06-21 00:01:00 +04:00
|
|
|
} else {
|
2016-10-27 13:48:58 +03:00
|
|
|
ret = write_object(fd, s->bs, (char *)data, vmstate_oid,
|
2012-04-04 00:03:58 +04:00
|
|
|
s->inode.nr_copies, data_len, offset, create,
|
2013-01-10 12:03:47 +04:00
|
|
|
s->cache_flags);
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
if (ret < 0) {
|
2011-06-22 16:03:54 +04:00
|
|
|
error_report("failed to save vmstate %s", strerror(errno));
|
2010-06-21 00:01:00 +04:00
|
|
|
goto cleanup;
|
|
|
|
}
|
|
|
|
|
|
|
|
pos += data_len;
|
2012-08-29 22:39:45 +04:00
|
|
|
data += data_len;
|
2012-05-29 20:05:15 +04:00
|
|
|
remaining -= data_len;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
2012-05-29 20:05:15 +04:00
|
|
|
ret = size;
|
2010-06-21 00:01:00 +04:00
|
|
|
cleanup:
|
|
|
|
closesocket(fd);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2013-04-05 23:27:53 +04:00
|
|
|
static int sd_save_vmstate(BlockDriverState *bs, QEMUIOVector *qiov,
|
|
|
|
int64_t pos)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
2013-04-05 23:27:53 +04:00
|
|
|
void *buf;
|
|
|
|
int ret;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2013-04-05 23:27:53 +04:00
|
|
|
buf = qemu_blockalign(bs, qiov->size);
|
|
|
|
qemu_iovec_to_buf(qiov, 0, buf, qiov->size);
|
|
|
|
ret = do_load_save_vmstate(s, (uint8_t *) buf, pos, qiov->size, 0);
|
|
|
|
qemu_vfree(buf);
|
|
|
|
|
|
|
|
return ret;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
2016-06-09 17:50:16 +03:00
|
|
|
static int sd_load_vmstate(BlockDriverState *bs, QEMUIOVector *qiov,
|
|
|
|
int64_t pos)
|
2010-06-21 00:01:00 +04:00
|
|
|
{
|
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
2016-06-09 17:50:16 +03:00
|
|
|
void *buf;
|
|
|
|
int ret;
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2016-06-09 17:50:16 +03:00
|
|
|
buf = qemu_blockalign(bs, qiov->size);
|
|
|
|
ret = do_load_save_vmstate(s, buf, pos, qiov->size, 1);
|
|
|
|
qemu_iovec_from_buf(qiov, 0, buf, qiov->size);
|
|
|
|
qemu_vfree(buf);
|
|
|
|
|
|
|
|
return ret;
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
|
2016-07-16 02:23:05 +03:00
|
|
|
static coroutine_fn int sd_co_pdiscard(BlockDriverState *bs, int64_t offset,
|
2017-06-09 13:18:08 +03:00
|
|
|
int bytes)
|
2013-04-23 10:03:33 +04:00
|
|
|
{
|
2016-11-29 14:32:43 +03:00
|
|
|
SheepdogAIOCB acb;
|
2013-04-23 10:03:33 +04:00
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
2015-09-01 06:03:10 +03:00
|
|
|
QEMUIOVector discard_iov;
|
|
|
|
struct iovec iov;
|
|
|
|
uint32_t zero = 0;
|
2013-04-23 10:03:33 +04:00
|
|
|
|
|
|
|
if (!s->discard_supported) {
|
2016-07-16 02:23:05 +03:00
|
|
|
return 0;
|
2013-04-23 10:03:33 +04:00
|
|
|
}
|
|
|
|
|
2015-09-01 06:03:10 +03:00
|
|
|
memset(&discard_iov, 0, sizeof(discard_iov));
|
|
|
|
memset(&iov, 0, sizeof(iov));
|
|
|
|
iov.iov_base = &zero;
|
|
|
|
iov.iov_len = sizeof(zero);
|
|
|
|
discard_iov.iov = &iov;
|
|
|
|
discard_iov.niov = 1;
|
2017-06-09 13:18:08 +03:00
|
|
|
if (!QEMU_IS_ALIGNED(offset | bytes, BDRV_SECTOR_SIZE)) {
|
2016-11-17 23:13:57 +03:00
|
|
|
return -ENOTSUP;
|
|
|
|
}
|
2016-11-29 14:32:43 +03:00
|
|
|
sd_aio_setup(&acb, s, &discard_iov, offset >> BDRV_SECTOR_BITS,
|
2017-06-09 13:18:08 +03:00
|
|
|
bytes >> BDRV_SECTOR_BITS, AIOCB_DISCARD_OBJ);
|
2016-11-29 14:32:43 +03:00
|
|
|
sd_co_rw_vector(&acb);
|
2016-11-29 14:32:45 +03:00
|
|
|
sd_aio_complete(&acb);
|
2013-04-23 10:03:33 +04:00
|
|
|
|
2016-11-29 14:32:43 +03:00
|
|
|
return acb.ret;
|
2013-04-23 10:03:33 +04:00
|
|
|
}
|
|
|
|
|
2018-02-13 23:26:55 +03:00
|
|
|
static coroutine_fn int
|
|
|
|
sd_co_block_status(BlockDriverState *bs, bool want_zero, int64_t offset,
|
|
|
|
int64_t bytes, int64_t *pnum, int64_t *map,
|
|
|
|
BlockDriverState **file)
|
2013-04-23 10:03:35 +04:00
|
|
|
{
|
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
|
|
|
SheepdogInode *inode = &s->inode;
|
2015-02-13 12:20:53 +03:00
|
|
|
uint32_t object_size = (UINT32_C(1) << inode->block_size_shift);
|
|
|
|
unsigned long start = offset / object_size,
|
2018-02-13 23:26:55 +03:00
|
|
|
end = DIV_ROUND_UP(offset + bytes, object_size);
|
2013-04-23 10:03:35 +04:00
|
|
|
unsigned long idx;
|
2018-02-13 23:26:55 +03:00
|
|
|
*map = offset;
|
|
|
|
int ret = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
|
2013-04-23 10:03:35 +04:00
|
|
|
|
|
|
|
for (idx = start; idx < end; idx++) {
|
|
|
|
if (inode->data_vdi_id[idx] == 0) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (idx == start) {
|
|
|
|
/* Get the longest length of unallocated sectors */
|
|
|
|
ret = 0;
|
|
|
|
for (idx = start + 1; idx < end; idx++) {
|
|
|
|
if (inode->data_vdi_id[idx] != 0) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2018-02-13 23:26:55 +03:00
|
|
|
*pnum = (idx - start) * object_size;
|
|
|
|
if (*pnum > bytes) {
|
|
|
|
*pnum = bytes;
|
2013-04-23 10:03:35 +04:00
|
|
|
}
|
2016-01-26 06:58:55 +03:00
|
|
|
if (ret > 0 && ret & BDRV_BLOCK_OFFSET_VALID) {
|
|
|
|
*file = bs;
|
|
|
|
}
|
2013-04-23 10:03:35 +04:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2013-08-07 12:59:53 +04:00
|
|
|
static int64_t sd_get_allocated_file_size(BlockDriverState *bs)
|
|
|
|
{
|
|
|
|
BDRVSheepdogState *s = bs->opaque;
|
|
|
|
SheepdogInode *inode = &s->inode;
|
2015-02-13 12:20:53 +03:00
|
|
|
uint32_t object_size = (UINT32_C(1) << inode->block_size_shift);
|
|
|
|
unsigned long i, last = DIV_ROUND_UP(inode->vdi_size, object_size);
|
2013-08-07 12:59:53 +04:00
|
|
|
uint64_t size = 0;
|
|
|
|
|
|
|
|
for (i = 0; i < last; i++) {
|
|
|
|
if (inode->data_vdi_id[i] == 0) {
|
|
|
|
continue;
|
|
|
|
}
|
2015-02-13 12:20:53 +03:00
|
|
|
size += object_size;
|
2013-08-07 12:59:53 +04:00
|
|
|
}
|
|
|
|
return size;
|
|
|
|
}
|
|
|
|
|
2014-06-05 13:21:05 +04:00
|
|
|
static QemuOptsList sd_create_opts = {
|
|
|
|
.name = "sheepdog-create-opts",
|
|
|
|
.head = QTAILQ_HEAD_INITIALIZER(sd_create_opts.head),
|
|
|
|
.desc = {
|
|
|
|
{
|
|
|
|
.name = BLOCK_OPT_SIZE,
|
|
|
|
.type = QEMU_OPT_SIZE,
|
|
|
|
.help = "Virtual disk size"
|
|
|
|
},
|
|
|
|
{
|
|
|
|
.name = BLOCK_OPT_BACKING_FILE,
|
|
|
|
.type = QEMU_OPT_STRING,
|
|
|
|
.help = "File name of a base image"
|
|
|
|
},
|
|
|
|
{
|
|
|
|
.name = BLOCK_OPT_PREALLOC,
|
|
|
|
.type = QEMU_OPT_STRING,
|
|
|
|
.help = "Preallocation mode (allowed values: off, full)"
|
|
|
|
},
|
|
|
|
{
|
|
|
|
.name = BLOCK_OPT_REDUNDANCY,
|
|
|
|
.type = QEMU_OPT_STRING,
|
|
|
|
.help = "Redundancy of the image"
|
|
|
|
},
|
2015-02-13 12:20:53 +03:00
|
|
|
{
|
|
|
|
.name = BLOCK_OPT_OBJECT_SIZE,
|
|
|
|
.type = QEMU_OPT_SIZE,
|
|
|
|
.help = "Object size of the image"
|
|
|
|
},
|
2014-06-05 13:21:05 +04:00
|
|
|
{ /* end of list */ }
|
|
|
|
}
|
2010-06-21 00:01:00 +04:00
|
|
|
};
|
|
|
|
|
2019-02-01 22:29:25 +03:00
|
|
|
static const char *const sd_strong_runtime_opts[] = {
|
|
|
|
"vdi",
|
|
|
|
"snap-id",
|
|
|
|
"tag",
|
|
|
|
"server.",
|
|
|
|
|
|
|
|
NULL
|
|
|
|
};
|
|
|
|
|
2013-02-22 07:39:51 +04:00
|
|
|
static BlockDriver bdrv_sheepdog = {
|
2017-11-08 01:27:21 +03:00
|
|
|
.format_name = "sheepdog",
|
|
|
|
.protocol_name = "sheepdog",
|
|
|
|
.instance_size = sizeof(BDRVSheepdogState),
|
|
|
|
.bdrv_parse_filename = sd_parse_filename,
|
|
|
|
.bdrv_file_open = sd_open,
|
|
|
|
.bdrv_reopen_prepare = sd_reopen_prepare,
|
|
|
|
.bdrv_reopen_commit = sd_reopen_commit,
|
|
|
|
.bdrv_reopen_abort = sd_reopen_abort,
|
|
|
|
.bdrv_close = sd_close,
|
2018-02-01 15:20:44 +03:00
|
|
|
.bdrv_co_create = sd_co_create,
|
2018-01-18 15:43:45 +03:00
|
|
|
.bdrv_co_create_opts = sd_co_create_opts,
|
2017-11-08 01:27:21 +03:00
|
|
|
.bdrv_has_zero_init = bdrv_has_zero_init_1,
|
2019-07-24 20:12:32 +03:00
|
|
|
.bdrv_has_zero_init_truncate = bdrv_has_zero_init_1,
|
2017-11-08 01:27:21 +03:00
|
|
|
.bdrv_getlength = sd_getlength,
|
2013-08-07 12:59:53 +04:00
|
|
|
.bdrv_get_allocated_file_size = sd_get_allocated_file_size,
|
block: Convert .bdrv_truncate callback to coroutine_fn
bdrv_truncate() is an operation that can block (even for a quite long
time, depending on the PreallocMode) in I/O paths that shouldn't block.
Convert it to a coroutine_fn so that we have the infrastructure for
drivers to make their .bdrv_co_truncate implementation asynchronous.
This change could potentially introduce new race conditions because
bdrv_truncate() isn't necessarily executed atomically any more. Whether
this is a problem needs to be evaluated for each block driver that
supports truncate:
* file-posix/win32, gluster, iscsi, nfs, rbd, ssh, sheepdog: The
protocol drivers are trivially safe because they don't actually yield
yet, so there is no change in behaviour.
* copy-on-read, crypto, raw-format: Essentially just filter drivers that
pass the request to a child node, no problem.
* qcow2: The implementation modifies metadata, so it needs to hold
s->lock to be safe with concurrent I/O requests. In order to avoid
double locking, this requires pulling the locking out into
preallocate_co() and using qcow2_write_caches() instead of
bdrv_flush().
* qed: Does a single header update, this is fine without locking.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-06-21 18:54:35 +03:00
|
|
|
.bdrv_co_truncate = sd_co_truncate,
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2017-11-08 01:27:21 +03:00
|
|
|
.bdrv_co_readv = sd_co_readv,
|
|
|
|
.bdrv_co_writev = sd_co_writev,
|
|
|
|
.bdrv_co_flush_to_disk = sd_co_flush_to_disk,
|
|
|
|
.bdrv_co_pdiscard = sd_co_pdiscard,
|
2018-02-13 23:26:55 +03:00
|
|
|
.bdrv_co_block_status = sd_co_block_status,
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2017-11-08 01:27:21 +03:00
|
|
|
.bdrv_snapshot_create = sd_snapshot_create,
|
|
|
|
.bdrv_snapshot_goto = sd_snapshot_goto,
|
|
|
|
.bdrv_snapshot_delete = sd_snapshot_delete,
|
|
|
|
.bdrv_snapshot_list = sd_snapshot_list,
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2017-11-08 01:27:21 +03:00
|
|
|
.bdrv_save_vmstate = sd_save_vmstate,
|
|
|
|
.bdrv_load_vmstate = sd_load_vmstate,
|
2010-06-21 00:01:00 +04:00
|
|
|
|
2017-11-08 01:27:21 +03:00
|
|
|
.bdrv_detach_aio_context = sd_detach_aio_context,
|
|
|
|
.bdrv_attach_aio_context = sd_attach_aio_context,
|
2014-05-08 18:34:52 +04:00
|
|
|
|
2017-11-08 01:27:21 +03:00
|
|
|
.create_opts = &sd_create_opts,
|
2019-02-01 22:29:25 +03:00
|
|
|
.strong_runtime_opts = sd_strong_runtime_opts,
|
2010-06-21 00:01:00 +04:00
|
|
|
};
|
|
|
|
|
2013-02-22 07:39:51 +04:00
|
|
|
static BlockDriver bdrv_sheepdog_tcp = {
|
2017-11-08 01:27:21 +03:00
|
|
|
.format_name = "sheepdog",
|
|
|
|
.protocol_name = "sheepdog+tcp",
|
|
|
|
.instance_size = sizeof(BDRVSheepdogState),
|
|
|
|
.bdrv_parse_filename = sd_parse_filename,
|
|
|
|
.bdrv_file_open = sd_open,
|
|
|
|
.bdrv_reopen_prepare = sd_reopen_prepare,
|
|
|
|
.bdrv_reopen_commit = sd_reopen_commit,
|
|
|
|
.bdrv_reopen_abort = sd_reopen_abort,
|
|
|
|
.bdrv_close = sd_close,
|
2018-02-01 15:20:44 +03:00
|
|
|
.bdrv_co_create = sd_co_create,
|
2018-01-18 15:43:45 +03:00
|
|
|
.bdrv_co_create_opts = sd_co_create_opts,
|
2017-11-08 01:27:21 +03:00
|
|
|
.bdrv_has_zero_init = bdrv_has_zero_init_1,
|
|
|
|
.bdrv_getlength = sd_getlength,
|
2013-08-07 12:59:53 +04:00
|
|
|
.bdrv_get_allocated_file_size = sd_get_allocated_file_size,
|
block: Convert .bdrv_truncate callback to coroutine_fn
bdrv_truncate() is an operation that can block (even for a quite long
time, depending on the PreallocMode) in I/O paths that shouldn't block.
Convert it to a coroutine_fn so that we have the infrastructure for
drivers to make their .bdrv_co_truncate implementation asynchronous.
This change could potentially introduce new race conditions because
bdrv_truncate() isn't necessarily executed atomically any more. Whether
this is a problem needs to be evaluated for each block driver that
supports truncate:
* file-posix/win32, gluster, iscsi, nfs, rbd, ssh, sheepdog: The
protocol drivers are trivially safe because they don't actually yield
yet, so there is no change in behaviour.
* copy-on-read, crypto, raw-format: Essentially just filter drivers that
pass the request to a child node, no problem.
* qcow2: The implementation modifies metadata, so it needs to hold
s->lock to be safe with concurrent I/O requests. In order to avoid
double locking, this requires pulling the locking out into
preallocate_co() and using qcow2_write_caches() instead of
bdrv_flush().
* qed: Does a single header update, this is fine without locking.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-06-21 18:54:35 +03:00
|
|
|
.bdrv_co_truncate = sd_co_truncate,
|
2013-02-22 07:39:51 +04:00
|
|
|
|
2017-11-08 01:27:21 +03:00
|
|
|
.bdrv_co_readv = sd_co_readv,
|
|
|
|
.bdrv_co_writev = sd_co_writev,
|
|
|
|
.bdrv_co_flush_to_disk = sd_co_flush_to_disk,
|
|
|
|
.bdrv_co_pdiscard = sd_co_pdiscard,
|
2018-02-13 23:26:55 +03:00
|
|
|
.bdrv_co_block_status = sd_co_block_status,
|
2013-02-22 07:39:51 +04:00
|
|
|
|
2017-11-08 01:27:21 +03:00
|
|
|
.bdrv_snapshot_create = sd_snapshot_create,
|
|
|
|
.bdrv_snapshot_goto = sd_snapshot_goto,
|
|
|
|
.bdrv_snapshot_delete = sd_snapshot_delete,
|
|
|
|
.bdrv_snapshot_list = sd_snapshot_list,
|
2013-02-22 07:39:51 +04:00
|
|
|
|
2017-11-08 01:27:21 +03:00
|
|
|
.bdrv_save_vmstate = sd_save_vmstate,
|
|
|
|
.bdrv_load_vmstate = sd_load_vmstate,
|
2013-02-22 07:39:51 +04:00
|
|
|
|
2017-11-08 01:27:21 +03:00
|
|
|
.bdrv_detach_aio_context = sd_detach_aio_context,
|
|
|
|
.bdrv_attach_aio_context = sd_attach_aio_context,
|
2014-05-08 18:34:52 +04:00
|
|
|
|
2017-11-08 01:27:21 +03:00
|
|
|
.create_opts = &sd_create_opts,
|
2019-02-01 22:29:25 +03:00
|
|
|
.strong_runtime_opts = sd_strong_runtime_opts,
|
2013-02-22 07:39:51 +04:00
|
|
|
};
|
|
|
|
|
2013-02-22 07:39:53 +04:00
|
|
|
static BlockDriver bdrv_sheepdog_unix = {
|
2017-11-08 01:27:21 +03:00
|
|
|
.format_name = "sheepdog",
|
|
|
|
.protocol_name = "sheepdog+unix",
|
|
|
|
.instance_size = sizeof(BDRVSheepdogState),
|
|
|
|
.bdrv_parse_filename = sd_parse_filename,
|
|
|
|
.bdrv_file_open = sd_open,
|
|
|
|
.bdrv_reopen_prepare = sd_reopen_prepare,
|
|
|
|
.bdrv_reopen_commit = sd_reopen_commit,
|
|
|
|
.bdrv_reopen_abort = sd_reopen_abort,
|
|
|
|
.bdrv_close = sd_close,
|
2018-02-01 15:20:44 +03:00
|
|
|
.bdrv_co_create = sd_co_create,
|
2018-01-18 15:43:45 +03:00
|
|
|
.bdrv_co_create_opts = sd_co_create_opts,
|
2017-11-08 01:27:21 +03:00
|
|
|
.bdrv_has_zero_init = bdrv_has_zero_init_1,
|
|
|
|
.bdrv_getlength = sd_getlength,
|
2013-08-07 12:59:53 +04:00
|
|
|
.bdrv_get_allocated_file_size = sd_get_allocated_file_size,
|
block: Convert .bdrv_truncate callback to coroutine_fn
bdrv_truncate() is an operation that can block (even for a quite long
time, depending on the PreallocMode) in I/O paths that shouldn't block.
Convert it to a coroutine_fn so that we have the infrastructure for
drivers to make their .bdrv_co_truncate implementation asynchronous.
This change could potentially introduce new race conditions because
bdrv_truncate() isn't necessarily executed atomically any more. Whether
this is a problem needs to be evaluated for each block driver that
supports truncate:
* file-posix/win32, gluster, iscsi, nfs, rbd, ssh, sheepdog: The
protocol drivers are trivially safe because they don't actually yield
yet, so there is no change in behaviour.
* copy-on-read, crypto, raw-format: Essentially just filter drivers that
pass the request to a child node, no problem.
* qcow2: The implementation modifies metadata, so it needs to hold
s->lock to be safe with concurrent I/O requests. In order to avoid
double locking, this requires pulling the locking out into
preallocate_co() and using qcow2_write_caches() instead of
bdrv_flush().
* qed: Does a single header update, this is fine without locking.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-06-21 18:54:35 +03:00
|
|
|
.bdrv_co_truncate = sd_co_truncate,
|
2013-02-22 07:39:53 +04:00
|
|
|
|
2017-11-08 01:27:21 +03:00
|
|
|
.bdrv_co_readv = sd_co_readv,
|
|
|
|
.bdrv_co_writev = sd_co_writev,
|
|
|
|
.bdrv_co_flush_to_disk = sd_co_flush_to_disk,
|
|
|
|
.bdrv_co_pdiscard = sd_co_pdiscard,
|
2018-02-13 23:26:55 +03:00
|
|
|
.bdrv_co_block_status = sd_co_block_status,
|
2013-02-22 07:39:53 +04:00
|
|
|
|
2017-11-08 01:27:21 +03:00
|
|
|
.bdrv_snapshot_create = sd_snapshot_create,
|
|
|
|
.bdrv_snapshot_goto = sd_snapshot_goto,
|
|
|
|
.bdrv_snapshot_delete = sd_snapshot_delete,
|
|
|
|
.bdrv_snapshot_list = sd_snapshot_list,
|
2013-02-22 07:39:53 +04:00
|
|
|
|
2017-11-08 01:27:21 +03:00
|
|
|
.bdrv_save_vmstate = sd_save_vmstate,
|
|
|
|
.bdrv_load_vmstate = sd_load_vmstate,
|
2013-02-22 07:39:53 +04:00
|
|
|
|
2017-11-08 01:27:21 +03:00
|
|
|
.bdrv_detach_aio_context = sd_detach_aio_context,
|
|
|
|
.bdrv_attach_aio_context = sd_attach_aio_context,
|
2014-05-08 18:34:52 +04:00
|
|
|
|
2017-11-08 01:27:21 +03:00
|
|
|
.create_opts = &sd_create_opts,
|
2019-02-01 22:29:25 +03:00
|
|
|
.strong_runtime_opts = sd_strong_runtime_opts,
|
2013-02-22 07:39:53 +04:00
|
|
|
};
|
|
|
|
|
2010-06-21 00:01:00 +04:00
|
|
|
static void bdrv_sheepdog_init(void)
|
|
|
|
{
|
|
|
|
bdrv_register(&bdrv_sheepdog);
|
2013-02-22 07:39:51 +04:00
|
|
|
bdrv_register(&bdrv_sheepdog_tcp);
|
2013-02-22 07:39:53 +04:00
|
|
|
bdrv_register(&bdrv_sheepdog_unix);
|
2010-06-21 00:01:00 +04:00
|
|
|
}
|
|
|
|
block_init(bdrv_sheepdog_init);
|