2013-06-04 19:17:10 +04:00
|
|
|
/*
|
|
|
|
* QEMU NVM Express Controller
|
|
|
|
*
|
|
|
|
* Copyright (c) 2012, Intel Corporation
|
|
|
|
*
|
|
|
|
* Written by Keith Busch <keith.busch@intel.com>
|
|
|
|
*
|
|
|
|
* This code is licensed under the GNU GPL v2 or later.
|
|
|
|
*/
|
|
|
|
|
|
|
|
/**
|
2020-12-08 23:04:10 +03:00
|
|
|
* Reference Specs: http://www.nvmexpress.org, 1.4, 1.3, 1.2, 1.1, 1.0e
|
2013-06-04 19:17:10 +04:00
|
|
|
*
|
2020-06-30 14:04:26 +03:00
|
|
|
* https://nvmexpress.org/developers/nvme-specification/
|
2013-06-04 19:17:10 +04:00
|
|
|
*/
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Usage: add options:
|
|
|
|
* -drive file=<file>,if=none,id=<drive_id>
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
* -device nvme,serial=<serial>,id=<bus_name>, \
|
2018-06-26 04:44:56 +03:00
|
|
|
* cmb_size_mb=<cmb_size_mb[optional]>, \
|
2020-03-30 19:46:56 +03:00
|
|
|
* [pmrdev=<mem_backend_file_id>,] \
|
2020-07-06 09:12:53 +03:00
|
|
|
* max_ioqpairs=<N[optional]>, \
|
2020-02-23 19:38:22 +03:00
|
|
|
* aerl=<N[optional]>, aer_max_queued=<N[optional]>, \
|
2020-12-08 23:04:10 +03:00
|
|
|
* mdts=<N[optional]>,zoned.append_size_limit=<N[optional]> \
|
|
|
|
* -device nvme-ns,drive=<drive_id>,bus=<bus_name>,nsid=<nsid>,\
|
|
|
|
* zoned=<true|false[optional]>
|
2017-05-16 22:10:59 +03:00
|
|
|
*
|
|
|
|
* Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at
|
2020-12-18 02:32:16 +03:00
|
|
|
* offset 0 in BAR2 and supports only WDS, RDS and SQS for now. By default, the
|
|
|
|
* device will use the "v1.4 CMB scheme" - use the `legacy-cmb` parameter to
|
|
|
|
* always enable the CMBLOC and CMBSZ registers (v1.3 behavior).
|
2020-03-30 19:46:56 +03:00
|
|
|
*
|
|
|
|
* Enabling pmr emulation can be achieved by pointing to memory-backend-file.
|
|
|
|
* For example:
|
|
|
|
* -object memory-backend-file,id=<mem_id>,share=on,mem-path=<file_path>, \
|
|
|
|
* size=<size> .... -device nvme,...,pmrdev=<mem_id>
|
2020-07-06 09:12:53 +03:00
|
|
|
*
|
2020-11-13 11:57:13 +03:00
|
|
|
* The PMR will use BAR 4/5 exclusively.
|
|
|
|
*
|
2020-07-06 09:12:53 +03:00
|
|
|
*
|
|
|
|
* nvme device parameters
|
|
|
|
* ~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
* - `aerl`
|
|
|
|
* The Asynchronous Event Request Limit (AERL). Indicates the maximum number
|
2020-12-08 23:04:10 +03:00
|
|
|
* of concurrently outstanding Asynchronous Event Request commands support
|
2020-07-06 09:12:53 +03:00
|
|
|
* by the controller. This is a 0's based value.
|
|
|
|
*
|
|
|
|
* - `aer_max_queued`
|
|
|
|
* This is the maximum number of events that the device will enqueue for
|
2020-12-08 23:04:10 +03:00
|
|
|
* completion when there are no outstanding AERs. When the maximum number of
|
2020-07-06 09:12:53 +03:00
|
|
|
* enqueued events are reached, subsequent events will be dropped.
|
|
|
|
*
|
2020-12-08 23:04:10 +03:00
|
|
|
* - `zoned.append_size_limit`
|
|
|
|
* The maximum I/O size in bytes that is allowed in Zone Append command.
|
|
|
|
* The default is 128KiB. Since internally this this value is maintained as
|
|
|
|
* ZASL = log2(<maximum append size> / <page size>), some values assigned
|
|
|
|
* to this property may be rounded down and result in a lower maximum ZA
|
|
|
|
* data size being in effect. By setting this property to 0, users can make
|
|
|
|
* ZASL to be equal to MDTS. This property only affects zoned namespaces.
|
|
|
|
*
|
|
|
|
* Setting `zoned` to true selects Zoned Command Set at the namespace.
|
|
|
|
* In this case, the following namespace properties are available to configure
|
|
|
|
* zoned operation:
|
|
|
|
* zoned.zone_size=<zone size in bytes, default: 128MiB>
|
|
|
|
* The number may be followed by K, M, G as in kilo-, mega- or giga-.
|
|
|
|
*
|
|
|
|
* zoned.zone_capacity=<zone capacity in bytes, default: zone size>
|
|
|
|
* The value 0 (default) forces zone capacity to be the same as zone
|
|
|
|
* size. The value of this property may not exceed zone size.
|
|
|
|
*
|
|
|
|
* zoned.descr_ext_size=<zone descriptor extension size, default 0>
|
|
|
|
* This value needs to be specified in 64B units. If it is zero,
|
|
|
|
* namespace(s) will not support zone descriptor extensions.
|
|
|
|
*
|
|
|
|
* zoned.max_active=<Maximum Active Resources (zones), default: 0>
|
|
|
|
* The default value means there is no limit to the number of
|
|
|
|
* concurrently active zones.
|
|
|
|
*
|
|
|
|
* zoned.max_open=<Maximum Open Resources (zones), default: 0>
|
|
|
|
* The default value means there is no limit to the number of
|
|
|
|
* concurrently open zones.
|
|
|
|
*
|
|
|
|
* zoned.cross_zone_read=<enable RAZB, default: false>
|
|
|
|
* Setting this property to true enables Read Across Zone Boundaries.
|
2013-06-04 19:17:10 +04:00
|
|
|
*/
|
|
|
|
|
2016-01-18 21:01:42 +03:00
|
|
|
#include "qemu/osdep.h"
|
2018-06-25 15:42:05 +03:00
|
|
|
#include "qemu/units.h"
|
2020-06-09 22:03:19 +03:00
|
|
|
#include "qemu/error-report.h"
|
2016-06-22 20:11:19 +03:00
|
|
|
#include "hw/block/block.h"
|
|
|
|
#include "hw/pci/msix.h"
|
|
|
|
#include "hw/pci/pci.h"
|
2019-08-12 08:23:51 +03:00
|
|
|
#include "hw/qdev-properties.h"
|
2019-08-12 08:23:45 +03:00
|
|
|
#include "migration/vmstate.h"
|
2014-10-07 12:00:34 +04:00
|
|
|
#include "sysemu/sysemu.h"
|
include/qemu/osdep.h: Don't include qapi/error.h
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the
Error typedef. Since then, we've moved to include qemu/osdep.h
everywhere. Its file comment explains: "To avoid getting into
possible circular include dependencies, this file should not include
any other QEMU headers, with the exceptions of config-host.h,
compiler.h, os-posix.h and os-win32.h, all of which are doing a
similar job to this file and are under similar constraints."
qapi/error.h doesn't do a similar job, and it doesn't adhere to
similar constraints: it includes qapi-types.h. That's in excess of
100KiB of crap most .c files don't actually need.
Add the typedef to qemu/typedefs.h, and include that instead of
qapi/error.h. Include qapi/error.h in .c files that need it and don't
get it now. Include qapi-types.h in qom/object.h for uint16List.
Update scripts/clean-includes accordingly. Update it further to match
reality: replace config.h by config-target.h, add sysemu/os-posix.h,
sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h
comment quoted above similarly.
This reduces the number of objects depending on qapi/error.h from "all
of them" to less than a third. Unfortunately, the number depending on
qapi-types.h shrinks only a little. More work is needed for that one.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
[Fix compilation without the spice devel packages. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-14 11:01:28 +03:00
|
|
|
#include "qapi/error.h"
|
2014-10-07 12:00:34 +04:00
|
|
|
#include "qapi/visitor.h"
|
2020-03-30 19:46:56 +03:00
|
|
|
#include "sysemu/hostmem.h"
|
2014-10-07 15:59:18 +04:00
|
|
|
#include "sysemu/block-backend.h"
|
2020-05-08 09:24:55 +03:00
|
|
|
#include "exec/memory.h"
|
2017-11-03 16:37:53 +03:00
|
|
|
#include "qemu/log.h"
|
2019-05-23 17:35:07 +03:00
|
|
|
#include "qemu/module.h"
|
2018-05-29 02:27:13 +03:00
|
|
|
#include "qemu/cutils.h"
|
2017-11-03 16:37:53 +03:00
|
|
|
#include "trace.h"
|
2013-06-04 19:17:10 +04:00
|
|
|
#include "nvme.h"
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
#include "nvme-ns.h"
|
2013-06-04 19:17:10 +04:00
|
|
|
|
2020-06-09 22:03:32 +03:00
|
|
|
#define NVME_MAX_IOQPAIRS 0xffff
|
2020-06-09 22:03:12 +03:00
|
|
|
#define NVME_DB_SIZE 4
|
2021-01-13 12:19:44 +03:00
|
|
|
#define NVME_SPEC_VER 0x00010400
|
2020-06-09 22:03:27 +03:00
|
|
|
#define NVME_CMB_BIR 2
|
2020-11-13 11:57:13 +03:00
|
|
|
#define NVME_PMR_BIR 4
|
2020-07-06 09:12:50 +03:00
|
|
|
#define NVME_TEMPERATURE 0x143
|
|
|
|
#define NVME_TEMPERATURE_WARNING 0x157
|
|
|
|
#define NVME_TEMPERATURE_CRITICAL 0x175
|
2020-07-06 09:12:51 +03:00
|
|
|
#define NVME_NUM_FW_SLOTS 1
|
2020-06-09 22:03:12 +03:00
|
|
|
|
2017-11-03 16:37:53 +03:00
|
|
|
#define NVME_GUEST_ERR(trace, fmt, ...) \
|
|
|
|
do { \
|
|
|
|
(trace_##trace)(__VA_ARGS__); \
|
|
|
|
qemu_log_mask(LOG_GUEST_ERROR, #trace \
|
|
|
|
" in %s: " fmt "\n", __func__, ## __VA_ARGS__); \
|
|
|
|
} while (0)
|
|
|
|
|
2020-07-06 09:12:56 +03:00
|
|
|
static const bool nvme_feature_support[NVME_FID_MAX] = {
|
|
|
|
[NVME_ARBITRATION] = true,
|
|
|
|
[NVME_POWER_MANAGEMENT] = true,
|
|
|
|
[NVME_TEMPERATURE_THRESHOLD] = true,
|
|
|
|
[NVME_ERROR_RECOVERY] = true,
|
|
|
|
[NVME_VOLATILE_WRITE_CACHE] = true,
|
|
|
|
[NVME_NUMBER_OF_QUEUES] = true,
|
|
|
|
[NVME_INTERRUPT_COALESCING] = true,
|
|
|
|
[NVME_INTERRUPT_VECTOR_CONF] = true,
|
|
|
|
[NVME_WRITE_ATOMICITY] = true,
|
|
|
|
[NVME_ASYNCHRONOUS_EVENT_CONF] = true,
|
|
|
|
[NVME_TIMESTAMP] = true,
|
|
|
|
};
|
|
|
|
|
2020-07-06 09:12:57 +03:00
|
|
|
static const uint32_t nvme_feature_cap[NVME_FID_MAX] = {
|
|
|
|
[NVME_TEMPERATURE_THRESHOLD] = NVME_FEAT_CAP_CHANGE,
|
2020-10-14 10:55:08 +03:00
|
|
|
[NVME_ERROR_RECOVERY] = NVME_FEAT_CAP_CHANGE | NVME_FEAT_CAP_NS,
|
2020-07-06 09:12:57 +03:00
|
|
|
[NVME_VOLATILE_WRITE_CACHE] = NVME_FEAT_CAP_CHANGE,
|
|
|
|
[NVME_NUMBER_OF_QUEUES] = NVME_FEAT_CAP_CHANGE,
|
|
|
|
[NVME_ASYNCHRONOUS_EVENT_CONF] = NVME_FEAT_CAP_CHANGE,
|
|
|
|
[NVME_TIMESTAMP] = NVME_FEAT_CAP_CHANGE,
|
|
|
|
};
|
|
|
|
|
2020-12-08 23:04:02 +03:00
|
|
|
static const uint32_t nvme_cse_acs[256] = {
|
|
|
|
[NVME_ADM_CMD_DELETE_SQ] = NVME_CMD_EFF_CSUPP,
|
|
|
|
[NVME_ADM_CMD_CREATE_SQ] = NVME_CMD_EFF_CSUPP,
|
|
|
|
[NVME_ADM_CMD_GET_LOG_PAGE] = NVME_CMD_EFF_CSUPP,
|
|
|
|
[NVME_ADM_CMD_DELETE_CQ] = NVME_CMD_EFF_CSUPP,
|
|
|
|
[NVME_ADM_CMD_CREATE_CQ] = NVME_CMD_EFF_CSUPP,
|
|
|
|
[NVME_ADM_CMD_IDENTIFY] = NVME_CMD_EFF_CSUPP,
|
|
|
|
[NVME_ADM_CMD_ABORT] = NVME_CMD_EFF_CSUPP,
|
|
|
|
[NVME_ADM_CMD_SET_FEATURES] = NVME_CMD_EFF_CSUPP,
|
|
|
|
[NVME_ADM_CMD_GET_FEATURES] = NVME_CMD_EFF_CSUPP,
|
|
|
|
[NVME_ADM_CMD_ASYNC_EV_REQ] = NVME_CMD_EFF_CSUPP,
|
|
|
|
};
|
|
|
|
|
|
|
|
static const uint32_t nvme_cse_iocs_none[256];
|
|
|
|
|
|
|
|
static const uint32_t nvme_cse_iocs_nvm[256] = {
|
|
|
|
[NVME_CMD_FLUSH] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
|
|
|
|
[NVME_CMD_WRITE_ZEROES] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
|
|
|
|
[NVME_CMD_WRITE] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
|
|
|
|
[NVME_CMD_READ] = NVME_CMD_EFF_CSUPP,
|
|
|
|
[NVME_CMD_DSM] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
|
|
|
|
[NVME_CMD_COMPARE] = NVME_CMD_EFF_CSUPP,
|
|
|
|
};
|
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
static const uint32_t nvme_cse_iocs_zoned[256] = {
|
|
|
|
[NVME_CMD_FLUSH] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
|
|
|
|
[NVME_CMD_WRITE_ZEROES] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
|
|
|
|
[NVME_CMD_WRITE] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
|
|
|
|
[NVME_CMD_READ] = NVME_CMD_EFF_CSUPP,
|
|
|
|
[NVME_CMD_DSM] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
|
|
|
|
[NVME_CMD_COMPARE] = NVME_CMD_EFF_CSUPP,
|
|
|
|
[NVME_CMD_ZONE_APPEND] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
|
|
|
|
[NVME_CMD_ZONE_MGMT_SEND] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
|
|
|
|
[NVME_CMD_ZONE_MGMT_RECV] = NVME_CMD_EFF_CSUPP,
|
|
|
|
};
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
static void nvme_process_sq(void *opaque);
|
|
|
|
|
2020-07-06 09:12:48 +03:00
|
|
|
static uint16_t nvme_cid(NvmeRequest *req)
|
|
|
|
{
|
|
|
|
if (!req) {
|
|
|
|
return 0xffff;
|
|
|
|
}
|
|
|
|
|
|
|
|
return le16_to_cpu(req->cqe.cid);
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_sqid(NvmeRequest *req)
|
|
|
|
{
|
|
|
|
return le16_to_cpu(req->sq->sqid);
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
static void nvme_assign_zone_state(NvmeNamespace *ns, NvmeZone *zone,
|
2020-12-10 01:12:49 +03:00
|
|
|
NvmeZoneState state)
|
2020-12-08 23:04:06 +03:00
|
|
|
{
|
|
|
|
if (QTAILQ_IN_USE(zone, entry)) {
|
|
|
|
switch (nvme_get_zone_state(zone)) {
|
|
|
|
case NVME_ZONE_STATE_EXPLICITLY_OPEN:
|
|
|
|
QTAILQ_REMOVE(&ns->exp_open_zones, zone, entry);
|
|
|
|
break;
|
|
|
|
case NVME_ZONE_STATE_IMPLICITLY_OPEN:
|
|
|
|
QTAILQ_REMOVE(&ns->imp_open_zones, zone, entry);
|
|
|
|
break;
|
|
|
|
case NVME_ZONE_STATE_CLOSED:
|
|
|
|
QTAILQ_REMOVE(&ns->closed_zones, zone, entry);
|
|
|
|
break;
|
|
|
|
case NVME_ZONE_STATE_FULL:
|
|
|
|
QTAILQ_REMOVE(&ns->full_zones, zone, entry);
|
|
|
|
default:
|
|
|
|
;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
nvme_set_zone_state(zone, state);
|
|
|
|
|
|
|
|
switch (state) {
|
|
|
|
case NVME_ZONE_STATE_EXPLICITLY_OPEN:
|
|
|
|
QTAILQ_INSERT_TAIL(&ns->exp_open_zones, zone, entry);
|
|
|
|
break;
|
|
|
|
case NVME_ZONE_STATE_IMPLICITLY_OPEN:
|
|
|
|
QTAILQ_INSERT_TAIL(&ns->imp_open_zones, zone, entry);
|
|
|
|
break;
|
|
|
|
case NVME_ZONE_STATE_CLOSED:
|
|
|
|
QTAILQ_INSERT_TAIL(&ns->closed_zones, zone, entry);
|
|
|
|
break;
|
|
|
|
case NVME_ZONE_STATE_FULL:
|
|
|
|
QTAILQ_INSERT_TAIL(&ns->full_zones, zone, entry);
|
|
|
|
case NVME_ZONE_STATE_READ_ONLY:
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
zone->d.za = 0;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:07 +03:00
|
|
|
/*
|
|
|
|
* Check if we can open a zone without exceeding open/active limits.
|
|
|
|
* AOR stands for "Active and Open Resources" (see TP 4053 section 2.5).
|
|
|
|
*/
|
|
|
|
static int nvme_aor_check(NvmeNamespace *ns, uint32_t act, uint32_t opn)
|
|
|
|
{
|
|
|
|
if (ns->params.max_active_zones != 0 &&
|
|
|
|
ns->nr_active_zones + act > ns->params.max_active_zones) {
|
|
|
|
trace_pci_nvme_err_insuff_active_res(ns->params.max_active_zones);
|
|
|
|
return NVME_ZONE_TOO_MANY_ACTIVE | NVME_DNR;
|
|
|
|
}
|
|
|
|
if (ns->params.max_open_zones != 0 &&
|
|
|
|
ns->nr_open_zones + opn > ns->params.max_open_zones) {
|
|
|
|
trace_pci_nvme_err_insuff_open_res(ns->params.max_open_zones);
|
|
|
|
return NVME_ZONE_TOO_MANY_OPEN | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
2020-06-09 22:03:17 +03:00
|
|
|
static bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr)
|
|
|
|
{
|
2020-12-18 02:32:16 +03:00
|
|
|
hwaddr hi, lo;
|
2020-06-09 22:03:17 +03:00
|
|
|
|
2020-12-18 02:32:16 +03:00
|
|
|
if (!n->cmb.cmse) {
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
lo = n->params.legacy_cmb ? n->cmb.mem.addr : n->cmb.cba;
|
|
|
|
hi = lo + int128_get64(n->cmb.mem.size);
|
|
|
|
|
|
|
|
return addr >= lo && addr < hi;
|
2020-06-09 22:03:17 +03:00
|
|
|
}
|
|
|
|
|
2020-02-23 16:21:52 +03:00
|
|
|
static inline void *nvme_addr_to_cmb(NvmeCtrl *n, hwaddr addr)
|
|
|
|
{
|
2020-12-18 02:32:16 +03:00
|
|
|
hwaddr base = n->params.legacy_cmb ? n->cmb.mem.addr : n->cmb.cba;
|
|
|
|
return &n->cmb.buf[addr - base];
|
2020-02-23 16:21:52 +03:00
|
|
|
}
|
|
|
|
|
2020-11-13 08:30:05 +03:00
|
|
|
static bool nvme_addr_is_pmr(NvmeCtrl *n, hwaddr addr)
|
|
|
|
{
|
|
|
|
hwaddr hi;
|
|
|
|
|
|
|
|
if (!n->pmr.cmse) {
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
hi = n->pmr.cba + int128_get64(n->pmr.dev->mr.size);
|
|
|
|
|
|
|
|
return addr >= n->pmr.cba && addr < hi;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void *nvme_addr_to_pmr(NvmeCtrl *n, hwaddr addr)
|
|
|
|
{
|
|
|
|
return memory_region_get_ram_ptr(&n->pmr.dev->mr) + (addr - n->pmr.cba);
|
|
|
|
}
|
|
|
|
|
2019-10-11 09:32:00 +03:00
|
|
|
static int nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
|
2017-05-16 22:10:59 +03:00
|
|
|
{
|
2020-03-31 00:23:15 +03:00
|
|
|
hwaddr hi = addr + size - 1;
|
|
|
|
if (hi < addr) {
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (n->bar.cmbsz && nvme_addr_is_cmb(n, addr) && nvme_addr_is_cmb(n, hi)) {
|
2020-02-23 16:21:52 +03:00
|
|
|
memcpy(buf, nvme_addr_to_cmb(n, addr), size);
|
2019-10-11 09:32:00 +03:00
|
|
|
return 0;
|
2017-05-16 22:10:59 +03:00
|
|
|
}
|
2020-06-09 22:03:17 +03:00
|
|
|
|
2020-11-13 08:30:05 +03:00
|
|
|
if (nvme_addr_is_pmr(n, addr) && nvme_addr_is_pmr(n, hi)) {
|
|
|
|
memcpy(buf, nvme_addr_to_pmr(n, addr), size);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2019-10-11 09:32:00 +03:00
|
|
|
return pci_dma_read(&n->parent_obj, addr, buf, size);
|
2017-05-16 22:10:59 +03:00
|
|
|
}
|
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
static bool nvme_nsid_valid(NvmeCtrl *n, uint32_t nsid)
|
|
|
|
{
|
|
|
|
return nsid && (nsid == NVME_NSID_BROADCAST || nsid <= n->num_namespaces);
|
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
static int nvme_check_sqid(NvmeCtrl *n, uint16_t sqid)
|
|
|
|
{
|
2020-06-09 22:03:19 +03:00
|
|
|
return sqid < n->params.max_ioqpairs + 1 && n->sq[sqid] != NULL ? 0 : -1;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static int nvme_check_cqid(NvmeCtrl *n, uint16_t cqid)
|
|
|
|
{
|
2020-06-09 22:03:19 +03:00
|
|
|
return cqid < n->params.max_ioqpairs + 1 && n->cq[cqid] != NULL ? 0 : -1;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static void nvme_inc_cq_tail(NvmeCQueue *cq)
|
|
|
|
{
|
|
|
|
cq->tail++;
|
|
|
|
if (cq->tail >= cq->size) {
|
|
|
|
cq->tail = 0;
|
|
|
|
cq->phase = !cq->phase;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void nvme_inc_sq_head(NvmeSQueue *sq)
|
|
|
|
{
|
|
|
|
sq->head = (sq->head + 1) % sq->size;
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint8_t nvme_cq_full(NvmeCQueue *cq)
|
|
|
|
{
|
|
|
|
return (cq->tail + 1) % cq->size == cq->head;
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint8_t nvme_sq_empty(NvmeSQueue *sq)
|
|
|
|
{
|
|
|
|
return sq->head == sq->tail;
|
|
|
|
}
|
|
|
|
|
2017-12-18 08:00:43 +03:00
|
|
|
static void nvme_irq_check(NvmeCtrl *n)
|
|
|
|
{
|
|
|
|
if (msix_enabled(&(n->parent_obj))) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
if (~n->bar.intms & n->irq_status) {
|
|
|
|
pci_irq_assert(&n->parent_obj);
|
|
|
|
} else {
|
|
|
|
pci_irq_deassert(&n->parent_obj);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void nvme_irq_assert(NvmeCtrl *n, NvmeCQueue *cq)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
|
|
|
if (cq->irq_enabled) {
|
|
|
|
if (msix_enabled(&(n->parent_obj))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_irq_msix(cq->vector);
|
2013-06-04 19:17:10 +04:00
|
|
|
msix_notify(&(n->parent_obj), cq->vector);
|
|
|
|
} else {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_irq_pin();
|
2020-06-09 22:03:18 +03:00
|
|
|
assert(cq->vector < 32);
|
|
|
|
n->irq_status |= 1 << cq->vector;
|
2017-12-18 08:00:43 +03:00
|
|
|
nvme_irq_check(n);
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
2017-11-03 16:37:53 +03:00
|
|
|
} else {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_irq_masked();
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2017-12-18 08:00:43 +03:00
|
|
|
static void nvme_irq_deassert(NvmeCtrl *n, NvmeCQueue *cq)
|
|
|
|
{
|
|
|
|
if (cq->irq_enabled) {
|
|
|
|
if (msix_enabled(&(n->parent_obj))) {
|
|
|
|
return;
|
|
|
|
} else {
|
2020-06-09 22:03:18 +03:00
|
|
|
assert(cq->vector < 32);
|
|
|
|
n->irq_status &= ~(1 << cq->vector);
|
2017-12-18 08:00:43 +03:00
|
|
|
nvme_irq_check(n);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static void nvme_req_clear(NvmeRequest *req)
|
|
|
|
{
|
|
|
|
req->ns = NULL;
|
2020-10-21 15:03:19 +03:00
|
|
|
req->opaque = NULL;
|
2020-07-20 13:44:01 +03:00
|
|
|
memset(&req->cqe, 0x0, sizeof(req->cqe));
|
2020-08-24 14:32:06 +03:00
|
|
|
req->status = NVME_SUCCESS;
|
2020-07-20 13:44:01 +03:00
|
|
|
}
|
|
|
|
|
2020-06-29 11:04:10 +03:00
|
|
|
static void nvme_req_exit(NvmeRequest *req)
|
|
|
|
{
|
|
|
|
if (req->qsg.sg) {
|
|
|
|
qemu_sglist_destroy(&req->qsg);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (req->iov.iov) {
|
|
|
|
qemu_iovec_destroy(&req->iov);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-02-23 16:21:52 +03:00
|
|
|
static uint16_t nvme_map_addr_cmb(NvmeCtrl *n, QEMUIOVector *iov, hwaddr addr,
|
|
|
|
size_t len)
|
|
|
|
{
|
|
|
|
if (!len) {
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
|
|
|
trace_pci_nvme_map_addr_cmb(addr, len);
|
|
|
|
|
|
|
|
if (!nvme_addr_is_cmb(n, addr) || !nvme_addr_is_cmb(n, addr + len - 1)) {
|
|
|
|
return NVME_DATA_TRAS_ERROR;
|
|
|
|
}
|
|
|
|
|
|
|
|
qemu_iovec_add(iov, nvme_addr_to_cmb(n, addr), len);
|
|
|
|
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
2020-11-13 08:30:05 +03:00
|
|
|
static uint16_t nvme_map_addr_pmr(NvmeCtrl *n, QEMUIOVector *iov, hwaddr addr,
|
|
|
|
size_t len)
|
|
|
|
{
|
|
|
|
if (!len) {
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!nvme_addr_is_pmr(n, addr) || !nvme_addr_is_pmr(n, addr + len - 1)) {
|
|
|
|
return NVME_DATA_TRAS_ERROR;
|
|
|
|
}
|
|
|
|
|
|
|
|
qemu_iovec_add(iov, nvme_addr_to_pmr(n, addr), len);
|
|
|
|
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
2020-02-23 16:21:52 +03:00
|
|
|
static uint16_t nvme_map_addr(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
|
|
|
|
hwaddr addr, size_t len)
|
|
|
|
{
|
2020-11-13 08:30:05 +03:00
|
|
|
bool cmb = false, pmr = false;
|
|
|
|
|
2020-02-23 16:21:52 +03:00
|
|
|
if (!len) {
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
|
|
|
trace_pci_nvme_map_addr(addr, len);
|
|
|
|
|
|
|
|
if (nvme_addr_is_cmb(n, addr)) {
|
2020-11-13 08:30:05 +03:00
|
|
|
cmb = true;
|
|
|
|
} else if (nvme_addr_is_pmr(n, addr)) {
|
|
|
|
pmr = true;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (cmb || pmr) {
|
2020-02-23 16:21:52 +03:00
|
|
|
if (qsg && qsg->sg) {
|
|
|
|
return NVME_INVALID_USE_OF_CMB | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
assert(iov);
|
|
|
|
|
|
|
|
if (!iov->iov) {
|
|
|
|
qemu_iovec_init(iov, 1);
|
|
|
|
}
|
|
|
|
|
2020-11-13 08:30:05 +03:00
|
|
|
if (cmb) {
|
|
|
|
return nvme_map_addr_cmb(n, iov, addr, len);
|
|
|
|
} else {
|
|
|
|
return nvme_map_addr_pmr(n, iov, addr, len);
|
|
|
|
}
|
2020-02-23 16:21:52 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
if (iov && iov->iov) {
|
|
|
|
return NVME_INVALID_USE_OF_CMB | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
assert(qsg);
|
|
|
|
|
|
|
|
if (!qsg->sg) {
|
|
|
|
pci_dma_sglist_init(qsg, &n->parent_obj, 1);
|
|
|
|
}
|
|
|
|
|
|
|
|
qemu_sglist_add(qsg, addr, len);
|
|
|
|
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
2020-07-29 22:55:37 +03:00
|
|
|
static uint16_t nvme_map_prp(NvmeCtrl *n, uint64_t prp1, uint64_t prp2,
|
|
|
|
uint32_t len, NvmeRequest *req)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
|
|
|
hwaddr trans_len = n->page_size - (prp1 % n->page_size);
|
|
|
|
trans_len = MIN(len, trans_len);
|
|
|
|
int num_prps = (len >> n->page_bits) + 1;
|
2020-02-23 16:21:52 +03:00
|
|
|
uint16_t status;
|
2019-10-11 09:32:00 +03:00
|
|
|
int ret;
|
2013-06-04 19:17:10 +04:00
|
|
|
|
2020-07-29 22:55:37 +03:00
|
|
|
QEMUSGList *qsg = &req->qsg;
|
|
|
|
QEMUIOVector *iov = &req->iov;
|
|
|
|
|
2020-07-29 22:18:34 +03:00
|
|
|
trace_pci_nvme_map_prp(trans_len, len, prp1, prp2, num_prps);
|
|
|
|
|
2020-11-13 08:30:05 +03:00
|
|
|
if (nvme_addr_is_cmb(n, prp1) || (nvme_addr_is_pmr(n, prp1))) {
|
2017-06-13 13:08:35 +03:00
|
|
|
qemu_iovec_init(iov, num_prps);
|
|
|
|
} else {
|
|
|
|
pci_dma_sglist_init(qsg, &n->parent_obj, num_prps);
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
2020-02-23 16:21:52 +03:00
|
|
|
|
|
|
|
status = nvme_map_addr(n, qsg, iov, prp1, trans_len);
|
|
|
|
if (status) {
|
2020-06-29 11:04:10 +03:00
|
|
|
return status;
|
2020-02-23 16:21:52 +03:00
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
len -= trans_len;
|
|
|
|
if (len) {
|
|
|
|
if (len > n->page_size) {
|
|
|
|
uint64_t prp_list[n->max_prp_ents];
|
|
|
|
uint32_t nents, prp_trans;
|
|
|
|
int i = 0;
|
|
|
|
|
|
|
|
nents = (len + n->page_size - 1) >> n->page_bits;
|
|
|
|
prp_trans = MIN(n->max_prp_ents, nents) * sizeof(uint64_t);
|
2019-10-11 09:32:00 +03:00
|
|
|
ret = nvme_addr_read(n, prp2, (void *)prp_list, prp_trans);
|
|
|
|
if (ret) {
|
|
|
|
trace_pci_nvme_err_addr_read(prp2);
|
|
|
|
return NVME_DATA_TRAS_ERROR;
|
|
|
|
}
|
2013-06-04 19:17:10 +04:00
|
|
|
while (len != 0) {
|
|
|
|
uint64_t prp_ent = le64_to_cpu(prp_list[i]);
|
|
|
|
|
|
|
|
if (i == n->max_prp_ents - 1 && len > n->page_size) {
|
2020-10-19 10:11:31 +03:00
|
|
|
if (unlikely(prp_ent & (n->page_size - 1))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_prplist_ent(prp_ent);
|
2020-10-19 10:11:31 +03:00
|
|
|
return NVME_INVALID_PRP_OFFSET | NVME_DNR;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
i = 0;
|
|
|
|
nents = (len + n->page_size - 1) >> n->page_bits;
|
|
|
|
prp_trans = MIN(n->max_prp_ents, nents) * sizeof(uint64_t);
|
2019-10-11 09:32:00 +03:00
|
|
|
ret = nvme_addr_read(n, prp_ent, (void *)prp_list,
|
|
|
|
prp_trans);
|
|
|
|
if (ret) {
|
|
|
|
trace_pci_nvme_err_addr_read(prp_ent);
|
|
|
|
return NVME_DATA_TRAS_ERROR;
|
|
|
|
}
|
2013-06-04 19:17:10 +04:00
|
|
|
prp_ent = le64_to_cpu(prp_list[i]);
|
|
|
|
}
|
|
|
|
|
2020-10-19 10:11:31 +03:00
|
|
|
if (unlikely(prp_ent & (n->page_size - 1))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_prplist_ent(prp_ent);
|
2020-10-19 10:11:31 +03:00
|
|
|
return NVME_INVALID_PRP_OFFSET | NVME_DNR;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
trans_len = MIN(len, n->page_size);
|
2020-02-23 16:21:52 +03:00
|
|
|
status = nvme_map_addr(n, qsg, iov, prp_ent, trans_len);
|
|
|
|
if (status) {
|
2020-06-29 11:04:10 +03:00
|
|
|
return status;
|
2017-06-13 13:08:35 +03:00
|
|
|
}
|
2020-02-23 18:12:12 +03:00
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
len -= trans_len;
|
|
|
|
i++;
|
|
|
|
}
|
|
|
|
} else {
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(prp2 & (n->page_size - 1))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_prp2_align(prp2);
|
2020-10-19 10:11:31 +03:00
|
|
|
return NVME_INVALID_PRP_OFFSET | NVME_DNR;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
2020-02-23 16:21:52 +03:00
|
|
|
status = nvme_map_addr(n, qsg, iov, prp2, len);
|
|
|
|
if (status) {
|
2020-06-29 11:04:10 +03:00
|
|
|
return status;
|
2017-06-13 13:08:35 +03:00
|
|
|
}
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-06-29 11:04:10 +03:00
|
|
|
return NVME_SUCCESS;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
|
2019-04-12 21:53:16 +03:00
|
|
|
/*
|
|
|
|
* Map 'nsgld' data descriptors from 'segment'. The function will subtract the
|
|
|
|
* number of bytes mapped in len.
|
|
|
|
*/
|
|
|
|
static uint16_t nvme_map_sgl_data(NvmeCtrl *n, QEMUSGList *qsg,
|
|
|
|
QEMUIOVector *iov,
|
|
|
|
NvmeSglDescriptor *segment, uint64_t nsgld,
|
|
|
|
size_t *len, NvmeRequest *req)
|
|
|
|
{
|
|
|
|
dma_addr_t addr, trans_len;
|
|
|
|
uint32_t dlen;
|
|
|
|
uint16_t status;
|
|
|
|
|
|
|
|
for (int i = 0; i < nsgld; i++) {
|
|
|
|
uint8_t type = NVME_SGL_TYPE(segment[i].type);
|
|
|
|
|
|
|
|
switch (type) {
|
2020-03-18 11:41:19 +03:00
|
|
|
case NVME_SGL_DESCR_TYPE_BIT_BUCKET:
|
|
|
|
if (req->cmd.opcode == NVME_CMD_WRITE) {
|
|
|
|
continue;
|
|
|
|
}
|
2019-04-12 21:53:16 +03:00
|
|
|
case NVME_SGL_DESCR_TYPE_DATA_BLOCK:
|
|
|
|
break;
|
|
|
|
case NVME_SGL_DESCR_TYPE_SEGMENT:
|
|
|
|
case NVME_SGL_DESCR_TYPE_LAST_SEGMENT:
|
|
|
|
return NVME_INVALID_NUM_SGL_DESCRS | NVME_DNR;
|
|
|
|
default:
|
|
|
|
return NVME_SGL_DESCR_TYPE_INVALID | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
dlen = le32_to_cpu(segment[i].len);
|
2020-03-18 11:41:19 +03:00
|
|
|
|
2019-04-12 21:53:16 +03:00
|
|
|
if (!dlen) {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (*len == 0) {
|
|
|
|
/*
|
|
|
|
* All data has been mapped, but the SGL contains additional
|
|
|
|
* segments and/or descriptors. The controller might accept
|
|
|
|
* ignoring the rest of the SGL.
|
|
|
|
*/
|
2020-11-04 13:22:47 +03:00
|
|
|
uint32_t sgls = le32_to_cpu(n->id_ctrl.sgls);
|
2019-04-12 21:53:16 +03:00
|
|
|
if (sgls & NVME_CTRL_SGLS_EXCESS_LENGTH) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
trace_pci_nvme_err_invalid_sgl_excess_length(nvme_cid(req));
|
|
|
|
return NVME_DATA_SGL_LEN_INVALID | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
trans_len = MIN(*len, dlen);
|
2020-03-18 11:41:19 +03:00
|
|
|
|
|
|
|
if (type == NVME_SGL_DESCR_TYPE_BIT_BUCKET) {
|
|
|
|
goto next;
|
|
|
|
}
|
|
|
|
|
2019-04-12 21:53:16 +03:00
|
|
|
addr = le64_to_cpu(segment[i].addr);
|
|
|
|
|
|
|
|
if (UINT64_MAX - addr < dlen) {
|
|
|
|
return NVME_DATA_SGL_LEN_INVALID | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
status = nvme_map_addr(n, qsg, iov, addr, trans_len);
|
|
|
|
if (status) {
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
2020-03-18 11:41:19 +03:00
|
|
|
next:
|
2019-04-12 21:53:16 +03:00
|
|
|
*len -= trans_len;
|
|
|
|
}
|
|
|
|
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_map_sgl(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
|
|
|
|
NvmeSglDescriptor sgl, size_t len,
|
2020-02-23 20:34:34 +03:00
|
|
|
NvmeRequest *req)
|
2019-04-12 21:53:16 +03:00
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Read the segment in chunks of 256 descriptors (one 4k page) to avoid
|
|
|
|
* dynamically allocating a potentially huge SGL. The spec allows the SGL
|
|
|
|
* to be larger (as in number of bytes required to describe the SGL
|
|
|
|
* descriptors and segment chain) than the command transfer size, so it is
|
|
|
|
* not bounded by MDTS.
|
|
|
|
*/
|
|
|
|
const int SEG_CHUNK_SIZE = 256;
|
|
|
|
|
|
|
|
NvmeSglDescriptor segment[SEG_CHUNK_SIZE], *sgld, *last_sgld;
|
|
|
|
uint64_t nsgld;
|
|
|
|
uint32_t seg_len;
|
|
|
|
uint16_t status;
|
|
|
|
hwaddr addr;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
sgld = &sgl;
|
|
|
|
addr = le64_to_cpu(sgl.addr);
|
|
|
|
|
|
|
|
trace_pci_nvme_map_sgl(nvme_cid(req), NVME_SGL_TYPE(sgl.type), len);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If the entire transfer can be described with a single data block it can
|
|
|
|
* be mapped directly.
|
|
|
|
*/
|
|
|
|
if (NVME_SGL_TYPE(sgl.type) == NVME_SGL_DESCR_TYPE_DATA_BLOCK) {
|
|
|
|
status = nvme_map_sgl_data(n, qsg, iov, sgld, 1, &len, req);
|
|
|
|
if (status) {
|
|
|
|
goto unmap;
|
|
|
|
}
|
|
|
|
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
for (;;) {
|
|
|
|
switch (NVME_SGL_TYPE(sgld->type)) {
|
|
|
|
case NVME_SGL_DESCR_TYPE_SEGMENT:
|
|
|
|
case NVME_SGL_DESCR_TYPE_LAST_SEGMENT:
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
return NVME_INVALID_SGL_SEG_DESCR | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
seg_len = le32_to_cpu(sgld->len);
|
|
|
|
|
|
|
|
/* check the length of the (Last) Segment descriptor */
|
2020-03-18 11:41:19 +03:00
|
|
|
if ((!seg_len || seg_len & 0xf) &&
|
|
|
|
(NVME_SGL_TYPE(sgld->type) != NVME_SGL_DESCR_TYPE_BIT_BUCKET)) {
|
2019-04-12 21:53:16 +03:00
|
|
|
return NVME_INVALID_SGL_SEG_DESCR | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (UINT64_MAX - addr < seg_len) {
|
|
|
|
return NVME_DATA_SGL_LEN_INVALID | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
nsgld = seg_len / sizeof(NvmeSglDescriptor);
|
|
|
|
|
|
|
|
while (nsgld > SEG_CHUNK_SIZE) {
|
|
|
|
if (nvme_addr_read(n, addr, segment, sizeof(segment))) {
|
|
|
|
trace_pci_nvme_err_addr_read(addr);
|
|
|
|
status = NVME_DATA_TRAS_ERROR;
|
|
|
|
goto unmap;
|
|
|
|
}
|
|
|
|
|
|
|
|
status = nvme_map_sgl_data(n, qsg, iov, segment, SEG_CHUNK_SIZE,
|
|
|
|
&len, req);
|
|
|
|
if (status) {
|
|
|
|
goto unmap;
|
|
|
|
}
|
|
|
|
|
|
|
|
nsgld -= SEG_CHUNK_SIZE;
|
|
|
|
addr += SEG_CHUNK_SIZE * sizeof(NvmeSglDescriptor);
|
|
|
|
}
|
|
|
|
|
|
|
|
ret = nvme_addr_read(n, addr, segment, nsgld *
|
|
|
|
sizeof(NvmeSglDescriptor));
|
|
|
|
if (ret) {
|
|
|
|
trace_pci_nvme_err_addr_read(addr);
|
|
|
|
status = NVME_DATA_TRAS_ERROR;
|
|
|
|
goto unmap;
|
|
|
|
}
|
|
|
|
|
|
|
|
last_sgld = &segment[nsgld - 1];
|
|
|
|
|
2020-03-18 11:41:19 +03:00
|
|
|
/*
|
|
|
|
* If the segment ends with a Data Block or Bit Bucket Descriptor Type,
|
|
|
|
* then we are done.
|
|
|
|
*/
|
|
|
|
switch (NVME_SGL_TYPE(last_sgld->type)) {
|
|
|
|
case NVME_SGL_DESCR_TYPE_DATA_BLOCK:
|
|
|
|
case NVME_SGL_DESCR_TYPE_BIT_BUCKET:
|
2019-04-12 21:53:16 +03:00
|
|
|
status = nvme_map_sgl_data(n, qsg, iov, segment, nsgld, &len, req);
|
|
|
|
if (status) {
|
|
|
|
goto unmap;
|
|
|
|
}
|
|
|
|
|
|
|
|
goto out;
|
2020-03-18 11:41:19 +03:00
|
|
|
|
|
|
|
default:
|
|
|
|
break;
|
2019-04-12 21:53:16 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2020-03-18 11:41:19 +03:00
|
|
|
* If the last descriptor was not a Data Block or Bit Bucket, then the
|
|
|
|
* current segment must not be a Last Segment.
|
2019-04-12 21:53:16 +03:00
|
|
|
*/
|
|
|
|
if (NVME_SGL_TYPE(sgld->type) == NVME_SGL_DESCR_TYPE_LAST_SEGMENT) {
|
|
|
|
status = NVME_INVALID_SGL_SEG_DESCR | NVME_DNR;
|
|
|
|
goto unmap;
|
|
|
|
}
|
|
|
|
|
|
|
|
sgld = last_sgld;
|
|
|
|
addr = le64_to_cpu(sgld->addr);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Do not map the last descriptor; it will be a Segment or Last Segment
|
|
|
|
* descriptor and is handled by the next iteration.
|
|
|
|
*/
|
|
|
|
status = nvme_map_sgl_data(n, qsg, iov, segment, nsgld - 1, &len, req);
|
|
|
|
if (status) {
|
|
|
|
goto unmap;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
out:
|
|
|
|
/* if there is any residual left in len, the SGL was too short */
|
|
|
|
if (len) {
|
|
|
|
status = NVME_DATA_SGL_LEN_INVALID | NVME_DNR;
|
|
|
|
goto unmap;
|
|
|
|
}
|
|
|
|
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
|
|
|
|
unmap:
|
|
|
|
if (iov->iov) {
|
|
|
|
qemu_iovec_destroy(iov);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (qsg->sg) {
|
|
|
|
qemu_sglist_destroy(qsg);
|
|
|
|
}
|
|
|
|
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_map_dptr(NvmeCtrl *n, size_t len, NvmeRequest *req)
|
|
|
|
{
|
|
|
|
uint64_t prp1, prp2;
|
|
|
|
|
|
|
|
switch (NVME_CMD_FLAGS_PSDT(req->cmd.flags)) {
|
|
|
|
case NVME_PSDT_PRP:
|
|
|
|
prp1 = le64_to_cpu(req->cmd.dptr.prp1);
|
|
|
|
prp2 = le64_to_cpu(req->cmd.dptr.prp2);
|
|
|
|
|
|
|
|
return nvme_map_prp(n, prp1, prp2, len, req);
|
|
|
|
case NVME_PSDT_SGL_MPTR_CONTIGUOUS:
|
|
|
|
case NVME_PSDT_SGL_MPTR_SGL:
|
|
|
|
/* SGLs shall not be used for Admin commands in NVMe over PCIe */
|
|
|
|
if (!req->sq->sqid) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
return nvme_map_sgl(n, &req->qsg, &req->iov, req->cmd.dptr.sgl, len,
|
|
|
|
req);
|
|
|
|
default:
|
|
|
|
return NVME_INVALID_FIELD;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_dma(NvmeCtrl *n, uint8_t *ptr, uint32_t len,
|
|
|
|
DMADirection dir, NvmeRequest *req)
|
2019-05-20 20:40:30 +03:00
|
|
|
{
|
|
|
|
uint16_t status = NVME_SUCCESS;
|
|
|
|
|
2019-04-12 21:53:16 +03:00
|
|
|
status = nvme_map_dptr(n, len, req);
|
2020-02-23 18:03:34 +03:00
|
|
|
if (status) {
|
|
|
|
return status;
|
2019-05-20 20:40:30 +03:00
|
|
|
}
|
|
|
|
|
2020-02-23 18:03:34 +03:00
|
|
|
/* assert that only one of qsg and iov carries data */
|
2020-02-23 20:34:34 +03:00
|
|
|
assert((req->qsg.nsg > 0) != (req->iov.niov > 0));
|
2017-11-03 16:37:53 +03:00
|
|
|
|
2020-02-23 20:34:34 +03:00
|
|
|
if (req->qsg.nsg > 0) {
|
2020-02-23 18:03:34 +03:00
|
|
|
uint64_t residual;
|
|
|
|
|
|
|
|
if (dir == DMA_DIRECTION_TO_DEVICE) {
|
2020-02-23 20:34:34 +03:00
|
|
|
residual = dma_buf_write(ptr, len, &req->qsg);
|
2020-02-23 18:03:34 +03:00
|
|
|
} else {
|
2020-02-23 20:34:34 +03:00
|
|
|
residual = dma_buf_read(ptr, len, &req->qsg);
|
2020-02-23 18:03:34 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
if (unlikely(residual)) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_dma();
|
2017-06-13 13:08:35 +03:00
|
|
|
status = NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
} else {
|
2020-02-23 18:03:34 +03:00
|
|
|
size_t bytes;
|
|
|
|
|
|
|
|
if (dir == DMA_DIRECTION_TO_DEVICE) {
|
2020-02-23 20:34:34 +03:00
|
|
|
bytes = qemu_iovec_to_buf(&req->iov, 0, ptr, len);
|
2020-02-23 18:03:34 +03:00
|
|
|
} else {
|
2020-02-23 20:34:34 +03:00
|
|
|
bytes = qemu_iovec_from_buf(&req->iov, 0, ptr, len);
|
2020-02-23 18:03:34 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
if (unlikely(bytes != len)) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_dma();
|
2017-06-13 13:08:35 +03:00
|
|
|
status = NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
2020-02-23 18:03:34 +03:00
|
|
|
|
2017-06-13 13:08:35 +03:00
|
|
|
return status;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static void nvme_post_cqes(void *opaque)
|
|
|
|
{
|
|
|
|
NvmeCQueue *cq = opaque;
|
|
|
|
NvmeCtrl *n = cq->ctrl;
|
|
|
|
NvmeRequest *req, *next;
|
2019-10-11 09:32:00 +03:00
|
|
|
int ret;
|
2013-06-04 19:17:10 +04:00
|
|
|
|
|
|
|
QTAILQ_FOREACH_SAFE(req, &cq->req_list, entry, next) {
|
|
|
|
NvmeSQueue *sq;
|
|
|
|
hwaddr addr;
|
|
|
|
|
|
|
|
if (nvme_cq_full(cq)) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
sq = req->sq;
|
|
|
|
req->cqe.status = cpu_to_le16((req->status << 1) | cq->phase);
|
|
|
|
req->cqe.sq_id = cpu_to_le16(sq->sqid);
|
|
|
|
req->cqe.sq_head = cpu_to_le16(sq->head);
|
|
|
|
addr = cq->dma_addr + cq->tail * n->cqe_size;
|
2019-10-11 09:32:00 +03:00
|
|
|
ret = pci_dma_write(&n->parent_obj, addr, (void *)&req->cqe,
|
|
|
|
sizeof(req->cqe));
|
|
|
|
if (ret) {
|
|
|
|
trace_pci_nvme_err_addr_write(addr);
|
|
|
|
trace_pci_nvme_err_cfs();
|
|
|
|
n->bar.csts = NVME_CSTS_FAILED;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
QTAILQ_REMOVE(&cq->req_list, req, entry);
|
2013-06-04 19:17:10 +04:00
|
|
|
nvme_inc_cq_tail(cq);
|
2020-06-29 11:04:10 +03:00
|
|
|
nvme_req_exit(req);
|
2013-06-04 19:17:10 +04:00
|
|
|
QTAILQ_INSERT_TAIL(&sq->req_list, req, entry);
|
|
|
|
}
|
2018-11-26 20:17:45 +03:00
|
|
|
if (cq->tail != cq->head) {
|
|
|
|
nvme_irq_assert(n, cq);
|
|
|
|
}
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static void nvme_enqueue_req_completion(NvmeCQueue *cq, NvmeRequest *req)
|
|
|
|
{
|
|
|
|
assert(cq->cqid == req->sq->cqid);
|
2020-07-06 09:12:48 +03:00
|
|
|
trace_pci_nvme_enqueue_req_completion(nvme_cid(req), cq->cqid,
|
|
|
|
req->status);
|
2020-09-30 02:19:05 +03:00
|
|
|
|
|
|
|
if (req->status) {
|
|
|
|
trace_pci_nvme_err_req_status(nvme_cid(req), nvme_nsid(req->ns),
|
|
|
|
req->status, req->cmd.opcode);
|
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
QTAILQ_REMOVE(&req->sq->out_req_list, req, entry);
|
|
|
|
QTAILQ_INSERT_TAIL(&cq->req_list, req, entry);
|
2013-08-21 19:03:08 +04:00
|
|
|
timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
|
2020-07-06 09:12:53 +03:00
|
|
|
static void nvme_process_aers(void *opaque)
|
|
|
|
{
|
|
|
|
NvmeCtrl *n = opaque;
|
|
|
|
NvmeAsyncEvent *event, *next;
|
|
|
|
|
|
|
|
trace_pci_nvme_process_aers(n->aer_queued);
|
|
|
|
|
|
|
|
QTAILQ_FOREACH_SAFE(event, &n->aer_queue, entry, next) {
|
|
|
|
NvmeRequest *req;
|
|
|
|
NvmeAerResult *result;
|
|
|
|
|
|
|
|
/* can't post cqe if there is nothing to complete */
|
|
|
|
if (!n->outstanding_aers) {
|
|
|
|
trace_pci_nvme_no_outstanding_aers();
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* ignore if masked (cqe posted, but event not cleared) */
|
|
|
|
if (n->aer_mask & (1 << event->result.event_type)) {
|
|
|
|
trace_pci_nvme_aer_masked(event->result.event_type, n->aer_mask);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
QTAILQ_REMOVE(&n->aer_queue, event, entry);
|
|
|
|
n->aer_queued--;
|
|
|
|
|
|
|
|
n->aer_mask |= 1 << event->result.event_type;
|
|
|
|
n->outstanding_aers--;
|
|
|
|
|
|
|
|
req = n->aer_reqs[n->outstanding_aers];
|
|
|
|
|
|
|
|
result = (NvmeAerResult *) &req->cqe.result;
|
|
|
|
result->event_type = event->result.event_type;
|
|
|
|
result->event_info = event->result.event_info;
|
|
|
|
result->log_page = event->result.log_page;
|
|
|
|
g_free(event);
|
|
|
|
|
|
|
|
trace_pci_nvme_aer_post_cqe(result->event_type, result->event_info,
|
|
|
|
result->log_page);
|
|
|
|
|
|
|
|
nvme_enqueue_req_completion(&n->admin_cq, req);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void nvme_enqueue_event(NvmeCtrl *n, uint8_t event_type,
|
|
|
|
uint8_t event_info, uint8_t log_page)
|
|
|
|
{
|
|
|
|
NvmeAsyncEvent *event;
|
|
|
|
|
|
|
|
trace_pci_nvme_enqueue_event(event_type, event_info, log_page);
|
|
|
|
|
|
|
|
if (n->aer_queued == n->params.aer_max_queued) {
|
|
|
|
trace_pci_nvme_enqueue_event_noqueue(n->aer_queued);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
event = g_new(NvmeAsyncEvent, 1);
|
|
|
|
event->result = (NvmeAerResult) {
|
|
|
|
.event_type = event_type,
|
|
|
|
.event_info = event_info,
|
|
|
|
.log_page = log_page,
|
|
|
|
};
|
|
|
|
|
|
|
|
QTAILQ_INSERT_TAIL(&n->aer_queue, event, entry);
|
|
|
|
n->aer_queued++;
|
|
|
|
|
|
|
|
nvme_process_aers(n);
|
|
|
|
}
|
|
|
|
|
2021-01-15 06:27:02 +03:00
|
|
|
static void nvme_smart_event(NvmeCtrl *n, uint8_t event)
|
|
|
|
{
|
|
|
|
uint8_t aer_info;
|
|
|
|
|
|
|
|
/* Ref SPEC <Asynchronous Event Information 0x2013 SMART / Health Status> */
|
|
|
|
if (!(NVME_AEC_SMART(n->features.async_config) & event)) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
switch (event) {
|
|
|
|
case NVME_SMART_SPARE:
|
|
|
|
aer_info = NVME_AER_INFO_SMART_SPARE_THRESH;
|
|
|
|
break;
|
|
|
|
case NVME_SMART_TEMPERATURE:
|
|
|
|
aer_info = NVME_AER_INFO_SMART_TEMP_THRESH;
|
|
|
|
break;
|
|
|
|
case NVME_SMART_RELIABILITY:
|
|
|
|
case NVME_SMART_MEDIA_READ_ONLY:
|
|
|
|
case NVME_SMART_FAILED_VOLATILE_MEDIA:
|
|
|
|
case NVME_SMART_PMR_UNRELIABLE:
|
|
|
|
aer_info = NVME_AER_INFO_SMART_RELIABILITY;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
nvme_enqueue_event(n, NVME_AER_TYPE_SMART, aer_info, NVME_LOG_SMART_INFO);
|
|
|
|
}
|
|
|
|
|
2020-07-06 09:12:53 +03:00
|
|
|
static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type)
|
|
|
|
{
|
|
|
|
n->aer_mask &= ~(1 << event_type);
|
|
|
|
if (!QTAILQ_EMPTY(&n->aer_queue)) {
|
|
|
|
nvme_process_aers(n);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-02-23 19:38:22 +03:00
|
|
|
static inline uint16_t nvme_check_mdts(NvmeCtrl *n, size_t len)
|
|
|
|
{
|
|
|
|
uint8_t mdts = n->params.mdts;
|
|
|
|
|
|
|
|
if (mdts && len > n->page_size << mdts) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
2020-11-09 14:23:18 +03:00
|
|
|
static inline uint16_t nvme_check_bounds(NvmeNamespace *ns, uint64_t slba,
|
|
|
|
uint32_t nlb)
|
2020-02-23 19:32:25 +03:00
|
|
|
{
|
|
|
|
uint64_t nsze = le64_to_cpu(ns->id_ns.nsze);
|
|
|
|
|
|
|
|
if (unlikely(UINT64_MAX - slba < nlb || slba + nlb > nsze)) {
|
|
|
|
return NVME_LBA_RANGE | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
2020-10-14 10:55:08 +03:00
|
|
|
static uint16_t nvme_check_dulbe(NvmeNamespace *ns, uint64_t slba,
|
|
|
|
uint32_t nlb)
|
|
|
|
{
|
|
|
|
BlockDriverState *bs = blk_bs(ns->blkconf.blk);
|
|
|
|
|
|
|
|
int64_t pnum = 0, bytes = nvme_l2b(ns, nlb);
|
|
|
|
int64_t offset = nvme_l2b(ns, slba);
|
|
|
|
bool zeroed;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
Error *local_err = NULL;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* `pnum` holds the number of bytes after offset that shares the same
|
|
|
|
* allocation status as the byte at offset. If `pnum` is different from
|
|
|
|
* `bytes`, we should check the allocation status of the next range and
|
|
|
|
* continue this until all bytes have been checked.
|
|
|
|
*/
|
|
|
|
do {
|
|
|
|
bytes -= pnum;
|
|
|
|
|
|
|
|
ret = bdrv_block_status(bs, offset, bytes, &pnum, NULL, NULL);
|
|
|
|
if (ret < 0) {
|
|
|
|
error_setg_errno(&local_err, -ret, "unable to get block status");
|
|
|
|
error_report_err(local_err);
|
|
|
|
|
|
|
|
return NVME_INTERNAL_DEV_ERROR;
|
|
|
|
}
|
|
|
|
|
|
|
|
zeroed = !!(ret & BDRV_BLOCK_ZERO);
|
|
|
|
|
|
|
|
trace_pci_nvme_block_status(offset, bytes, pnum, ret, zeroed);
|
|
|
|
|
|
|
|
if (zeroed) {
|
|
|
|
return NVME_DULB;
|
|
|
|
}
|
|
|
|
|
|
|
|
offset += pnum;
|
|
|
|
} while (pnum != bytes);
|
|
|
|
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
2020-11-10 10:53:20 +03:00
|
|
|
static void nvme_aio_err(NvmeRequest *req, int ret)
|
|
|
|
{
|
|
|
|
uint16_t status = NVME_SUCCESS;
|
|
|
|
Error *local_err = NULL;
|
|
|
|
|
|
|
|
switch (req->cmd.opcode) {
|
|
|
|
case NVME_CMD_READ:
|
|
|
|
status = NVME_UNRECOVERED_READ;
|
|
|
|
break;
|
|
|
|
case NVME_CMD_FLUSH:
|
|
|
|
case NVME_CMD_WRITE:
|
|
|
|
case NVME_CMD_WRITE_ZEROES:
|
2020-12-08 23:04:06 +03:00
|
|
|
case NVME_CMD_ZONE_APPEND:
|
2020-11-10 10:53:20 +03:00
|
|
|
status = NVME_WRITE_FAULT;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
status = NVME_INTERNAL_DEV_ERROR;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
trace_pci_nvme_err_aio(nvme_cid(req), strerror(ret), status);
|
|
|
|
|
|
|
|
error_setg_errno(&local_err, -ret, "aio failed");
|
|
|
|
error_report_err(local_err);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Set the command status code to the first encountered error but allow a
|
|
|
|
* subsequent Internal Device Error to trump it.
|
|
|
|
*/
|
|
|
|
if (req->status && status != NVME_INTERNAL_DEV_ERROR) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
req->status = status;
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
static inline uint32_t nvme_zone_idx(NvmeNamespace *ns, uint64_t slba)
|
|
|
|
{
|
|
|
|
return ns->zone_size_log2 > 0 ? slba >> ns->zone_size_log2 :
|
|
|
|
slba / ns->zone_size;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline NvmeZone *nvme_get_zone_by_slba(NvmeNamespace *ns, uint64_t slba)
|
|
|
|
{
|
|
|
|
uint32_t zone_idx = nvme_zone_idx(ns, slba);
|
|
|
|
|
|
|
|
assert(zone_idx < ns->num_zones);
|
|
|
|
return &ns->zone_array[zone_idx];
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_check_zone_state_for_write(NvmeZone *zone)
|
|
|
|
{
|
|
|
|
uint16_t status;
|
|
|
|
|
|
|
|
switch (nvme_get_zone_state(zone)) {
|
|
|
|
case NVME_ZONE_STATE_EMPTY:
|
|
|
|
case NVME_ZONE_STATE_IMPLICITLY_OPEN:
|
|
|
|
case NVME_ZONE_STATE_EXPLICITLY_OPEN:
|
|
|
|
case NVME_ZONE_STATE_CLOSED:
|
|
|
|
status = NVME_SUCCESS;
|
|
|
|
break;
|
|
|
|
case NVME_ZONE_STATE_FULL:
|
|
|
|
status = NVME_ZONE_FULL;
|
|
|
|
break;
|
|
|
|
case NVME_ZONE_STATE_OFFLINE:
|
|
|
|
status = NVME_ZONE_OFFLINE;
|
|
|
|
break;
|
|
|
|
case NVME_ZONE_STATE_READ_ONLY:
|
|
|
|
status = NVME_ZONE_READ_ONLY;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
assert(false);
|
|
|
|
}
|
|
|
|
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_check_zone_write(NvmeCtrl *n, NvmeNamespace *ns,
|
|
|
|
NvmeZone *zone, uint64_t slba,
|
|
|
|
uint32_t nlb, bool append)
|
|
|
|
{
|
|
|
|
uint16_t status;
|
|
|
|
|
|
|
|
if (unlikely((slba + nlb) > nvme_zone_wr_boundary(zone))) {
|
|
|
|
status = NVME_ZONE_BOUNDARY_ERROR;
|
|
|
|
} else {
|
|
|
|
status = nvme_check_zone_state_for_write(zone);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (status != NVME_SUCCESS) {
|
|
|
|
trace_pci_nvme_err_zone_write_not_ok(slba, nlb, status);
|
|
|
|
} else {
|
|
|
|
assert(nvme_wp_is_valid(zone));
|
|
|
|
if (append) {
|
|
|
|
if (unlikely(slba != zone->d.zslba)) {
|
|
|
|
trace_pci_nvme_err_append_not_at_start(slba, zone->d.zslba);
|
2021-01-18 06:39:17 +03:00
|
|
|
status = NVME_INVALID_FIELD;
|
2020-12-08 23:04:06 +03:00
|
|
|
}
|
|
|
|
if (nvme_l2b(ns, nlb) > (n->page_size << n->zasl)) {
|
|
|
|
trace_pci_nvme_err_append_too_large(slba, nlb, n->zasl);
|
|
|
|
status = NVME_INVALID_FIELD;
|
|
|
|
}
|
|
|
|
} else if (unlikely(slba != zone->w_ptr)) {
|
|
|
|
trace_pci_nvme_err_write_not_at_wp(slba, zone->d.zslba,
|
|
|
|
zone->w_ptr);
|
|
|
|
status = NVME_ZONE_INVALID_WRITE;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_check_zone_state_for_read(NvmeZone *zone)
|
|
|
|
{
|
|
|
|
uint16_t status;
|
|
|
|
|
|
|
|
switch (nvme_get_zone_state(zone)) {
|
|
|
|
case NVME_ZONE_STATE_EMPTY:
|
|
|
|
case NVME_ZONE_STATE_IMPLICITLY_OPEN:
|
|
|
|
case NVME_ZONE_STATE_EXPLICITLY_OPEN:
|
|
|
|
case NVME_ZONE_STATE_FULL:
|
|
|
|
case NVME_ZONE_STATE_CLOSED:
|
|
|
|
case NVME_ZONE_STATE_READ_ONLY:
|
|
|
|
status = NVME_SUCCESS;
|
|
|
|
break;
|
|
|
|
case NVME_ZONE_STATE_OFFLINE:
|
|
|
|
status = NVME_ZONE_OFFLINE;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
assert(false);
|
|
|
|
}
|
|
|
|
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_check_zone_read(NvmeNamespace *ns, uint64_t slba,
|
|
|
|
uint32_t nlb)
|
|
|
|
{
|
|
|
|
NvmeZone *zone = nvme_get_zone_by_slba(ns, slba);
|
|
|
|
uint64_t bndry = nvme_zone_rd_boundary(ns, zone);
|
|
|
|
uint64_t end = slba + nlb;
|
|
|
|
uint16_t status;
|
|
|
|
|
|
|
|
status = nvme_check_zone_state_for_read(zone);
|
|
|
|
if (status != NVME_SUCCESS) {
|
|
|
|
;
|
|
|
|
} else if (unlikely(end > bndry)) {
|
|
|
|
if (!ns->params.cross_zone_read) {
|
|
|
|
status = NVME_ZONE_BOUNDARY_ERROR;
|
|
|
|
} else {
|
|
|
|
/*
|
|
|
|
* Read across zone boundary - check that all subsequent
|
|
|
|
* zones that are being read have an appropriate state.
|
|
|
|
*/
|
|
|
|
do {
|
|
|
|
zone++;
|
|
|
|
status = nvme_check_zone_state_for_read(zone);
|
|
|
|
if (status != NVME_SUCCESS) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
} while (end > nvme_zone_rd_boundary(ns, zone));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:07 +03:00
|
|
|
static void nvme_auto_transition_zone(NvmeNamespace *ns)
|
|
|
|
{
|
|
|
|
NvmeZone *zone;
|
|
|
|
|
|
|
|
if (ns->params.max_open_zones &&
|
|
|
|
ns->nr_open_zones == ns->params.max_open_zones) {
|
|
|
|
zone = QTAILQ_FIRST(&ns->imp_open_zones);
|
|
|
|
if (zone) {
|
|
|
|
/*
|
|
|
|
* Automatically close this implicitly open zone.
|
|
|
|
*/
|
|
|
|
QTAILQ_REMOVE(&ns->imp_open_zones, zone, entry);
|
|
|
|
nvme_aor_dec_open(ns);
|
|
|
|
nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_CLOSED);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_auto_open_zone(NvmeNamespace *ns, NvmeZone *zone)
|
|
|
|
{
|
|
|
|
uint16_t status = NVME_SUCCESS;
|
|
|
|
uint8_t zs = nvme_get_zone_state(zone);
|
|
|
|
|
|
|
|
if (zs == NVME_ZONE_STATE_EMPTY) {
|
|
|
|
nvme_auto_transition_zone(ns);
|
|
|
|
status = nvme_aor_check(ns, 1, 1);
|
|
|
|
} else if (zs == NVME_ZONE_STATE_CLOSED) {
|
|
|
|
nvme_auto_transition_zone(ns);
|
|
|
|
status = nvme_aor_check(ns, 0, 1);
|
|
|
|
}
|
|
|
|
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
static void nvme_finalize_zoned_write(NvmeNamespace *ns, NvmeRequest *req,
|
|
|
|
bool failed)
|
|
|
|
{
|
|
|
|
NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd;
|
|
|
|
NvmeZone *zone;
|
|
|
|
NvmeZonedResult *res = (NvmeZonedResult *)&req->cqe;
|
|
|
|
uint64_t slba;
|
|
|
|
uint32_t nlb;
|
|
|
|
|
|
|
|
slba = le64_to_cpu(rw->slba);
|
|
|
|
nlb = le16_to_cpu(rw->nlb) + 1;
|
|
|
|
zone = nvme_get_zone_by_slba(ns, slba);
|
|
|
|
|
2021-01-12 12:32:37 +03:00
|
|
|
zone->d.wp += nlb;
|
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
if (failed) {
|
|
|
|
res->slba = 0;
|
2021-01-12 12:32:37 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
if (zone->d.wp == nvme_zone_wr_boundary(zone)) {
|
2020-12-08 23:04:06 +03:00
|
|
|
switch (nvme_get_zone_state(zone)) {
|
|
|
|
case NVME_ZONE_STATE_IMPLICITLY_OPEN:
|
|
|
|
case NVME_ZONE_STATE_EXPLICITLY_OPEN:
|
2020-12-08 23:04:07 +03:00
|
|
|
nvme_aor_dec_open(ns);
|
|
|
|
/* fall through */
|
2020-12-08 23:04:06 +03:00
|
|
|
case NVME_ZONE_STATE_CLOSED:
|
2020-12-08 23:04:07 +03:00
|
|
|
nvme_aor_dec_active(ns);
|
|
|
|
/* fall through */
|
2020-12-08 23:04:06 +03:00
|
|
|
case NVME_ZONE_STATE_EMPTY:
|
|
|
|
nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_FULL);
|
|
|
|
/* fall through */
|
|
|
|
case NVME_ZONE_STATE_FULL:
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
assert(false);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint64_t nvme_advance_zone_wp(NvmeNamespace *ns, NvmeZone *zone,
|
|
|
|
uint32_t nlb)
|
|
|
|
{
|
|
|
|
uint64_t result = zone->w_ptr;
|
|
|
|
uint8_t zs;
|
|
|
|
|
|
|
|
zone->w_ptr += nlb;
|
|
|
|
|
|
|
|
if (zone->w_ptr < nvme_zone_wr_boundary(zone)) {
|
|
|
|
zs = nvme_get_zone_state(zone);
|
|
|
|
switch (zs) {
|
|
|
|
case NVME_ZONE_STATE_EMPTY:
|
2020-12-08 23:04:07 +03:00
|
|
|
nvme_aor_inc_active(ns);
|
|
|
|
/* fall through */
|
2020-12-08 23:04:06 +03:00
|
|
|
case NVME_ZONE_STATE_CLOSED:
|
2020-12-08 23:04:07 +03:00
|
|
|
nvme_aor_inc_open(ns);
|
2020-12-08 23:04:06 +03:00
|
|
|
nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_IMPLICITLY_OPEN);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return result;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool nvme_is_write(NvmeRequest *req)
|
|
|
|
{
|
|
|
|
NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd;
|
|
|
|
|
|
|
|
return rw->opcode == NVME_CMD_WRITE ||
|
|
|
|
rw->opcode == NVME_CMD_ZONE_APPEND ||
|
|
|
|
rw->opcode == NVME_CMD_WRITE_ZEROES;
|
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
static void nvme_rw_cb(void *opaque, int ret)
|
|
|
|
{
|
|
|
|
NvmeRequest *req = opaque;
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
NvmeNamespace *ns = req->ns;
|
2013-06-04 19:17:10 +04:00
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
BlockBackend *blk = ns->blkconf.blk;
|
2020-08-24 13:43:38 +03:00
|
|
|
BlockAcctCookie *acct = &req->acct;
|
|
|
|
BlockAcctStats *stats = blk_get_stats(blk);
|
|
|
|
|
|
|
|
trace_pci_nvme_rw_cb(nvme_cid(req), blk_name(blk));
|
2020-07-06 09:12:48 +03:00
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
if (ns->params.zoned && nvme_is_write(req)) {
|
|
|
|
nvme_finalize_zoned_write(ns, req, ret != 0);
|
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
if (!ret) {
|
2020-08-24 13:43:38 +03:00
|
|
|
block_acct_done(stats, acct);
|
2013-06-04 19:17:10 +04:00
|
|
|
} else {
|
2020-08-24 13:43:38 +03:00
|
|
|
block_acct_failed(stats, acct);
|
2020-11-10 10:53:20 +03:00
|
|
|
nvme_aio_err(req, ret);
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
2020-07-29 22:08:04 +03:00
|
|
|
|
2020-08-24 13:43:38 +03:00
|
|
|
nvme_enqueue_req_completion(nvme_cq(req), req);
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
|
2020-10-21 15:03:19 +03:00
|
|
|
static void nvme_aio_discard_cb(void *opaque, int ret)
|
|
|
|
{
|
|
|
|
NvmeRequest *req = opaque;
|
|
|
|
uintptr_t *discards = (uintptr_t *)&req->opaque;
|
|
|
|
|
|
|
|
trace_pci_nvme_aio_discard_cb(nvme_cid(req));
|
|
|
|
|
|
|
|
if (ret) {
|
|
|
|
nvme_aio_err(req, ret);
|
|
|
|
}
|
|
|
|
|
|
|
|
(*discards)--;
|
|
|
|
|
|
|
|
if (*discards) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
nvme_enqueue_req_completion(nvme_cq(req), req);
|
|
|
|
}
|
|
|
|
|
2020-12-10 01:43:15 +03:00
|
|
|
struct nvme_zone_reset_ctx {
|
|
|
|
NvmeRequest *req;
|
|
|
|
NvmeZone *zone;
|
|
|
|
};
|
|
|
|
|
|
|
|
static void nvme_aio_zone_reset_cb(void *opaque, int ret)
|
|
|
|
{
|
|
|
|
struct nvme_zone_reset_ctx *ctx = opaque;
|
|
|
|
NvmeRequest *req = ctx->req;
|
|
|
|
NvmeNamespace *ns = req->ns;
|
|
|
|
NvmeZone *zone = ctx->zone;
|
|
|
|
uintptr_t *resets = (uintptr_t *)&req->opaque;
|
|
|
|
|
|
|
|
g_free(ctx);
|
|
|
|
|
|
|
|
trace_pci_nvme_aio_zone_reset_cb(nvme_cid(req), zone->d.zslba);
|
|
|
|
|
|
|
|
if (!ret) {
|
|
|
|
switch (nvme_get_zone_state(zone)) {
|
|
|
|
case NVME_ZONE_STATE_EXPLICITLY_OPEN:
|
|
|
|
case NVME_ZONE_STATE_IMPLICITLY_OPEN:
|
|
|
|
nvme_aor_dec_open(ns);
|
|
|
|
/* fall through */
|
|
|
|
case NVME_ZONE_STATE_CLOSED:
|
|
|
|
nvme_aor_dec_active(ns);
|
|
|
|
/* fall through */
|
|
|
|
case NVME_ZONE_STATE_FULL:
|
|
|
|
zone->w_ptr = zone->d.zslba;
|
|
|
|
zone->d.wp = zone->w_ptr;
|
|
|
|
nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_EMPTY);
|
|
|
|
/* fall through */
|
|
|
|
default:
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
nvme_aio_err(req, ret);
|
|
|
|
}
|
|
|
|
|
|
|
|
(*resets)--;
|
|
|
|
|
|
|
|
if (*resets) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
nvme_enqueue_req_completion(nvme_cq(req), req);
|
|
|
|
}
|
|
|
|
|
2020-11-16 13:14:02 +03:00
|
|
|
struct nvme_compare_ctx {
|
|
|
|
QEMUIOVector iov;
|
|
|
|
uint8_t *bounce;
|
|
|
|
size_t len;
|
|
|
|
};
|
|
|
|
|
|
|
|
static void nvme_compare_cb(void *opaque, int ret)
|
|
|
|
{
|
|
|
|
NvmeRequest *req = opaque;
|
|
|
|
NvmeNamespace *ns = req->ns;
|
|
|
|
struct nvme_compare_ctx *ctx = req->opaque;
|
|
|
|
g_autofree uint8_t *buf = NULL;
|
|
|
|
uint16_t status;
|
|
|
|
|
|
|
|
trace_pci_nvme_compare_cb(nvme_cid(req));
|
|
|
|
|
|
|
|
if (!ret) {
|
|
|
|
block_acct_done(blk_get_stats(ns->blkconf.blk), &req->acct);
|
|
|
|
} else {
|
|
|
|
block_acct_failed(blk_get_stats(ns->blkconf.blk), &req->acct);
|
|
|
|
nvme_aio_err(req, ret);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
buf = g_malloc(ctx->len);
|
|
|
|
|
|
|
|
status = nvme_dma(nvme_ctrl(req), buf, ctx->len, DMA_DIRECTION_TO_DEVICE,
|
|
|
|
req);
|
|
|
|
if (status) {
|
|
|
|
req->status = status;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (memcmp(buf, ctx->bounce, ctx->len)) {
|
|
|
|
req->status = NVME_CMP_FAILURE;
|
|
|
|
}
|
|
|
|
|
|
|
|
out:
|
|
|
|
qemu_iovec_destroy(&ctx->iov);
|
|
|
|
g_free(ctx->bounce);
|
|
|
|
g_free(ctx);
|
|
|
|
|
|
|
|
nvme_enqueue_req_completion(nvme_cq(req), req);
|
|
|
|
}
|
|
|
|
|
2020-10-21 15:03:19 +03:00
|
|
|
static uint16_t nvme_dsm(NvmeCtrl *n, NvmeRequest *req)
|
|
|
|
{
|
|
|
|
NvmeNamespace *ns = req->ns;
|
|
|
|
NvmeDsmCmd *dsm = (NvmeDsmCmd *) &req->cmd;
|
|
|
|
|
|
|
|
uint32_t attr = le32_to_cpu(dsm->attributes);
|
|
|
|
uint32_t nr = (le32_to_cpu(dsm->nr) & 0xff) + 1;
|
|
|
|
|
|
|
|
uint16_t status = NVME_SUCCESS;
|
|
|
|
|
|
|
|
trace_pci_nvme_dsm(nvme_cid(req), nvme_nsid(ns), nr, attr);
|
|
|
|
|
|
|
|
if (attr & NVME_DSMGMT_AD) {
|
|
|
|
int64_t offset;
|
|
|
|
size_t len;
|
|
|
|
NvmeDsmRange range[nr];
|
|
|
|
uintptr_t *discards = (uintptr_t *)&req->opaque;
|
|
|
|
|
|
|
|
status = nvme_dma(n, (uint8_t *)range, sizeof(range),
|
|
|
|
DMA_DIRECTION_TO_DEVICE, req);
|
|
|
|
if (status) {
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* AIO callbacks may be called immediately, so initialize discards to 1
|
|
|
|
* to make sure the the callback does not complete the request before
|
|
|
|
* all discards have been issued.
|
|
|
|
*/
|
|
|
|
*discards = 1;
|
|
|
|
|
|
|
|
for (int i = 0; i < nr; i++) {
|
|
|
|
uint64_t slba = le64_to_cpu(range[i].slba);
|
|
|
|
uint32_t nlb = le32_to_cpu(range[i].nlb);
|
|
|
|
|
|
|
|
if (nvme_check_bounds(ns, slba, nlb)) {
|
|
|
|
trace_pci_nvme_err_invalid_lba_range(slba, nlb,
|
|
|
|
ns->id_ns.nsze);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
trace_pci_nvme_dsm_deallocate(nvme_cid(req), nvme_nsid(ns), slba,
|
|
|
|
nlb);
|
|
|
|
|
|
|
|
offset = nvme_l2b(ns, slba);
|
|
|
|
len = nvme_l2b(ns, nlb);
|
|
|
|
|
|
|
|
while (len) {
|
|
|
|
size_t bytes = MIN(BDRV_REQUEST_MAX_BYTES, len);
|
|
|
|
|
|
|
|
(*discards)++;
|
|
|
|
|
|
|
|
blk_aio_pdiscard(ns->blkconf.blk, offset, bytes,
|
|
|
|
nvme_aio_discard_cb, req);
|
|
|
|
|
|
|
|
offset += bytes;
|
|
|
|
len -= bytes;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* account for the 1-initialization */
|
|
|
|
(*discards)--;
|
|
|
|
|
|
|
|
if (*discards) {
|
|
|
|
status = NVME_NO_COMPLETE;
|
|
|
|
} else {
|
|
|
|
status = req->status;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
2020-11-16 13:14:02 +03:00
|
|
|
static uint16_t nvme_compare(NvmeCtrl *n, NvmeRequest *req)
|
|
|
|
{
|
|
|
|
NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd;
|
|
|
|
NvmeNamespace *ns = req->ns;
|
|
|
|
BlockBackend *blk = ns->blkconf.blk;
|
|
|
|
uint64_t slba = le64_to_cpu(rw->slba);
|
|
|
|
uint32_t nlb = le16_to_cpu(rw->nlb) + 1;
|
|
|
|
size_t len = nvme_l2b(ns, nlb);
|
|
|
|
int64_t offset = nvme_l2b(ns, slba);
|
|
|
|
uint8_t *bounce = NULL;
|
|
|
|
struct nvme_compare_ctx *ctx = NULL;
|
|
|
|
uint16_t status;
|
|
|
|
|
|
|
|
trace_pci_nvme_compare(nvme_cid(req), nvme_nsid(ns), slba, nlb);
|
|
|
|
|
|
|
|
status = nvme_check_mdts(n, len);
|
|
|
|
if (status) {
|
|
|
|
trace_pci_nvme_err_mdts(nvme_cid(req), len);
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
|
|
|
status = nvme_check_bounds(ns, slba, nlb);
|
|
|
|
if (status) {
|
|
|
|
trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (NVME_ERR_REC_DULBE(ns->features.err_rec)) {
|
|
|
|
status = nvme_check_dulbe(ns, slba, nlb);
|
|
|
|
if (status) {
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
bounce = g_malloc(len);
|
|
|
|
|
|
|
|
ctx = g_new(struct nvme_compare_ctx, 1);
|
|
|
|
ctx->bounce = bounce;
|
|
|
|
ctx->len = len;
|
|
|
|
|
|
|
|
req->opaque = ctx;
|
|
|
|
|
|
|
|
qemu_iovec_init(&ctx->iov, 1);
|
|
|
|
qemu_iovec_add(&ctx->iov, bounce, len);
|
|
|
|
|
|
|
|
block_acct_start(blk_get_stats(blk), &req->acct, len, BLOCK_ACCT_READ);
|
|
|
|
blk_aio_preadv(blk, offset, &ctx->iov, 0, nvme_compare_cb, req);
|
|
|
|
|
|
|
|
return NVME_NO_COMPLETE;
|
|
|
|
}
|
|
|
|
|
2020-08-24 13:43:38 +03:00
|
|
|
static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest *req)
|
|
|
|
{
|
2020-09-30 22:22:27 +03:00
|
|
|
block_acct_start(blk_get_stats(req->ns->blkconf.blk), &req->acct, 0,
|
|
|
|
BLOCK_ACCT_FLUSH);
|
|
|
|
req->aiocb = blk_aio_flush(req->ns->blkconf.blk, nvme_rw_cb, req);
|
|
|
|
return NVME_NO_COMPLETE;
|
2020-08-24 13:43:38 +03:00
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:00 +03:00
|
|
|
static uint16_t nvme_read(NvmeCtrl *n, NvmeRequest *req)
|
|
|
|
{
|
|
|
|
NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd;
|
|
|
|
NvmeNamespace *ns = req->ns;
|
|
|
|
uint64_t slba = le64_to_cpu(rw->slba);
|
|
|
|
uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1;
|
|
|
|
uint64_t data_size = nvme_l2b(ns, nlb);
|
|
|
|
uint64_t data_offset;
|
|
|
|
BlockBackend *blk = ns->blkconf.blk;
|
|
|
|
uint16_t status;
|
|
|
|
|
|
|
|
trace_pci_nvme_read(nvme_cid(req), nvme_nsid(ns), nlb, data_size, slba);
|
|
|
|
|
|
|
|
status = nvme_check_mdts(n, data_size);
|
|
|
|
if (status) {
|
|
|
|
trace_pci_nvme_err_mdts(nvme_cid(req), data_size);
|
|
|
|
goto invalid;
|
|
|
|
}
|
|
|
|
|
|
|
|
status = nvme_check_bounds(ns, slba, nlb);
|
|
|
|
if (status) {
|
|
|
|
trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
|
|
|
|
goto invalid;
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
if (ns->params.zoned) {
|
|
|
|
status = nvme_check_zone_read(ns, slba, nlb);
|
|
|
|
if (status != NVME_SUCCESS) {
|
|
|
|
trace_pci_nvme_err_zone_read_not_ok(slba, nlb, status);
|
|
|
|
goto invalid;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:00 +03:00
|
|
|
status = nvme_map_dptr(n, data_size, req);
|
|
|
|
if (status) {
|
|
|
|
goto invalid;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (NVME_ERR_REC_DULBE(ns->features.err_rec)) {
|
|
|
|
status = nvme_check_dulbe(ns, slba, nlb);
|
|
|
|
if (status) {
|
|
|
|
goto invalid;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
data_offset = nvme_l2b(ns, slba);
|
|
|
|
|
|
|
|
block_acct_start(blk_get_stats(blk), &req->acct, data_size,
|
|
|
|
BLOCK_ACCT_READ);
|
|
|
|
if (req->qsg.sg) {
|
|
|
|
req->aiocb = dma_blk_read(blk, &req->qsg, data_offset,
|
|
|
|
BDRV_SECTOR_SIZE, nvme_rw_cb, req);
|
|
|
|
} else {
|
|
|
|
req->aiocb = blk_aio_preadv(blk, data_offset, &req->iov, 0,
|
|
|
|
nvme_rw_cb, req);
|
|
|
|
}
|
|
|
|
return NVME_NO_COMPLETE;
|
|
|
|
|
|
|
|
invalid:
|
|
|
|
block_acct_invalid(blk_get_stats(blk), BLOCK_ACCT_READ);
|
|
|
|
return status | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
static uint16_t nvme_do_write(NvmeCtrl *n, NvmeRequest *req, bool append,
|
|
|
|
bool wrz)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
2020-07-20 13:44:01 +03:00
|
|
|
NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd;
|
|
|
|
NvmeNamespace *ns = req->ns;
|
2013-06-04 19:17:10 +04:00
|
|
|
uint64_t slba = le64_to_cpu(rw->slba);
|
2020-12-08 23:04:00 +03:00
|
|
|
uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1;
|
2020-08-24 09:59:41 +03:00
|
|
|
uint64_t data_size = nvme_l2b(ns, nlb);
|
2020-12-08 23:04:00 +03:00
|
|
|
uint64_t data_offset;
|
2020-12-08 23:04:06 +03:00
|
|
|
NvmeZone *zone;
|
|
|
|
NvmeZonedResult *res = (NvmeZonedResult *)&req->cqe;
|
2020-09-30 22:22:27 +03:00
|
|
|
BlockBackend *blk = ns->blkconf.blk;
|
2020-02-23 19:32:25 +03:00
|
|
|
uint16_t status;
|
2013-06-04 19:17:10 +04:00
|
|
|
|
2020-12-08 23:04:00 +03:00
|
|
|
trace_pci_nvme_write(nvme_cid(req), nvme_io_opc_str(rw->opcode),
|
|
|
|
nvme_nsid(ns), nlb, data_size, slba);
|
2017-11-03 16:37:53 +03:00
|
|
|
|
2020-12-08 23:04:01 +03:00
|
|
|
if (!wrz) {
|
|
|
|
status = nvme_check_mdts(n, data_size);
|
|
|
|
if (status) {
|
|
|
|
trace_pci_nvme_err_mdts(nvme_cid(req), data_size);
|
|
|
|
goto invalid;
|
|
|
|
}
|
2020-02-23 19:38:22 +03:00
|
|
|
}
|
|
|
|
|
2020-11-09 14:23:18 +03:00
|
|
|
status = nvme_check_bounds(ns, slba, nlb);
|
2020-02-23 19:32:25 +03:00
|
|
|
if (status) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
|
2020-08-24 09:48:55 +03:00
|
|
|
goto invalid;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
2015-10-28 18:33:11 +03:00
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
if (ns->params.zoned) {
|
|
|
|
zone = nvme_get_zone_by_slba(ns, slba);
|
|
|
|
|
|
|
|
status = nvme_check_zone_write(n, ns, zone, slba, nlb, append);
|
|
|
|
if (status != NVME_SUCCESS) {
|
|
|
|
goto invalid;
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:07 +03:00
|
|
|
status = nvme_auto_open_zone(ns, zone);
|
|
|
|
if (status != NVME_SUCCESS) {
|
|
|
|
goto invalid;
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
if (append) {
|
|
|
|
slba = zone->w_ptr;
|
|
|
|
}
|
|
|
|
|
|
|
|
res->slba = nvme_advance_zone_wp(ns, zone, nlb);
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:00 +03:00
|
|
|
data_offset = nvme_l2b(ns, slba);
|
|
|
|
|
2020-12-08 23:04:01 +03:00
|
|
|
if (!wrz) {
|
|
|
|
status = nvme_map_dptr(n, data_size, req);
|
|
|
|
if (status) {
|
|
|
|
goto invalid;
|
|
|
|
}
|
|
|
|
|
|
|
|
block_acct_start(blk_get_stats(blk), &req->acct, data_size,
|
|
|
|
BLOCK_ACCT_WRITE);
|
|
|
|
if (req->qsg.sg) {
|
|
|
|
req->aiocb = dma_blk_write(blk, &req->qsg, data_offset,
|
|
|
|
BDRV_SECTOR_SIZE, nvme_rw_cb, req);
|
|
|
|
} else {
|
|
|
|
req->aiocb = blk_aio_pwritev(blk, data_offset, &req->iov, 0,
|
|
|
|
nvme_rw_cb, req);
|
|
|
|
}
|
2020-09-30 22:22:27 +03:00
|
|
|
} else {
|
2020-12-08 23:04:01 +03:00
|
|
|
block_acct_start(blk_get_stats(blk), &req->acct, 0, BLOCK_ACCT_WRITE);
|
|
|
|
req->aiocb = blk_aio_pwrite_zeroes(blk, data_offset, data_size,
|
|
|
|
BDRV_REQ_MAY_UNMAP, nvme_rw_cb,
|
|
|
|
req);
|
2020-09-30 22:22:27 +03:00
|
|
|
}
|
|
|
|
return NVME_NO_COMPLETE;
|
2020-08-24 09:48:55 +03:00
|
|
|
|
|
|
|
invalid:
|
2020-12-08 23:04:00 +03:00
|
|
|
block_acct_invalid(blk_get_stats(blk), BLOCK_ACCT_WRITE);
|
|
|
|
return status | NVME_DNR;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:01 +03:00
|
|
|
static inline uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req)
|
|
|
|
{
|
2020-12-08 23:04:06 +03:00
|
|
|
return nvme_do_write(n, req, false, false);
|
2020-12-08 23:04:01 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
|
|
|
|
{
|
2020-12-08 23:04:06 +03:00
|
|
|
return nvme_do_write(n, req, false, true);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline uint16_t nvme_zone_append(NvmeCtrl *n, NvmeRequest *req)
|
|
|
|
{
|
|
|
|
return nvme_do_write(n, req, true, false);
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_get_mgmt_zone_slba_idx(NvmeNamespace *ns, NvmeCmd *c,
|
|
|
|
uint64_t *slba, uint32_t *zone_idx)
|
|
|
|
{
|
|
|
|
uint32_t dw10 = le32_to_cpu(c->cdw10);
|
|
|
|
uint32_t dw11 = le32_to_cpu(c->cdw11);
|
|
|
|
|
|
|
|
if (!ns->params.zoned) {
|
|
|
|
trace_pci_nvme_err_invalid_opc(c->opcode);
|
|
|
|
return NVME_INVALID_OPCODE | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
*slba = ((uint64_t)dw11) << 32 | dw10;
|
|
|
|
if (unlikely(*slba >= ns->id_ns.nsze)) {
|
|
|
|
trace_pci_nvme_err_invalid_lba_range(*slba, 0, ns->id_ns.nsze);
|
|
|
|
*slba = 0;
|
|
|
|
return NVME_LBA_RANGE | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
*zone_idx = nvme_zone_idx(ns, *slba);
|
|
|
|
assert(*zone_idx < ns->num_zones);
|
|
|
|
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
2020-12-10 01:43:15 +03:00
|
|
|
typedef uint16_t (*op_handler_t)(NvmeNamespace *, NvmeZone *, NvmeZoneState,
|
|
|
|
NvmeRequest *);
|
2020-12-08 23:04:06 +03:00
|
|
|
|
|
|
|
enum NvmeZoneProcessingMask {
|
|
|
|
NVME_PROC_CURRENT_ZONE = 0,
|
2020-12-10 01:00:20 +03:00
|
|
|
NVME_PROC_OPENED_ZONES = 1 << 0,
|
|
|
|
NVME_PROC_CLOSED_ZONES = 1 << 1,
|
|
|
|
NVME_PROC_READ_ONLY_ZONES = 1 << 2,
|
|
|
|
NVME_PROC_FULL_ZONES = 1 << 3,
|
2020-12-08 23:04:06 +03:00
|
|
|
};
|
|
|
|
|
|
|
|
static uint16_t nvme_open_zone(NvmeNamespace *ns, NvmeZone *zone,
|
2020-12-10 01:43:15 +03:00
|
|
|
NvmeZoneState state, NvmeRequest *req)
|
2020-12-08 23:04:06 +03:00
|
|
|
{
|
2020-12-08 23:04:07 +03:00
|
|
|
uint16_t status;
|
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
switch (state) {
|
|
|
|
case NVME_ZONE_STATE_EMPTY:
|
2020-12-08 23:04:07 +03:00
|
|
|
status = nvme_aor_check(ns, 1, 0);
|
|
|
|
if (status != NVME_SUCCESS) {
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
nvme_aor_inc_active(ns);
|
|
|
|
/* fall through */
|
2020-12-08 23:04:06 +03:00
|
|
|
case NVME_ZONE_STATE_CLOSED:
|
2020-12-08 23:04:07 +03:00
|
|
|
status = nvme_aor_check(ns, 0, 1);
|
|
|
|
if (status != NVME_SUCCESS) {
|
|
|
|
if (state == NVME_ZONE_STATE_EMPTY) {
|
|
|
|
nvme_aor_dec_active(ns);
|
|
|
|
}
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
nvme_aor_inc_open(ns);
|
|
|
|
/* fall through */
|
2020-12-08 23:04:06 +03:00
|
|
|
case NVME_ZONE_STATE_IMPLICITLY_OPEN:
|
|
|
|
nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_EXPLICITLY_OPEN);
|
|
|
|
/* fall through */
|
|
|
|
case NVME_ZONE_STATE_EXPLICITLY_OPEN:
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
default:
|
|
|
|
return NVME_ZONE_INVAL_TRANSITION;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_close_zone(NvmeNamespace *ns, NvmeZone *zone,
|
2020-12-10 01:43:15 +03:00
|
|
|
NvmeZoneState state, NvmeRequest *req)
|
2020-12-08 23:04:06 +03:00
|
|
|
{
|
|
|
|
switch (state) {
|
|
|
|
case NVME_ZONE_STATE_EXPLICITLY_OPEN:
|
|
|
|
case NVME_ZONE_STATE_IMPLICITLY_OPEN:
|
2020-12-08 23:04:07 +03:00
|
|
|
nvme_aor_dec_open(ns);
|
2020-12-08 23:04:06 +03:00
|
|
|
nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_CLOSED);
|
|
|
|
/* fall through */
|
|
|
|
case NVME_ZONE_STATE_CLOSED:
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
default:
|
|
|
|
return NVME_ZONE_INVAL_TRANSITION;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_finish_zone(NvmeNamespace *ns, NvmeZone *zone,
|
2020-12-10 01:43:15 +03:00
|
|
|
NvmeZoneState state, NvmeRequest *req)
|
2020-12-08 23:04:06 +03:00
|
|
|
{
|
|
|
|
switch (state) {
|
|
|
|
case NVME_ZONE_STATE_EXPLICITLY_OPEN:
|
|
|
|
case NVME_ZONE_STATE_IMPLICITLY_OPEN:
|
2020-12-08 23:04:07 +03:00
|
|
|
nvme_aor_dec_open(ns);
|
|
|
|
/* fall through */
|
2020-12-08 23:04:06 +03:00
|
|
|
case NVME_ZONE_STATE_CLOSED:
|
2020-12-08 23:04:07 +03:00
|
|
|
nvme_aor_dec_active(ns);
|
|
|
|
/* fall through */
|
2020-12-08 23:04:06 +03:00
|
|
|
case NVME_ZONE_STATE_EMPTY:
|
|
|
|
zone->w_ptr = nvme_zone_wr_boundary(zone);
|
|
|
|
zone->d.wp = zone->w_ptr;
|
|
|
|
nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_FULL);
|
|
|
|
/* fall through */
|
|
|
|
case NVME_ZONE_STATE_FULL:
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
default:
|
|
|
|
return NVME_ZONE_INVAL_TRANSITION;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_reset_zone(NvmeNamespace *ns, NvmeZone *zone,
|
2020-12-10 01:43:15 +03:00
|
|
|
NvmeZoneState state, NvmeRequest *req)
|
2020-12-08 23:04:06 +03:00
|
|
|
{
|
2020-12-10 01:43:15 +03:00
|
|
|
uintptr_t *resets = (uintptr_t *)&req->opaque;
|
|
|
|
struct nvme_zone_reset_ctx *ctx;
|
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
switch (state) {
|
2020-12-10 01:43:15 +03:00
|
|
|
case NVME_ZONE_STATE_EMPTY:
|
|
|
|
return NVME_SUCCESS;
|
2020-12-08 23:04:06 +03:00
|
|
|
case NVME_ZONE_STATE_EXPLICITLY_OPEN:
|
|
|
|
case NVME_ZONE_STATE_IMPLICITLY_OPEN:
|
|
|
|
case NVME_ZONE_STATE_CLOSED:
|
|
|
|
case NVME_ZONE_STATE_FULL:
|
2020-12-10 01:43:15 +03:00
|
|
|
break;
|
2020-12-08 23:04:06 +03:00
|
|
|
default:
|
|
|
|
return NVME_ZONE_INVAL_TRANSITION;
|
|
|
|
}
|
2020-12-10 01:43:15 +03:00
|
|
|
|
|
|
|
/*
|
|
|
|
* The zone reset aio callback needs to know the zone that is being reset
|
|
|
|
* in order to transition the zone on completion.
|
|
|
|
*/
|
|
|
|
ctx = g_new(struct nvme_zone_reset_ctx, 1);
|
|
|
|
ctx->req = req;
|
|
|
|
ctx->zone = zone;
|
|
|
|
|
|
|
|
(*resets)++;
|
|
|
|
|
|
|
|
blk_aio_pwrite_zeroes(ns->blkconf.blk, nvme_l2b(ns, zone->d.zslba),
|
|
|
|
nvme_l2b(ns, ns->zone_size), BDRV_REQ_MAY_UNMAP,
|
|
|
|
nvme_aio_zone_reset_cb, ctx);
|
|
|
|
|
|
|
|
return NVME_NO_COMPLETE;
|
2020-12-08 23:04:06 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_offline_zone(NvmeNamespace *ns, NvmeZone *zone,
|
2020-12-10 01:43:15 +03:00
|
|
|
NvmeZoneState state, NvmeRequest *req)
|
2020-12-08 23:04:06 +03:00
|
|
|
{
|
|
|
|
switch (state) {
|
|
|
|
case NVME_ZONE_STATE_READ_ONLY:
|
|
|
|
nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_OFFLINE);
|
|
|
|
/* fall through */
|
|
|
|
case NVME_ZONE_STATE_OFFLINE:
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
default:
|
|
|
|
return NVME_ZONE_INVAL_TRANSITION;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:08 +03:00
|
|
|
static uint16_t nvme_set_zd_ext(NvmeNamespace *ns, NvmeZone *zone)
|
|
|
|
{
|
|
|
|
uint16_t status;
|
|
|
|
uint8_t state = nvme_get_zone_state(zone);
|
|
|
|
|
|
|
|
if (state == NVME_ZONE_STATE_EMPTY) {
|
|
|
|
status = nvme_aor_check(ns, 1, 0);
|
|
|
|
if (status != NVME_SUCCESS) {
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
nvme_aor_inc_active(ns);
|
|
|
|
zone->d.za |= NVME_ZA_ZD_EXT_VALID;
|
|
|
|
nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_CLOSED);
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
|
|
|
return NVME_ZONE_INVAL_TRANSITION;
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
static uint16_t nvme_bulk_proc_zone(NvmeNamespace *ns, NvmeZone *zone,
|
|
|
|
enum NvmeZoneProcessingMask proc_mask,
|
2020-12-10 01:43:15 +03:00
|
|
|
op_handler_t op_hndlr, NvmeRequest *req)
|
2020-12-08 23:04:06 +03:00
|
|
|
{
|
|
|
|
uint16_t status = NVME_SUCCESS;
|
2020-12-10 01:12:49 +03:00
|
|
|
NvmeZoneState zs = nvme_get_zone_state(zone);
|
2020-12-08 23:04:06 +03:00
|
|
|
bool proc_zone;
|
|
|
|
|
|
|
|
switch (zs) {
|
|
|
|
case NVME_ZONE_STATE_IMPLICITLY_OPEN:
|
|
|
|
case NVME_ZONE_STATE_EXPLICITLY_OPEN:
|
2020-12-10 01:00:20 +03:00
|
|
|
proc_zone = proc_mask & NVME_PROC_OPENED_ZONES;
|
2020-12-08 23:04:06 +03:00
|
|
|
break;
|
|
|
|
case NVME_ZONE_STATE_CLOSED:
|
|
|
|
proc_zone = proc_mask & NVME_PROC_CLOSED_ZONES;
|
|
|
|
break;
|
|
|
|
case NVME_ZONE_STATE_READ_ONLY:
|
|
|
|
proc_zone = proc_mask & NVME_PROC_READ_ONLY_ZONES;
|
|
|
|
break;
|
|
|
|
case NVME_ZONE_STATE_FULL:
|
|
|
|
proc_zone = proc_mask & NVME_PROC_FULL_ZONES;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
proc_zone = false;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (proc_zone) {
|
2020-12-10 01:43:15 +03:00
|
|
|
status = op_hndlr(ns, zone, zs, req);
|
2020-12-08 23:04:06 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_do_zone_op(NvmeNamespace *ns, NvmeZone *zone,
|
|
|
|
enum NvmeZoneProcessingMask proc_mask,
|
2020-12-10 01:43:15 +03:00
|
|
|
op_handler_t op_hndlr, NvmeRequest *req)
|
2020-12-08 23:04:06 +03:00
|
|
|
{
|
|
|
|
NvmeZone *next;
|
|
|
|
uint16_t status = NVME_SUCCESS;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
if (!proc_mask) {
|
2020-12-10 01:43:15 +03:00
|
|
|
status = op_hndlr(ns, zone, nvme_get_zone_state(zone), req);
|
2020-12-08 23:04:06 +03:00
|
|
|
} else {
|
|
|
|
if (proc_mask & NVME_PROC_CLOSED_ZONES) {
|
|
|
|
QTAILQ_FOREACH_SAFE(zone, &ns->closed_zones, entry, next) {
|
2020-12-10 01:43:15 +03:00
|
|
|
status = nvme_bulk_proc_zone(ns, zone, proc_mask, op_hndlr,
|
|
|
|
req);
|
|
|
|
if (status && status != NVME_NO_COMPLETE) {
|
2020-12-08 23:04:06 +03:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2020-12-10 01:00:20 +03:00
|
|
|
if (proc_mask & NVME_PROC_OPENED_ZONES) {
|
2020-12-08 23:04:06 +03:00
|
|
|
QTAILQ_FOREACH_SAFE(zone, &ns->imp_open_zones, entry, next) {
|
2020-12-10 01:43:15 +03:00
|
|
|
status = nvme_bulk_proc_zone(ns, zone, proc_mask, op_hndlr,
|
|
|
|
req);
|
|
|
|
if (status && status != NVME_NO_COMPLETE) {
|
2020-12-08 23:04:06 +03:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
}
|
2020-12-10 01:00:20 +03:00
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
QTAILQ_FOREACH_SAFE(zone, &ns->exp_open_zones, entry, next) {
|
2020-12-10 01:43:15 +03:00
|
|
|
status = nvme_bulk_proc_zone(ns, zone, proc_mask, op_hndlr,
|
|
|
|
req);
|
|
|
|
if (status && status != NVME_NO_COMPLETE) {
|
2020-12-08 23:04:06 +03:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (proc_mask & NVME_PROC_FULL_ZONES) {
|
|
|
|
QTAILQ_FOREACH_SAFE(zone, &ns->full_zones, entry, next) {
|
2020-12-10 01:43:15 +03:00
|
|
|
status = nvme_bulk_proc_zone(ns, zone, proc_mask, op_hndlr,
|
|
|
|
req);
|
|
|
|
if (status && status != NVME_NO_COMPLETE) {
|
2020-12-08 23:04:06 +03:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (proc_mask & NVME_PROC_READ_ONLY_ZONES) {
|
|
|
|
for (i = 0; i < ns->num_zones; i++, zone++) {
|
2020-12-10 01:43:15 +03:00
|
|
|
status = nvme_bulk_proc_zone(ns, zone, proc_mask, op_hndlr,
|
|
|
|
req);
|
|
|
|
if (status && status != NVME_NO_COMPLETE) {
|
2020-12-08 23:04:06 +03:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
out:
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req)
|
|
|
|
{
|
|
|
|
NvmeCmd *cmd = (NvmeCmd *)&req->cmd;
|
|
|
|
NvmeNamespace *ns = req->ns;
|
|
|
|
NvmeZone *zone;
|
2020-12-10 01:43:15 +03:00
|
|
|
uintptr_t *resets;
|
2020-12-08 23:04:08 +03:00
|
|
|
uint8_t *zd_ext;
|
2020-12-08 23:04:06 +03:00
|
|
|
uint32_t dw13 = le32_to_cpu(cmd->cdw13);
|
|
|
|
uint64_t slba = 0;
|
|
|
|
uint32_t zone_idx = 0;
|
|
|
|
uint16_t status;
|
|
|
|
uint8_t action;
|
|
|
|
bool all;
|
|
|
|
enum NvmeZoneProcessingMask proc_mask = NVME_PROC_CURRENT_ZONE;
|
|
|
|
|
|
|
|
action = dw13 & 0xff;
|
|
|
|
all = dw13 & 0x100;
|
|
|
|
|
|
|
|
req->status = NVME_SUCCESS;
|
|
|
|
|
|
|
|
if (!all) {
|
|
|
|
status = nvme_get_mgmt_zone_slba_idx(ns, cmd, &slba, &zone_idx);
|
|
|
|
if (status) {
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
zone = &ns->zone_array[zone_idx];
|
|
|
|
if (slba != zone->d.zslba) {
|
|
|
|
trace_pci_nvme_err_unaligned_zone_cmd(action, slba, zone->d.zslba);
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
switch (action) {
|
|
|
|
|
|
|
|
case NVME_ZONE_ACTION_OPEN:
|
|
|
|
if (all) {
|
|
|
|
proc_mask = NVME_PROC_CLOSED_ZONES;
|
|
|
|
}
|
|
|
|
trace_pci_nvme_open_zone(slba, zone_idx, all);
|
2020-12-10 01:43:15 +03:00
|
|
|
status = nvme_do_zone_op(ns, zone, proc_mask, nvme_open_zone, req);
|
2020-12-08 23:04:06 +03:00
|
|
|
break;
|
|
|
|
|
|
|
|
case NVME_ZONE_ACTION_CLOSE:
|
|
|
|
if (all) {
|
2020-12-10 01:00:20 +03:00
|
|
|
proc_mask = NVME_PROC_OPENED_ZONES;
|
2020-12-08 23:04:06 +03:00
|
|
|
}
|
|
|
|
trace_pci_nvme_close_zone(slba, zone_idx, all);
|
2020-12-10 01:43:15 +03:00
|
|
|
status = nvme_do_zone_op(ns, zone, proc_mask, nvme_close_zone, req);
|
2020-12-08 23:04:06 +03:00
|
|
|
break;
|
|
|
|
|
|
|
|
case NVME_ZONE_ACTION_FINISH:
|
|
|
|
if (all) {
|
2020-12-10 01:00:20 +03:00
|
|
|
proc_mask = NVME_PROC_OPENED_ZONES | NVME_PROC_CLOSED_ZONES;
|
2020-12-08 23:04:06 +03:00
|
|
|
}
|
|
|
|
trace_pci_nvme_finish_zone(slba, zone_idx, all);
|
2020-12-10 01:43:15 +03:00
|
|
|
status = nvme_do_zone_op(ns, zone, proc_mask, nvme_finish_zone, req);
|
2020-12-08 23:04:06 +03:00
|
|
|
break;
|
|
|
|
|
|
|
|
case NVME_ZONE_ACTION_RESET:
|
2020-12-10 01:43:15 +03:00
|
|
|
resets = (uintptr_t *)&req->opaque;
|
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
if (all) {
|
2020-12-10 01:00:20 +03:00
|
|
|
proc_mask = NVME_PROC_OPENED_ZONES | NVME_PROC_CLOSED_ZONES |
|
|
|
|
NVME_PROC_FULL_ZONES;
|
2020-12-08 23:04:06 +03:00
|
|
|
}
|
|
|
|
trace_pci_nvme_reset_zone(slba, zone_idx, all);
|
2020-12-10 01:43:15 +03:00
|
|
|
|
|
|
|
*resets = 1;
|
|
|
|
|
|
|
|
status = nvme_do_zone_op(ns, zone, proc_mask, nvme_reset_zone, req);
|
|
|
|
|
|
|
|
(*resets)--;
|
|
|
|
|
|
|
|
return *resets ? NVME_NO_COMPLETE : req->status;
|
2020-12-08 23:04:06 +03:00
|
|
|
|
|
|
|
case NVME_ZONE_ACTION_OFFLINE:
|
|
|
|
if (all) {
|
|
|
|
proc_mask = NVME_PROC_READ_ONLY_ZONES;
|
|
|
|
}
|
|
|
|
trace_pci_nvme_offline_zone(slba, zone_idx, all);
|
2020-12-10 01:43:15 +03:00
|
|
|
status = nvme_do_zone_op(ns, zone, proc_mask, nvme_offline_zone, req);
|
2020-12-08 23:04:06 +03:00
|
|
|
break;
|
|
|
|
|
|
|
|
case NVME_ZONE_ACTION_SET_ZD_EXT:
|
|
|
|
trace_pci_nvme_set_descriptor_extension(slba, zone_idx);
|
2020-12-08 23:04:08 +03:00
|
|
|
if (all || !ns->params.zd_extension_size) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
zd_ext = nvme_get_zd_extension(ns, zone_idx);
|
|
|
|
status = nvme_dma(n, zd_ext, ns->params.zd_extension_size,
|
|
|
|
DMA_DIRECTION_TO_DEVICE, req);
|
|
|
|
if (status) {
|
|
|
|
trace_pci_nvme_err_zd_extension_map_error(zone_idx);
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
|
|
|
status = nvme_set_zd_ext(ns, zone);
|
|
|
|
if (status == NVME_SUCCESS) {
|
|
|
|
trace_pci_nvme_zd_extension_set(zone_idx);
|
|
|
|
return status;
|
|
|
|
}
|
2020-12-08 23:04:06 +03:00
|
|
|
break;
|
|
|
|
|
|
|
|
default:
|
|
|
|
trace_pci_nvme_err_invalid_mgmt_action(action);
|
|
|
|
status = NVME_INVALID_FIELD;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (status == NVME_ZONE_INVAL_TRANSITION) {
|
|
|
|
trace_pci_nvme_err_invalid_zone_state_transition(action, slba,
|
|
|
|
zone->d.za);
|
|
|
|
}
|
|
|
|
if (status) {
|
|
|
|
status |= NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
|
|
|
static bool nvme_zone_matches_filter(uint32_t zafs, NvmeZone *zl)
|
|
|
|
{
|
2020-12-10 01:12:49 +03:00
|
|
|
NvmeZoneState zs = nvme_get_zone_state(zl);
|
2020-12-08 23:04:06 +03:00
|
|
|
|
|
|
|
switch (zafs) {
|
|
|
|
case NVME_ZONE_REPORT_ALL:
|
|
|
|
return true;
|
|
|
|
case NVME_ZONE_REPORT_EMPTY:
|
|
|
|
return zs == NVME_ZONE_STATE_EMPTY;
|
|
|
|
case NVME_ZONE_REPORT_IMPLICITLY_OPEN:
|
|
|
|
return zs == NVME_ZONE_STATE_IMPLICITLY_OPEN;
|
|
|
|
case NVME_ZONE_REPORT_EXPLICITLY_OPEN:
|
|
|
|
return zs == NVME_ZONE_STATE_EXPLICITLY_OPEN;
|
|
|
|
case NVME_ZONE_REPORT_CLOSED:
|
|
|
|
return zs == NVME_ZONE_STATE_CLOSED;
|
|
|
|
case NVME_ZONE_REPORT_FULL:
|
|
|
|
return zs == NVME_ZONE_STATE_FULL;
|
|
|
|
case NVME_ZONE_REPORT_READ_ONLY:
|
|
|
|
return zs == NVME_ZONE_STATE_READ_ONLY;
|
|
|
|
case NVME_ZONE_REPORT_OFFLINE:
|
|
|
|
return zs == NVME_ZONE_STATE_OFFLINE;
|
|
|
|
default:
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req)
|
|
|
|
{
|
|
|
|
NvmeCmd *cmd = (NvmeCmd *)&req->cmd;
|
|
|
|
NvmeNamespace *ns = req->ns;
|
|
|
|
/* cdw12 is zero-based number of dwords to return. Convert to bytes */
|
|
|
|
uint32_t data_size = (le32_to_cpu(cmd->cdw12) + 1) << 2;
|
|
|
|
uint32_t dw13 = le32_to_cpu(cmd->cdw13);
|
|
|
|
uint32_t zone_idx, zra, zrasf, partial;
|
|
|
|
uint64_t max_zones, nr_zones = 0;
|
|
|
|
uint16_t status;
|
|
|
|
uint64_t slba, capacity = nvme_ns_nlbas(ns);
|
|
|
|
NvmeZoneDescr *z;
|
|
|
|
NvmeZone *zone;
|
|
|
|
NvmeZoneReportHeader *header;
|
|
|
|
void *buf, *buf_p;
|
|
|
|
size_t zone_entry_sz;
|
|
|
|
|
|
|
|
req->status = NVME_SUCCESS;
|
|
|
|
|
|
|
|
status = nvme_get_mgmt_zone_slba_idx(ns, cmd, &slba, &zone_idx);
|
|
|
|
if (status) {
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
|
|
|
zra = dw13 & 0xff;
|
2020-12-08 23:04:08 +03:00
|
|
|
if (zra != NVME_ZONE_REPORT && zra != NVME_ZONE_REPORT_EXTENDED) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
if (zra == NVME_ZONE_REPORT_EXTENDED && !ns->params.zd_extension_size) {
|
2020-12-08 23:04:06 +03:00
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
zrasf = (dw13 >> 8) & 0xff;
|
|
|
|
if (zrasf > NVME_ZONE_REPORT_OFFLINE) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (data_size < sizeof(NvmeZoneReportHeader)) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
status = nvme_check_mdts(n, data_size);
|
|
|
|
if (status) {
|
|
|
|
trace_pci_nvme_err_mdts(nvme_cid(req), data_size);
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
|
|
|
partial = (dw13 >> 16) & 0x01;
|
|
|
|
|
|
|
|
zone_entry_sz = sizeof(NvmeZoneDescr);
|
2020-12-08 23:04:08 +03:00
|
|
|
if (zra == NVME_ZONE_REPORT_EXTENDED) {
|
|
|
|
zone_entry_sz += ns->params.zd_extension_size;
|
|
|
|
}
|
2020-12-08 23:04:06 +03:00
|
|
|
|
|
|
|
max_zones = (data_size - sizeof(NvmeZoneReportHeader)) / zone_entry_sz;
|
|
|
|
buf = g_malloc0(data_size);
|
|
|
|
|
|
|
|
zone = &ns->zone_array[zone_idx];
|
|
|
|
for (; slba < capacity; slba += ns->zone_size) {
|
|
|
|
if (partial && nr_zones >= max_zones) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
if (nvme_zone_matches_filter(zrasf, zone++)) {
|
|
|
|
nr_zones++;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
header = (NvmeZoneReportHeader *)buf;
|
|
|
|
header->nr_zones = cpu_to_le64(nr_zones);
|
|
|
|
|
|
|
|
buf_p = buf + sizeof(NvmeZoneReportHeader);
|
|
|
|
for (; zone_idx < ns->num_zones && max_zones > 0; zone_idx++) {
|
|
|
|
zone = &ns->zone_array[zone_idx];
|
|
|
|
if (nvme_zone_matches_filter(zrasf, zone)) {
|
|
|
|
z = (NvmeZoneDescr *)buf_p;
|
|
|
|
buf_p += sizeof(NvmeZoneDescr);
|
|
|
|
|
|
|
|
z->zt = zone->d.zt;
|
|
|
|
z->zs = zone->d.zs;
|
|
|
|
z->zcap = cpu_to_le64(zone->d.zcap);
|
|
|
|
z->zslba = cpu_to_le64(zone->d.zslba);
|
|
|
|
z->za = zone->d.za;
|
|
|
|
|
|
|
|
if (nvme_wp_is_valid(zone)) {
|
|
|
|
z->wp = cpu_to_le64(zone->d.wp);
|
|
|
|
} else {
|
|
|
|
z->wp = cpu_to_le64(~0ULL);
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:08 +03:00
|
|
|
if (zra == NVME_ZONE_REPORT_EXTENDED) {
|
|
|
|
if (zone->d.za & NVME_ZA_ZD_EXT_VALID) {
|
|
|
|
memcpy(buf_p, nvme_get_zd_extension(ns, zone_idx),
|
|
|
|
ns->params.zd_extension_size);
|
|
|
|
}
|
|
|
|
buf_p += ns->params.zd_extension_size;
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
max_zones--;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
status = nvme_dma(n, (uint8_t *)buf, data_size,
|
|
|
|
DMA_DIRECTION_FROM_DEVICE, req);
|
|
|
|
|
|
|
|
g_free(buf);
|
|
|
|
|
|
|
|
return status;
|
2020-12-08 23:04:01 +03:00
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
2020-07-20 13:44:01 +03:00
|
|
|
uint32_t nsid = le32_to_cpu(req->cmd.nsid);
|
2013-06-04 19:17:10 +04:00
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
trace_pci_nvme_io_cmd(nvme_cid(req), nsid, nvme_sqid(req),
|
2020-08-24 23:11:33 +03:00
|
|
|
req->cmd.opcode, nvme_io_opc_str(req->cmd.opcode));
|
2020-07-06 09:12:48 +03:00
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
if (!nvme_nsid_valid(n, nsid)) {
|
2013-06-04 19:17:10 +04:00
|
|
|
return NVME_INVALID_NSID | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
req->ns = nvme_ns(n, nsid);
|
|
|
|
if (unlikely(!req->ns)) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:02 +03:00
|
|
|
if (!(req->ns->iocs[req->cmd.opcode] & NVME_CMD_EFF_CSUPP)) {
|
|
|
|
trace_pci_nvme_err_invalid_opc(req->cmd.opcode);
|
|
|
|
return NVME_INVALID_OPCODE | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
switch (req->cmd.opcode) {
|
2013-06-04 19:17:10 +04:00
|
|
|
case NVME_CMD_FLUSH:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_flush(n, req);
|
2020-03-31 00:10:13 +03:00
|
|
|
case NVME_CMD_WRITE_ZEROES:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_write_zeroes(n, req);
|
2020-12-08 23:04:06 +03:00
|
|
|
case NVME_CMD_ZONE_APPEND:
|
|
|
|
return nvme_zone_append(n, req);
|
2013-06-04 19:17:10 +04:00
|
|
|
case NVME_CMD_WRITE:
|
2020-12-08 23:04:00 +03:00
|
|
|
return nvme_write(n, req);
|
2013-06-04 19:17:10 +04:00
|
|
|
case NVME_CMD_READ:
|
2020-12-08 23:04:00 +03:00
|
|
|
return nvme_read(n, req);
|
2020-11-16 13:14:02 +03:00
|
|
|
case NVME_CMD_COMPARE:
|
|
|
|
return nvme_compare(n, req);
|
2020-10-21 15:03:19 +03:00
|
|
|
case NVME_CMD_DSM:
|
|
|
|
return nvme_dsm(n, req);
|
2020-12-08 23:04:06 +03:00
|
|
|
case NVME_CMD_ZONE_MGMT_SEND:
|
|
|
|
return nvme_zone_mgmt_send(n, req);
|
|
|
|
case NVME_CMD_ZONE_MGMT_RECV:
|
|
|
|
return nvme_zone_mgmt_recv(n, req);
|
2013-06-04 19:17:10 +04:00
|
|
|
default:
|
2020-12-08 23:04:02 +03:00
|
|
|
assert(false);
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
2020-12-08 23:04:02 +03:00
|
|
|
|
|
|
|
return NVME_INVALID_OPCODE | NVME_DNR;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static void nvme_free_sq(NvmeSQueue *sq, NvmeCtrl *n)
|
|
|
|
{
|
|
|
|
n->sq[sq->sqid] = NULL;
|
2013-08-21 19:03:08 +04:00
|
|
|
timer_free(sq->timer);
|
2013-06-04 19:17:10 +04:00
|
|
|
g_free(sq->io_req);
|
|
|
|
if (sq->sqid) {
|
|
|
|
g_free(sq);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_del_sq(NvmeCtrl *n, NvmeRequest *req)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
2020-07-20 13:44:01 +03:00
|
|
|
NvmeDeleteQ *c = (NvmeDeleteQ *)&req->cmd;
|
|
|
|
NvmeRequest *r, *next;
|
2013-06-04 19:17:10 +04:00
|
|
|
NvmeSQueue *sq;
|
|
|
|
NvmeCQueue *cq;
|
|
|
|
uint16_t qid = le16_to_cpu(c->qid);
|
|
|
|
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(!qid || nvme_check_sqid(n, qid))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_del_sq(qid);
|
2013-06-04 19:17:10 +04:00
|
|
|
return NVME_INVALID_QID | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_del_sq(qid);
|
2017-11-03 16:37:53 +03:00
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
sq = n->sq[qid];
|
|
|
|
while (!QTAILQ_EMPTY(&sq->out_req_list)) {
|
2020-07-20 13:44:01 +03:00
|
|
|
r = QTAILQ_FIRST(&sq->out_req_list);
|
|
|
|
assert(r->aiocb);
|
|
|
|
blk_aio_cancel(r->aiocb);
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
if (!nvme_check_cqid(n, sq->cqid)) {
|
|
|
|
cq = n->cq[sq->cqid];
|
|
|
|
QTAILQ_REMOVE(&cq->sq_list, sq, entry);
|
|
|
|
|
|
|
|
nvme_post_cqes(cq);
|
2020-07-20 13:44:01 +03:00
|
|
|
QTAILQ_FOREACH_SAFE(r, &cq->req_list, entry, next) {
|
|
|
|
if (r->sq == sq) {
|
|
|
|
QTAILQ_REMOVE(&cq->req_list, r, entry);
|
|
|
|
QTAILQ_INSERT_TAIL(&sq->req_list, r, entry);
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
nvme_free_sq(sq, n);
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void nvme_init_sq(NvmeSQueue *sq, NvmeCtrl *n, uint64_t dma_addr,
|
2020-08-24 09:58:56 +03:00
|
|
|
uint16_t sqid, uint16_t cqid, uint16_t size)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
|
|
|
int i;
|
|
|
|
NvmeCQueue *cq;
|
|
|
|
|
|
|
|
sq->ctrl = n;
|
|
|
|
sq->dma_addr = dma_addr;
|
|
|
|
sq->sqid = sqid;
|
|
|
|
sq->size = size;
|
|
|
|
sq->cqid = cqid;
|
|
|
|
sq->head = sq->tail = 0;
|
2020-02-23 18:37:49 +03:00
|
|
|
sq->io_req = g_new0(NvmeRequest, sq->size);
|
2013-06-04 19:17:10 +04:00
|
|
|
|
|
|
|
QTAILQ_INIT(&sq->req_list);
|
|
|
|
QTAILQ_INIT(&sq->out_req_list);
|
|
|
|
for (i = 0; i < sq->size; i++) {
|
|
|
|
sq->io_req[i].sq = sq;
|
|
|
|
QTAILQ_INSERT_TAIL(&(sq->req_list), &sq->io_req[i], entry);
|
|
|
|
}
|
2013-08-21 19:03:08 +04:00
|
|
|
sq->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, nvme_process_sq, sq);
|
2013-06-04 19:17:10 +04:00
|
|
|
|
|
|
|
assert(n->cq[cqid]);
|
|
|
|
cq = n->cq[cqid];
|
|
|
|
QTAILQ_INSERT_TAIL(&(cq->sq_list), sq, entry);
|
|
|
|
n->sq[sqid] = sq;
|
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeRequest *req)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
|
|
|
NvmeSQueue *sq;
|
2020-07-20 13:44:01 +03:00
|
|
|
NvmeCreateSq *c = (NvmeCreateSq *)&req->cmd;
|
2013-06-04 19:17:10 +04:00
|
|
|
|
|
|
|
uint16_t cqid = le16_to_cpu(c->cqid);
|
|
|
|
uint16_t sqid = le16_to_cpu(c->sqid);
|
|
|
|
uint16_t qsize = le16_to_cpu(c->qsize);
|
|
|
|
uint16_t qflags = le16_to_cpu(c->sq_flags);
|
|
|
|
uint64_t prp1 = le64_to_cpu(c->prp1);
|
|
|
|
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_create_sq(prp1, sqid, cqid, qsize, qflags);
|
2017-11-03 16:37:53 +03:00
|
|
|
|
|
|
|
if (unlikely(!cqid || nvme_check_cqid(n, cqid))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_create_sq_cqid(cqid);
|
2013-06-04 19:17:10 +04:00
|
|
|
return NVME_INVALID_CQID | NVME_DNR;
|
|
|
|
}
|
2020-10-22 12:07:08 +03:00
|
|
|
if (unlikely(!sqid || sqid > n->params.max_ioqpairs ||
|
|
|
|
n->sq[sqid] != NULL)) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_create_sq_sqid(sqid);
|
2013-06-04 19:17:10 +04:00
|
|
|
return NVME_INVALID_QID | NVME_DNR;
|
|
|
|
}
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(!qsize || qsize > NVME_CAP_MQES(n->bar.cap))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_create_sq_size(qsize);
|
2013-06-04 19:17:10 +04:00
|
|
|
return NVME_MAX_QSIZE_EXCEEDED | NVME_DNR;
|
|
|
|
}
|
2020-10-22 09:28:46 +03:00
|
|
|
if (unlikely(prp1 & (n->page_size - 1))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_create_sq_addr(prp1);
|
2020-10-22 09:28:46 +03:00
|
|
|
return NVME_INVALID_PRP_OFFSET | NVME_DNR;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(!(NVME_SQ_FLAGS_PC(qflags)))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_create_sq_qflags(NVME_SQ_FLAGS_PC(qflags));
|
2013-06-04 19:17:10 +04:00
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
sq = g_malloc0(sizeof(*sq));
|
|
|
|
nvme_init_sq(sq, n, prp1, sqid, cqid, qsize + 1);
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
2020-09-30 20:15:50 +03:00
|
|
|
struct nvme_stats {
|
|
|
|
uint64_t units_read;
|
|
|
|
uint64_t units_written;
|
|
|
|
uint64_t read_commands;
|
|
|
|
uint64_t write_commands;
|
|
|
|
};
|
|
|
|
|
|
|
|
static void nvme_set_blk_stats(NvmeNamespace *ns, struct nvme_stats *stats)
|
|
|
|
{
|
|
|
|
BlockAcctStats *s = blk_get_stats(ns->blkconf.blk);
|
|
|
|
|
|
|
|
stats->units_read += s->nr_bytes[BLOCK_ACCT_READ] >> BDRV_SECTOR_BITS;
|
|
|
|
stats->units_written += s->nr_bytes[BLOCK_ACCT_WRITE] >> BDRV_SECTOR_BITS;
|
|
|
|
stats->read_commands += s->nr_ops[BLOCK_ACCT_READ];
|
|
|
|
stats->write_commands += s->nr_ops[BLOCK_ACCT_WRITE];
|
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
|
|
|
|
uint64_t off, NvmeRequest *req)
|
2020-07-06 09:12:52 +03:00
|
|
|
{
|
2019-04-12 21:53:16 +03:00
|
|
|
uint32_t nsid = le32_to_cpu(req->cmd.nsid);
|
2020-09-30 20:15:50 +03:00
|
|
|
struct nvme_stats stats = { 0 };
|
|
|
|
NvmeSmartLog smart = { 0 };
|
2020-07-06 09:12:52 +03:00
|
|
|
uint32_t trans_len;
|
2020-09-30 20:15:50 +03:00
|
|
|
NvmeNamespace *ns;
|
2020-07-06 09:12:52 +03:00
|
|
|
time_t current_ms;
|
|
|
|
|
2020-09-30 20:01:02 +03:00
|
|
|
if (off >= sizeof(smart)) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2020-09-30 20:15:50 +03:00
|
|
|
if (nsid != 0xffffffff) {
|
|
|
|
ns = nvme_ns(n, nsid);
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
if (!ns) {
|
2020-09-30 20:15:50 +03:00
|
|
|
return NVME_INVALID_NSID | NVME_DNR;
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
}
|
2020-09-30 20:15:50 +03:00
|
|
|
nvme_set_blk_stats(ns, &stats);
|
|
|
|
} else {
|
|
|
|
int i;
|
2020-07-06 09:12:52 +03:00
|
|
|
|
2020-09-30 20:15:50 +03:00
|
|
|
for (i = 1; i <= n->num_namespaces; i++) {
|
|
|
|
ns = nvme_ns(n, i);
|
|
|
|
if (!ns) {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
nvme_set_blk_stats(ns, &stats);
|
|
|
|
}
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
}
|
2020-07-06 09:12:52 +03:00
|
|
|
|
|
|
|
trans_len = MIN(sizeof(smart) - off, buf_len);
|
2021-01-15 06:27:01 +03:00
|
|
|
smart.critical_warning = n->smart_critical_warning;
|
2020-07-06 09:12:52 +03:00
|
|
|
|
2020-09-30 20:15:50 +03:00
|
|
|
smart.data_units_read[0] = cpu_to_le64(DIV_ROUND_UP(stats.units_read,
|
|
|
|
1000));
|
|
|
|
smart.data_units_written[0] = cpu_to_le64(DIV_ROUND_UP(stats.units_written,
|
2020-07-06 09:12:52 +03:00
|
|
|
1000));
|
2020-09-30 20:15:50 +03:00
|
|
|
smart.host_read_commands[0] = cpu_to_le64(stats.read_commands);
|
|
|
|
smart.host_write_commands[0] = cpu_to_le64(stats.write_commands);
|
2020-07-06 09:12:52 +03:00
|
|
|
|
|
|
|
smart.temperature = cpu_to_le16(n->temperature);
|
|
|
|
|
|
|
|
if ((n->temperature >= n->features.temp_thresh_hi) ||
|
|
|
|
(n->temperature <= n->features.temp_thresh_low)) {
|
|
|
|
smart.critical_warning |= NVME_SMART_TEMPERATURE;
|
|
|
|
}
|
|
|
|
|
|
|
|
current_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
|
|
|
|
smart.power_on_hours[0] =
|
|
|
|
cpu_to_le64((((current_ms - n->starttime_ms) / 1000) / 60) / 60);
|
|
|
|
|
2020-07-06 09:12:53 +03:00
|
|
|
if (!rae) {
|
|
|
|
nvme_clear_events(n, NVME_AER_TYPE_SMART);
|
|
|
|
}
|
|
|
|
|
2019-04-12 21:53:16 +03:00
|
|
|
return nvme_dma(n, (uint8_t *) &smart + off, trans_len,
|
|
|
|
DMA_DIRECTION_FROM_DEVICE, req);
|
2020-07-06 09:12:52 +03:00
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_fw_log_info(NvmeCtrl *n, uint32_t buf_len, uint64_t off,
|
|
|
|
NvmeRequest *req)
|
2020-07-06 09:12:52 +03:00
|
|
|
{
|
|
|
|
uint32_t trans_len;
|
|
|
|
NvmeFwSlotInfoLog fw_log = {
|
|
|
|
.afi = 0x1,
|
|
|
|
};
|
|
|
|
|
2020-09-30 20:01:02 +03:00
|
|
|
if (off >= sizeof(fw_log)) {
|
2020-07-06 09:12:52 +03:00
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2020-09-30 20:01:02 +03:00
|
|
|
strpadcpy((char *)&fw_log.frs1, sizeof(fw_log.frs1), "1.0", ' ');
|
2020-07-06 09:12:52 +03:00
|
|
|
trans_len = MIN(sizeof(fw_log) - off, buf_len);
|
|
|
|
|
2019-04-12 21:53:16 +03:00
|
|
|
return nvme_dma(n, (uint8_t *) &fw_log + off, trans_len,
|
|
|
|
DMA_DIRECTION_FROM_DEVICE, req);
|
2020-07-06 09:12:52 +03:00
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
|
|
|
|
uint64_t off, NvmeRequest *req)
|
2020-07-06 09:12:52 +03:00
|
|
|
{
|
|
|
|
uint32_t trans_len;
|
|
|
|
NvmeErrorLog errlog;
|
|
|
|
|
2020-09-30 20:01:02 +03:00
|
|
|
if (off >= sizeof(errlog)) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
2020-07-06 09:12:53 +03:00
|
|
|
}
|
|
|
|
|
2020-09-30 20:01:02 +03:00
|
|
|
if (!rae) {
|
|
|
|
nvme_clear_events(n, NVME_AER_TYPE_ERROR);
|
2020-07-06 09:12:52 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
memset(&errlog, 0x0, sizeof(errlog));
|
|
|
|
trans_len = MIN(sizeof(errlog) - off, buf_len);
|
|
|
|
|
2019-04-12 21:53:16 +03:00
|
|
|
return nvme_dma(n, (uint8_t *)&errlog, trans_len,
|
|
|
|
DMA_DIRECTION_FROM_DEVICE, req);
|
2020-07-06 09:12:52 +03:00
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:03 +03:00
|
|
|
static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint8_t csi, uint32_t buf_len,
|
2020-12-08 23:04:02 +03:00
|
|
|
uint64_t off, NvmeRequest *req)
|
|
|
|
{
|
|
|
|
NvmeEffectsLog log = {};
|
|
|
|
const uint32_t *src_iocs = NULL;
|
|
|
|
uint32_t trans_len;
|
|
|
|
|
|
|
|
if (off >= sizeof(log)) {
|
|
|
|
trace_pci_nvme_err_invalid_log_page_offset(off, sizeof(log));
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
switch (NVME_CC_CSS(n->bar.cc)) {
|
|
|
|
case NVME_CC_CSS_NVM:
|
|
|
|
src_iocs = nvme_cse_iocs_nvm;
|
2020-12-08 23:04:03 +03:00
|
|
|
/* fall through */
|
2020-12-08 23:04:02 +03:00
|
|
|
case NVME_CC_CSS_ADMIN_ONLY:
|
|
|
|
break;
|
2020-12-08 23:04:03 +03:00
|
|
|
case NVME_CC_CSS_CSI:
|
|
|
|
switch (csi) {
|
|
|
|
case NVME_CSI_NVM:
|
|
|
|
src_iocs = nvme_cse_iocs_nvm;
|
|
|
|
break;
|
2020-12-08 23:04:06 +03:00
|
|
|
case NVME_CSI_ZONED:
|
|
|
|
src_iocs = nvme_cse_iocs_zoned;
|
|
|
|
break;
|
2020-12-08 23:04:03 +03:00
|
|
|
}
|
2020-12-08 23:04:02 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
memcpy(log.acs, nvme_cse_acs, sizeof(nvme_cse_acs));
|
|
|
|
|
|
|
|
if (src_iocs) {
|
|
|
|
memcpy(log.iocs, src_iocs, sizeof(log.iocs));
|
|
|
|
}
|
|
|
|
|
|
|
|
trans_len = MIN(sizeof(log) - off, buf_len);
|
|
|
|
|
|
|
|
return nvme_dma(n, ((uint8_t *)&log) + off, trans_len,
|
|
|
|
DMA_DIRECTION_FROM_DEVICE, req);
|
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req)
|
2020-07-06 09:12:52 +03:00
|
|
|
{
|
2020-07-20 13:44:01 +03:00
|
|
|
NvmeCmd *cmd = &req->cmd;
|
|
|
|
|
2020-07-06 09:12:52 +03:00
|
|
|
uint32_t dw10 = le32_to_cpu(cmd->cdw10);
|
|
|
|
uint32_t dw11 = le32_to_cpu(cmd->cdw11);
|
|
|
|
uint32_t dw12 = le32_to_cpu(cmd->cdw12);
|
|
|
|
uint32_t dw13 = le32_to_cpu(cmd->cdw13);
|
|
|
|
uint8_t lid = dw10 & 0xff;
|
|
|
|
uint8_t lsp = (dw10 >> 8) & 0xf;
|
|
|
|
uint8_t rae = (dw10 >> 15) & 0x1;
|
2020-12-08 23:04:03 +03:00
|
|
|
uint8_t csi = le32_to_cpu(cmd->cdw14) >> 24;
|
2020-07-06 09:12:52 +03:00
|
|
|
uint32_t numdl, numdu;
|
|
|
|
uint64_t off, lpol, lpou;
|
|
|
|
size_t len;
|
2020-02-23 19:38:22 +03:00
|
|
|
uint16_t status;
|
2020-07-06 09:12:52 +03:00
|
|
|
|
|
|
|
numdl = (dw10 >> 16);
|
|
|
|
numdu = (dw11 & 0xffff);
|
|
|
|
lpol = dw12;
|
|
|
|
lpou = dw13;
|
|
|
|
|
|
|
|
len = (((numdu << 16) | numdl) + 1) << 2;
|
|
|
|
off = (lpou << 32ULL) | lpol;
|
|
|
|
|
|
|
|
if (off & 0x3) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
trace_pci_nvme_get_log(nvme_cid(req), lid, lsp, rae, len, off);
|
|
|
|
|
2020-02-23 19:38:22 +03:00
|
|
|
status = nvme_check_mdts(n, len);
|
|
|
|
if (status) {
|
|
|
|
trace_pci_nvme_err_mdts(nvme_cid(req), len);
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
2020-07-06 09:12:52 +03:00
|
|
|
switch (lid) {
|
|
|
|
case NVME_LOG_ERROR_INFO:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_error_info(n, rae, len, off, req);
|
2020-07-06 09:12:52 +03:00
|
|
|
case NVME_LOG_SMART_INFO:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_smart_info(n, rae, len, off, req);
|
2020-07-06 09:12:52 +03:00
|
|
|
case NVME_LOG_FW_SLOT_INFO:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_fw_log_info(n, len, off, req);
|
2020-12-08 23:04:02 +03:00
|
|
|
case NVME_LOG_CMD_EFFECTS:
|
2020-12-08 23:04:03 +03:00
|
|
|
return nvme_cmd_effects(n, csi, len, off, req);
|
2020-07-06 09:12:52 +03:00
|
|
|
default:
|
|
|
|
trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid);
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
static void nvme_free_cq(NvmeCQueue *cq, NvmeCtrl *n)
|
|
|
|
{
|
|
|
|
n->cq[cq->cqid] = NULL;
|
2013-08-21 19:03:08 +04:00
|
|
|
timer_free(cq->timer);
|
2021-01-12 15:30:26 +03:00
|
|
|
if (msix_enabled(&n->parent_obj)) {
|
|
|
|
msix_vector_unuse(&n->parent_obj, cq->vector);
|
|
|
|
}
|
2013-06-04 19:17:10 +04:00
|
|
|
if (cq->cqid) {
|
|
|
|
g_free(cq);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_del_cq(NvmeCtrl *n, NvmeRequest *req)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
2020-07-20 13:44:01 +03:00
|
|
|
NvmeDeleteQ *c = (NvmeDeleteQ *)&req->cmd;
|
2013-06-04 19:17:10 +04:00
|
|
|
NvmeCQueue *cq;
|
|
|
|
uint16_t qid = le16_to_cpu(c->qid);
|
|
|
|
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(!qid || nvme_check_cqid(n, qid))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_del_cq_cqid(qid);
|
2013-06-04 19:17:10 +04:00
|
|
|
return NVME_INVALID_CQID | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
cq = n->cq[qid];
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(!QTAILQ_EMPTY(&cq->sq_list))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_del_cq_notempty(qid);
|
2013-06-04 19:17:10 +04:00
|
|
|
return NVME_INVALID_QUEUE_DEL;
|
|
|
|
}
|
2018-11-21 21:10:13 +03:00
|
|
|
nvme_irq_deassert(n, cq);
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_del_cq(qid);
|
2013-06-04 19:17:10 +04:00
|
|
|
nvme_free_cq(cq, n);
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void nvme_init_cq(NvmeCQueue *cq, NvmeCtrl *n, uint64_t dma_addr,
|
2020-08-24 09:58:56 +03:00
|
|
|
uint16_t cqid, uint16_t vector, uint16_t size,
|
|
|
|
uint16_t irq_enabled)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
2020-06-09 22:03:31 +03:00
|
|
|
int ret;
|
|
|
|
|
2021-01-12 15:30:26 +03:00
|
|
|
if (msix_enabled(&n->parent_obj)) {
|
|
|
|
ret = msix_vector_use(&n->parent_obj, vector);
|
|
|
|
assert(ret == 0);
|
|
|
|
}
|
2013-06-04 19:17:10 +04:00
|
|
|
cq->ctrl = n;
|
|
|
|
cq->cqid = cqid;
|
|
|
|
cq->size = size;
|
|
|
|
cq->dma_addr = dma_addr;
|
|
|
|
cq->phase = 1;
|
|
|
|
cq->irq_enabled = irq_enabled;
|
|
|
|
cq->vector = vector;
|
|
|
|
cq->head = cq->tail = 0;
|
|
|
|
QTAILQ_INIT(&cq->req_list);
|
|
|
|
QTAILQ_INIT(&cq->sq_list);
|
|
|
|
n->cq[cqid] = cq;
|
2013-08-21 19:03:08 +04:00
|
|
|
cq->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, nvme_post_cqes, cq);
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest *req)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
|
|
|
NvmeCQueue *cq;
|
2020-07-20 13:44:01 +03:00
|
|
|
NvmeCreateCq *c = (NvmeCreateCq *)&req->cmd;
|
2013-06-04 19:17:10 +04:00
|
|
|
uint16_t cqid = le16_to_cpu(c->cqid);
|
|
|
|
uint16_t vector = le16_to_cpu(c->irq_vector);
|
|
|
|
uint16_t qsize = le16_to_cpu(c->qsize);
|
|
|
|
uint16_t qflags = le16_to_cpu(c->cq_flags);
|
|
|
|
uint64_t prp1 = le64_to_cpu(c->prp1);
|
|
|
|
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_create_cq(prp1, cqid, vector, qsize, qflags,
|
|
|
|
NVME_CQ_FLAGS_IEN(qflags) != 0);
|
2017-11-03 16:37:53 +03:00
|
|
|
|
2020-10-22 12:07:08 +03:00
|
|
|
if (unlikely(!cqid || cqid > n->params.max_ioqpairs ||
|
|
|
|
n->cq[cqid] != NULL)) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_create_cq_cqid(cqid);
|
2020-10-22 09:28:46 +03:00
|
|
|
return NVME_INVALID_QID | NVME_DNR;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(!qsize || qsize > NVME_CAP_MQES(n->bar.cap))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_create_cq_size(qsize);
|
2013-06-04 19:17:10 +04:00
|
|
|
return NVME_MAX_QSIZE_EXCEEDED | NVME_DNR;
|
|
|
|
}
|
2020-10-22 09:28:46 +03:00
|
|
|
if (unlikely(prp1 & (n->page_size - 1))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_create_cq_addr(prp1);
|
2020-10-22 09:28:46 +03:00
|
|
|
return NVME_INVALID_PRP_OFFSET | NVME_DNR;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
2020-06-09 22:03:18 +03:00
|
|
|
if (unlikely(!msix_enabled(&n->parent_obj) && vector)) {
|
|
|
|
trace_pci_nvme_err_invalid_create_cq_vector(vector);
|
|
|
|
return NVME_INVALID_IRQ_VECTOR | NVME_DNR;
|
|
|
|
}
|
2020-06-09 22:03:32 +03:00
|
|
|
if (unlikely(vector >= n->params.msix_qsize)) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_create_cq_vector(vector);
|
2013-06-04 19:17:10 +04:00
|
|
|
return NVME_INVALID_IRQ_VECTOR | NVME_DNR;
|
|
|
|
}
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(!(NVME_CQ_FLAGS_PC(qflags)))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_create_cq_qflags(NVME_CQ_FLAGS_PC(qflags));
|
2013-06-04 19:17:10 +04:00
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
cq = g_malloc0(sizeof(*cq));
|
|
|
|
nvme_init_cq(cq, n, prp1, cqid, vector, qsize + 1,
|
2020-08-24 09:58:56 +03:00
|
|
|
NVME_CQ_FLAGS_IEN(qflags));
|
2020-07-06 09:13:01 +03:00
|
|
|
|
|
|
|
/*
|
|
|
|
* It is only required to set qs_created when creating a completion queue;
|
|
|
|
* creating a submission queue without a matching completion queue will
|
|
|
|
* fail.
|
|
|
|
*/
|
|
|
|
n->qs_created = true;
|
2013-06-04 19:17:10 +04:00
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:03 +03:00
|
|
|
static uint16_t nvme_rpt_empty_id_struct(NvmeCtrl *n, NvmeRequest *req)
|
|
|
|
{
|
|
|
|
uint8_t id[NVME_IDENTIFY_DATA_SIZE] = {};
|
|
|
|
|
|
|
|
return nvme_dma(n, id, sizeof(id), DMA_DIRECTION_FROM_DEVICE, req);
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
static inline bool nvme_csi_has_nvm_support(NvmeNamespace *ns)
|
|
|
|
{
|
|
|
|
switch (ns->csi) {
|
|
|
|
case NVME_CSI_NVM:
|
|
|
|
case NVME_CSI_ZONED:
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req)
|
2016-08-04 22:42:14 +03:00
|
|
|
{
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_identify_ctrl();
|
2017-11-03 16:37:53 +03:00
|
|
|
|
2019-04-12 21:53:16 +03:00
|
|
|
return nvme_dma(n, (uint8_t *)&n->id_ctrl, sizeof(n->id_ctrl),
|
|
|
|
DMA_DIRECTION_FROM_DEVICE, req);
|
2016-08-04 22:42:14 +03:00
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:03 +03:00
|
|
|
static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req)
|
|
|
|
{
|
|
|
|
NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
|
2020-12-08 23:04:06 +03:00
|
|
|
NvmeIdCtrlZoned id = {};
|
2020-12-08 23:04:03 +03:00
|
|
|
|
|
|
|
trace_pci_nvme_identify_ctrl_csi(c->csi);
|
|
|
|
|
|
|
|
if (c->csi == NVME_CSI_NVM) {
|
|
|
|
return nvme_rpt_empty_id_struct(n, req);
|
2020-12-08 23:04:06 +03:00
|
|
|
} else if (c->csi == NVME_CSI_ZONED) {
|
|
|
|
if (n->params.zasl_bs) {
|
|
|
|
id.zasl = n->zasl;
|
|
|
|
}
|
|
|
|
return nvme_dma(n, (uint8_t *)&id, sizeof(id),
|
|
|
|
DMA_DIRECTION_FROM_DEVICE, req);
|
2020-12-08 23:04:03 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
|
|
|
NvmeNamespace *ns;
|
2020-07-20 13:44:01 +03:00
|
|
|
NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
|
2013-06-04 19:17:10 +04:00
|
|
|
uint32_t nsid = le32_to_cpu(c->nsid);
|
|
|
|
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_identify_ns(nsid);
|
2017-11-03 16:37:53 +03:00
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) {
|
2013-06-04 19:17:10 +04:00
|
|
|
return NVME_INVALID_NSID | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
ns = nvme_ns(n, nsid);
|
|
|
|
if (unlikely(!ns)) {
|
2020-12-08 23:04:03 +03:00
|
|
|
return nvme_rpt_empty_id_struct(n, req);
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
}
|
2017-11-03 16:37:53 +03:00
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
if (c->csi == NVME_CSI_NVM && nvme_csi_has_nvm_support(ns)) {
|
|
|
|
return nvme_dma(n, (uint8_t *)&ns->id_ns, sizeof(NvmeIdNs),
|
|
|
|
DMA_DIRECTION_FROM_DEVICE, req);
|
|
|
|
}
|
|
|
|
|
|
|
|
return NVME_INVALID_CMD_SET | NVME_DNR;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:03 +03:00
|
|
|
static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req)
|
|
|
|
{
|
|
|
|
NvmeNamespace *ns;
|
|
|
|
NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
|
|
|
|
uint32_t nsid = le32_to_cpu(c->nsid);
|
|
|
|
|
|
|
|
trace_pci_nvme_identify_ns_csi(nsid, c->csi);
|
|
|
|
|
|
|
|
if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) {
|
|
|
|
return NVME_INVALID_NSID | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
ns = nvme_ns(n, nsid);
|
|
|
|
if (unlikely(!ns)) {
|
|
|
|
return nvme_rpt_empty_id_struct(n, req);
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
if (c->csi == NVME_CSI_NVM && nvme_csi_has_nvm_support(ns)) {
|
2020-12-08 23:04:03 +03:00
|
|
|
return nvme_rpt_empty_id_struct(n, req);
|
2020-12-08 23:04:06 +03:00
|
|
|
} else if (c->csi == NVME_CSI_ZONED && ns->csi == NVME_CSI_ZONED) {
|
|
|
|
return nvme_dma(n, (uint8_t *)ns->id_ns_zoned, sizeof(NvmeIdNsZoned),
|
|
|
|
DMA_DIRECTION_FROM_DEVICE, req);
|
2020-12-08 23:04:03 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req)
|
2016-08-04 22:42:14 +03:00
|
|
|
{
|
2020-12-08 23:04:03 +03:00
|
|
|
NvmeNamespace *ns;
|
2020-07-20 13:44:01 +03:00
|
|
|
NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
|
2016-08-04 22:42:14 +03:00
|
|
|
uint32_t min_nsid = le32_to_cpu(c->nsid);
|
2020-12-08 23:04:03 +03:00
|
|
|
uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {};
|
|
|
|
static const int data_len = sizeof(list);
|
|
|
|
uint32_t *list_ptr = (uint32_t *)list;
|
|
|
|
int i, j = 0;
|
2016-08-04 22:42:14 +03:00
|
|
|
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_identify_nslist(min_nsid);
|
2017-11-03 16:37:53 +03:00
|
|
|
|
2020-07-06 09:13:00 +03:00
|
|
|
/*
|
|
|
|
* Both 0xffffffff (NVME_NSID_BROADCAST) and 0xfffffffe are invalid values
|
|
|
|
* since the Active Namespace ID List should return namespaces with ids
|
|
|
|
* *higher* than the NSID specified in the command. This is also specified
|
|
|
|
* in the spec (NVM Express v1.3d, Section 5.15.4).
|
|
|
|
*/
|
|
|
|
if (min_nsid >= NVME_NSID_BROADCAST - 1) {
|
|
|
|
return NVME_INVALID_NSID | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:03 +03:00
|
|
|
for (i = 1; i <= n->num_namespaces; i++) {
|
|
|
|
ns = nvme_ns(n, i);
|
|
|
|
if (!ns) {
|
2016-08-04 22:42:14 +03:00
|
|
|
continue;
|
|
|
|
}
|
2020-12-08 23:04:03 +03:00
|
|
|
if (ns->params.nsid <= min_nsid) {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
list_ptr[j++] = cpu_to_le32(ns->params.nsid);
|
2016-08-04 22:42:14 +03:00
|
|
|
if (j == data_len / sizeof(uint32_t)) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
2020-12-08 23:04:03 +03:00
|
|
|
|
|
|
|
return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req);
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req)
|
|
|
|
{
|
|
|
|
NvmeNamespace *ns;
|
|
|
|
NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
|
|
|
|
uint32_t min_nsid = le32_to_cpu(c->nsid);
|
|
|
|
uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {};
|
|
|
|
static const int data_len = sizeof(list);
|
|
|
|
uint32_t *list_ptr = (uint32_t *)list;
|
|
|
|
int i, j = 0;
|
|
|
|
|
|
|
|
trace_pci_nvme_identify_nslist_csi(min_nsid, c->csi);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Same as in nvme_identify_nslist(), 0xffffffff/0xfffffffe are invalid.
|
|
|
|
*/
|
|
|
|
if (min_nsid >= NVME_NSID_BROADCAST - 1) {
|
|
|
|
return NVME_INVALID_NSID | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
if (c->csi != NVME_CSI_NVM && c->csi != NVME_CSI_ZONED) {
|
2020-12-08 23:04:03 +03:00
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
for (i = 1; i <= n->num_namespaces; i++) {
|
|
|
|
ns = nvme_ns(n, i);
|
|
|
|
if (!ns) {
|
|
|
|
continue;
|
|
|
|
}
|
2020-12-08 23:04:06 +03:00
|
|
|
if (ns->params.nsid <= min_nsid || c->csi != ns->csi) {
|
2020-12-08 23:04:03 +03:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
list_ptr[j++] = cpu_to_le32(ns->params.nsid);
|
|
|
|
if (j == data_len / sizeof(uint32_t)) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req);
|
2016-08-04 22:42:14 +03:00
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)
|
2020-07-06 09:12:59 +03:00
|
|
|
{
|
2020-12-08 23:03:59 +03:00
|
|
|
NvmeNamespace *ns;
|
2020-07-20 13:44:01 +03:00
|
|
|
NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
|
2020-07-06 09:12:59 +03:00
|
|
|
uint32_t nsid = le32_to_cpu(c->nsid);
|
2020-12-08 23:04:03 +03:00
|
|
|
uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {};
|
2020-07-06 09:12:59 +03:00
|
|
|
|
|
|
|
struct data {
|
|
|
|
struct {
|
|
|
|
NvmeIdNsDescr hdr;
|
2020-12-08 23:04:03 +03:00
|
|
|
uint8_t v[NVME_NIDL_UUID];
|
2020-07-06 09:12:59 +03:00
|
|
|
} uuid;
|
2020-12-08 23:04:03 +03:00
|
|
|
struct {
|
|
|
|
NvmeIdNsDescr hdr;
|
|
|
|
uint8_t v;
|
|
|
|
} csi;
|
2020-07-06 09:12:59 +03:00
|
|
|
};
|
|
|
|
|
|
|
|
struct data *ns_descrs = (struct data *)list;
|
|
|
|
|
|
|
|
trace_pci_nvme_identify_ns_descr_list(nsid);
|
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) {
|
2020-07-06 09:12:59 +03:00
|
|
|
return NVME_INVALID_NSID | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:03:59 +03:00
|
|
|
ns = nvme_ns(n, nsid);
|
|
|
|
if (unlikely(!ns)) {
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2020-07-06 09:12:59 +03:00
|
|
|
/*
|
|
|
|
* Because the NGUID and EUI64 fields are 0 in the Identify Namespace data
|
|
|
|
* structure, a Namespace UUID (nidt = 0x3) must be reported in the
|
2020-12-08 23:03:59 +03:00
|
|
|
* Namespace Identification Descriptor. Add the namespace UUID here.
|
2020-07-06 09:12:59 +03:00
|
|
|
*/
|
|
|
|
ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID;
|
2020-12-08 23:04:03 +03:00
|
|
|
ns_descrs->uuid.hdr.nidl = NVME_NIDL_UUID;
|
|
|
|
memcpy(&ns_descrs->uuid.v, ns->params.uuid.data, NVME_NIDL_UUID);
|
2020-07-06 09:12:59 +03:00
|
|
|
|
2020-12-08 23:04:03 +03:00
|
|
|
ns_descrs->csi.hdr.nidt = NVME_NIDT_CSI;
|
|
|
|
ns_descrs->csi.hdr.nidl = NVME_NIDL_CSI;
|
|
|
|
ns_descrs->csi.v = ns->csi;
|
|
|
|
|
|
|
|
return nvme_dma(n, list, sizeof(list), DMA_DIRECTION_FROM_DEVICE, req);
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint16_t nvme_identify_cmd_set(NvmeCtrl *n, NvmeRequest *req)
|
|
|
|
{
|
|
|
|
uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {};
|
|
|
|
static const int data_len = sizeof(list);
|
|
|
|
|
|
|
|
trace_pci_nvme_identify_cmd_set();
|
|
|
|
|
|
|
|
NVME_SET_CSI(*list, NVME_CSI_NVM);
|
2020-12-08 23:04:06 +03:00
|
|
|
NVME_SET_CSI(*list, NVME_CSI_ZONED);
|
|
|
|
|
2020-12-08 23:04:03 +03:00
|
|
|
return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req);
|
2020-07-06 09:12:59 +03:00
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req)
|
2016-08-04 22:42:14 +03:00
|
|
|
{
|
2020-07-20 13:44:01 +03:00
|
|
|
NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
|
2016-08-04 22:42:14 +03:00
|
|
|
|
|
|
|
switch (le32_to_cpu(c->cns)) {
|
2020-06-09 22:03:16 +03:00
|
|
|
case NVME_ID_CNS_NS:
|
2020-12-08 23:04:04 +03:00
|
|
|
/* fall through */
|
|
|
|
case NVME_ID_CNS_NS_PRESENT:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_identify_ns(n, req);
|
2020-12-08 23:04:03 +03:00
|
|
|
case NVME_ID_CNS_CS_NS:
|
2020-12-08 23:04:04 +03:00
|
|
|
/* fall through */
|
|
|
|
case NVME_ID_CNS_CS_NS_PRESENT:
|
2020-12-08 23:04:03 +03:00
|
|
|
return nvme_identify_ns_csi(n, req);
|
2020-06-09 22:03:16 +03:00
|
|
|
case NVME_ID_CNS_CTRL:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_identify_ctrl(n, req);
|
2020-12-08 23:04:03 +03:00
|
|
|
case NVME_ID_CNS_CS_CTRL:
|
|
|
|
return nvme_identify_ctrl_csi(n, req);
|
2020-06-09 22:03:16 +03:00
|
|
|
case NVME_ID_CNS_NS_ACTIVE_LIST:
|
2020-12-08 23:04:04 +03:00
|
|
|
/* fall through */
|
|
|
|
case NVME_ID_CNS_NS_PRESENT_LIST:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_identify_nslist(n, req);
|
2020-12-08 23:04:03 +03:00
|
|
|
case NVME_ID_CNS_CS_NS_ACTIVE_LIST:
|
2020-12-08 23:04:04 +03:00
|
|
|
/* fall through */
|
|
|
|
case NVME_ID_CNS_CS_NS_PRESENT_LIST:
|
2020-12-08 23:04:03 +03:00
|
|
|
return nvme_identify_nslist_csi(n, req);
|
2020-07-06 09:12:59 +03:00
|
|
|
case NVME_ID_CNS_NS_DESCR_LIST:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_identify_ns_descr_list(n, req);
|
2020-12-08 23:04:03 +03:00
|
|
|
case NVME_ID_CNS_IO_COMMAND_SET:
|
|
|
|
return nvme_identify_cmd_set(n, req);
|
2016-08-04 22:42:14 +03:00
|
|
|
default:
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_invalid_identify_cns(le32_to_cpu(c->cns));
|
2016-08-04 22:42:14 +03:00
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_abort(NvmeCtrl *n, NvmeRequest *req)
|
2020-07-06 09:12:49 +03:00
|
|
|
{
|
2020-07-20 13:44:01 +03:00
|
|
|
uint16_t sqid = le32_to_cpu(req->cmd.cdw10) & 0xffff;
|
2020-07-06 09:12:49 +03:00
|
|
|
|
|
|
|
req->cqe.result = 1;
|
|
|
|
if (nvme_check_sqid(n, sqid)) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
2019-05-20 20:40:30 +03:00
|
|
|
static inline void nvme_set_timestamp(NvmeCtrl *n, uint64_t ts)
|
|
|
|
{
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_setfeat_timestamp(ts);
|
2019-05-20 20:40:30 +03:00
|
|
|
|
|
|
|
n->host_timestamp = le64_to_cpu(ts);
|
|
|
|
n->timestamp_set_qemu_clock_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline uint64_t nvme_get_timestamp(const NvmeCtrl *n)
|
|
|
|
{
|
|
|
|
uint64_t current_time = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
|
|
|
|
uint64_t elapsed_time = current_time - n->timestamp_set_qemu_clock_ms;
|
|
|
|
|
|
|
|
union nvme_timestamp {
|
|
|
|
struct {
|
|
|
|
uint64_t timestamp:48;
|
|
|
|
uint64_t sync:1;
|
|
|
|
uint64_t origin:3;
|
|
|
|
uint64_t rsvd1:12;
|
|
|
|
};
|
|
|
|
uint64_t all;
|
|
|
|
};
|
|
|
|
|
|
|
|
union nvme_timestamp ts;
|
|
|
|
ts.all = 0;
|
2020-10-02 10:57:16 +03:00
|
|
|
ts.timestamp = n->host_timestamp + elapsed_time;
|
2019-05-20 20:40:30 +03:00
|
|
|
|
|
|
|
/* If the host timestamp is non-zero, set the timestamp origin */
|
|
|
|
ts.origin = n->host_timestamp ? 0x01 : 0x00;
|
|
|
|
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_getfeat_timestamp(ts.all);
|
2019-05-20 20:40:30 +03:00
|
|
|
|
|
|
|
return cpu_to_le64(ts.all);
|
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeRequest *req)
|
2019-05-20 20:40:30 +03:00
|
|
|
{
|
|
|
|
uint64_t timestamp = nvme_get_timestamp(n);
|
|
|
|
|
2019-04-12 21:53:16 +03:00
|
|
|
return nvme_dma(n, (uint8_t *)×tamp, sizeof(timestamp),
|
|
|
|
DMA_DIRECTION_FROM_DEVICE, req);
|
2019-05-20 20:40:30 +03:00
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeRequest *req)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
2020-07-20 13:44:01 +03:00
|
|
|
NvmeCmd *cmd = &req->cmd;
|
2013-06-04 19:17:10 +04:00
|
|
|
uint32_t dw10 = le32_to_cpu(cmd->cdw10);
|
2020-07-06 09:12:50 +03:00
|
|
|
uint32_t dw11 = le32_to_cpu(cmd->cdw11);
|
2020-07-06 09:12:57 +03:00
|
|
|
uint32_t nsid = le32_to_cpu(cmd->nsid);
|
2015-06-11 13:01:39 +03:00
|
|
|
uint32_t result;
|
2020-07-06 09:12:56 +03:00
|
|
|
uint8_t fid = NVME_GETSETFEAT_FID(dw10);
|
2020-07-06 09:12:57 +03:00
|
|
|
NvmeGetFeatureSelect sel = NVME_GETFEAT_SELECT(dw10);
|
2020-07-06 09:12:56 +03:00
|
|
|
uint16_t iv;
|
2020-10-14 10:55:08 +03:00
|
|
|
NvmeNamespace *ns;
|
2021-01-17 17:53:32 +03:00
|
|
|
int i;
|
2020-07-06 09:12:56 +03:00
|
|
|
|
|
|
|
static const uint32_t nvme_feature_default[NVME_FID_MAX] = {
|
|
|
|
[NVME_ARBITRATION] = NVME_ARB_AB_NOLIMIT,
|
|
|
|
};
|
|
|
|
|
2020-09-30 02:19:04 +03:00
|
|
|
trace_pci_nvme_getfeat(nvme_cid(req), nsid, fid, sel, dw11);
|
2020-07-06 09:12:56 +03:00
|
|
|
|
|
|
|
if (!nvme_feature_support[fid]) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
2013-06-04 19:17:10 +04:00
|
|
|
|
2020-07-06 09:12:57 +03:00
|
|
|
if (nvme_feature_cap[fid] & NVME_FEAT_CAP_NS) {
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) {
|
2020-07-06 09:12:57 +03:00
|
|
|
/*
|
|
|
|
* The Reservation Notification Mask and Reservation Persistence
|
|
|
|
* features require a status code of Invalid Field in Command when
|
|
|
|
* NSID is 0xFFFFFFFF. Since the device does not support those
|
|
|
|
* features we can always return Invalid Namespace or Format as we
|
|
|
|
* should do for all other features.
|
|
|
|
*/
|
|
|
|
return NVME_INVALID_NSID | NVME_DNR;
|
|
|
|
}
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
|
|
|
|
if (!nvme_ns(n, nsid)) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
2020-07-06 09:12:57 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
switch (sel) {
|
|
|
|
case NVME_GETFEAT_SELECT_CURRENT:
|
|
|
|
break;
|
|
|
|
case NVME_GETFEAT_SELECT_SAVED:
|
|
|
|
/* no features are saveable by the controller; fallthrough */
|
|
|
|
case NVME_GETFEAT_SELECT_DEFAULT:
|
|
|
|
goto defaults;
|
|
|
|
case NVME_GETFEAT_SELECT_CAP:
|
|
|
|
result = nvme_feature_cap[fid];
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2020-07-06 09:12:56 +03:00
|
|
|
switch (fid) {
|
2020-07-06 09:12:50 +03:00
|
|
|
case NVME_TEMPERATURE_THRESHOLD:
|
|
|
|
result = 0;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The controller only implements the Composite Temperature sensor, so
|
|
|
|
* return 0 for all other sensors.
|
|
|
|
*/
|
|
|
|
if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
|
2020-07-06 09:12:57 +03:00
|
|
|
goto out;
|
2020-07-06 09:12:50 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
switch (NVME_TEMP_THSEL(dw11)) {
|
|
|
|
case NVME_TEMP_THSEL_OVER:
|
|
|
|
result = n->features.temp_thresh_hi;
|
2020-07-06 09:12:57 +03:00
|
|
|
goto out;
|
2020-07-06 09:12:50 +03:00
|
|
|
case NVME_TEMP_THSEL_UNDER:
|
|
|
|
result = n->features.temp_thresh_low;
|
2020-07-06 09:12:57 +03:00
|
|
|
goto out;
|
2020-07-06 09:12:50 +03:00
|
|
|
}
|
|
|
|
|
2020-07-06 09:12:57 +03:00
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
2020-10-14 10:55:08 +03:00
|
|
|
case NVME_ERROR_RECOVERY:
|
|
|
|
if (!nvme_nsid_valid(n, nsid)) {
|
|
|
|
return NVME_INVALID_NSID | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
ns = nvme_ns(n, nsid);
|
|
|
|
if (unlikely(!ns)) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
result = ns->features.err_rec;
|
|
|
|
goto out;
|
2015-04-30 12:44:17 +03:00
|
|
|
case NVME_VOLATILE_WRITE_CACHE:
|
2021-01-17 17:53:32 +03:00
|
|
|
for (i = 1; i <= n->num_namespaces; i++) {
|
|
|
|
ns = nvme_ns(n, i);
|
|
|
|
if (!ns) {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
result = blk_enable_write_cache(ns->blkconf.blk);
|
|
|
|
if (result) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_getfeat_vwcache(result ? "enabled" : "disabled");
|
2020-07-06 09:12:57 +03:00
|
|
|
goto out;
|
|
|
|
case NVME_ASYNCHRONOUS_EVENT_CONF:
|
|
|
|
result = n->features.async_config;
|
|
|
|
goto out;
|
|
|
|
case NVME_TIMESTAMP:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_get_feature_timestamp(n, req);
|
2020-07-06 09:12:57 +03:00
|
|
|
default:
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
defaults:
|
|
|
|
switch (fid) {
|
|
|
|
case NVME_TEMPERATURE_THRESHOLD:
|
|
|
|
result = 0;
|
|
|
|
|
|
|
|
if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (NVME_TEMP_THSEL(dw11) == NVME_TEMP_THSEL_OVER) {
|
|
|
|
result = NVME_TEMPERATURE_WARNING;
|
|
|
|
}
|
|
|
|
|
2015-06-11 13:01:39 +03:00
|
|
|
break;
|
|
|
|
case NVME_NUMBER_OF_QUEUES:
|
2020-07-06 09:12:47 +03:00
|
|
|
result = (n->params.max_ioqpairs - 1) |
|
|
|
|
((n->params.max_ioqpairs - 1) << 16);
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_getfeat_numq(result);
|
2020-07-06 09:12:56 +03:00
|
|
|
break;
|
|
|
|
case NVME_INTERRUPT_VECTOR_CONF:
|
|
|
|
iv = dw11 & 0xffff;
|
|
|
|
if (iv >= n->params.max_ioqpairs + 1) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
result = iv;
|
|
|
|
if (iv == n->admin_cq.vector) {
|
|
|
|
result |= NVME_INTVC_NOCOALESCING;
|
|
|
|
}
|
2020-12-08 23:04:03 +03:00
|
|
|
break;
|
|
|
|
case NVME_COMMAND_SET_PROFILE:
|
|
|
|
result = 0;
|
2015-04-30 12:44:17 +03:00
|
|
|
break;
|
2013-06-04 19:17:10 +04:00
|
|
|
default:
|
2020-07-06 09:12:56 +03:00
|
|
|
result = nvme_feature_default[fid];
|
|
|
|
break;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
2015-06-11 13:01:39 +03:00
|
|
|
|
2020-07-06 09:12:57 +03:00
|
|
|
out:
|
2020-07-06 09:12:47 +03:00
|
|
|
req->cqe.result = cpu_to_le32(result);
|
2013-06-04 19:17:10 +04:00
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_set_feature_timestamp(NvmeCtrl *n, NvmeRequest *req)
|
2019-05-20 20:40:30 +03:00
|
|
|
{
|
|
|
|
uint16_t ret;
|
|
|
|
uint64_t timestamp;
|
|
|
|
|
2019-04-12 21:53:16 +03:00
|
|
|
ret = nvme_dma(n, (uint8_t *)×tamp, sizeof(timestamp),
|
|
|
|
DMA_DIRECTION_TO_DEVICE, req);
|
2019-05-20 20:40:30 +03:00
|
|
|
if (ret != NVME_SUCCESS) {
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
nvme_set_timestamp(n, timestamp);
|
|
|
|
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
2020-10-14 10:55:08 +03:00
|
|
|
NvmeNamespace *ns = NULL;
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
NvmeCmd *cmd = &req->cmd;
|
2013-06-04 19:17:10 +04:00
|
|
|
uint32_t dw10 = le32_to_cpu(cmd->cdw10);
|
2015-06-11 13:01:39 +03:00
|
|
|
uint32_t dw11 = le32_to_cpu(cmd->cdw11);
|
2020-07-06 09:12:57 +03:00
|
|
|
uint32_t nsid = le32_to_cpu(cmd->nsid);
|
2020-07-06 09:12:56 +03:00
|
|
|
uint8_t fid = NVME_GETSETFEAT_FID(dw10);
|
2020-07-06 09:12:57 +03:00
|
|
|
uint8_t save = NVME_SETFEAT_SAVE(dw10);
|
2020-10-14 10:55:08 +03:00
|
|
|
int i;
|
2020-07-06 09:12:57 +03:00
|
|
|
|
2020-09-30 02:19:04 +03:00
|
|
|
trace_pci_nvme_setfeat(nvme_cid(req), nsid, fid, save, dw11);
|
2013-06-04 19:17:10 +04:00
|
|
|
|
2021-01-24 18:35:32 +03:00
|
|
|
if (save && !(nvme_feature_cap[fid] & NVME_FEAT_CAP_SAVE)) {
|
2020-07-06 09:12:57 +03:00
|
|
|
return NVME_FID_NOT_SAVEABLE | NVME_DNR;
|
|
|
|
}
|
2020-07-06 09:12:56 +03:00
|
|
|
|
|
|
|
if (!nvme_feature_support[fid]) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2020-07-06 09:12:57 +03:00
|
|
|
if (nvme_feature_cap[fid] & NVME_FEAT_CAP_NS) {
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
if (nsid != NVME_NSID_BROADCAST) {
|
|
|
|
if (!nvme_nsid_valid(n, nsid)) {
|
|
|
|
return NVME_INVALID_NSID | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
ns = nvme_ns(n, nsid);
|
|
|
|
if (unlikely(!ns)) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
2020-07-06 09:12:57 +03:00
|
|
|
}
|
|
|
|
} else if (nsid && nsid != NVME_NSID_BROADCAST) {
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
if (!nvme_nsid_valid(n, nsid)) {
|
2020-07-06 09:12:57 +03:00
|
|
|
return NVME_INVALID_NSID | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
return NVME_FEAT_NOT_NS_SPEC | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!(nvme_feature_cap[fid] & NVME_FEAT_CAP_CHANGE)) {
|
|
|
|
return NVME_FEAT_NOT_CHANGEABLE | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2020-07-06 09:12:56 +03:00
|
|
|
switch (fid) {
|
2020-07-06 09:12:50 +03:00
|
|
|
case NVME_TEMPERATURE_THRESHOLD:
|
|
|
|
if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
switch (NVME_TEMP_THSEL(dw11)) {
|
|
|
|
case NVME_TEMP_THSEL_OVER:
|
|
|
|
n->features.temp_thresh_hi = NVME_TEMP_TMPTH(dw11);
|
|
|
|
break;
|
|
|
|
case NVME_TEMP_THSEL_UNDER:
|
|
|
|
n->features.temp_thresh_low = NVME_TEMP_TMPTH(dw11);
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2021-01-15 06:27:02 +03:00
|
|
|
if ((n->temperature >= n->features.temp_thresh_hi) ||
|
|
|
|
(n->temperature <= n->features.temp_thresh_low)) {
|
|
|
|
nvme_smart_event(n, NVME_AER_INFO_SMART_TEMP_THRESH);
|
2020-07-06 09:12:53 +03:00
|
|
|
}
|
|
|
|
|
2020-10-14 10:55:08 +03:00
|
|
|
break;
|
|
|
|
case NVME_ERROR_RECOVERY:
|
|
|
|
if (nsid == NVME_NSID_BROADCAST) {
|
|
|
|
for (i = 1; i <= n->num_namespaces; i++) {
|
|
|
|
ns = nvme_ns(n, i);
|
|
|
|
|
|
|
|
if (!ns) {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (NVME_ID_NS_NSFEAT_DULBE(ns->id_ns.nsfeat)) {
|
|
|
|
ns->features.err_rec = dw11;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
assert(ns);
|
2021-01-24 18:54:40 +03:00
|
|
|
if (NVME_ID_NS_NSFEAT_DULBE(ns->id_ns.nsfeat)) {
|
|
|
|
ns->features.err_rec = dw11;
|
|
|
|
}
|
2020-07-06 09:12:50 +03:00
|
|
|
break;
|
2015-06-11 13:01:39 +03:00
|
|
|
case NVME_VOLATILE_WRITE_CACHE:
|
2020-10-14 10:55:08 +03:00
|
|
|
for (i = 1; i <= n->num_namespaces; i++) {
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
ns = nvme_ns(n, i);
|
|
|
|
if (!ns) {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!(dw11 & 0x1) && blk_enable_write_cache(ns->blkconf.blk)) {
|
|
|
|
blk_flush(ns->blkconf.blk);
|
|
|
|
}
|
|
|
|
|
|
|
|
blk_set_enable_write_cache(ns->blkconf.blk, dw11 & 1);
|
2020-07-06 09:12:55 +03:00
|
|
|
}
|
|
|
|
|
2015-06-11 13:01:39 +03:00
|
|
|
break;
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
case NVME_NUMBER_OF_QUEUES:
|
2020-07-06 09:13:01 +03:00
|
|
|
if (n->qs_created) {
|
|
|
|
return NVME_CMD_SEQ_ERROR | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2020-07-06 09:12:58 +03:00
|
|
|
/*
|
|
|
|
* NVMe v1.3, Section 5.21.1.7: 0xffff is not an allowed value for NCQR
|
|
|
|
* and NSQR.
|
|
|
|
*/
|
|
|
|
if ((dw11 & 0xffff) == 0xffff || ((dw11 >> 16) & 0xffff) == 0xffff) {
|
|
|
|
return NVME_INVALID_FIELD | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_setfeat_numq((dw11 & 0xFFFF) + 1,
|
|
|
|
((dw11 >> 16) & 0xFFFF) + 1,
|
2020-06-09 22:03:19 +03:00
|
|
|
n->params.max_ioqpairs,
|
|
|
|
n->params.max_ioqpairs);
|
|
|
|
req->cqe.result = cpu_to_le32((n->params.max_ioqpairs - 1) |
|
|
|
|
((n->params.max_ioqpairs - 1) << 16));
|
2013-06-04 19:17:10 +04:00
|
|
|
break;
|
2020-07-06 09:12:53 +03:00
|
|
|
case NVME_ASYNCHRONOUS_EVENT_CONF:
|
|
|
|
n->features.async_config = dw11;
|
|
|
|
break;
|
2019-05-20 20:40:30 +03:00
|
|
|
case NVME_TIMESTAMP:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_set_feature_timestamp(n, req);
|
2020-12-08 23:04:03 +03:00
|
|
|
case NVME_COMMAND_SET_PROFILE:
|
|
|
|
if (dw11 & 0x1ff) {
|
|
|
|
trace_pci_nvme_err_invalid_iocsci(dw11 & 0x1ff);
|
|
|
|
return NVME_CMD_SET_CMB_REJECTED | NVME_DNR;
|
|
|
|
}
|
|
|
|
break;
|
2013-06-04 19:17:10 +04:00
|
|
|
default:
|
2020-07-06 09:12:56 +03:00
|
|
|
return NVME_FEAT_NOT_CHANGEABLE | NVME_DNR;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
return NVME_SUCCESS;
|
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_aer(NvmeCtrl *n, NvmeRequest *req)
|
2020-07-06 09:12:53 +03:00
|
|
|
{
|
|
|
|
trace_pci_nvme_aer(nvme_cid(req));
|
|
|
|
|
|
|
|
if (n->outstanding_aers > n->params.aerl) {
|
|
|
|
trace_pci_nvme_aer_aerl_exceeded();
|
|
|
|
return NVME_AER_LIMIT_EXCEEDED;
|
|
|
|
}
|
|
|
|
|
|
|
|
n->aer_reqs[n->outstanding_aers] = req;
|
|
|
|
n->outstanding_aers++;
|
|
|
|
|
|
|
|
if (!QTAILQ_EMPTY(&n->aer_queue)) {
|
|
|
|
nvme_process_aers(n);
|
|
|
|
}
|
|
|
|
|
|
|
|
return NVME_NO_COMPLETE;
|
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeRequest *req)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
2020-08-24 23:11:33 +03:00
|
|
|
trace_pci_nvme_admin_cmd(nvme_cid(req), nvme_sqid(req), req->cmd.opcode,
|
|
|
|
nvme_adm_opc_str(req->cmd.opcode));
|
2020-07-06 09:12:48 +03:00
|
|
|
|
2020-12-08 23:04:02 +03:00
|
|
|
if (!(nvme_cse_acs[req->cmd.opcode] & NVME_CMD_EFF_CSUPP)) {
|
|
|
|
trace_pci_nvme_err_invalid_admin_opc(req->cmd.opcode);
|
|
|
|
return NVME_INVALID_OPCODE | NVME_DNR;
|
|
|
|
}
|
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
switch (req->cmd.opcode) {
|
2013-06-04 19:17:10 +04:00
|
|
|
case NVME_ADM_CMD_DELETE_SQ:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_del_sq(n, req);
|
2013-06-04 19:17:10 +04:00
|
|
|
case NVME_ADM_CMD_CREATE_SQ:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_create_sq(n, req);
|
2020-07-06 09:12:52 +03:00
|
|
|
case NVME_ADM_CMD_GET_LOG_PAGE:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_get_log(n, req);
|
2013-06-04 19:17:10 +04:00
|
|
|
case NVME_ADM_CMD_DELETE_CQ:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_del_cq(n, req);
|
2013-06-04 19:17:10 +04:00
|
|
|
case NVME_ADM_CMD_CREATE_CQ:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_create_cq(n, req);
|
2013-06-04 19:17:10 +04:00
|
|
|
case NVME_ADM_CMD_IDENTIFY:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_identify(n, req);
|
2020-07-06 09:12:49 +03:00
|
|
|
case NVME_ADM_CMD_ABORT:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_abort(n, req);
|
2013-06-04 19:17:10 +04:00
|
|
|
case NVME_ADM_CMD_SET_FEATURES:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_set_feature(n, req);
|
2013-06-04 19:17:10 +04:00
|
|
|
case NVME_ADM_CMD_GET_FEATURES:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_get_feature(n, req);
|
2020-07-06 09:12:53 +03:00
|
|
|
case NVME_ADM_CMD_ASYNC_EV_REQ:
|
2020-07-20 13:44:01 +03:00
|
|
|
return nvme_aer(n, req);
|
2013-06-04 19:17:10 +04:00
|
|
|
default:
|
2020-12-08 23:04:02 +03:00
|
|
|
assert(false);
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
2020-12-08 23:04:02 +03:00
|
|
|
|
|
|
|
return NVME_INVALID_OPCODE | NVME_DNR;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static void nvme_process_sq(void *opaque)
|
|
|
|
{
|
|
|
|
NvmeSQueue *sq = opaque;
|
|
|
|
NvmeCtrl *n = sq->ctrl;
|
|
|
|
NvmeCQueue *cq = n->cq[sq->cqid];
|
|
|
|
|
|
|
|
uint16_t status;
|
|
|
|
hwaddr addr;
|
|
|
|
NvmeCmd cmd;
|
|
|
|
NvmeRequest *req;
|
|
|
|
|
|
|
|
while (!(nvme_sq_empty(sq) || QTAILQ_EMPTY(&sq->req_list))) {
|
|
|
|
addr = sq->dma_addr + sq->head * n->sqe_size;
|
2019-10-11 09:32:00 +03:00
|
|
|
if (nvme_addr_read(n, addr, (void *)&cmd, sizeof(cmd))) {
|
|
|
|
trace_pci_nvme_err_addr_read(addr);
|
|
|
|
trace_pci_nvme_err_cfs();
|
|
|
|
n->bar.csts = NVME_CSTS_FAILED;
|
|
|
|
break;
|
|
|
|
}
|
2013-06-04 19:17:10 +04:00
|
|
|
nvme_inc_sq_head(sq);
|
|
|
|
|
|
|
|
req = QTAILQ_FIRST(&sq->req_list);
|
|
|
|
QTAILQ_REMOVE(&sq->req_list, req, entry);
|
|
|
|
QTAILQ_INSERT_TAIL(&sq->out_req_list, req, entry);
|
2020-07-20 13:44:01 +03:00
|
|
|
nvme_req_clear(req);
|
2013-06-04 19:17:10 +04:00
|
|
|
req->cqe.cid = cmd.cid;
|
2020-07-20 13:44:01 +03:00
|
|
|
memcpy(&req->cmd, &cmd, sizeof(NvmeCmd));
|
2013-06-04 19:17:10 +04:00
|
|
|
|
2020-07-20 13:44:01 +03:00
|
|
|
status = sq->sqid ? nvme_io_cmd(n, req) :
|
|
|
|
nvme_admin_cmd(n, req);
|
2013-06-04 19:17:10 +04:00
|
|
|
if (status != NVME_NO_COMPLETE) {
|
|
|
|
req->status = status;
|
|
|
|
nvme_enqueue_req_completion(cq, req);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-12-09 15:10:45 +03:00
|
|
|
static void nvme_ctrl_reset(NvmeCtrl *n)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
NvmeNamespace *ns;
|
2013-06-04 19:17:10 +04:00
|
|
|
int i;
|
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
for (i = 1; i <= n->num_namespaces; i++) {
|
|
|
|
ns = nvme_ns(n, i);
|
|
|
|
if (!ns) {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
nvme_ns_drain(ns);
|
|
|
|
}
|
2018-11-06 15:16:55 +03:00
|
|
|
|
2020-06-09 22:03:19 +03:00
|
|
|
for (i = 0; i < n->params.max_ioqpairs + 1; i++) {
|
2013-06-04 19:17:10 +04:00
|
|
|
if (n->sq[i] != NULL) {
|
|
|
|
nvme_free_sq(n->sq[i], n);
|
|
|
|
}
|
|
|
|
}
|
2020-06-09 22:03:19 +03:00
|
|
|
for (i = 0; i < n->params.max_ioqpairs + 1; i++) {
|
2013-06-04 19:17:10 +04:00
|
|
|
if (n->cq[i] != NULL) {
|
|
|
|
nvme_free_cq(n->cq[i], n);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-07-06 09:12:53 +03:00
|
|
|
while (!QTAILQ_EMPTY(&n->aer_queue)) {
|
|
|
|
NvmeAsyncEvent *event = QTAILQ_FIRST(&n->aer_queue);
|
|
|
|
QTAILQ_REMOVE(&n->aer_queue, event, entry);
|
|
|
|
g_free(event);
|
|
|
|
}
|
|
|
|
|
|
|
|
n->aer_queued = 0;
|
|
|
|
n->outstanding_aers = 0;
|
2020-07-06 09:13:01 +03:00
|
|
|
n->qs_created = false;
|
2020-12-08 23:03:58 +03:00
|
|
|
|
|
|
|
n->bar.cc = 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void nvme_ctrl_shutdown(NvmeCtrl *n)
|
|
|
|
{
|
|
|
|
NvmeNamespace *ns;
|
|
|
|
int i;
|
|
|
|
|
2020-11-13 08:30:05 +03:00
|
|
|
if (n->pmr.dev) {
|
|
|
|
memory_region_msync(&n->pmr.dev->mr, 0, n->pmr.dev->size);
|
2020-12-09 15:10:45 +03:00
|
|
|
}
|
2020-07-06 09:12:53 +03:00
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
for (i = 1; i <= n->num_namespaces; i++) {
|
|
|
|
ns = nvme_ns(n, i);
|
|
|
|
if (!ns) {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
2020-12-08 23:03:58 +03:00
|
|
|
nvme_ns_shutdown(ns);
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
}
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
|
2020-12-08 23:04:02 +03:00
|
|
|
static void nvme_select_ns_iocs(NvmeCtrl *n)
|
|
|
|
{
|
|
|
|
NvmeNamespace *ns;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
for (i = 1; i <= n->num_namespaces; i++) {
|
|
|
|
ns = nvme_ns(n, i);
|
|
|
|
if (!ns) {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
ns->iocs = nvme_cse_iocs_none;
|
2020-12-08 23:04:03 +03:00
|
|
|
switch (ns->csi) {
|
|
|
|
case NVME_CSI_NVM:
|
|
|
|
if (NVME_CC_CSS(n->bar.cc) != NVME_CC_CSS_ADMIN_ONLY) {
|
|
|
|
ns->iocs = nvme_cse_iocs_nvm;
|
|
|
|
}
|
|
|
|
break;
|
2020-12-08 23:04:06 +03:00
|
|
|
case NVME_CSI_ZONED:
|
|
|
|
if (NVME_CC_CSS(n->bar.cc) == NVME_CC_CSS_CSI) {
|
|
|
|
ns->iocs = nvme_cse_iocs_zoned;
|
|
|
|
} else if (NVME_CC_CSS(n->bar.cc) == NVME_CC_CSS_NVM) {
|
|
|
|
ns->iocs = nvme_cse_iocs_nvm;
|
|
|
|
}
|
|
|
|
break;
|
2020-12-08 23:04:02 +03:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
static int nvme_start_ctrl(NvmeCtrl *n)
|
|
|
|
{
|
|
|
|
uint32_t page_bits = NVME_CC_MPS(n->bar.cc) + 12;
|
|
|
|
uint32_t page_size = 1 << page_bits;
|
|
|
|
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(n->cq[0])) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_startfail_cq();
|
2017-11-03 16:37:53 +03:00
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
if (unlikely(n->sq[0])) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_startfail_sq();
|
2017-11-03 16:37:53 +03:00
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
if (unlikely(!n->bar.asq)) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_startfail_nbarasq();
|
2017-11-03 16:37:53 +03:00
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
if (unlikely(!n->bar.acq)) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_startfail_nbaracq();
|
2017-11-03 16:37:53 +03:00
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
if (unlikely(n->bar.asq & (page_size - 1))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_startfail_asq_misaligned(n->bar.asq);
|
2017-11-03 16:37:53 +03:00
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
if (unlikely(n->bar.acq & (page_size - 1))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_startfail_acq_misaligned(n->bar.acq);
|
2017-11-03 16:37:53 +03:00
|
|
|
return -1;
|
|
|
|
}
|
2020-09-30 20:54:05 +03:00
|
|
|
if (unlikely(!(NVME_CAP_CSS(n->bar.cap) & (1 << NVME_CC_CSS(n->bar.cc))))) {
|
|
|
|
trace_pci_nvme_err_startfail_css(NVME_CC_CSS(n->bar.cc));
|
|
|
|
return -1;
|
|
|
|
}
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(NVME_CC_MPS(n->bar.cc) <
|
|
|
|
NVME_CAP_MPSMIN(n->bar.cap))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_startfail_page_too_small(
|
2017-11-03 16:37:53 +03:00
|
|
|
NVME_CC_MPS(n->bar.cc),
|
|
|
|
NVME_CAP_MPSMIN(n->bar.cap));
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
if (unlikely(NVME_CC_MPS(n->bar.cc) >
|
|
|
|
NVME_CAP_MPSMAX(n->bar.cap))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_startfail_page_too_large(
|
2017-11-03 16:37:53 +03:00
|
|
|
NVME_CC_MPS(n->bar.cc),
|
|
|
|
NVME_CAP_MPSMAX(n->bar.cap));
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
if (unlikely(NVME_CC_IOCQES(n->bar.cc) <
|
|
|
|
NVME_CTRL_CQES_MIN(n->id_ctrl.cqes))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_startfail_cqent_too_small(
|
2017-11-03 16:37:53 +03:00
|
|
|
NVME_CC_IOCQES(n->bar.cc),
|
|
|
|
NVME_CTRL_CQES_MIN(n->bar.cap));
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
if (unlikely(NVME_CC_IOCQES(n->bar.cc) >
|
|
|
|
NVME_CTRL_CQES_MAX(n->id_ctrl.cqes))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_startfail_cqent_too_large(
|
2017-11-03 16:37:53 +03:00
|
|
|
NVME_CC_IOCQES(n->bar.cc),
|
|
|
|
NVME_CTRL_CQES_MAX(n->bar.cap));
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
if (unlikely(NVME_CC_IOSQES(n->bar.cc) <
|
|
|
|
NVME_CTRL_SQES_MIN(n->id_ctrl.sqes))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_startfail_sqent_too_small(
|
2017-11-03 16:37:53 +03:00
|
|
|
NVME_CC_IOSQES(n->bar.cc),
|
|
|
|
NVME_CTRL_SQES_MIN(n->bar.cap));
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
if (unlikely(NVME_CC_IOSQES(n->bar.cc) >
|
|
|
|
NVME_CTRL_SQES_MAX(n->id_ctrl.sqes))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_startfail_sqent_too_large(
|
2017-11-03 16:37:53 +03:00
|
|
|
NVME_CC_IOSQES(n->bar.cc),
|
|
|
|
NVME_CTRL_SQES_MAX(n->bar.cap));
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
if (unlikely(!NVME_AQA_ASQS(n->bar.aqa))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_startfail_asqent_sz_zero();
|
2017-11-03 16:37:53 +03:00
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
if (unlikely(!NVME_AQA_ACQS(n->bar.aqa))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_startfail_acqent_sz_zero();
|
2013-06-04 19:17:10 +04:00
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
n->page_bits = page_bits;
|
|
|
|
n->page_size = page_size;
|
|
|
|
n->max_prp_ents = n->page_size / sizeof(uint64_t);
|
|
|
|
n->cqe_size = 1 << NVME_CC_IOCQES(n->bar.cc);
|
|
|
|
n->sqe_size = 1 << NVME_CC_IOSQES(n->bar.cc);
|
|
|
|
nvme_init_cq(&n->admin_cq, n, n->bar.acq, 0, 0,
|
2020-08-24 09:58:56 +03:00
|
|
|
NVME_AQA_ACQS(n->bar.aqa) + 1, 1);
|
2013-06-04 19:17:10 +04:00
|
|
|
nvme_init_sq(&n->admin_sq, n, n->bar.asq, 0, 0,
|
2020-08-24 09:58:56 +03:00
|
|
|
NVME_AQA_ASQS(n->bar.aqa) + 1);
|
2013-06-04 19:17:10 +04:00
|
|
|
|
2020-12-08 23:04:06 +03:00
|
|
|
if (!n->params.zasl_bs) {
|
|
|
|
n->zasl = n->params.mdts;
|
|
|
|
} else {
|
|
|
|
if (n->params.zasl_bs < n->page_size) {
|
|
|
|
trace_pci_nvme_err_startfail_zasl_too_small(n->params.zasl_bs,
|
|
|
|
n->page_size);
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
n->zasl = 31 - clz32(n->params.zasl_bs / n->page_size);
|
|
|
|
}
|
|
|
|
|
2019-05-20 20:40:30 +03:00
|
|
|
nvme_set_timestamp(n, 0ULL);
|
|
|
|
|
2020-07-06 09:12:53 +03:00
|
|
|
QTAILQ_INIT(&n->aer_queue);
|
|
|
|
|
2020-12-08 23:04:02 +03:00
|
|
|
nvme_select_ns_iocs(n);
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2020-12-18 02:32:16 +03:00
|
|
|
static void nvme_cmb_enable_regs(NvmeCtrl *n)
|
|
|
|
{
|
2020-12-18 02:32:57 +03:00
|
|
|
NVME_CMBLOC_SET_CDPCILS(n->bar.cmbloc, 1);
|
|
|
|
NVME_CMBLOC_SET_CDPMLS(n->bar.cmbloc, 1);
|
2020-12-18 02:32:16 +03:00
|
|
|
NVME_CMBLOC_SET_BIR(n->bar.cmbloc, NVME_CMB_BIR);
|
|
|
|
|
|
|
|
NVME_CMBSZ_SET_SQS(n->bar.cmbsz, 1);
|
|
|
|
NVME_CMBSZ_SET_CQS(n->bar.cmbsz, 0);
|
|
|
|
NVME_CMBSZ_SET_LISTS(n->bar.cmbsz, 1);
|
|
|
|
NVME_CMBSZ_SET_RDS(n->bar.cmbsz, 1);
|
|
|
|
NVME_CMBSZ_SET_WDS(n->bar.cmbsz, 1);
|
|
|
|
NVME_CMBSZ_SET_SZU(n->bar.cmbsz, 2); /* MBs */
|
|
|
|
NVME_CMBSZ_SET_SZ(n->bar.cmbsz, n->params.cmb_size_mb);
|
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data,
|
2020-08-24 09:58:56 +03:00
|
|
|
unsigned size)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(offset & (sizeof(uint32_t) - 1))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_mmiowr_misaligned32,
|
2017-11-03 16:37:53 +03:00
|
|
|
"MMIO write not 32-bit aligned,"
|
|
|
|
" offset=0x%"PRIx64"", offset);
|
|
|
|
/* should be ignored, fall through for now */
|
|
|
|
}
|
|
|
|
|
|
|
|
if (unlikely(size < sizeof(uint32_t))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_mmiowr_toosmall,
|
2017-11-03 16:37:53 +03:00
|
|
|
"MMIO write smaller than 32-bits,"
|
|
|
|
" offset=0x%"PRIx64", size=%u",
|
|
|
|
offset, size);
|
|
|
|
/* should be ignored, fall through for now */
|
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
switch (offset) {
|
2017-11-03 16:37:53 +03:00
|
|
|
case 0xc: /* INTMS */
|
|
|
|
if (unlikely(msix_enabled(&(n->parent_obj)))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_mmiowr_intmask_with_msix,
|
2017-11-03 16:37:53 +03:00
|
|
|
"undefined access to interrupt mask set"
|
|
|
|
" when MSI-X is enabled");
|
|
|
|
/* should be ignored, fall through for now */
|
|
|
|
}
|
2013-06-04 19:17:10 +04:00
|
|
|
n->bar.intms |= data & 0xffffffff;
|
|
|
|
n->bar.intmc = n->bar.intms;
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_mmio_intm_set(data & 0xffffffff, n->bar.intmc);
|
2017-12-18 08:00:43 +03:00
|
|
|
nvme_irq_check(n);
|
2013-06-04 19:17:10 +04:00
|
|
|
break;
|
2017-11-03 16:37:53 +03:00
|
|
|
case 0x10: /* INTMC */
|
|
|
|
if (unlikely(msix_enabled(&(n->parent_obj)))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_mmiowr_intmask_with_msix,
|
2017-11-03 16:37:53 +03:00
|
|
|
"undefined access to interrupt mask clr"
|
|
|
|
" when MSI-X is enabled");
|
|
|
|
/* should be ignored, fall through for now */
|
|
|
|
}
|
2013-06-04 19:17:10 +04:00
|
|
|
n->bar.intms &= ~(data & 0xffffffff);
|
|
|
|
n->bar.intmc = n->bar.intms;
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_mmio_intm_clr(data & 0xffffffff, n->bar.intmc);
|
2017-12-18 08:00:43 +03:00
|
|
|
nvme_irq_check(n);
|
2013-06-04 19:17:10 +04:00
|
|
|
break;
|
2017-11-03 16:37:53 +03:00
|
|
|
case 0x14: /* CC */
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_mmio_cfg(data & 0xffffffff);
|
2015-04-24 21:55:42 +03:00
|
|
|
/* Windows first sends data, then sends enable bit */
|
|
|
|
if (!NVME_CC_EN(data) && !NVME_CC_EN(n->bar.cc) &&
|
|
|
|
!NVME_CC_SHN(data) && !NVME_CC_SHN(n->bar.cc))
|
|
|
|
{
|
|
|
|
n->bar.cc = data;
|
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
if (NVME_CC_EN(data) && !NVME_CC_EN(n->bar.cc)) {
|
|
|
|
n->bar.cc = data;
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(nvme_start_ctrl(n))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_err_startfail();
|
2013-06-04 19:17:10 +04:00
|
|
|
n->bar.csts = NVME_CSTS_FAILED;
|
|
|
|
} else {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_mmio_start_success();
|
2013-06-04 19:17:10 +04:00
|
|
|
n->bar.csts = NVME_CSTS_READY;
|
|
|
|
}
|
|
|
|
} else if (!NVME_CC_EN(data) && NVME_CC_EN(n->bar.cc)) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_mmio_stopped();
|
2020-12-08 23:03:58 +03:00
|
|
|
nvme_ctrl_reset(n);
|
2013-06-04 19:17:10 +04:00
|
|
|
n->bar.csts &= ~NVME_CSTS_READY;
|
|
|
|
}
|
|
|
|
if (NVME_CC_SHN(data) && !(NVME_CC_SHN(n->bar.cc))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_mmio_shutdown_set();
|
2020-12-08 23:03:58 +03:00
|
|
|
nvme_ctrl_shutdown(n);
|
2017-11-03 16:37:53 +03:00
|
|
|
n->bar.cc = data;
|
|
|
|
n->bar.csts |= NVME_CSTS_SHST_COMPLETE;
|
2013-06-04 19:17:10 +04:00
|
|
|
} else if (!NVME_CC_SHN(data) && NVME_CC_SHN(n->bar.cc)) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_mmio_shutdown_cleared();
|
2017-11-03 16:37:53 +03:00
|
|
|
n->bar.csts &= ~NVME_CSTS_SHST_COMPLETE;
|
|
|
|
n->bar.cc = data;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
case 0x1C: /* CSTS */
|
|
|
|
if (data & (1 << 4)) {
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_mmiowr_ssreset_w1c_unsupported,
|
2017-11-03 16:37:53 +03:00
|
|
|
"attempted to W1C CSTS.NSSRO"
|
|
|
|
" but CAP.NSSRS is zero (not supported)");
|
|
|
|
} else if (data != 0) {
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_mmiowr_ro_csts,
|
2017-11-03 16:37:53 +03:00
|
|
|
"attempted to set a read only bit"
|
|
|
|
" of controller status");
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
case 0x20: /* NSSR */
|
|
|
|
if (data == 0x4E564D65) {
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_ub_mmiowr_ssreset_unsupported();
|
2017-11-03 16:37:53 +03:00
|
|
|
} else {
|
|
|
|
/* The spec says that writes of other values have no effect */
|
|
|
|
return;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
break;
|
2017-11-03 16:37:53 +03:00
|
|
|
case 0x24: /* AQA */
|
2013-06-04 19:17:10 +04:00
|
|
|
n->bar.aqa = data & 0xffffffff;
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_mmio_aqattr(data & 0xffffffff);
|
2013-06-04 19:17:10 +04:00
|
|
|
break;
|
2017-11-03 16:37:53 +03:00
|
|
|
case 0x28: /* ASQ */
|
2021-01-18 09:31:45 +03:00
|
|
|
n->bar.asq = size == 8 ? data :
|
|
|
|
(n->bar.asq & ~0xffffffffULL) | (data & 0xffffffff);
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_mmio_asqaddr(data);
|
2013-06-04 19:17:10 +04:00
|
|
|
break;
|
2017-11-03 16:37:53 +03:00
|
|
|
case 0x2c: /* ASQ hi */
|
2021-01-18 09:31:45 +03:00
|
|
|
n->bar.asq = (n->bar.asq & 0xffffffff) | (data << 32);
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_mmio_asqaddr_hi(data, n->bar.asq);
|
2013-06-04 19:17:10 +04:00
|
|
|
break;
|
2017-11-03 16:37:53 +03:00
|
|
|
case 0x30: /* ACQ */
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_mmio_acqaddr(data);
|
2021-01-18 09:31:45 +03:00
|
|
|
n->bar.acq = size == 8 ? data :
|
|
|
|
(n->bar.acq & ~0xffffffffULL) | (data & 0xffffffff);
|
2013-06-04 19:17:10 +04:00
|
|
|
break;
|
2017-11-03 16:37:53 +03:00
|
|
|
case 0x34: /* ACQ hi */
|
2021-01-18 09:31:45 +03:00
|
|
|
n->bar.acq = (n->bar.acq & 0xffffffff) | (data << 32);
|
2020-06-09 22:03:13 +03:00
|
|
|
trace_pci_nvme_mmio_acqaddr_hi(data, n->bar.acq);
|
2013-06-04 19:17:10 +04:00
|
|
|
break;
|
2017-11-03 16:37:53 +03:00
|
|
|
case 0x38: /* CMBLOC */
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_mmiowr_cmbloc_reserved,
|
2017-11-03 16:37:53 +03:00
|
|
|
"invalid write to reserved CMBLOC"
|
|
|
|
" when CMBSZ is zero, ignored");
|
|
|
|
return;
|
|
|
|
case 0x3C: /* CMBSZ */
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_mmiowr_cmbsz_readonly,
|
2017-11-03 16:37:53 +03:00
|
|
|
"invalid write to read only CMBSZ, ignored");
|
|
|
|
return;
|
2020-12-18 02:32:16 +03:00
|
|
|
case 0x50: /* CMBMSC */
|
|
|
|
if (!NVME_CAP_CMBS(n->bar.cap)) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
n->bar.cmbmsc = size == 8 ? data :
|
|
|
|
(n->bar.cmbmsc & ~0xffffffff) | (data & 0xffffffff);
|
|
|
|
n->cmb.cmse = false;
|
|
|
|
|
|
|
|
if (NVME_CMBMSC_CRE(data)) {
|
|
|
|
nvme_cmb_enable_regs(n);
|
|
|
|
|
|
|
|
if (NVME_CMBMSC_CMSE(data)) {
|
|
|
|
hwaddr cba = NVME_CMBMSC_CBA(data) << CMBMSC_CBA_SHIFT;
|
|
|
|
if (cba + int128_get64(n->cmb.mem.size) < cba) {
|
|
|
|
NVME_CMBSTS_SET_CBAI(n->bar.cmbsts, 1);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
n->cmb.cba = cba;
|
|
|
|
n->cmb.cmse = true;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
n->bar.cmbsz = 0;
|
|
|
|
n->bar.cmbloc = 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
return;
|
|
|
|
case 0x54: /* CMBMSC hi */
|
|
|
|
n->bar.cmbmsc = (n->bar.cmbmsc & 0xffffffff) | (data << 32);
|
|
|
|
return;
|
|
|
|
|
2020-03-30 19:46:56 +03:00
|
|
|
case 0xE00: /* PMRCAP */
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_mmiowr_pmrcap_readonly,
|
2020-03-30 19:46:56 +03:00
|
|
|
"invalid write to PMRCAP register, ignored");
|
|
|
|
return;
|
2020-12-18 15:04:19 +03:00
|
|
|
case 0xE04: /* PMRCTL */
|
|
|
|
n->bar.pmrctl = data;
|
|
|
|
if (NVME_PMRCTL_EN(data)) {
|
2020-11-13 08:30:05 +03:00
|
|
|
memory_region_set_enabled(&n->pmr.dev->mr, true);
|
2020-12-18 15:04:19 +03:00
|
|
|
n->bar.pmrsts = 0;
|
|
|
|
} else {
|
2020-11-13 08:30:05 +03:00
|
|
|
memory_region_set_enabled(&n->pmr.dev->mr, false);
|
2020-12-18 15:04:19 +03:00
|
|
|
NVME_PMRSTS_SET_NRDY(n->bar.pmrsts, 1);
|
2020-11-13 08:30:05 +03:00
|
|
|
n->pmr.cmse = false;
|
2020-12-18 15:04:19 +03:00
|
|
|
}
|
|
|
|
return;
|
2020-03-30 19:46:56 +03:00
|
|
|
case 0xE08: /* PMRSTS */
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_mmiowr_pmrsts_readonly,
|
2020-03-30 19:46:56 +03:00
|
|
|
"invalid write to PMRSTS register, ignored");
|
|
|
|
return;
|
|
|
|
case 0xE0C: /* PMREBS */
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_mmiowr_pmrebs_readonly,
|
2020-03-30 19:46:56 +03:00
|
|
|
"invalid write to PMREBS register, ignored");
|
|
|
|
return;
|
|
|
|
case 0xE10: /* PMRSWTP */
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_mmiowr_pmrswtp_readonly,
|
2020-03-30 19:46:56 +03:00
|
|
|
"invalid write to PMRSWTP register, ignored");
|
|
|
|
return;
|
2020-11-13 08:30:05 +03:00
|
|
|
case 0xE14: /* PMRMSCL */
|
|
|
|
if (!NVME_CAP_PMRS(n->bar.cap)) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
n->bar.pmrmsc = (n->bar.pmrmsc & ~0xffffffff) | (data & 0xffffffff);
|
|
|
|
n->pmr.cmse = false;
|
|
|
|
|
|
|
|
if (NVME_PMRMSC_CMSE(n->bar.pmrmsc)) {
|
|
|
|
hwaddr cba = NVME_PMRMSC_CBA(n->bar.pmrmsc) << PMRMSC_CBA_SHIFT;
|
|
|
|
if (cba + int128_get64(n->pmr.dev->mr.size) < cba) {
|
|
|
|
NVME_PMRSTS_SET_CBAI(n->bar.pmrsts, 1);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
n->pmr.cmse = true;
|
|
|
|
n->pmr.cba = cba;
|
|
|
|
}
|
|
|
|
|
|
|
|
return;
|
|
|
|
case 0xE18: /* PMRMSCU */
|
|
|
|
if (!NVME_CAP_PMRS(n->bar.cap)) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
n->bar.pmrmsc = (n->bar.pmrmsc & 0xffffffff) | (data << 32);
|
|
|
|
return;
|
2013-06-04 19:17:10 +04:00
|
|
|
default:
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_mmiowr_invalid,
|
2017-11-03 16:37:53 +03:00
|
|
|
"invalid MMIO write,"
|
|
|
|
" offset=0x%"PRIx64", data=%"PRIx64"",
|
|
|
|
offset, data);
|
2013-06-04 19:17:10 +04:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint64_t nvme_mmio_read(void *opaque, hwaddr addr, unsigned size)
|
|
|
|
{
|
|
|
|
NvmeCtrl *n = (NvmeCtrl *)opaque;
|
|
|
|
uint8_t *ptr = (uint8_t *)&n->bar;
|
|
|
|
uint64_t val = 0;
|
|
|
|
|
2021-01-18 09:30:50 +03:00
|
|
|
trace_pci_nvme_mmio_read(addr, size);
|
2020-07-06 09:12:48 +03:00
|
|
|
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(addr & (sizeof(uint32_t) - 1))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_mmiord_misaligned32,
|
2017-11-03 16:37:53 +03:00
|
|
|
"MMIO read not 32-bit aligned,"
|
|
|
|
" offset=0x%"PRIx64"", addr);
|
|
|
|
/* should RAZ, fall through for now */
|
|
|
|
} else if (unlikely(size < sizeof(uint32_t))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_mmiord_toosmall,
|
2017-11-03 16:37:53 +03:00
|
|
|
"MMIO read smaller than 32-bits,"
|
|
|
|
" offset=0x%"PRIx64"", addr);
|
|
|
|
/* should RAZ, fall through for now */
|
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
if (addr < sizeof(n->bar)) {
|
2020-03-30 19:46:56 +03:00
|
|
|
/*
|
|
|
|
* When PMRWBM bit 1 is set then read from
|
|
|
|
* from PMRSTS should ensure prior writes
|
|
|
|
* made it to persistent media
|
|
|
|
*/
|
|
|
|
if (addr == 0xE08 &&
|
|
|
|
(NVME_PMRCAP_PMRWBM(n->bar.pmrcap) & 0x02)) {
|
2020-11-13 08:30:05 +03:00
|
|
|
memory_region_msync(&n->pmr.dev->mr, 0, n->pmr.dev->size);
|
2020-03-30 19:46:56 +03:00
|
|
|
}
|
2013-06-04 19:17:10 +04:00
|
|
|
memcpy(&val, ptr + addr, size);
|
2017-11-03 16:37:53 +03:00
|
|
|
} else {
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_mmiord_invalid_ofs,
|
2017-11-03 16:37:53 +03:00
|
|
|
"MMIO read beyond last register,"
|
|
|
|
" offset=0x%"PRIx64", returning 0", addr);
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
2017-11-03 16:37:53 +03:00
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
return val;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
|
|
|
|
{
|
|
|
|
uint32_t qid;
|
|
|
|
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(addr & ((1 << 2) - 1))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_db_wr_misaligned,
|
2017-11-03 16:37:53 +03:00
|
|
|
"doorbell write not 32-bit aligned,"
|
|
|
|
" offset=0x%"PRIx64", ignoring", addr);
|
2013-06-04 19:17:10 +04:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (((addr - 0x1000) >> 2) & 1) {
|
2017-11-03 16:37:53 +03:00
|
|
|
/* Completion queue doorbell write */
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
uint16_t new_head = val & 0xffff;
|
|
|
|
int start_sqs;
|
|
|
|
NvmeCQueue *cq;
|
|
|
|
|
|
|
|
qid = (addr - (0x1000 + (1 << 2))) >> 3;
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(nvme_check_cqid(n, qid))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_db_wr_invalid_cq,
|
2017-11-03 16:37:53 +03:00
|
|
|
"completion queue doorbell write"
|
|
|
|
" for nonexistent queue,"
|
|
|
|
" sqid=%"PRIu32", ignoring", qid);
|
2020-07-06 09:12:53 +03:00
|
|
|
|
|
|
|
/*
|
|
|
|
* NVM Express v1.3d, Section 4.1 state: "If host software writes
|
|
|
|
* an invalid value to the Submission Queue Tail Doorbell or
|
|
|
|
* Completion Queue Head Doorbell regiter and an Asynchronous Event
|
|
|
|
* Request command is outstanding, then an asynchronous event is
|
|
|
|
* posted to the Admin Completion Queue with a status code of
|
|
|
|
* Invalid Doorbell Write Value."
|
|
|
|
*
|
|
|
|
* Also note that the spec includes the "Invalid Doorbell Register"
|
|
|
|
* status code, but nowhere does it specify when to use it.
|
|
|
|
* However, it seems reasonable to use it here in a similar
|
|
|
|
* fashion.
|
|
|
|
*/
|
|
|
|
if (n->outstanding_aers) {
|
|
|
|
nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
|
|
|
|
NVME_AER_INFO_ERR_INVALID_DB_REGISTER,
|
|
|
|
NVME_LOG_ERROR_INFO);
|
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
cq = n->cq[qid];
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(new_head >= cq->size)) {
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_db_wr_invalid_cqhead,
|
2017-11-03 16:37:53 +03:00
|
|
|
"completion queue doorbell write value"
|
|
|
|
" beyond queue size, sqid=%"PRIu32","
|
|
|
|
" new_head=%"PRIu16", ignoring",
|
|
|
|
qid, new_head);
|
2020-07-06 09:12:53 +03:00
|
|
|
|
|
|
|
if (n->outstanding_aers) {
|
|
|
|
nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
|
|
|
|
NVME_AER_INFO_ERR_INVALID_DB_VALUE,
|
|
|
|
NVME_LOG_ERROR_INFO);
|
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2020-07-06 09:12:48 +03:00
|
|
|
trace_pci_nvme_mmio_doorbell_cq(cq->cqid, new_head);
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
start_sqs = nvme_cq_full(cq) ? 1 : 0;
|
|
|
|
cq->head = new_head;
|
|
|
|
if (start_sqs) {
|
|
|
|
NvmeSQueue *sq;
|
|
|
|
QTAILQ_FOREACH(sq, &cq->sq_list, entry) {
|
2013-08-21 19:03:08 +04:00
|
|
|
timer_mod(sq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
2013-08-21 19:03:08 +04:00
|
|
|
timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
|
2017-12-18 08:00:43 +03:00
|
|
|
if (cq->tail == cq->head) {
|
|
|
|
nvme_irq_deassert(n, cq);
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
} else {
|
2017-11-03 16:37:53 +03:00
|
|
|
/* Submission queue doorbell write */
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
uint16_t new_tail = val & 0xffff;
|
|
|
|
NvmeSQueue *sq;
|
|
|
|
|
|
|
|
qid = (addr - 0x1000) >> 3;
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(nvme_check_sqid(n, qid))) {
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_db_wr_invalid_sq,
|
2017-11-03 16:37:53 +03:00
|
|
|
"submission queue doorbell write"
|
|
|
|
" for nonexistent queue,"
|
|
|
|
" sqid=%"PRIu32", ignoring", qid);
|
2020-07-06 09:12:53 +03:00
|
|
|
|
|
|
|
if (n->outstanding_aers) {
|
|
|
|
nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
|
|
|
|
NVME_AER_INFO_ERR_INVALID_DB_REGISTER,
|
|
|
|
NVME_LOG_ERROR_INFO);
|
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
sq = n->sq[qid];
|
2017-11-03 16:37:53 +03:00
|
|
|
if (unlikely(new_tail >= sq->size)) {
|
2020-06-09 22:03:13 +03:00
|
|
|
NVME_GUEST_ERR(pci_nvme_ub_db_wr_invalid_sqtail,
|
2017-11-03 16:37:53 +03:00
|
|
|
"submission queue doorbell write value"
|
|
|
|
" beyond queue size, sqid=%"PRIu32","
|
|
|
|
" new_tail=%"PRIu16", ignoring",
|
|
|
|
qid, new_tail);
|
2020-07-06 09:12:53 +03:00
|
|
|
|
|
|
|
if (n->outstanding_aers) {
|
|
|
|
nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
|
|
|
|
NVME_AER_INFO_ERR_INVALID_DB_VALUE,
|
|
|
|
NVME_LOG_ERROR_INFO);
|
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2020-07-06 09:12:48 +03:00
|
|
|
trace_pci_nvme_mmio_doorbell_sq(sq->sqid, new_tail);
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
sq->tail = new_tail;
|
2013-08-21 19:03:08 +04:00
|
|
|
timer_mod(sq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void nvme_mmio_write(void *opaque, hwaddr addr, uint64_t data,
|
2020-08-24 09:58:56 +03:00
|
|
|
unsigned size)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
|
|
|
NvmeCtrl *n = (NvmeCtrl *)opaque;
|
2020-07-06 09:12:48 +03:00
|
|
|
|
2021-01-18 09:30:50 +03:00
|
|
|
trace_pci_nvme_mmio_write(addr, data, size);
|
2020-07-06 09:12:48 +03:00
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
if (addr < sizeof(n->bar)) {
|
|
|
|
nvme_write_bar(n, addr, data, size);
|
2020-06-30 14:04:29 +03:00
|
|
|
} else {
|
2013-06-04 19:17:10 +04:00
|
|
|
nvme_process_db(n, addr, data);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static const MemoryRegionOps nvme_mmio_ops = {
|
|
|
|
.read = nvme_mmio_read,
|
|
|
|
.write = nvme_mmio_write,
|
|
|
|
.endianness = DEVICE_LITTLE_ENDIAN,
|
|
|
|
.impl = {
|
|
|
|
.min_access_size = 2,
|
|
|
|
.max_access_size = 8,
|
|
|
|
},
|
|
|
|
};
|
|
|
|
|
2017-05-16 22:10:59 +03:00
|
|
|
static void nvme_cmb_write(void *opaque, hwaddr addr, uint64_t data,
|
2020-08-24 09:58:56 +03:00
|
|
|
unsigned size)
|
2017-05-16 22:10:59 +03:00
|
|
|
{
|
|
|
|
NvmeCtrl *n = (NvmeCtrl *)opaque;
|
2020-12-18 02:32:16 +03:00
|
|
|
stn_le_p(&n->cmb.buf[addr], size, data);
|
2017-05-16 22:10:59 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
static uint64_t nvme_cmb_read(void *opaque, hwaddr addr, unsigned size)
|
|
|
|
{
|
|
|
|
NvmeCtrl *n = (NvmeCtrl *)opaque;
|
2020-12-18 02:32:16 +03:00
|
|
|
return ldn_le_p(&n->cmb.buf[addr], size);
|
2017-05-16 22:10:59 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
static const MemoryRegionOps nvme_cmb_ops = {
|
|
|
|
.read = nvme_cmb_read,
|
|
|
|
.write = nvme_cmb_write,
|
|
|
|
.endianness = DEVICE_LITTLE_ENDIAN,
|
|
|
|
.impl = {
|
2018-11-20 21:41:48 +03:00
|
|
|
.min_access_size = 1,
|
2017-05-16 22:10:59 +03:00
|
|
|
.max_access_size = 8,
|
|
|
|
},
|
|
|
|
};
|
|
|
|
|
2020-06-09 22:03:21 +03:00
|
|
|
static void nvme_check_constraints(NvmeCtrl *n, Error **errp)
|
2013-06-04 19:17:10 +04:00
|
|
|
{
|
2020-06-09 22:03:21 +03:00
|
|
|
NvmeParams *params = &n->params;
|
2013-06-04 19:17:10 +04:00
|
|
|
|
2020-06-09 22:03:21 +03:00
|
|
|
if (params->num_queues) {
|
2020-06-09 22:03:19 +03:00
|
|
|
warn_report("num_queues is deprecated; please use max_ioqpairs "
|
|
|
|
"instead");
|
|
|
|
|
2020-06-09 22:03:21 +03:00
|
|
|
params->max_ioqpairs = params->num_queues - 1;
|
2020-06-09 22:03:19 +03:00
|
|
|
}
|
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
if (n->conf.blk) {
|
|
|
|
warn_report("drive property is deprecated; "
|
|
|
|
"please use an nvme-ns device instead");
|
|
|
|
}
|
|
|
|
|
2020-06-09 22:03:21 +03:00
|
|
|
if (params->max_ioqpairs < 1 ||
|
2020-06-09 22:03:32 +03:00
|
|
|
params->max_ioqpairs > NVME_MAX_IOQPAIRS) {
|
2020-06-09 22:03:19 +03:00
|
|
|
error_setg(errp, "max_ioqpairs must be between 1 and %d",
|
2020-06-09 22:03:32 +03:00
|
|
|
NVME_MAX_IOQPAIRS);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (params->msix_qsize < 1 ||
|
|
|
|
params->msix_qsize > PCI_MSIX_FLAGS_QSIZE + 1) {
|
|
|
|
error_setg(errp, "msix_qsize must be between 1 and %d",
|
|
|
|
PCI_MSIX_FLAGS_QSIZE + 1);
|
nvme: ensure the num_queues is not zero
When it is zero, it causes segv.
Using following command:
"-drive file=//home/test/test1.img,if=none,id=id0
-device nvme,drive=id0,serial=test,num_queues=0"
causes following Backtrack:
Thread 4 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe9735700 (LWP 30952)]
0x0000555555a7a77c in nvme_start_ctrl (n=0x5555577473f0) at hw/block/nvme.c:825
825 if (unlikely(n->cq[0])) {
(gdb) bt
0 0x0000555555a7a77c in nvme_start_ctrl (n=0x5555577473f0)
at hw/block/nvme.c:825
1 0x0000555555a7af7f in nvme_write_bar (n=0x5555577473f0, offset=20,
data=4587521, size=4) at hw/block/nvme.c:969
2 0x0000555555a7b81a in nvme_mmio_write (opaque=0x5555577473f0, addr=20,
data=4587521, size=4) at hw/block/nvme.c:1163
3 0x0000555555869236 in memory_region_write_accessor (mr=0x555557747cd0,
addr=20, value=0x7fffe97320f8, size=4, shift=0, mask=4294967295, attrs=...)
at /home/test/qemu1/qemu/memory.c:502
4 0x0000555555869446 in access_with_adjusted_size (addr=20,
value=0x7fffe97320f8, size=4, access_size_min=2, access_size_max=8,
access_fn=0x55555586914d <memory_region_write_accessor>,
mr=0x555557747cd0, attrs=...) at /home/test/qemu1/qemu/memory.c:568
5 0x000055555586c479 in memory_region_dispatch_write (mr=0x555557747cd0,
addr=20, data=4587521, size=4, attrs=...)
at /home/test/qemu1/qemu/memory.c:1499
6 0x00005555558030af in flatview_write_continue (fv=0x7fffe0061130,
addr=4273930260, attrs=..., buf=0x7ffff7ff0028 "\001", len=4, addr1=20,
l=4, mr=0x555557747cd0) at /home/test/qemu1/qemu/exec.c:3234
7 0x00005555558031f9 in flatview_write (fv=0x7fffe0061130, addr=4273930260,
attrs=..., buf=0x7ffff7ff0028 "\001", len=4)
at /home/test/qemu1/qemu/exec.c:3273
8 0x00005555558034ff in address_space_write (
---Type <return> to continue, or q <return> to quit---
as=0x555556758480 <address_space_memory>, addr=4273930260, attrs=...,
buf=0x7ffff7ff0028 "\001", len=4) at /home/test/qemu1/qemu/exec.c:3363
9 0x0000555555803550 in address_space_rw (
as=0x555556758480 <address_space_memory>, addr=4273930260, attrs=...,
buf=0x7ffff7ff0028 "\001", len=4, is_write=true)
at /home/test/qemu1/qemu/exec.c:3374
10 0x00005555558884a1 in kvm_cpu_exec (cpu=0x555556920e40)
at /home/test/qemu1/qemu/accel/kvm/kvm-all.c:2031
11 0x000055555584cd9d in qemu_kvm_cpu_thread_fn (arg=0x555556920e40)
at /home/test/qemu1/qemu/cpus.c:1281
12 0x0000555555dbaf6d in qemu_thread_start (args=0x5555569438a0)
at util/qemu-thread-posix.c:502
13 0x00007ffff5dc86db in start_thread (arg=0x7fffe9735700)
at pthread_create.c:463
14 0x00007ffff5af188f in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190120055558.32984-3-liq3ea@163.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-01-20 08:55:57 +03:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2020-06-09 22:03:21 +03:00
|
|
|
if (!params->serial) {
|
2017-11-22 06:08:43 +03:00
|
|
|
error_setg(errp, "serial property not set");
|
|
|
|
return;
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
2020-03-30 19:46:56 +03:00
|
|
|
|
2020-11-13 08:30:05 +03:00
|
|
|
if (n->pmr.dev) {
|
|
|
|
if (host_memory_backend_is_mapped(n->pmr.dev)) {
|
2020-07-14 19:02:00 +03:00
|
|
|
error_setg(errp, "can't use already busy memdev: %s",
|
2020-11-13 08:30:05 +03:00
|
|
|
object_get_canonical_path_component(OBJECT(n->pmr.dev)));
|
2020-03-30 19:46:56 +03:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2020-11-13 08:30:05 +03:00
|
|
|
if (!is_power_of_2(n->pmr.dev->size)) {
|
2020-03-30 19:46:56 +03:00
|
|
|
error_setg(errp, "pmr backend size needs to be power of 2 in size");
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2020-11-13 08:30:05 +03:00
|
|
|
host_memory_backend_set_mapped(n->pmr.dev, true);
|
2020-03-30 19:46:56 +03:00
|
|
|
}
|
2020-12-08 23:04:06 +03:00
|
|
|
|
|
|
|
if (n->params.zasl_bs) {
|
|
|
|
if (!is_power_of_2(n->params.zasl_bs)) {
|
|
|
|
error_setg(errp, "zone append size limit has to be a power of 2");
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
}
|
2020-06-09 22:03:21 +03:00
|
|
|
}
|
|
|
|
|
2020-06-09 22:03:22 +03:00
|
|
|
static void nvme_init_state(NvmeCtrl *n)
|
|
|
|
{
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
n->num_namespaces = NVME_MAX_NAMESPACES;
|
2020-06-09 22:03:22 +03:00
|
|
|
/* add one to max_ioqpairs to account for the admin queue pair */
|
2020-06-30 14:04:29 +03:00
|
|
|
n->reg_size = pow2ceil(sizeof(NvmeBar) +
|
2020-06-09 22:03:22 +03:00
|
|
|
2 * (n->params.max_ioqpairs + 1) * NVME_DB_SIZE);
|
|
|
|
n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
|
|
|
|
n->cq = g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1);
|
2020-07-06 09:12:52 +03:00
|
|
|
n->temperature = NVME_TEMPERATURE;
|
2020-07-06 09:12:50 +03:00
|
|
|
n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING;
|
2020-07-06 09:12:52 +03:00
|
|
|
n->starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
|
2020-07-06 09:12:53 +03:00
|
|
|
n->aer_reqs = g_new0(NvmeRequest *, n->params.aerl + 1);
|
2020-06-09 22:03:22 +03:00
|
|
|
}
|
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
int nvme_register_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
|
2020-06-09 22:03:23 +03:00
|
|
|
{
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
uint32_t nsid = nvme_nsid(ns);
|
|
|
|
|
|
|
|
if (nsid > NVME_MAX_NAMESPACES) {
|
|
|
|
error_setg(errp, "invalid namespace id (must be between 0 and %d)",
|
|
|
|
NVME_MAX_NAMESPACES);
|
|
|
|
return -1;
|
2020-05-29 01:55:10 +03:00
|
|
|
}
|
2020-06-09 22:03:23 +03:00
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
if (!nsid) {
|
|
|
|
for (int i = 1; i <= n->num_namespaces; i++) {
|
2020-11-04 13:22:46 +03:00
|
|
|
if (!nvme_ns(n, i)) {
|
2020-10-02 00:37:20 +03:00
|
|
|
nsid = ns->params.nsid = i;
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
2020-06-09 22:03:25 +03:00
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
if (!nsid) {
|
|
|
|
error_setg(errp, "no free namespace id");
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
if (n->namespaces[nsid - 1]) {
|
|
|
|
error_setg(errp, "namespace id '%d' is already in use", nsid);
|
|
|
|
return -1;
|
|
|
|
}
|
2020-06-09 22:03:25 +03:00
|
|
|
}
|
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
trace_pci_nvme_register_namespace(nsid);
|
2020-06-09 22:03:25 +03:00
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
n->namespaces[nsid - 1] = ns;
|
2020-06-09 22:03:25 +03:00
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
return 0;
|
2020-06-09 22:03:25 +03:00
|
|
|
}
|
|
|
|
|
2020-06-09 22:03:27 +03:00
|
|
|
static void nvme_init_cmb(NvmeCtrl *n, PCIDevice *pci_dev)
|
|
|
|
{
|
2020-12-18 02:32:16 +03:00
|
|
|
uint64_t cmb_size = n->params.cmb_size_mb * MiB;
|
2020-06-09 22:03:27 +03:00
|
|
|
|
2020-12-18 02:32:16 +03:00
|
|
|
n->cmb.buf = g_malloc0(cmb_size);
|
|
|
|
memory_region_init_io(&n->cmb.mem, OBJECT(n), &nvme_cmb_ops, n,
|
|
|
|
"nvme-cmb", cmb_size);
|
|
|
|
pci_register_bar(pci_dev, NVME_CMB_BIR,
|
2020-06-09 22:03:27 +03:00
|
|
|
PCI_BASE_ADDRESS_SPACE_MEMORY |
|
|
|
|
PCI_BASE_ADDRESS_MEM_TYPE_64 |
|
2020-12-18 02:32:16 +03:00
|
|
|
PCI_BASE_ADDRESS_MEM_PREFETCH, &n->cmb.mem);
|
|
|
|
|
|
|
|
NVME_CAP_SET_CMBS(n->bar.cap, 1);
|
|
|
|
|
|
|
|
if (n->params.legacy_cmb) {
|
|
|
|
nvme_cmb_enable_regs(n);
|
|
|
|
n->cmb.cmse = true;
|
|
|
|
}
|
2020-06-09 22:03:27 +03:00
|
|
|
}
|
|
|
|
|
2020-06-09 22:03:28 +03:00
|
|
|
static void nvme_init_pmr(NvmeCtrl *n, PCIDevice *pci_dev)
|
|
|
|
{
|
2020-11-13 08:30:05 +03:00
|
|
|
NVME_PMRCAP_SET_RDS(n->bar.pmrcap, 1);
|
|
|
|
NVME_PMRCAP_SET_WDS(n->bar.pmrcap, 1);
|
2020-06-09 22:03:28 +03:00
|
|
|
NVME_PMRCAP_SET_BIR(n->bar.pmrcap, NVME_PMR_BIR);
|
|
|
|
/* Turn on bit 1 support */
|
|
|
|
NVME_PMRCAP_SET_PMRWBM(n->bar.pmrcap, 0x02);
|
2020-11-13 08:30:05 +03:00
|
|
|
NVME_PMRCAP_SET_CMSS(n->bar.pmrcap, 1);
|
2020-06-09 22:03:28 +03:00
|
|
|
|
|
|
|
pci_register_bar(pci_dev, NVME_PMRCAP_BIR(n->bar.pmrcap),
|
|
|
|
PCI_BASE_ADDRESS_SPACE_MEMORY |
|
|
|
|
PCI_BASE_ADDRESS_MEM_TYPE_64 |
|
2020-11-13 08:30:05 +03:00
|
|
|
PCI_BASE_ADDRESS_MEM_PREFETCH, &n->pmr.dev->mr);
|
2020-12-18 15:04:19 +03:00
|
|
|
|
2020-11-13 08:30:05 +03:00
|
|
|
memory_region_set_enabled(&n->pmr.dev->mr, false);
|
2020-06-09 22:03:28 +03:00
|
|
|
}
|
|
|
|
|
2021-01-12 15:30:26 +03:00
|
|
|
static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp)
|
2020-06-09 22:03:26 +03:00
|
|
|
{
|
|
|
|
uint8_t *pci_conf = pci_dev->config;
|
2020-11-13 11:50:33 +03:00
|
|
|
uint64_t bar_size, msix_table_size, msix_pba_size;
|
|
|
|
unsigned msix_table_offset, msix_pba_offset;
|
2021-01-12 15:30:26 +03:00
|
|
|
int ret;
|
|
|
|
|
|
|
|
Error *err = NULL;
|
2020-06-09 22:03:26 +03:00
|
|
|
|
|
|
|
pci_conf[PCI_INTERRUPT_PIN] = 1;
|
|
|
|
pci_config_set_prog_interface(pci_conf, 0x2);
|
2019-09-27 12:43:12 +03:00
|
|
|
|
|
|
|
if (n->params.use_intel_id) {
|
|
|
|
pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_INTEL);
|
|
|
|
pci_config_set_device_id(pci_conf, 0x5845);
|
|
|
|
} else {
|
|
|
|
pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_REDHAT);
|
|
|
|
pci_config_set_device_id(pci_conf, PCI_DEVICE_ID_REDHAT_NVME);
|
|
|
|
}
|
|
|
|
|
2020-06-09 22:03:26 +03:00
|
|
|
pci_config_set_class(pci_conf, PCI_CLASS_STORAGE_EXPRESS);
|
|
|
|
pcie_endpoint_cap_init(pci_dev, 0x80);
|
|
|
|
|
2020-11-13 11:50:33 +03:00
|
|
|
bar_size = QEMU_ALIGN_UP(n->reg_size, 4 * KiB);
|
|
|
|
msix_table_offset = bar_size;
|
|
|
|
msix_table_size = PCI_MSIX_ENTRY_SIZE * n->params.msix_qsize;
|
|
|
|
|
|
|
|
bar_size += msix_table_size;
|
|
|
|
bar_size = QEMU_ALIGN_UP(bar_size, 4 * KiB);
|
|
|
|
msix_pba_offset = bar_size;
|
|
|
|
msix_pba_size = QEMU_ALIGN_UP(n->params.msix_qsize, 64) / 8;
|
|
|
|
|
|
|
|
bar_size += msix_pba_size;
|
|
|
|
bar_size = pow2ceil(bar_size);
|
|
|
|
|
|
|
|
memory_region_init(&n->bar0, OBJECT(n), "nvme-bar0", bar_size);
|
2020-06-09 22:03:26 +03:00
|
|
|
memory_region_init_io(&n->iomem, OBJECT(n), &nvme_mmio_ops, n, "nvme",
|
|
|
|
n->reg_size);
|
2020-11-13 11:50:33 +03:00
|
|
|
memory_region_add_subregion(&n->bar0, 0, &n->iomem);
|
|
|
|
|
2020-06-09 22:03:26 +03:00
|
|
|
pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY |
|
2020-11-13 11:50:33 +03:00
|
|
|
PCI_BASE_ADDRESS_MEM_TYPE_64, &n->bar0);
|
|
|
|
ret = msix_init(pci_dev, n->params.msix_qsize,
|
|
|
|
&n->bar0, 0, msix_table_offset,
|
|
|
|
&n->bar0, 0, msix_pba_offset, 0, &err);
|
2021-01-12 15:30:26 +03:00
|
|
|
if (ret < 0) {
|
|
|
|
if (ret == -ENOTSUP) {
|
|
|
|
warn_report_err(err);
|
|
|
|
} else {
|
|
|
|
error_propagate(errp, err);
|
|
|
|
return ret;
|
|
|
|
}
|
2020-06-09 22:03:33 +03:00
|
|
|
}
|
2020-06-09 22:03:29 +03:00
|
|
|
|
|
|
|
if (n->params.cmb_size_mb) {
|
|
|
|
nvme_init_cmb(n, pci_dev);
|
2020-11-13 11:57:13 +03:00
|
|
|
}
|
|
|
|
|
2020-11-13 08:30:05 +03:00
|
|
|
if (n->pmr.dev) {
|
2020-06-09 22:03:29 +03:00
|
|
|
nvme_init_pmr(n, pci_dev);
|
|
|
|
}
|
2021-01-12 15:30:26 +03:00
|
|
|
|
|
|
|
return 0;
|
2020-06-09 22:03:26 +03:00
|
|
|
}
|
|
|
|
|
2020-06-09 22:03:30 +03:00
|
|
|
static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
|
2020-06-09 22:03:21 +03:00
|
|
|
{
|
|
|
|
NvmeIdCtrl *id = &n->id_ctrl;
|
2020-06-09 22:03:30 +03:00
|
|
|
uint8_t *pci_conf = pci_dev->config;
|
2020-07-06 09:13:02 +03:00
|
|
|
char *subnqn;
|
2013-06-04 19:17:10 +04:00
|
|
|
|
|
|
|
id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID));
|
|
|
|
id->ssvid = cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID));
|
|
|
|
strpadcpy((char *)id->mn, sizeof(id->mn), "QEMU NVMe Ctrl", ' ');
|
|
|
|
strpadcpy((char *)id->fr, sizeof(id->fr), "1.0", ' ');
|
2020-06-09 22:03:15 +03:00
|
|
|
strpadcpy((char *)id->sn, sizeof(id->sn), n->params.serial, ' ');
|
2013-06-04 19:17:10 +04:00
|
|
|
id->rab = 6;
|
|
|
|
id->ieee[0] = 0x00;
|
|
|
|
id->ieee[1] = 0x02;
|
|
|
|
id->ieee[2] = 0xb3;
|
2020-02-23 19:38:22 +03:00
|
|
|
id->mdts = n->params.mdts;
|
2020-07-06 09:13:03 +03:00
|
|
|
id->ver = cpu_to_le32(NVME_SPEC_VER);
|
2013-06-04 19:17:10 +04:00
|
|
|
id->oacs = cpu_to_le16(0);
|
2021-01-13 12:19:44 +03:00
|
|
|
id->cntrltype = 0x1;
|
2020-07-06 09:12:49 +03:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Because the controller always completes the Abort command immediately,
|
|
|
|
* there can never be more than one concurrently executing Abort command,
|
|
|
|
* so this value is never used for anything. Note that there can easily be
|
|
|
|
* many Abort commands in the queues, but they are not considered
|
|
|
|
* "executing" until processed by nvme_abort.
|
|
|
|
*
|
|
|
|
* The specification recommends a value of 3 for Abort Command Limit (four
|
|
|
|
* concurrently outstanding Abort commands), so lets use that though it is
|
|
|
|
* inconsequential.
|
|
|
|
*/
|
|
|
|
id->acl = 3;
|
2020-07-06 09:12:53 +03:00
|
|
|
id->aerl = n->params.aerl;
|
2020-07-06 09:12:51 +03:00
|
|
|
id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO;
|
2020-12-08 23:04:02 +03:00
|
|
|
id->lpa = NVME_LPA_NS_SMART | NVME_LPA_CSE | NVME_LPA_EXTENDED;
|
2020-07-06 09:12:50 +03:00
|
|
|
|
|
|
|
/* recommended default value (~70 C) */
|
|
|
|
id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING);
|
|
|
|
id->cctemp = cpu_to_le16(NVME_TEMPERATURE_CRITICAL);
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
id->sqes = (0x6 << 4) | 0x6;
|
|
|
|
id->cqes = (0x4 << 4) | 0x4;
|
|
|
|
id->nn = cpu_to_le32(n->num_namespaces);
|
2020-03-31 00:10:13 +03:00
|
|
|
id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROES | NVME_ONCS_TIMESTAMP |
|
2020-11-16 13:14:02 +03:00
|
|
|
NVME_ONCS_FEATURES | NVME_ONCS_DSM |
|
|
|
|
NVME_ONCS_COMPARE);
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
|
2021-01-13 12:19:44 +03:00
|
|
|
id->vwc = (0x2 << 1) | 0x1;
|
2020-03-18 11:41:19 +03:00
|
|
|
id->sgls = cpu_to_le32(NVME_CTRL_SGLS_SUPPORT_NO_ALIGN |
|
|
|
|
NVME_CTRL_SGLS_BITBUCKET);
|
2020-07-06 09:12:57 +03:00
|
|
|
|
2020-07-06 09:13:02 +03:00
|
|
|
subnqn = g_strdup_printf("nqn.2019-08.org.qemu:%s", n->params.serial);
|
|
|
|
strpadcpy((char *)id->subnqn, sizeof(id->subnqn), subnqn, '\0');
|
|
|
|
g_free(subnqn);
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
id->psd[0].mp = cpu_to_le16(0x9c4);
|
|
|
|
id->psd[0].enlat = cpu_to_le32(0x10);
|
|
|
|
id->psd[0].exlat = cpu_to_le32(0x4);
|
|
|
|
|
|
|
|
NVME_CAP_SET_MQES(n->bar.cap, 0x7ff);
|
|
|
|
NVME_CAP_SET_CQR(n->bar.cap, 1);
|
|
|
|
NVME_CAP_SET_TO(n->bar.cap, 0xf);
|
2020-09-30 20:54:05 +03:00
|
|
|
NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_NVM);
|
2020-12-08 23:04:03 +03:00
|
|
|
NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_CSI_SUPP);
|
2020-09-30 20:58:03 +03:00
|
|
|
NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_ADMIN_ONLY);
|
2014-11-27 06:39:21 +03:00
|
|
|
NVME_CAP_SET_MPSMAX(n->bar.cap, 4);
|
2020-11-13 10:00:47 +03:00
|
|
|
NVME_CAP_SET_CMBS(n->bar.cap, n->params.cmb_size_mb ? 1 : 0);
|
2020-11-13 08:30:05 +03:00
|
|
|
NVME_CAP_SET_PMRS(n->bar.cap, n->pmr.dev ? 1 : 0);
|
2013-06-04 19:17:10 +04:00
|
|
|
|
2020-07-06 09:13:03 +03:00
|
|
|
n->bar.vs = NVME_SPEC_VER;
|
2013-06-04 19:17:10 +04:00
|
|
|
n->bar.intmc = n->bar.intms = 0;
|
2020-06-09 22:03:30 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
static void nvme_realize(PCIDevice *pci_dev, Error **errp)
|
|
|
|
{
|
|
|
|
NvmeCtrl *n = NVME(pci_dev);
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
NvmeNamespace *ns;
|
2020-06-09 22:03:30 +03:00
|
|
|
Error *local_err = NULL;
|
|
|
|
|
|
|
|
nvme_check_constraints(n, &local_err);
|
|
|
|
if (local_err) {
|
|
|
|
error_propagate(errp, local_err);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
qbus_create_inplace(&n->bus, sizeof(NvmeBus), TYPE_NVME_BUS,
|
|
|
|
&pci_dev->qdev, n->parent_obj.qdev.id);
|
2020-06-09 22:03:30 +03:00
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
nvme_init_state(n);
|
2021-01-12 15:30:26 +03:00
|
|
|
if (nvme_init_pci(n, pci_dev, errp)) {
|
2020-06-09 22:03:33 +03:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2020-06-09 22:03:30 +03:00
|
|
|
nvme_init_ctrl(n, pci_dev);
|
2013-06-04 19:17:10 +04:00
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
/* setup a namespace if the controller drive property was given */
|
|
|
|
if (n->namespace.blkconf.blk) {
|
|
|
|
ns = &n->namespace;
|
|
|
|
ns->params.nsid = 1;
|
|
|
|
|
2021-01-17 17:53:35 +03:00
|
|
|
if (nvme_ns_setup(ns, errp)) {
|
2020-06-09 22:03:25 +03:00
|
|
|
return;
|
|
|
|
}
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void nvme_exit(PCIDevice *pci_dev)
|
|
|
|
{
|
|
|
|
NvmeCtrl *n = NVME(pci_dev);
|
2020-12-08 23:04:06 +03:00
|
|
|
NvmeNamespace *ns;
|
|
|
|
int i;
|
2013-06-04 19:17:10 +04:00
|
|
|
|
2020-12-09 15:10:45 +03:00
|
|
|
nvme_ctrl_reset(n);
|
2020-12-08 23:04:06 +03:00
|
|
|
|
|
|
|
for (i = 1; i <= n->num_namespaces; i++) {
|
|
|
|
ns = nvme_ns(n, i);
|
|
|
|
if (!ns) {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
nvme_ns_cleanup(ns);
|
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
g_free(n->cq);
|
|
|
|
g_free(n->sq);
|
2020-07-06 09:12:53 +03:00
|
|
|
g_free(n->aer_reqs);
|
2017-05-16 22:10:59 +03:00
|
|
|
|
2020-06-09 22:03:15 +03:00
|
|
|
if (n->params.cmb_size_mb) {
|
2020-12-18 02:32:16 +03:00
|
|
|
g_free(n->cmb.buf);
|
2018-10-29 09:29:41 +03:00
|
|
|
}
|
2020-03-30 19:46:56 +03:00
|
|
|
|
2020-11-13 08:30:05 +03:00
|
|
|
if (n->pmr.dev) {
|
|
|
|
host_memory_backend_set_mapped(n->pmr.dev, false);
|
2020-03-30 19:46:56 +03:00
|
|
|
}
|
2013-06-04 19:17:10 +04:00
|
|
|
msix_uninit_exclusive_bar(pci_dev);
|
|
|
|
}
|
|
|
|
|
|
|
|
static Property nvme_props[] = {
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
DEFINE_BLOCK_PROPERTIES(NvmeCtrl, namespace.blkconf),
|
2020-11-13 08:30:05 +03:00
|
|
|
DEFINE_PROP_LINK("pmrdev", NvmeCtrl, pmr.dev, TYPE_MEMORY_BACKEND,
|
2020-03-30 19:46:56 +03:00
|
|
|
HostMemoryBackend *),
|
2020-06-09 22:03:15 +03:00
|
|
|
DEFINE_PROP_STRING("serial", NvmeCtrl, params.serial),
|
|
|
|
DEFINE_PROP_UINT32("cmb_size_mb", NvmeCtrl, params.cmb_size_mb, 0),
|
2020-06-09 22:03:19 +03:00
|
|
|
DEFINE_PROP_UINT32("num_queues", NvmeCtrl, params.num_queues, 0),
|
|
|
|
DEFINE_PROP_UINT32("max_ioqpairs", NvmeCtrl, params.max_ioqpairs, 64),
|
2020-06-09 22:03:32 +03:00
|
|
|
DEFINE_PROP_UINT16("msix_qsize", NvmeCtrl, params.msix_qsize, 65),
|
2020-07-06 09:12:53 +03:00
|
|
|
DEFINE_PROP_UINT8("aerl", NvmeCtrl, params.aerl, 3),
|
|
|
|
DEFINE_PROP_UINT32("aer_max_queued", NvmeCtrl, params.aer_max_queued, 64),
|
2020-02-23 19:38:22 +03:00
|
|
|
DEFINE_PROP_UINT8("mdts", NvmeCtrl, params.mdts, 7),
|
2019-09-27 12:43:12 +03:00
|
|
|
DEFINE_PROP_BOOL("use-intel-id", NvmeCtrl, params.use_intel_id, false),
|
2020-12-18 02:32:16 +03:00
|
|
|
DEFINE_PROP_BOOL("legacy-cmb", NvmeCtrl, params.legacy_cmb, false),
|
2020-12-08 23:04:06 +03:00
|
|
|
DEFINE_PROP_SIZE32("zoned.append_size_limit", NvmeCtrl, params.zasl_bs,
|
|
|
|
NVME_DEFAULT_MAX_ZA_SIZE),
|
2013-06-04 19:17:10 +04:00
|
|
|
DEFINE_PROP_END_OF_LIST(),
|
|
|
|
};
|
|
|
|
|
2021-01-15 06:27:01 +03:00
|
|
|
static void nvme_get_smart_warning(Object *obj, Visitor *v, const char *name,
|
|
|
|
void *opaque, Error **errp)
|
|
|
|
{
|
|
|
|
NvmeCtrl *n = NVME(obj);
|
|
|
|
uint8_t value = n->smart_critical_warning;
|
|
|
|
|
|
|
|
visit_type_uint8(v, name, &value, errp);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void nvme_set_smart_warning(Object *obj, Visitor *v, const char *name,
|
|
|
|
void *opaque, Error **errp)
|
|
|
|
{
|
|
|
|
NvmeCtrl *n = NVME(obj);
|
2021-01-15 06:27:02 +03:00
|
|
|
uint8_t value, old_value, cap = 0, index, event;
|
2021-01-15 06:27:01 +03:00
|
|
|
|
|
|
|
if (!visit_type_uint8(v, name, &value, errp)) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
cap = NVME_SMART_SPARE | NVME_SMART_TEMPERATURE | NVME_SMART_RELIABILITY
|
|
|
|
| NVME_SMART_MEDIA_READ_ONLY | NVME_SMART_FAILED_VOLATILE_MEDIA;
|
2020-12-18 14:54:45 +03:00
|
|
|
if (NVME_CAP_PMRS(n->bar.cap)) {
|
2021-01-15 06:27:01 +03:00
|
|
|
cap |= NVME_SMART_PMR_UNRELIABLE;
|
|
|
|
}
|
|
|
|
|
|
|
|
if ((value & cap) != value) {
|
|
|
|
error_setg(errp, "unsupported smart critical warning bits: 0x%x",
|
|
|
|
value & ~cap);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2021-01-15 06:27:02 +03:00
|
|
|
old_value = n->smart_critical_warning;
|
2021-01-15 06:27:01 +03:00
|
|
|
n->smart_critical_warning = value;
|
2021-01-15 06:27:02 +03:00
|
|
|
|
|
|
|
/* only inject new bits of smart critical warning */
|
|
|
|
for (index = 0; index < NVME_SMART_WARN_MAX; index++) {
|
|
|
|
event = 1 << index;
|
|
|
|
if (value & ~old_value & event)
|
|
|
|
nvme_smart_event(n, event);
|
|
|
|
}
|
2021-01-15 06:27:01 +03:00
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
static const VMStateDescription nvme_vmstate = {
|
|
|
|
.name = "nvme",
|
|
|
|
.unmigratable = 1,
|
|
|
|
};
|
|
|
|
|
|
|
|
static void nvme_class_init(ObjectClass *oc, void *data)
|
|
|
|
{
|
|
|
|
DeviceClass *dc = DEVICE_CLASS(oc);
|
|
|
|
PCIDeviceClass *pc = PCI_DEVICE_CLASS(oc);
|
|
|
|
|
2017-11-22 06:08:43 +03:00
|
|
|
pc->realize = nvme_realize;
|
2013-06-04 19:17:10 +04:00
|
|
|
pc->exit = nvme_exit;
|
|
|
|
pc->class_id = PCI_CLASS_STORAGE_EXPRESS;
|
2016-08-04 22:42:15 +03:00
|
|
|
pc->revision = 2;
|
2013-06-04 19:17:10 +04:00
|
|
|
|
2013-07-29 18:17:45 +04:00
|
|
|
set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
|
2013-06-04 19:17:10 +04:00
|
|
|
dc->desc = "Non-Volatile Memory Express";
|
2020-01-10 18:30:32 +03:00
|
|
|
device_class_set_props(dc, nvme_props);
|
2013-06-04 19:17:10 +04:00
|
|
|
dc->vmsd = &nvme_vmstate;
|
|
|
|
}
|
|
|
|
|
nvme: generate OpenFirmware device path in the "bootorder" fw_cfg file
Background on QEMU boot indices
-------------------------------
Normally, the "bootindex" property is configured for bootable devices
with:
DEVICE_instance_init()
device_add_bootindex_property(..., "bootindex", ...)
object_property_add(..., device_get_bootindex,
device_set_bootindex, ...)
and when the bootindex is set on the QEMU command line, with
-device DEVICE,...,bootindex=N
the setter that was configured above is invoked:
device_set_bootindex()
/* parse boot index */
visit_type_int32()
/* verify unicity */
check_boot_index()
/* store parsed boot index */
...
/* insert device path to boot order */
add_boot_device_path()
In the last step, add_boot_device_path() ensures that an OpenFirmware
device path will show up in the "bootorder" fw_cfg file, at a position
corresponding to the device's boot index. Thus guest firmware (SeaBIOS and
OVMF) can try to boot off the device with the right priority.
NVMe boot index
---------------
In QEMU commit 33739c712982,
nvma: ide: add bootindex to qom property
the following generic setters / getters:
- device_set_bootindex()
- device_get_bootindex()
were open-coded for NVMe, under the names
- nvme_set_bootindex()
- nvme_get_bootindex()
Plus nvme_instance_init() was added to configure the "bootindex" property
manually, designating the open-coded getter & setter, rather than calling
device_add_bootindex_property().
Crucially, nvme_set_bootindex() avoided the final add_boot_device_path()
call. This fact is spelled out in the message of commit 33739c712982, and
it was presumably the entire reason for all of the code duplication.
Now, Vladislav filed an RFE for OVMF
<https://github.com/tianocore/edk2/issues/48>; OVMF should boot off NVMe
devices. It is simple to build edk2's existent NvmExpressDxe driver into
OVMF, but the boot order matching logic in OVMF can only handle NVMe if
the "bootorder" fw_cfg file includes such devices.
Therefore this patch converts the NVMe device model to
device_set_bootindex() all the way.
Device paths
------------
device_set_bootindex() accepts an optional parameter called "suffix". When
present, it is expected to take the form of an OpenFirmware device path
node, and it gets appended as last node to the otherwise auto-generated
OFW path.
For NVMe, the auto-generated part is
/pci@i0cf8/pci8086,5845@6[,1]
^ ^ ^ ^
| | PCI slot and (present when nonzero)
| | function of the NVMe controller, both hex
| "driver name" component, built from PCI vendor & device IDs
PCI root at system bus port, PIO
to which here we append the suffix
/namespace@1,0
^ ^
| big endian (MSB at lowest address) numeric interpretation
| of the 64-bit IEEE Extended Unique Identifier, aka EUI-64,
| hex
32-bit NVMe namespace identifier, aka NSID, hex
resulting in the OFW device path
/pci@i0cf8/pci8086,5845@6[,1]/namespace@1,0
The reason for including the NSID and the EUI-64 is that an NVMe device
can in theory produce several different namespaces (distinguished by
NSID). Additionally, each of those may (optionally) have an EUI-64 value.
For now, QEMU only provides namespace 1.
Furthermore, QEMU doesn't even represent the EUI-64 as a standalone field;
it is embedded (and left unused) inside the "NvmeIdNs.res30" array, at the
last eight bytes. (Which is fine, since EUI-64 can be left zero-filled if
unsupported by the device.)
Based on the above, we set the "unit address" part of the last
("namespace") node to fixed "1,0".
OVMF will then map the above OFW device path to the following UEFI device
path fragment, for boot order processing:
PciRoot(0x0)/Pci(0x6,0x1)/NVMe(0x1,00-00-00-00-00-00-00-00)
^ ^ ^ ^ ^ ^
| | | | | octets of the EUI-64 in address order
| | | | NSID
| | | NVMe namespace messaging device path node
| PCI slot and function
PCI root bridge
Cc: Keith Busch <keith.busch@intel.com> (supporter:nvme)
Cc: Kevin Wolf <kwolf@redhat.com> (supporter:Block layer core)
Cc: qemu-block@nongnu.org (open list:nvme)
Cc: Gonglei <arei.gonglei@huawei.com>
Cc: Vladislav Vovchenko <vladislav.vovchenko@sk.com>
Cc: Feng Tian <feng.tian@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Kevin O'Connor <kevin@koconnor.net>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Gonglei <arei.gonglei@huawei.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Tested-by: Vladislav Vovchenko <vladislav.vovchenko@sk.com>
Message-id: 1453850483-27511-1-git-send-email-lersek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-01-27 02:21:23 +03:00
|
|
|
static void nvme_instance_init(Object *obj)
|
2014-10-07 12:00:34 +04:00
|
|
|
{
|
2021-01-15 06:27:01 +03:00
|
|
|
NvmeCtrl *n = NVME(obj);
|
2014-10-07 12:00:34 +04:00
|
|
|
|
2021-01-15 06:27:01 +03:00
|
|
|
if (n->namespace.blkconf.blk) {
|
|
|
|
device_add_bootindex_property(obj, &n->namespace.blkconf.bootindex,
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
"bootindex", "/namespace@1,0",
|
|
|
|
DEVICE(obj));
|
|
|
|
}
|
2021-01-15 06:27:01 +03:00
|
|
|
|
|
|
|
object_property_add(obj, "smart_critical_warning", "uint8",
|
|
|
|
nvme_get_smart_warning,
|
|
|
|
nvme_set_smart_warning, NULL, NULL);
|
2014-10-07 12:00:34 +04:00
|
|
|
}
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
static const TypeInfo nvme_info = {
|
2019-01-20 08:55:56 +03:00
|
|
|
.name = TYPE_NVME,
|
2013-06-04 19:17:10 +04:00
|
|
|
.parent = TYPE_PCI_DEVICE,
|
|
|
|
.instance_size = sizeof(NvmeCtrl),
|
2014-10-07 12:00:34 +04:00
|
|
|
.instance_init = nvme_instance_init,
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
.class_init = nvme_class_init,
|
2017-09-27 22:56:33 +03:00
|
|
|
.interfaces = (InterfaceInfo[]) {
|
|
|
|
{ INTERFACE_PCIE_DEVICE },
|
|
|
|
{ }
|
|
|
|
},
|
2013-06-04 19:17:10 +04:00
|
|
|
};
|
|
|
|
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
static const TypeInfo nvme_bus_info = {
|
|
|
|
.name = TYPE_NVME_BUS,
|
|
|
|
.parent = TYPE_BUS,
|
|
|
|
.instance_size = sizeof(NvmeBus),
|
|
|
|
};
|
|
|
|
|
2013-06-04 19:17:10 +04:00
|
|
|
static void nvme_register_types(void)
|
|
|
|
{
|
|
|
|
type_register_static(&nvme_info);
|
hw/block/nvme: support multiple namespaces
This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.
This changes how an nvme device is created. Example with two namespaces:
-drive file=nvme0n1.img,if=none,id=disk1
-drive file=nvme0n2.img,if=none,id=disk2
-device nvme,serial=deadbeef,id=nvme0
-device nvme-ns,drive=disk1,bus=nvme0,nsid=1
-device nvme-ns,drive=disk2,bus=nvme0,nsid=2
The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2019-06-26 09:51:06 +03:00
|
|
|
type_register_static(&nvme_bus_info);
|
2013-06-04 19:17:10 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
type_init(nvme_register_types)
|