Commit Graph

167 Commits

Author SHA1 Message Date
Klaus Jensen
fbba243bc7 hw/nvme: bump firmware revision
The Linux kernel quirks the QEMU NVMe controller pretty heavily because
of the namespace identifier mess. Since this is now fixed, bump the
firmware revision number to allow the quirk to be disabled for this
revision.

As of now, bump the firmware revision number to be equal to the QEMU
release version number.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-03 21:48:24 +02:00
Klaus Jensen
9f2e1acf83 hw/nvme: do not report null uuid
Do not report the "null uuid" (all zeros) in the namespace
identification descriptors.

Reported-by: Luis Chamberlain <mcgrof@kernel.org>
Reported-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-03 21:48:24 +02:00
Klaus Jensen
bd9f371c6f hw/nvme: do not auto-generate uuid
Do not default to generate an UUID for namespaces if it is not
explicitly specified.

This is a technically a breaking change in behavior. However, since the
UUID changes on every VM launch, it is not spec compliant and is of
little use since the UUID cannot be used reliably anyway and the
behavior prior to this patch must be considered buggy.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-03 21:48:24 +02:00
Klaus Jensen
36d83272d5 hw/nvme: do not auto-generate eui64
We cannot provide auto-generated unique or persistent namespace
identifiers (EUI64, NGUID, UUID) easily. Since 6.1, namespaces have been
assigned a generated EUI64 of the form "52:54:00:<namespace counter>".
This is will be unique within a QEMU instance, but not globally.

Revert that this is assigned automatically and immediately deprecate the
compatibility parameter. Users can opt-in to this with the
`eui64-default=on` device parameter or set it explicitly with
`eui64=UINT64`.

Cc: libvir-list@redhat.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-03 21:48:24 +02:00
Klaus Jensen
a859eb9f8f hw/nvme: enforce common serial per subsystem
The Identify Controller Serial Number (SN) is the serial number for the
NVM subsystem and must be the same across all controller in the NVM
subsystem.

Enforce this.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-03 21:48:24 +02:00
Klaus Jensen
9235a72a5d hw/nvme: fix smart aen
Pass the right constant to nvme_smart_event(). The NVME_AER* values hold
the bit position in the SMART byte, not the shifted value that we expect
it to be in nvme_smart_event().

Fixes: c62720f137 ("hw/block/nvme: trigger async event during injecting smart warning")
Acked-by: zhenwei pi <pizhenwei@bytedance.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-03 21:48:24 +02:00
Dmitry Tikhov
2e8f952ae7 hw/nvme: fix copy cmd for pi enabled namespaces
Current implementation have problem in the read part of copy command.
Because there is no metadata mangling before nvme_dif_check invocation,
reftag error could be thrown for blocks of namespace that have not been
previously written to.

Signed-off-by: Dmitry Tikhov <d.tihov@yadro.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-03 21:48:24 +02:00
Dmitry Tikhov
51c4532663 hw/nvme: add missing return statement
Since there is no return after nvme_dsm_cb invocation, metadata
associated with non-zero block range is currently zeroed. Also this
behaviour leads to segfault since we schedule iocb->bh two times.
First when entering nvme_dsm_cb with iocb->idx == iocb->nr and
second because of missing return on call stack unwinding by calling
blk_aio_pwrite_zeroes and subsequent nvme_dsm_cb callback.

Fixes: d7d1474fd8 ("hw/nvme: reimplement dsm to allow cancellation")
Signed-off-by: Dmitry Tikhov <d.tihov@yadro.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-03 21:48:24 +02:00
Dmitry Tikhov
1e64facc01 hw/nvme: fix narrowing conversion
Since nlbas is of type int, it does not work with large namespace size
values, e.g., 9 TB size of file backing namespace and 8 byte metadata
with 4096 bytes lbasz gives negative nlbas value, which is later
promoted to negative int64_t type value and results in negative
ns->moff which breaks namespace

Signed-off-by: Dmitry Tikhov <ddtikhov@gmail.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-03 21:48:24 +02:00
Markus Armbruster
52581c718c Clean up header guards that don't match their file name
Header guard symbols should match their file name to make guard
collisions less likely.

Cleaned up with scripts/clean-header-guards.pl, followed by some
renaming of new guard symbols picked by the script to better ones.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20220506134911.2856099-2-armbru@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[Change to generated file ebpf/rss.bpf.skeleton.h backed out]
2022-05-11 16:49:06 +02:00
Markus Armbruster
b21e238037 Use g_new() & friends where that makes obvious sense
g_new(T, n) is neater than g_malloc(sizeof(T) * n).  It's also safer,
for two reasons.  One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.

This commit only touches allocations with size arguments of the form
sizeof(T).

Patch created mechanically with:

    $ spatch --in-place --sp-file scripts/coccinelle/use-g_new-etc.cocci \
	     --macro-file scripts/cocci-macro-file.h FILES...

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20220315144156.1595462-4-armbru@redhat.com>
Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
2022-03-21 15:44:44 +01:00
Naveen Nagar
44219b6029 hw/nvme: 64-bit pi support
This adds support for one possible new protection information format
introduced in TP4068 (and integrated in NVMe 2.0): the 64-bit CRC guard
and 48-bit reference tag. This version does not support storage tags.

Like the CRC16 support already present, this uses a software
implementation of CRC64 (so it is naturally pretty slow). But its good
enough for verification purposes.

This may go nicely hand-in-hand with the support that Keith submitted
for the Linux kernel[1].

  [1]: https://lore.kernel.org/linux-nvme/20220126165214.GA1782352@dhcp-10-100-145-180.wdc.com/T/

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Naveen Nagar <naveen.n1@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-03-03 09:30:21 +01:00
Klaus Jensen
ac0b34c58d hw/nvme: add pi tuple size helper
A subsequent patch will introduce a new tuple size; so add a helper and
use that instead of sizeof() and magic numbers.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-03-03 09:28:49 +01:00
Naveen Nagar
763c05dfb0 hw/nvme: add support for the lbafee hbs feature
Add support for up to 64 LBA formats through the LBAFEE field of the
Host Behavior Support feature.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Naveen Nagar <naveen.n1@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-03-03 09:28:49 +01:00
Klaus Jensen
a6de6ed509 hw/nvme: move format parameter parsing
There is no need to extract the format command parameters for each
namespace. Move it to the entry point.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-03-03 09:28:49 +01:00
Naveen Nagar
d0c0697b9e hw/nvme: add host behavior support feature
Add support for getting and setting the Host Behavior Support feature.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Naveen Nagar <naveen.n1@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-03-03 09:28:48 +01:00
Klaus Jensen
05f7ae45c8 hw/nvme: move dif/pi prototypes into dif.h
Move dif/pi data structures and inlines to dif.h.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-03-03 09:28:48 +01:00
Klaus Jensen
e321b4cdc2 hw/nvme: add support for zoned random write area
Add support for TP 4076 ("Zoned Random Write Area"), v2021.08.23
("Ratified").

This adds three new namespace parameters: "zoned.numzrwa" (number of
zrwa resources, i.e. number of zones that can have a zrwa),
"zoned.zrwas" (zrwa size in LBAs), "zoned.zrwafg" (granularity in LBAs
for flushes).

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-02-14 08:58:29 +01:00
Klaus Jensen
25872031e1 hw/nvme: add ozcs enum
Add enumeration for OZCS values.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-02-14 08:58:29 +01:00
Klaus Jensen
6190d92ff7 hw/nvme: add struct for zone management send
Add struct for Zone Management Send in preparation for more zone send
flags.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-02-14 08:58:29 +01:00
Philippe Mathieu-Daudé
8d3a17be6f hw/nvme/ctrl: Pass buffers as 'void *' types
These buffers can be anything, not an array of chars,
so use the 'void *' type for them.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-02-14 08:58:29 +01:00
Philippe Mathieu-Daudé
e080ce8676 hw/nvme/ctrl: Have nvme_addr_write() take const buffer
The 'buf' argument is not modified, so better pass it as const type.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-02-14 08:58:29 +01:00
Klaus Jensen
736b01642d hw/nvme: fix CVE-2021-3929
This fixes CVE-2021-3929 "locally" by denying DMA to the iomem of the
device itself. This still allows DMA to MMIO regions of other devices
(e.g. doing P2P DMA to the controller memory buffer of another NVMe
device).

Fixes: CVE-2021-3929
Reported-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-02-14 08:58:29 +01:00
Philippe Mathieu-Daudé
f02b664aad hw/dma: Let dma_buf_read() / dma_buf_write() propagate MemTxResult
Since commit 292e13142d, dma_buf_rw() returns a MemTxResult type.
Do not discard it, return it to the caller. Pass the previously
returned value (the QEMUSGList residual size, which was rarely used)
as an optional argument.

With this new API, SCSIRequest::residual might now be accessed via
a pointer. Since the size_t type does not have the same size on
32 and 64-bit host architectures, convert it to a uint64_t, which
is big enough to hold the residual size, and the type is constant
on both 32/64-bit hosts.

Update the few dma_buf_read() / dma_buf_write() callers to the new
API.

Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220117125130.131828-1-f4bug@amsat.org>
2022-01-18 12:56:29 +01:00
Philippe Mathieu-Daudé
bfa30f3903 hw/dma: Use dma_addr_t type definition when relevant
Update the obvious places where dma_addr_t should be used
(instead of uint64_t, hwaddr, size_t, int32_t types).

This allows to have &dma_addr_t type portable on 32/64-bit
hosts.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220111184309.28637-11-f4bug@amsat.org>
2022-01-18 12:56:29 +01:00
Philippe Mathieu-Daudé
1e5a3f8b2a dma: Let dma_buf_read() take MemTxAttrs argument
Let devices specify transaction attributes when calling
dma_buf_read().

Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.

Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-13-philmd@redhat.com>
2021-12-31 01:05:27 +01:00
Philippe Mathieu-Daudé
392e48af34 dma: Let dma_buf_write() take MemTxAttrs argument
Let devices specify transaction attributes when calling
dma_buf_write().

Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.

Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-12-philmd@redhat.com>
2021-12-31 01:05:27 +01:00
Klaus Jensen
e2c57529c9 hw/nvme: fix buffer overrun in nvme_changed_nslist (CVE-2021-3947)
Fix missing offset verification.

Cc: qemu-stable@nongnu.org
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Reported-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Fixes: f432fdfa12 ("support changed namespace asynchronous event")
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-11-19 07:32:19 +01:00
Klaus Jensen
916b0f0b52 hw/nvme: change nvme-ns 'shared' default
Change namespaces to be shared namespaces by default (parameter
shared=on). Keep shared=off for older machine types.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-11-19 07:31:56 +01:00
Hannes Reinecke
9fc6e86e8b hw/nvme: reattach subsystem namespaces on hotplug
With commit 5ffbaeed16 ("hw/nvme: fix controller hot unplugging")
namespaces get moved from the controller to the subsystem if one
is specified.
That keeps the namespaces alive after a controller hot-unplug, but
after a controller hotplug we have to reconnect the namespaces
from the subsystem to the controller.

Fixes: 5ffbaeed16 ("hw/nvme: fix controller hot unplugging")
Cc: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Hannes Reinecke <hare@suse.de>
[k.jensen: only attach to shared and non-detached namespaces]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-11-19 07:31:34 +01:00
Peter Maydell
d637e1dc6d qbus: Rename qbus_create_inplace() to qbus_init()
Rename qbus_create_inplace() to qbus_init(); this is more in line
with our usual naming convention for functions that in-place
initialize objects.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20210923121153.23754-5-peter.maydell@linaro.org
2021-09-30 13:42:10 +01:00
Pankaj Raghav
c53a9a9102 hw/nvme: Return error for fused operations
Currently, FUSED operations are not supported by QEMU. As per the 1.4 SPEC,
controller should abort the command that requested a fused operation with
an INVALID FIELD error code if they are not supported.

Changes from v1:
Added FUSE flag check also to the admin cmd processing as the FUSED
operations are mentioned in the general SQE section in the SPEC.

Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-09-24 08:43:58 +02:00
Naveen Nagar
07a3dfa7c4 hw/nvme: fix verification of select field in namespace attachment
Fix is added to check for reserved value in select field for
namespace attachment

CC: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Naveen Nagar <naveen.n1@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-09-24 08:43:52 +02:00
Klaus Jensen
fd761337ac hw/nvme: fix validation of ASQ and ACQ
Address 0x0 is a valid address. Fix the admin submission and completion
queue address validation to not error out on this.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-09-24 08:31:35 +02:00
Klaus Jensen
5f4884c441 hw/nvme: fix missing variable initializers
Coverity found that 'uuid', 'csi' and 'eui64' are uninitialized. While
we set most of the fields, we do not explicitly set the rsvd2 field in
the NvmeIdNsDescr header.

Fix this by explicitly zero-initializing the variables.

Reported-by: Coverity (CID 1458835, 1459295 and 1459580)
Fixes: 6870cfb814 ("hw/nvme: namespace parameter for EUI-64")
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-08-09 12:52:16 +02:00
Klaus Jensen
49e03457f1 hw/nvme: fix mmio read
The new PMR test unearthed a long-standing issue with MMIO reads on
big-endian hosts.

Fix this by unconditionally storing all controller registers in little
endian.

Cc: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2021-07-26 21:09:39 +02:00
Klaus Jensen
5029de44b5 hw/nvme: fix out-of-bounds reads
Peter noticed that mmio access may read into the NvmeParams member in
the NvmeCtrl struct.

Fix the bounds check.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2021-07-26 21:09:39 +02:00
Klaus Jensen
a316aa50e6 hw/nvme: use symbolic names for registers
Add the NvmeBarRegs enum and use these instead of explicit register
offsets.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-07-26 21:09:38 +02:00
Klaus Jensen
5d45edbeac hw/nvme: split pmrmsc register into upper and lower
The specification uses a set of 32 bit PMRMSCL and PMRMSCU registers to
make up the 64 bit logical PMRMSC register.

Make it so.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-07-26 21:09:38 +02:00
Klaus Jensen
5ffbaeed16 hw/nvme: fix controller hot unplugging
Prior to this patch the nvme-ns devices are always children of the
NvmeBus owned by the NvmeCtrl. This causes the namespaces to be
unrealized when the parent device is removed. However, when subsystems
are involved, this is not what we want since the namespaces may be
attached to other controllers as well.

This patch adds an additional NvmeBus on the subsystem device. When
nvme-ns devices are realized, if the parent controller device is linked
to a subsystem, the parent bus is set to the subsystem one instead. This
makes sure that namespaces are kept alive and not unrealized.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-07-26 21:09:38 +02:00
Padmakar Kalghatgi
234214734f hw/nvme: error handling for too many mappings
If the number of PRP/SGL mappings exceed 1024, reads and writes will
fail because of an internal QEMU limitation of max 1024 vectors.

Signed-off-by: Padmakar Kalghatgi <p.kalghatgi@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
[k.jensen: changed the error message to be more generic]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-07-26 21:09:38 +02:00
Klaus Jensen
b0fde9e861 hw/nvme: unregister controller with subsystem at exit
Make sure the controller is unregistered from the subsystem when device
is removed.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-07-26 21:09:38 +02:00
Klaus Jensen
cc6fb6bc50 hw/nvme: mark nvme-subsys non-hotpluggable
We currently lack the infrastructure to handle subsystem hotplugging, so
disable it.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-07-26 21:09:38 +02:00
Klaus Jensen
5e4f6bcc29 hw/nvme: remove NvmeCtrl parameter from ns setup/check functions
The nvme_ns_setup and nvme_ns_check_constraints should not depend on the
controller state. Refactor and remove it.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-07-26 21:09:38 +02:00
Klaus Jensen
83d7ed5c57 hw/nvme: fix pin-based interrupt behavior (again)
Jakub noticed[1] that, when using pin-based interrupts, the device will
unconditionally deasssert when any CQEs are acknowledged. However, the
pin should not be deasserted if other completion queues still holds
unacknowledged CQEs.

The bug is an artifact of commit ca247d3509 ("hw/block/nvme: fix
pin-based interrupt behavior") which fixed one bug but introduced
another. This is the third time someone tries to fix pin-based
interrupts (see commit 5e9aa92eb1 ("hw/block: Fix pin-based interrupt
behaviour of NVMe"))...

Third time's the charm, so fix it, again, by keeping track of how many
CQs have unacknowledged CQEs and only deassert when all are cleared.

  [1]: <20210610114624.304681-1-jakub.jermar@kernkonzept.com>

Cc: qemu-stable@nongnu.org
Fixes: ca247d3509 ("hw/block/nvme: fix pin-based interrupt behavior")
Reported-by: Jakub Jermář <jakub.jermar@kernkonzept.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-06-29 07:18:10 +02:00
Klaus Jensen
2b02aabc9d hw/nvme: fix missing check for PMR capability
Qiang Liu reported that an access on an unknown address is triggered in
memory_region_set_enabled because a check on CAP.PMRS is missing for the
PMRCTL register write when no PMR is configured.

Cc: qemu-stable@nongnu.org
Fixes: 75c3c9de96 ("hw/block/nvme: disable PMR at boot up")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/362
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-06-29 07:16:25 +02:00
Gollu Appalanaidu
eeef43290d hw/nvme: documentation fix
In the documentation of the '-detached' param "be" and "not" has been
used side by side, fix that.

Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-06-29 07:16:25 +02:00
Gollu Appalanaidu
5f4eb94dbb hw/nvme: fix endianess conversion and add controller list
Add the controller identifiers list CNS 0x13, available list of ctrls
in NVM Subsystem that may or may not be attached to namespaces.

In Identify Ctrl List of the CNS 0x12 and 0x13 no endian conversion
for the nsid field.

These two CNS values shows affect when there exists a Subsystem.
Added condition if there is no Subsystem return invalid field in
command.

Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-06-29 07:16:25 +02:00
Klaus Jensen
e76fb260ca Partially revert "hw/block/nvme: drain namespaces on sq deletion"
This partially reverts commit 98f84f5a4e.

Since all "multi aio" commands are now reimplemented to properly track
the nested aiocbs, we can revert the "hack" that was introduced to make
sure all requests we're properly drained upon sq deletion.

The revert is partial since we keep the assert that no outstanding
requests remain on the submission queue after the explicit cancellation.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-06-29 07:16:25 +02:00
Klaus Jensen
3bcf26d3d6 hw/nvme: reimplement format nvm to allow cancellation
Prior to this patch, the aios associated with broadcast format are
submitted anonymously (no aiocb reference saved from the blk_aio call).

Fix this by formatting the namespaces one after another, saving a
reference to the aiocb for each.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-06-29 07:16:25 +02:00
Klaus Jensen
63d96e4ffd hw/nvme: reimplement zone reset to allow cancellation
Prior to this patch, the aios associated with zone reset are submitted
anonymously (no reference saved to the aiocb from the blk_aio call).

Fix this by resetting the zones one after another, saving a reference to
the aiocb for each reset.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-06-29 07:16:25 +02:00
Klaus Jensen
796d20681d hw/nvme: reimplement the copy command to allow aio cancellation
Before this patch the code would issue several aios simultaneously
without saving a reference to the aiocb. Without the aiocb reference the
individual copies cannot be canceled.

Fix this by issuing copies of the ranges one after another.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-06-29 07:16:25 +02:00
Klaus Jensen
f1c97407c5 hw/nvme: add dw0/1 to the req completion trace event
Some commands report additional useful information in dw0 and dw1 of the
completion queue entry.

Add them to the trace.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-06-29 07:16:25 +02:00
Klaus Jensen
2a132309e4 hw/nvme: use prinfo directly in nvme_check_prinfo and nvme_dif_check
The nvme_check_prinfo() and nvme_dif_check() functions operate on the
16 bit "control" member of the NvmeCmd. These functions do not otherwise
operate on an NvmeCmd or an NvmeRequest, so change them to expect the
actual 4 bit PRINFO field and add constants that work on this field as
well.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-06-29 07:16:25 +02:00
Klaus Jensen
189a8bf7f6 hw/nvme: remove assert from nvme_get_zone_by_slba
Make nvme_get_zone_by_slba() return NULL if the slba is out of range.
This allows the function to be used without guarding the call with a
call to nvme_check_bounds(), in preparation for the next patch.

Add asserts after calling nvme_get_zone_by_slba() instead.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-06-29 07:16:25 +02:00
Klaus Jensen
0ca5c3ccac hw/nvme: save reftag when generating pi
Prepare nvme_dif_pract_generate_dif() and nvme_dif_check() to be
callable in smaller increments by making the reftag a pointer parameter
updated by the function.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-06-29 07:16:25 +02:00
Klaus Jensen
d7d1474fd8 hw/nvme: reimplement dsm to allow cancellation
Prior to this patch, a loop was used to issue multiple "fire and forget"
aios for each range in the command. Without a reference to the aiocb
returned from the blk_aio_pdiscard calls, the aios cannot be canceled.

Fix this by processing the ranges one after another.

As a bonus, this fixes how metadata is cleared (i.e. we only zero it out
if the data was succesfully discarded).

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-06-29 07:16:25 +02:00
Klaus Jensen
ff0ac2c8b8 hw/nvme: add nvme_block_status_all helper
Pull the gist of nvme_check_dulbe() into a helper function. This is in
preparation for dsm refactoring.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-06-29 07:16:25 +02:00
Klaus Jensen
38f4ac65ac hw/nvme: reimplement flush to allow cancellation
Prior to this patch, a broadcast flush would result in submitting
multiple "fire and forget" aios (no reference saved to the aiocbs
returned from the blk_aio_flush calls).

Fix this by issuing the flushes one after another.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-06-29 07:16:25 +02:00
Heinrich Schuchardt
3276dde4f2 hw/nvme: default for namespace EUI-64
On machines with version > 6.0 replace a missing EUI-64 by a generated
value.

Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
2021-06-29 07:16:25 +02:00
Heinrich Schuchardt
6870cfb814 hw/nvme: namespace parameter for EUI-64
The EUI-64 field is the only identifier for NVMe namespaces in UEFI device
paths. Add a new namespace property "eui64", that provides the user the
option to specify the EUI-64.

Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Acked-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
2021-06-29 07:16:25 +02:00
Gollu Appalanaidu
3553c48fcb hw/nvme: fix csi field for cns 0x00 and 0x11
As per the TP 4056d Namespace types CNS 0x00 and CNS 0x11
CSI field shouldn't use but it is being used for these two
Identify command CNS values, fix that.

Remove 'nvme_csi_has_nvm_support()' helper as suggested by
Klaus we can safely assume NVM command set support for all
namespaces.

Suggested-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-06-29 07:16:25 +02:00
Niklas Cassel
cccc2651f4 hw/nvme: add param to control auto zone transitioning to zone state closed
In the Zoned Namespace Command Set Specification, chapter
2.5.1 Managing resources

"The controller may transition zones in the ZSIO:Implicitly Opened state
to the ZSC:Closed state for resource management purposes."

The word may in this sentence means that automatically transitioning
an implicitly opened zone to closed is completely optional.

Add a new parameter so that the user can control if this automatic
transitioning should be performed or not.

Being able to control this can help with verifying that e.g. a user-space
program behaves properly even without this optional ZNS feature.

The default value is set to true, in order to not change the existing
behavior.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
[k.jensen: moved parameter to controller]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-06-29 07:16:25 +02:00
Gollu Appalanaidu
421a309271 hw/nvme: fix lbaf formats initialization
Currently LBAF formats are being intialized based on metadata
size if and only if nvme-ns "ms" parameter is non-zero value.
Since FormatNVM command being supported device parameter "ms"
may not be the criteria to initialize the supported LBAFs.

And make LBAF array as read-only.

Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-06-29 07:16:25 +02:00
Gollu Appalanaidu
18de1526ba hw/nvme: add identify namespace flbas/mc enums
Add enums for the Identify Namespace FLBAS and MC fields.

Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
[k.jensen: squashed separate flbas/mc commits into one]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-06-29 07:16:25 +02:00
Gollu Appalanaidu
f42be416cd hw/nvme: fix style
Identify command related functions style fix.

Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-06-29 07:16:25 +02:00
Klaus Jensen
88eea45c53 hw/nvme: move nvme emulation out of hw/block
With the introduction of the nvme-subsystem device we are really
cluttering up the hw/block directory.

As suggested by Philippe previously, move the nvme emulation to hw/nvme.

Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-05-17 09:19:00 +02:00