Commit Graph

94 Commits

Author SHA1 Message Date
Klaus Jensen
c3f5526d22 hw/block/nvme: factor out pci setup
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-16-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen
d634d74229 hw/block/nvme: factor out namespace setup
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky  <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-15-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen
3adee1c2d3 hw/block/nvme: add namespace helpers
Introduce some small helpers to make the next patches easier on the eye.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-14-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen
90f4511543 hw/block/nvme: factor out block backend setup
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-13-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen
a17f50188b hw/block/nvme: factor out device state setup
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-12-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen
54000c66f0 hw/block/nvme: factor out property/constraint checks
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-11-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen
e1731e816a hw/block/nvme: remove redundant cmbloc/cmbsz members
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-10-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen
dce22c8646 hw/block/nvme: add max_ioqpairs device parameter
The num_queues device paramater has a slightly confusing meaning because
it accounts for the admin queue pair which is not really optional.
Secondly, it is really a maximum value of queues allowed.

Add a new max_ioqpairs parameter that only accounts for I/O queue pairs,
but keep num_queues for compatibility.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-9-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen
ca247d3509 hw/block/nvme: fix pin-based interrupt behavior
First, since the device only supports MSI-X or pin-based interrupt, if
MSI-X is not enabled, it should not accept interrupt vectors different
from 0 when creating completion queues.

Secondly, the irq_status NvmeCtrl member is meant to be compared to the
INTMS register, so it should only be 32 bits wide. And it is really only
useful when used with multi-message MSI.

Third, since we do not force a 1-to-1 correspondence between cqid and
interrupt vector, the irq_status register should not have bits set
according to cqid, but according to the associated interrupt vector.

Fix these issues, but keep irq_status available so we can easily support
multi-message MSI down the line.

Fixes: 5e9aa92eb1 ("hw/block: Fix pin-based interrupt behaviour of NVMe")
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-8-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen
b4529c5c3a hw/block/nvme: refactor nvme_addr_read
Pull the controller memory buffer check to its own function. The check
will be used on its own in later patches.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-7-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen
3e829fd438 hw/block/nvme: use constants in identify
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-6-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen
1065abfbf1 hw/block/nvme: move device parameters to separate struct
Move device configuration parameters to separate struct to make it
explicit what is configurable and what is set internally.

Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20200609190333.59390-5-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen
4920786ee6 hw/block/nvme: remove superfluous breaks
These break statements was left over when commit 3036a626e9 ("nvme:
add Get/Set Feature Timestamp support") was merged.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-4-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen
6f4ee2e9aa hw/block/nvme: rename trace events to pci_nvme
Change the prefix of all nvme device related trace events to 'pci_nvme'
to not clash with trace events from the nvme block driver.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-3-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:39 +02:00
Klaus Jensen
f7e8c23f39 hw/block/nvme: fix pci doorbell size calculation
The size of the BAR is 0x1000 (main registers) + 8 bytes for each
queue. Currently, the size of the BAR is calculated like so:

    n->reg_size = pow2ceil(0x1004 + 2 * (n->num_queues + 1) * 4);

Since the 'num_queues' parameter already accounts for the admin queue,
this should in any case not need to be incremented by one. Also, the
size should be initialized to (0x1000).

    n->reg_size = pow2ceil(0x1000 + 2 * n->num_queues * 4);

This, with the default value of num_queues (64), we will set aside room
for 1 admin queue and 63 I/O queues (4 bytes per doorbell, 2 doorbells
per queue).

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-2-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:39 +02:00
Philippe Mathieu-Daudé
bc2a2364b8 hw/block: Let the NVMe emulated device be target-agnostic
Now than the non-target specific memory_region_msync() function
is available, use it to make this device target-agnostic.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20200508062456.23344-4-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-06-05 09:54:48 +01:00
Markus Armbruster
40c2281cc3 Drop more @errp parameters after previous commit
Several functions can't fail anymore: ich9_pm_add_properties(),
device_add_bootindex_property(), ppc_compat_add_property(),
spapr_caps_add_properties(), PropertyInfo.create().  Drop their @errp
parameter.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-16-armbru@redhat.com>
2020-05-15 07:08:14 +02:00
Andrzej Jakowski
6cf9413229 nvme: introduce PMR support from NVMe 1.4 spec
This patch introduces support for PMR that has been defined as part of NVMe 1.4
spec. User can now specify a pmrdev option that should point to HostMemoryBackend.
pmrdev memory region will subsequently be exposed as PCI BAR 2 in emulated NVMe
device. Guest OS can perform mmio read and writes to the PMR region that will stay
persistent across system reboot.

Signed-off-by: Andrzej Jakowski <andrzej.jakowski@linux.intel.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200330164656.9348-1-andrzej.jakowski@linux.intel.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-04-30 17:51:07 +02:00
Marc-André Lureau
4f67d30b5e qdev: set properties with device_class_set_props()
The following patch will need to handle properties registration during
class_init time. Let's use a device_class_set_props() setter.

spatch --macro-file scripts/cocci-macro-file.h  --sp-file
./scripts/coccinelle/qdev-set-props.cocci --keep-comments --in-place
--dir .

@@
typedef DeviceClass;
DeviceClass *d;
expression val;
@@
- d->props = val
+ device_class_set_props(d, val)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20200110153039.1379601-20-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 20:59:15 +01:00
Markus Armbruster
a27bd6c779 Include hw/qdev-properties.h less
In my "build everything" tree, changing hw/qdev-properties.h triggers
a recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

Many places including hw/qdev-properties.h (directly or via hw/qdev.h)
actually need only hw/qdev-core.h.  Include hw/qdev-core.h there
instead.

hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h
and hw/qdev-properties.h, which in turn includes hw/qdev-core.h.
Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h.

While there, delete a few superfluous inclusions of hw/qdev-core.h.

Touching hw/qdev-properties.h now recompiles some 1200 objects.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190812052359.30071-22-armbru@redhat.com>
2019-08-16 13:31:53 +02:00
Markus Armbruster
650d103d3e Include hw/hw.h exactly where needed
In my "build everything" tree, changing hw/hw.h triggers a recompile
of some 2600 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).

The previous commits have left only the declaration of hw_error() in
hw/hw.h.  This permits dropping most of its inclusions.  Touching it
now recompiles less than 200 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster
d645427057 Include migration/vmstate.h less
In my "build everything" tree, changing migration/vmstate.h triggers a
recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

hw/hw.h supposedly includes it for convenience.  Several other headers
include it just to get VMStateDescription.  The previous commit made
that unnecessary.

Include migration/vmstate.h only where it's still needed.  Touching it
now recompiles only some 1600 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-16-armbru@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Klaus Birkelund Jensen
1cc354ac98 nvme: do not advertise support for unsupported arbitration mechanism
The device mistakenly reports that the Weighted Round Robin with Urgent
Priority Class arbitration mechanism is supported.

It is not.

Signed-off-by: Klaus Birkelund Jensen <klaus.jensen@cnexlabs.com>
Message-id: 20190606092530.14206-1-klaus@birkelund.eu
Acked-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-06-24 15:53:01 +02:00
Markus Armbruster
0b8fa32f55 Include qemu/module.h where needed, drop it from qemu-common.h
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-4-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c
hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c;
ui/cocoa.m fixed up]
2019-06-12 13:18:33 +02:00
Kenneth Heitke
3036a626e9 nvme: add Get/Set Feature Timestamp support
Signed-off-by: Kenneth Heitke <kenneth.heitke@intel.com>
Reviewed-by: Klaus Birkelund Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-06-04 15:22:09 +02:00
Klaus Birkelund Jensen
25349e8250 nvme: fix copy direction in DMA reads going to CMB
`nvme_dma_read_prp` erronously used `qemu_iovec_*to*_buf` instead of
`qemu_iovec_*from*_buf` when the request involved the controller memory
buffer.

Signed-off-by: Klaus Birkelund Jensen <klaus.jensen@cnexlabs.com>
Reviewed-by: Kenneth Heitke <kenneth.heitke@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-05-20 17:08:56 +02:00
Keith Busch
9d6459d21a nvme: fix write zeroes offset and count
The implementation used blocks units rather than the expected bytes.

Fixes: c03e7ef12a ("nvme: Implement Write Zeroes")
Reported-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Li Qiang
a3d25ddd6f nvme: use pci_dev directly in nvme_realize
There is no need to make another reference.

Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190120055558.32984-4-liq3ea@163.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-01-31 00:38:19 +01:00
Li Qiang
2410e133ec nvme: ensure the num_queues is not zero
When it is zero, it causes segv.
Using following command:

"-drive file=//home/test/test1.img,if=none,id=id0
-device nvme,drive=id0,serial=test,num_queues=0"
causes following Backtrack:

Thread 4 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe9735700 (LWP 30952)]
0x0000555555a7a77c in nvme_start_ctrl (n=0x5555577473f0) at hw/block/nvme.c:825
825	    if (unlikely(n->cq[0])) {
(gdb) bt
0  0x0000555555a7a77c in nvme_start_ctrl (n=0x5555577473f0)
    at hw/block/nvme.c:825
1  0x0000555555a7af7f in nvme_write_bar (n=0x5555577473f0, offset=20,
    data=4587521, size=4) at hw/block/nvme.c:969
2  0x0000555555a7b81a in nvme_mmio_write (opaque=0x5555577473f0, addr=20,
    data=4587521, size=4) at hw/block/nvme.c:1163
3  0x0000555555869236 in memory_region_write_accessor (mr=0x555557747cd0,
    addr=20, value=0x7fffe97320f8, size=4, shift=0, mask=4294967295, attrs=...)
    at /home/test/qemu1/qemu/memory.c:502
4  0x0000555555869446 in access_with_adjusted_size (addr=20,
    value=0x7fffe97320f8, size=4, access_size_min=2, access_size_max=8,
    access_fn=0x55555586914d <memory_region_write_accessor>,
    mr=0x555557747cd0, attrs=...) at /home/test/qemu1/qemu/memory.c:568
5  0x000055555586c479 in memory_region_dispatch_write (mr=0x555557747cd0,
    addr=20, data=4587521, size=4, attrs=...)
    at /home/test/qemu1/qemu/memory.c:1499
6  0x00005555558030af in flatview_write_continue (fv=0x7fffe0061130,
    addr=4273930260, attrs=..., buf=0x7ffff7ff0028 "\001", len=4, addr1=20,
    l=4, mr=0x555557747cd0) at /home/test/qemu1/qemu/exec.c:3234
7  0x00005555558031f9 in flatview_write (fv=0x7fffe0061130, addr=4273930260,
    attrs=..., buf=0x7ffff7ff0028 "\001", len=4)
    at /home/test/qemu1/qemu/exec.c:3273
8  0x00005555558034ff in address_space_write (
---Type <return> to continue, or q <return> to quit---
    as=0x555556758480 <address_space_memory>, addr=4273930260, attrs=...,
    buf=0x7ffff7ff0028 "\001", len=4) at /home/test/qemu1/qemu/exec.c:3363
9  0x0000555555803550 in address_space_rw (
    as=0x555556758480 <address_space_memory>, addr=4273930260, attrs=...,
    buf=0x7ffff7ff0028 "\001", len=4, is_write=true)
    at /home/test/qemu1/qemu/exec.c:3374
10 0x00005555558884a1 in kvm_cpu_exec (cpu=0x555556920e40)
    at /home/test/qemu1/qemu/accel/kvm/kvm-all.c:2031
11 0x000055555584cd9d in qemu_kvm_cpu_thread_fn (arg=0x555556920e40)
    at /home/test/qemu1/qemu/cpus.c:1281
12 0x0000555555dbaf6d in qemu_thread_start (args=0x5555569438a0)
    at util/qemu-thread-posix.c:502
13 0x00007ffff5dc86db in start_thread (arg=0x7fffe9735700)
    at pthread_create.c:463
14 0x00007ffff5af188f in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190120055558.32984-3-liq3ea@163.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-01-31 00:38:19 +01:00
Li Qiang
08db59e188 nvme: use TYPE_NVME instead of constant string
Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190120055558.32984-2-liq3ea@163.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-01-31 00:38:19 +01:00
Keith Busch
6da021815e nvme: Fix spurious interrupts
The code had asserted an interrupt every time it was requested to check
for new completion queue entries.This can result in spurious interrupts
seen by the guest OS.

Fix this by asserting an interrupt only if there are un-acknowledged
completion queue entries available.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-27 12:59:00 +01:00
Logan Gunthorpe
ad3a7e4555 nvme: fix bug with PCI IRQ pins on teardown
When the submission and completion queues are being torn down
the IRQ will be asserted for the completion queue when the
submsission queue is deleted. Then when the completion queue
is deleted it stays asserted. Thus, on systems that do
not use MSI, no further interrupts can be triggered on the host.

Linux sees this as a long delay when unbinding the nvme device.
Eventually the interrupt timeout occurs and it continues.

To fix this we ensure we deassert the IRQ for a CQ when it is
deleted.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-22 19:37:31 +01:00
Paolo Bonzini
71a86ddece nvme: fix CMB endianness confusion
The CMB is marked as DEVICE_LITTLE_ENDIAN, so the data must be
read/written as if it was little-endian output (in the case of
big endian, we get two swaps, one in the memory core and one
in nvme.c).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-22 19:37:31 +01:00
Kevin Wolf
2067d39e5e Revert "nvme: fix oob access issue(CVE-2018-16847)"
This reverts commit 5e3c0220d7.
We have a better fix commited for this now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-22 16:43:52 +01:00
Paolo Bonzini
87ad860c62 nvme: fix out-of-bounds access to the CMB
Because the CMB BAR has a min_access_size of 2, if you read the last
byte it will try to memcpy *2* bytes from n->cmbuf, causing an off-by-one
error.  This is CVE-2018-16847.

Another way to fix this might be to register the CMB as a RAM memory
region, which would also be more efficient.  However, that might be a
change for big-endian machines; I didn't think this through and I don't
know how real hardware works.  Add a basic testcase for the CMB in case
somebody does this change later on.

Cc: Keith Busch <keith.busch@intel.com>
Cc: qemu-block@nongnu.org
Reported-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Tested-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-22 16:43:52 +01:00
Igor Druzhinin
6bf7463615 nvme: call blk_drain in NVMe reset code to avoid lockups
When blk_flush called in NVMe reset path S/C queues are already freed
which means that re-entering AIO handling loop having some IO requests
unfinished will lockup or crash as their SG structures being potentially
reused. Call blk_drain before freeing the queues to avoid this nasty
scenario.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-22 16:43:52 +01:00
Li Qiang
5e3c0220d7 nvme: fix oob access issue(CVE-2018-16847)
Currently, the nvme_cmb_ops mr doesn't check the addr and size.
This can lead an oob access issue. This is triggerable in the guest.
Add check to avoid this issue.

Fixes CVE-2018-16847.

Reported-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-19 12:51:16 +01:00
Li Qiang
a883d6a0bc nvme: free cmbuf in nvme_exit
This avoid a memory leak in unhotplug nvme device.

Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-12 17:46:57 +01:00
Li Qiang
20faf0f5f8 nvme: don't unref ctrl_mem when device unrealized
Currently, when hotplug/unhotplug nvme device, it will cause an
assert in object.c. Following is the backtrack:

ERROR:qom/object.c:981:object_unref: assertion failed: (obj->ref > 0)

Thread 2 "qemu-system-x86" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffcbd32700 (LWP 18844)]
0x00007fffdb9e4fff in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
/lib/x86_64-linux-gnu/libglib-2.0.so.0
/lib/x86_64-linux-gnu/libglib-2.0.so.0
qom/object.c:981
/home/liqiang02/qemu-upstream/qemu/memory.c:1732
/home/liqiang02/qemu-upstream/qemu/memory.c:285
util/qemu-thread-posix.c:504
/lib/x86_64-linux-gnu/libpthread.so.0

This is caused by memory_region_unref in nvme_exit.

Remove it to make the PCIdevice refcount correct.

Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-12 17:46:57 +01:00
Kevin Wolf
572023f7b2 block: Remove deprecated -drive option serial
This reinstates commit b008326744,
which was temporarily reverted for the 3.0 release so that libvirt gets
some extra time to update their command lines.

The -drive option serial was deprecated in QEMU 2.10. It's time to
remove it.

Tests need to be updated to set the serial number with -global instead
of using the -drive option.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2018-08-15 12:50:39 +02:00
Cornelia Huck
44e8b4689c Revert "block: Remove deprecated -drive option serial"
This reverts commit b008326744.

Hold off removing this for one more QEMU release (current libvirt
release still uses it.)

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-07-10 14:36:11 +02:00
Philippe Mathieu-Daudé
e8400cf385 hw/block: Use the IEC binary prefix definitions
It eases code review, unit is explicit.

Patch generated using:

  $ git grep -E '(1024|2048|4096|8192|(<<|>>).?(10|20|30))' hw/ include/hw/

and modified manually.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180625124238.25339-14-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-02 15:41:13 +02:00
Weiping Zhang
7c8952697e hw/block/nvme: add optional parameter num_queues for nvme device
Add an optional paramter num_queues for device, and set it
to 64 by default.

Signed-off-by: Weiping Zhang <zhangweiping@didichuxing.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-06-29 14:20:56 +02:00
Kevin Wolf
b008326744 block: Remove deprecated -drive option serial
The -drive option serial was deprecated in QEMU 2.10. It's time to
remove it.

Tests need to be updated to set the serial number with -global instead
of using the -drive option.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2018-06-15 14:49:44 +02:00
Philippe Mathieu-Daudé
6b39bad02e hw/block/nvme: Include "qemu/cutils.h" directly in the source file
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20180528232719.4721-16-f4bug@amsat.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-01 14:15:10 +02:00
Yoni Bettan
d61a363d3e pci: removed the is_express field since a uniform interface was inserted
according to Eduardo Habkost's commit fd3b02c889 all PCIEs now implement
INTERFACE_PCIE_DEVICE so we don't need is_express field anymore.

Devices that implements only INTERFACE_PCIE_DEVICE (is_express == 1)
or
devices that implements only INTERFACE_CONVENTIONAL_PCI_DEVICE (is_express == 0)
where not affected by the change.

The only devices that were affected are those that are hybrid and also
had (is_express == 1) - therefor only:
  - hw/vfio/pci.c
  - hw/usb/hcd-xhci.c
  - hw/xen/xen_pt.c

For those 3 I made sure that QEMU_PCI_CAP_EXPRESS is on in instance_init()

Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Yoni Bettan <ybettan@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:41 +02:00
Hikaru Nishida
5e9aa92eb1 hw/block: Fix pin-based interrupt behaviour of NVMe
Pin-based interrupt of NVMe controller did not work properly
because using an obsolated function pci_irq_pulse().
To fix this, change to use pci_irq_assert() / pci_irq_deassert()
instead of pci_irq_pulse().

Signed-off-by: Hikaru Nishida <hikarupsp@gmail.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-01-23 12:33:07 +01:00
Doug Gale
1ee24514ae nvme: Add tracing
Add trace output for commands, errors, and undefined behavior.
Add guest error log output for undefined behavior.
Report invalid undefined accesses to MMIO.
Annotate unlikely error checks with unlikely.

Signed-off-by: Doug Gale <doug16k@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-12-22 15:03:41 +01:00
Mao Zhongyi
ceff3e1f01 hw/block: Use errp directly rather than local_err
[Drop virtio_blk_data_plane_create() change that misinterprets return
value when the virtio transport does not support dataplane.
--Stefan]

Cc: John Snow <jsnow@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>

Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: e77848d3735ba590f23ffbf8094379c646c33d79.1511317952.git.maozy.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-12-19 10:25:03 +00:00
Mao Zhongyi
e01d6a415b hw/block/nvme: Convert to realize
Convert nvme_init() to realize and rename it to nvme_realize().

Cc: John Snow <jsnow@redhat.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>

Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Message-id: 2882e72d795e04cbe2120f569d551aef2467ac60.1511317952.git.maozy.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-12-19 09:25:27 +00:00