The Non-MDTS DMSRL limit must be recomputed when namespaces are
detached.
Fixes: 645ce1a70c ("hw/block/nvme: support namespace attachment command")
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Remove the unused BlockConf from the controller structure and remove the
noop constraint checking.
Device works just fine with both legacy drive parameter namespace and
nvme-ns namespace definitions.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
The `nvme_nsid()` function returns '-1' (FFFFFFFFh) when the given
namespace is NULL. Since FFFFFFFFh is actually a valid namespace
identifier (the "broadcast" value), change this to be '0' since that
actually *is* the invalid value.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Add the missing nvme_adm_opc_str entry for the Namespace Attachment
command.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Protection Information can only be enabled if there is at least 8 bytes
of metadata.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
The same thing as for incoming postcopy - we cannot deal with concurrent
RAM discards when using background snapshot feature in outgoing migration.
Fixes: 8518278a6a (migration: implementation
of background snapshot thread)
Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Reported-by: David Hildenbrand <david@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210401092226.102804-3-andrey.gruzdev@virtuozzo.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
* ppc/e500 and arm/virt: only add valid dynamic sysbus devices to the
platform bus
* update i.mx31 maintainer list
* Revert "target/arm: Make number of counters in PMCR follow the CPU"
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmBsU1IZHHBldGVyLm1h
eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3gsJD/48W/dgiW9NkVo0zZclQq7a
kXBREK3JH+weW0Ws9g89hxjOmij6kPQeDC0fxCLfc7OX9bshYps21Q0r/Kl5X30h
1T6eDTHhFPcv2cLty2IYUC4E2FWDjVuLtRtcx1dYULMrIG7xXEqDFl+d+ZgGK5mL
NfwAA3fYqLN6cGxf94h/MosU0C9NJq1g/VDcq2gaR8+bdmGk0Gg7HHloFOzoOlaa
KS+Qt9dcVYa6q9GZBtYi0/w1YlORSaf7sTvqjkZ4H5jTY9NfjVRP87OSaLkgJYt+
OTTZjh9OQv1rL51Egl9sYUJX2dk4mFBE1pPampnwtBEcaQ9r8idR2+3noiTF8lRi
tdyRPDoZU6EdkH0aLJeSRbkhT3z1y+m0qLTCRh5lnyhZKIAmDHXW2FBnFfPnL7EL
C4RqkUZ1PxdixEQ6GOauBTJQbVsjKUTsgFuxZ2S3euKCl2oHnLafcqY9uC711YHb
5R9cvACLHkA/kSgkw5HmJPyX4qNn+9LftRr3YpZ95soo/c8dEa17niu/2I2jMTnt
1EPYap/R1lI3OHaB4Q51FlRPufnwQ9Vh/pOtXyn1cvc2x+ABHB++139LUuYe+5e2
N/0vH5rSQcf9PnTPddOZYaCQx7KE44ZaAXHtqFMB+rWtG/Ss2MJCwCa9sMd6ciEE
M9vZ4ZmccPBEwRv2Dgp84g==
=f0Ad
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20210406' into staging
target-arm queue:
* ppc/e500 and arm/virt: only add valid dynamic sysbus devices to the
platform bus
* update i.mx31 maintainer list
* Revert "target/arm: Make number of counters in PMCR follow the CPU"
# gpg: Signature made Tue 06 Apr 2021 13:25:54 BST
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20210406:
Remove myself as i.mx31 maintainer
Revert "target/arm: Make number of counters in PMCR follow the CPU"
hw/ppc/e500plat: Only try to add valid dynamic sysbus devices to platform bus
hw/arm/virt: Only try to add valid dynamic sysbus devices to platform bus
machine: Provide a function to check the dynamic sysbus allowlist
include/hw/boards.h: Document machine_class_allow_dynamic_sysbus_dev()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Two last minute bugfixes. They are both designed to prevent
compatibility headaches down the road.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmBsQkMPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpUvUH/RPU/bvWjwcfjt0PSySslxleLLZX2ixZRWTl
Ij1+Ibl47Lxa5lI0ZMibac3Q4CC0+20pwGITX6165AXwMUxtHcUij+d6SW//ZrAv
rFOtAgpHGtL82ShwpWt73Qa/iZKAnl0GRaqs/gZJcFViUGPBZsDT1abG+OgI5EDh
l3+j+3CbwrZWqGygaVYNHbx/ZaO6+rq/RVxsQl6bNbP7XItanMj43NyhSngHtt7S
/UNSq9xAVSlVpCbkGqfNhms9ET4FwKoJgBK+FE2OFG5TqJoBYXKmUJx/9GML9R1d
2qwSAT9NCTniSX47mIHbcMgQZTXDkkWmvUhaSREASmlAjzoqAlQ=
=GXh7
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc,virtio: last minute bugfixes
Two last minute bugfixes. They are both designed to prevent
compatibility headaches down the road.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 06 Apr 2021 12:13:07 BST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
virtio-pci: compat page aligned ATS
x86: rename oem-id and oem-table-id properties
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 4c70875372 ("pci: advertise a page aligned ATS") advertises
the page aligned via ATS capability (RO) to unbrek recent Linux IOMMU
drivers since 5.2. But it forgot the compat the capability which
breaks the migration from old machine type:
(qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x104 read:
0 device: 20 cmask: ff wmask: 0 w1cmask:0
This patch introduces a new parameter "x-ats-page-aligned" for
virtio-pci device and turns it on for machine type which is newer than
5.1.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: qemu-stable@nongnu.org
Fixes: 4c70875372 ("pci: advertise a page aligned ATS")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210406040330.11306-1-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The e500plat machine device plug callback currently calls
platform_bus_link_device() for any sysbus device. This is overly
broad, because platform_bus_link_device() will unconditionally grab
the IRQs and MMIOs of the device it is passed, whether it was
intended for the platform bus or not. Restrict hotpluggability of
sysbus devices to only those devices on the dynamic sysbus allowlist.
We were mostly getting away with this because the board creates the
platform bus as the last device it creates, and so the hotplug
callback did not do anything for all the sysbus devices created by
the board itself. However if the user plugged in a device which
itself uses a sysbus device internally we would have mishandled this
and probably asserted. An example of this is:
qemu-system-ppc64 -M ppce500 -device macio-oldworld
This isn't a sensible command because the macio-oldworld device
is really specific to the 'g3beige' machine, but we now fail
with a reasonable error message rather than asserting:
qemu-system-ppc64: Device heathrow is not supported by this machine yet.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-id: 20210325153310.9131-5-peter.maydell@linaro.org
The virt machine device plug callback currently calls
platform_bus_link_device() for any sysbus device. This is overly
broad, because platform_bus_link_device() will unconditionally grab
the IRQs and MMIOs of the device it is passed, whether it was
intended for the platform bus or not. Restrict hotpluggability of
sysbus devices to only those devices on the dynamic sysbus
allowlist.
We were mostly getting away with this because the board creates the
platform bus as the last device it creates, and so the hotplug
callback did not do anything for all the sysbus devices created by
the board itself. However if the user plugged in a device which
itself uses a sysbus device internally we would have mishandled this
and probably asserted.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20210325153310.9131-4-peter.maydell@linaro.org
Provide a new function dynamic_sysbus_dev_allowed() which checks the
per-machine list of permitted dynamic sysbus devices and returns a
boolean result indicating whether the device is allowed. We can use
this in the implementation of validate_sysbus_device(), but we will
also need it so that machine hotplug callbacks can validate devices
rather than assuming that any sysbus device might be hotpluggable
into the platform bus.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20210325153310.9131-3-peter.maydell@linaro.org
The check for `n->namespace.blkconf.blk` always fails because
this is in the initialization function.
Signed-off-by: Joelle van Dyne <j@getutm.app>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
The description was originally removed in commit 578d914b26
("hw/block/nvme: align zoned.zasl with mdts") together with the removal
of the zoned.append_size_limit parameter itself.
However, it was (most likely accidentally), re-added in commit
f7dcd31885 ("hw/block/nvme: add non-mdts command size limit for verify").
Remove the description again, since the parameter it describes,
zoned.append_size_limit, no longer exists.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
After introducing non-scalar machine properties, it would be preferrable
to have a single acpitable property which includes both generic
information (such as the OEM ids) and custom tables currently
passed via -acpitable.
Do not saddle ourselves with legacy oem-id and oem-table-id
properties, instead mark them as experimental.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210402082128.13854-1-pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If a device model
(a) doesn't set the value to a correct interrupt number and then
(b) triggers an interrupt for itself,
it's device model bug. Add assert on interrupt pin number to catch
this kind of bug more obviously.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-Id: <9cf8ac3b17e162daac0971d7be32deb6a33ae6ec.1616532563.git.isaku.yamahata@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit 6be8cf56bc made sure that SCI is enabled in PM1.CNT
on reset in acpi_only mode by modifying acpi_pm1_cnt_reset() and
that worked for q35 as expected.
This patch adds reset ACPI PM related registers on vt82c686 reset time
and de-assert sci.
via_pm_realize() initializes acpi pm tmr, evt, cnt and gpe.
Reset them on device reset.
Cc: BALATON Zoltan <balaton@eik.bme.hu>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: "Philippe Mathieu-Daudé" <f4bug@amsat.org>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-Id: <0a3fe998525552860919a690ce83dab8f663ab99.1616532563.git.isaku.yamahata@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit 6be8cf56bc made sure that SCI is enabled in PM1.CNT
on reset in acpi_only mode by modifying acpi_pm1_cnt_reset() and
that worked for q35 as expected.
The function was introduced by commit
eaba51c573 (acpi, acpi_piix, vt82c686: factor out PM1_CNT logic)
that forgot to actually call it at piix4 reset time and as result
SCI_EN wasn't set as was expected by 6be8cf56bc in acpi_only mode.
So Windows crashes when it notices that SCI_EN is not set and FADT is
not providing information about how to enable it anymore.
Reproducer:
qemu-system-x86_64 -enable-kvm -M pc-i440fx-6.0,smm=off -cdrom any_windows_10x64.iso
Fix it by calling acpi_pm1_cnt_reset() at piix4 reset time.
Occasionally this patch adds reset acpi PM related registers on
piix4 reset time and de-assert sci.
piix4_pm_realize() initializes acpi pm tmr, evt, cnt and gpe.
Reset them on device reset. pm_reset() in ich9.c correctly calls
corresponding reset functions.
Fixes: 6be8cf56bc (acpi/core: always set SCI_EN when SMM isn't supported)
Reported-by: Reinoud Zandijk <reinoud@NetBSD.org>
Co-developed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-Id: <8a5bbd19727045ec863523830078dd4ca63f6a9a.1616532563.git.isaku.yamahata@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The value is assigned later in this procedure.
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Message-Id: <20210315115937.14286-3-yuri.benditovich@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1743098
This commit completes the solution of segfault in hot unplug flow
(by commit ccec7e9603).
Added missing check for vdev in virtio_pci_isr_read.
Typical stack of crash:
virtio_pci_isr_read ../hw/virtio/virtio-pci.c:1365 with proxy-vdev = 0
memory_region_read_accessor at ../softmmu/memory.c:442
access_with_adjusted_size at ../softmmu/memory.c:552
memory_region_dispatch_read1 at ../softmmu/memory.c:1420
memory_region_dispatch_read at ../softmmu/memory.c:1449
flatview_read_continue at ../softmmu/physmem.c:2822
flatview_read at ../softmmu/physmem.c:2862
address_space_read_full at ../softmmu/physmem.c:2875
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Message-Id: <20210315115937.14286-2-yuri.benditovich@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Qemu crashes on shutdown if the chardev used by vhost-user-blk has been
finalized before the vhost-user-blk.
This happens with char-socket chardev operating in the listening mode (server).
The char-socket chardev emits "close" event at the end of finalizing when
its internal data is destroyed. This calls vhost-user-blk event handler
which in turn tries to manipulate with destroyed chardev by setting an empty
event handler for vhost-user-blk cleanup postponing.
This patch separates the shutdown case from the cleanup postponing removing
the need to set an event handler.
Signed-off-by: Denis Plotnikov <den-plotnikov@yandex-team.ru>
Message-Id: <20210325151217.262793-4-den-plotnikov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit 4bcad76f4c ("vhost-user-blk: delay vhost_user_blk_disconnect")
introduced postponing vhost_dev cleanup aiming to eliminate qemu aborts
because of connection problems with vhost-blk daemon.
However, it introdues a new problem. Now, any communication errors
during execution of vhost_dev_init() called by vhost_user_blk_device_realize()
lead to qemu abort on assert in vhost_dev_get_config().
This happens because vhost_user_blk_disconnect() is postponed but
it should have dropped s->connected flag by the time
vhost_user_blk_device_realize() performs a new connection opening.
On the connection opening, vhost_dev initialization in
vhost_user_blk_connect() relies on s->connection flag and
if it's not dropped, it skips vhost_dev initialization and returns
with success. Then, vhost_user_blk_device_realize()'s execution flow
goes to vhost_dev_get_config() where it's aborted on the assert.
To fix the problem this patch adds immediate cleanup on device
initialization(in vhost_user_blk_device_realize()) using different
event handlers for initialization and operation introduced in the
previous patch.
On initialization (in vhost_user_blk_device_realize()) we fully
control the initialization process. At that point, nobody can use the
device since it isn't initialized and we don't need to postpone any
cleanups, so we can do cleaup right away when there is a communication
problem with the vhost-blk daemon.
On operation we leave it as is, since the disconnect may happen when
the device is in use, so the device users may want to use vhost_dev's data
to do rollback before vhost_dev is re-initialized (e.g. in vhost_dev_set_log()).
Signed-off-by: Denis Plotnikov <den-plotnikov@yandex-team.ru>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20210325151217.262793-3-den-plotnikov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It is useful to use different connect/disconnect event handlers
on device initialization and operation as seen from the further
commit fixing a bug on device initialization.
This patch refactors the code to make use of them: we don't rely any
more on the VM state for choosing how to cleanup the device, instead
we explicitly use the proper event handler depending on whether
the device has been initialized.
Signed-off-by: Denis Plotnikov <den-plotnikov@yandex-team.ru>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20210325151217.262793-2-den-plotnikov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As the comment of tx_padding_and_crc() says: "Never add CRC in QEMU",
min_frame_len should excluce CRC, so it should be 60 instead of 64.
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210316081505.72898-1-bmeng.cn@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
spapr_memory_unplug() is the last step of the hot unplug sequence.
It is indirectly called by:
spapr_lmb_release()
hotplug_handler_unplug()
and spapr_lmb_release() already buys us that DIMM unplug state is
present : it gets restored with spapr_recover_pending_dimm_state()
if missing.
g_assert() that spapr_pending_dimm_unplugs_find() cannot return NULL
in spapr_memory_unplug() to make this clear and silence Coverity.
Fixes: Coverity CID 1450767
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <161562021166.948373.15092876234470478331.stgit@bahia.lan>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Per devicetree spec v0.3 [1] chapter 2.3.5:
The #address-cells and #size-cells properties are not inherited
from ancestors in the devicetree. They shall be explicitly defined.
If missing, a client program should assume a default value of 2
for #address-cells, and a value of 1 for #size-cells.
These properties are currently missing, causing the <reg> property
of the queue-group subnode to be incorrectly parsed using default
values.
[1] https://github.com/devicetree-org/devicetree-specification/releases/download/v0.3/devicetree-specification-v0.3.pdf
Fixes: fdfb7f2cdb ("e500: Add support for eTSEC in device tree")
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <20210311081608.66891-1-bmeng.cn@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In commit 81b3ddaf87 we fixed a use of uninitialized data
in read_tcnt(). However this change wasn't enough to placate
Coverity, which is not smart enough to see that if we read a
2 bit field and then handle cases 0, 1, 2 and 3 then there cannot
be a flow of execution through the switch default. Add explicit
default cases which assert that they can't be reached, which
should help silence Coverity.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210319162458.13760-1-peter.maydell@linaro.org
They were introduced in commit 9bde7f0674 ("hw/arm/smmuv3: Implement
translate callback") but never actually used. Drop them.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20210325142702.790-1-yuzenghui@huawei.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When building with --enable-sanitizers we get:
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x5618479ec7cf in malloc (qemu-system-aarch64+0x233b7cf)
#1 0x7f675745f958 in g_malloc (/lib64/libglib-2.0.so.0+0x58958)
#2 0x561847c2dcc9 in xlnx_dp_init hw/display/xlnx_dp.c:1259:5
#3 0x56184a5bdab8 in object_init_with_type qom/object.c:375:9
#4 0x56184a5a2bda in object_initialize_with_type qom/object.c:517:5
#5 0x56184a5a24d5 in object_initialize qom/object.c:536:5
#6 0x56184a5a2f6c in object_initialize_child_with_propsv qom/object.c:566:5
#7 0x56184a5a2e60 in object_initialize_child_with_props qom/object.c:549:10
#8 0x56184a5a3a1e in object_initialize_child_internal qom/object.c:603:5
#9 0x5618495aa431 in xlnx_zynqmp_init hw/arm/xlnx-zynqmp.c:273:5
The RX/TX FIFOs are created in xlnx_dp_init(), add xlnx_dp_finalize()
to destroy them.
Fixes: 58ac482a66 ("introduce xlnx-dp")
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210323182958.277654-1-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Max noticed that since blk_aio_pwrite_zeroes() may invoke the callback
before returning, the callbacks will never see *count == 0 and thus
never free the count variable or decrement num_formats causing a CQE to
never be posted.
Coverity (CID 1451082) also picked up on the fact that count would not
be free'ed if the namespace was of zero size.
Fix both of these issues by explicitly checking *count and finalize for
the given namespace if --(*count) is zero. Enqueing a CQE if there are
no AIOs outstanding after this case is already handled by nvme_format()
by inspecting *num_formats.
Reported-by: Max Reitz <mreitz@redhat.com>
Reported-by: Coverity (CID 1451082)
Fixes: dc04d25e2f ("hw/block/nvme: add support for the format nvm command")
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
If nvme_map_dptr() fails, nvme_dif_rw() will leak the bounce context.
Fix this by using the same error handling as everywhere else in the
function.
Reported-by: Coverity (CID 1451080)
Fixes: 146f720c55 ("hw/block/nvme: end-to-end data protection")
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
QEMU crashes with certain targets when trying to show the help
output of EHCI devices:
$ ./qemu-system-aarch64 -device ich9-usb-ehci1,help
qemu-system-aarch64: ../../devel/qemu/softmmu/physmem.c:1154: phys_section_add:
Assertion `map->sections_nb < TARGET_PAGE_SIZE' failed.
Aborted (core dumped)
This happens because the device is doing things at "instance_init" time
that should be done at "realize" time instead. So move the related code
to the realize() function instead. (NB: This now also matches the
memory_region_del_subregion() calls which are done in usb_ehci_unrealize(),
and not during finalize()).
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210326095155.1994604-1-thuth@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Since the virtio-gpu-ccw device depends on the hw-display-virtio-gpu
module, which provides the type virtio-gpu-device, packaging the
hw-display-virtio-gpu module as a separate package that may or may not
be installed along with the qemu package leads to problems. Namely if
the hw-display-virtio-gpu is absent, qemu continues to advertise
virtio-gpu-ccw, but it aborts not only when one attempts using
virtio-gpu-ccw, but also when libvirtd's capability probing tries
to instantiate the type to introspect it.
Let us thus introduce a module named hw-s390x-virtio-gpu-ccw that
is going to provide the virtio-gpu-ccw device. The hw-s390x prefix
was chosen because it is not a portable device.
With virtio-gpu-ccw built as a module, the correct way to package a
modularized qemu is to require that hw-display-virtio-gpu must be
installed whenever the module hw-s390x-virtio-gpu-ccw.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20210317095622.2839895-4-kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Introduce a symbol which can be used to prevent ccw modules
being loaded into system emulators without ccw support.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20210317095622.2839895-3-kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
When building with --enable-sanitizers we get:
Direct leak of 32 byte(s) in 2 object(s) allocated from:
#0 0x5618479ec7cf in malloc (qemu-system-aarch64+0x233b7cf)
#1 0x7f675745f958 in g_malloc (/lib64/libglib-2.0.so.0+0x58958)
#2 0x561847f02ca2 in usb_packet_init hw/usb/core.c:531:5
#3 0x561848df4df4 in usb_ehci_init hw/usb/hcd-ehci.c:2575:5
#4 0x561847c119ac in ehci_sysbus_init hw/usb/hcd-ehci-sysbus.c:73:5
#5 0x56184a5bdab8 in object_init_with_type qom/object.c:375:9
#6 0x56184a5bd955 in object_init_with_type qom/object.c:371:9
#7 0x56184a5a2bda in object_initialize_with_type qom/object.c:517:5
#8 0x56184a5a24d5 in object_initialize qom/object.c:536:5
#9 0x56184a5a2f6c in object_initialize_child_with_propsv qom/object.c:566:5
#10 0x56184a5a2e60 in object_initialize_child_with_props qom/object.c:549:10
#11 0x56184a5a3a1e in object_initialize_child_internal qom/object.c:603:5
#12 0x561849542d18 in npcm7xx_init hw/arm/npcm7xx.c:427:5
Similarly to commit d710e1e7bd ("usb: ehci: fix memory leak in
ehci"), fix by calling usb_ehci_finalize() to free the USBPacket.
Fixes: 7341ea075c
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210323183701.281152-1-f4bug@amsat.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
"-usbdevice ccid" was not documented and -usbdevice itself was marked
as deprecated before QEMU v6.0. And searching for "-usbdevice ccid"
in the internet does not show any useful results, so likely nobody
was using the ccid device via the -usbdevice option. Remove it now.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210311092829.1479051-1-thuth@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This PR includes:
- Fix for vector CSR access
- Improvements to the Ibex UART device
- PMP improvements and bug fixes
- Hypervisor extension bug fixes
- ramfb support for the virt machine
- Fast read support for SST flash
- Improvements to the microchip_pfsoc machine
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE9sSsRtSTSGjTuM6PIeENKd+XcFQFAmBZSuUACgkQIeENKd+X
cFQfqggAqiTI/KYe5GW9WBrPpGdctuEgUVBbxUhq5zY52mVGTfqEn4LbCfvdaTOp
O1MA+kfi4JN87VVWCfHPH0S725qmaPV7wfpWKi8KcPQ9uR+4RvjX/RDptkZhVjC9
Ok9p6dg2cUbz9Jm/q2V1Z69fRNMrz6pJWPxfZeTpy9bcmFNyYBR9+xiSXaW2fuAM
HsyZbsP8zHmHxfGcZwI6vpSZh3P4Ytd6nL/8v98i1DwWUsf51TsEwaS87gRQSjTz
SHqHZ9xsVM0k3glFi+El39iHzpujlONkYPvZaFP8kRhzXtaWVP1cu+CaKNUPEteq
jWNHLwFQk9VcN8uDG5o0unGEXnLJPA==
=dKa4
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/alistair/tags/pull-riscv-to-apply-20210322-2' into staging
RISC-V PR for 6.0
This PR includes:
- Fix for vector CSR access
- Improvements to the Ibex UART device
- PMP improvements and bug fixes
- Hypervisor extension bug fixes
- ramfb support for the virt machine
- Fast read support for SST flash
- Improvements to the microchip_pfsoc machine
# gpg: Signature made Tue 23 Mar 2021 01:56:53 GMT
# gpg: using RSA key F6C4AC46D4934868D3B8CE8F21E10D29DF977054
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [full]
# Primary key fingerprint: F6C4 AC46 D493 4868 D3B8 CE8F 21E1 0D29 DF97 7054
* remotes/alistair/tags/pull-riscv-to-apply-20210322-2:
target/riscv: Prevent lost illegal instruction exceptions
docs/system: riscv: Add documentation for 'microchip-icicle-kit' machine
hw/riscv: microchip_pfsoc: Map EMMC/SD mux register
hw/block: m25p80: Support fast read for SST flashes
target/riscv: Add proper two-stage lookup exception detection
target/riscv: Fix read and write accesses to vsip and vsie
hw/riscv: allow ramfb on virt
hw/riscv: Add fw_cfg support to virt
target/riscv: Use background registers also for MSTATUS_MPV
target/riscv: Make VSTIP and VSEIP read-only in hip
target/riscv: Adjust privilege level for HLV(X)/HSV instructions
target/riscv: flush TLB pages if PMP permission has been changed
target/riscv: add log of PMP permission checking
target/riscv: propagate PMP permission to TLB page
hw/char: disable ibex uart receive if the buffer is full
target/riscv: fix vs() to return proper error code
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For accesses to rom blob data before or during reset, we have a
function rom_ptr() which looks for a rom blob that would be loaded to
the specified address, and returns a pointer into the rom blob data
corresponding to that address. This allows board or CPU code to say
"what is the data that is going to be loaded to this address?".
However, this function does not take account of memory region
aliases. If for instance a machine model has RAM at address
0x0000_0000 which is aliased to also appear at 0x1000_0000, a
rom_ptr() query for address 0x0000_0000 will only return a match if
the guest image provided by the user was loaded at 0x0000_0000 and
not if it was loaded at 0x1000_0000, even though they are the same
RAM and a run-time guest CPU read of 0x0000_0000 will read the data
loaded to 0x1000_0000.
Provide a new function rom_ptr_for_as() which takes an AddressSpace
argument, so that it can check whether the MemoryRegion corresponding
to the address is also mapped anywhere else in the AddressSpace and
look for rom blobs that loaded to that alias.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210318174823.18066-5-peter.maydell@linaro.org
A clock is added by commit aac63e0e6e ("hw/char/pl011: add a clock
input") since v5.2.0 which corresponds to virt-5.2 machine type. It
causes backwards migration failure from upstream to downstream (v5.1.0)
when the machine type is specified with virt-5.1.
This fixes the issue by following instructions from section "Connecting
subsections to properties" in docs/devel/migration.rst. With this applied,
the PL011 clock is migrated based on the machine type.
virt-5.2 or newer: migration
virt-5.1 or older: non-migration
Cc: qemu-stable@nongnu.org # v5.2.0+
Fixes: aac63e0e6e ("hw/char/pl011: add a clock input")
Suggested-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 20210318023801.18287-1-gshan@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Windows guests using the "Basic Display Adapter" don't parse the
"Established timings III" block. They also don't parse any edid
extension.
So prefer the "Standard Timings" block to store the display resolutions
in edid_fill_modes(). Also reorder the mode list, so more exotic
resolutions (specifically the ones which are not supported by vgabios)
are moved down and the remaining ones have a better chance to get one of
the eight slots in the "Standard Timings" block.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20210316143812.2363588-6-kraxel@redhat.com>
Whenever a Xen block device is detach via xenstore, the image
associated with it remained open by the backend QEMU and an error is
logged:
qemu-system-i386: failed to destroy drive: Node xvdz-qcow2 is in use
This happened since object_unparent() doesn't immediately frees the
object and thus keep a reference to the node we are trying to free.
The reference is hold by the "drive" property and the call
xen_block_drive_destroy() fails.
In order to fix that, we call drain_call_rcu() to run the callback
setup by bus_remove_child() via object_unparent().
Fixes: 2d24a64661 ("device-core: use RCU for list of children of a bus")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20210308143232.83388-1-anthony.perard@citrix.com>
Since HSS commit c20a89f8dcac, the Icicle Kit reference design has
been updated to use a register mapped at 0x4f000000 instead of a
GPIO to control whether eMMC or SD card is to be used. With this
support the same HSS image can be used for both eMMC and SD card
boot flow, while previously two different board configurations were
used. This is undocumented but one can take a look at the HSS code
HSS_MMCInit() in services/mmc/mmc_api.c.
With this commit, HSS image built from 2020.12 release boots again.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210322075248.136255-1-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Per SST25VF016B datasheet [1], SST flash requires a dummy byte after
the address bytes. Note only SPI mode is supported by SST flashes.
[1] http://ww1.microchip.com/downloads/en/devicedoc/s71271_04.pdf
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210306060152.7250-1-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Provides fw_cfg for the virt machine on riscv. This enables
using e.g. ramfb later.
Signed-off-by: Asherah Connor <ashe@kivikakk.ee>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210318235041.17175-2-ashe@kivikakk.ee
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Not disabling the UART leads to QEMU overwriting the UART receive buffer with
the newest received byte. The rx_level variable is added to allow the use of
the existing OpenTitan driver libraries.
Signed-off-by: Alexander Wagner <alexander.wagner@ulal.de>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210309152130.13038-1-alexander.wagner@ulal.de
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The code that sets/gets oem fields is duplicated in both PC and MICROVM
variants. This commit moves it to X86MachineState so that all x86
variants can use it and duplication is removed.
Signed-off-by: Marian Postevca <posteuca@mutex.one>
Message-Id: <20210221001737.24499-2-posteuca@mutex.one>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's also set a maximum size for "etc/acpi/rsdp", so the maximum
size doesn't get implicitly set based on the initial table size. In my
experiments, the table size was in the range of 22 bytes, so a single
page (== what we used until now) seems to be good enough.
Now that we have defined maximum sizes for all currently used table types,
let's assert that we catch usage with new tables that need a proper maximum
size definition.
Also assert that our initial size does not exceed the maximum size; while
qemu_ram_alloc_internal() properly asserts that the initial RAMBlock size
is <= its maximum size, the result might differ when the host page size
is bigger than 4k.
Suggested-by: Laszlo Ersek <lersek@redhat.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <shannon.zhaosl@gmail.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210304105554.121674-5-david@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We want to have safety margins for all tables based on the table type.
Let's move the maximum size logic into acpi_add_rom_blob() and make it
dependent on the table name, so we don't have to replicate for each and
every instance that creates such tables.
Suggested-by: Laszlo Ersek <lersek@redhat.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <shannon.zhaosl@gmail.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210304105554.121674-4-david@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's just reuse ACPI_BUILD_LOADER_FILE.
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <shannon.zhaosl@gmail.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210304105554.121674-3-david@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The resizeable memory region / RAMBlock that is created for the cmd blob
has a maximum size of whole host pages (e.g., 4k), because RAMBlocks
work on full host pages. In addition, in i386 ACPI code:
acpi_align_size(tables->linker->cmd_blob, ACPI_BUILD_ALIGN_SIZE);
makes sure to align to multiples of 4k, padding with 0.
For example, if our cmd_blob is created with a size of 2k, the maximum
size is 4k - we cannot grow beyond that. Growing might be required
due to guest action when rebuilding the tables, but also on incoming
migration.
This automatic generation of the maximum size used to be sufficient,
however, there are cases where we cross host pages now when growing at
runtime: we exceed the maximum size of the RAMBlock and can crash QEMU when
trying to resize the resizeable memory region / RAMBlock:
$ build/qemu-system-x86_64 --enable-kvm \
-machine q35,nvdimm=on \
-smp 1 \
-cpu host \
-m size=2G,slots=8,maxmem=4G \
-object memory-backend-file,id=mem0,mem-path=/tmp/nvdimm,size=256M \
-device nvdimm,label-size=131072,memdev=mem0,id=nvdimm0,slot=1 \
-nodefaults \
-device vmgenid \
-device intel-iommu
Results in:
Unexpected error in qemu_ram_resize() at ../softmmu/physmem.c:1850:
qemu-system-x86_64: Size too large: /rom@etc/table-loader:
0x2000 > 0x1000: Invalid argument
In this configuration, we consume exactly 4k (32 entries, 128 bytes each)
when creating the VM. However, once the guest boots up and maps the MCFG,
we also create the MCFG table and end up consuming 2 additional entries
(pointer + checksum) -- which is where we try resizing the memory region
/ RAMBlock, however, the maximum size does not allow for it.
Currently, we get the following maximum sizes for our different
mutable tables based on behavior of resizeable RAMBlock:
hw table max_size
------- ---------------------------------------------------------
virt "etc/acpi/tables" ACPI_BUILD_TABLE_MAX_SIZE (0x200000)
virt "etc/table-loader" HOST_PAGE_ALIGN(initial_size)
virt "etc/acpi/rsdp" HOST_PAGE_ALIGN(initial_size)
i386 "etc/acpi/tables" ACPI_BUILD_TABLE_MAX_SIZE (0x200000)
i386 "etc/table-loader" HOST_PAGE_ALIGN(initial_size)
i386 "etc/acpi/rsdp" HOST_PAGE_ALIGN(initial_size)
microvm "etc/acpi/tables" ACPI_BUILD_TABLE_MAX_SIZE (0x200000)
microvm "etc/table-loader" HOST_PAGE_ALIGN(initial_size)
microvm "etc/acpi/rsdp" HOST_PAGE_ALIGN(initial_size)
Let's set the maximum table size for "etc/table-loader" to 64k, so we
can properly grow at runtime, which should be good enough for the future.
Migration is not concerned with the maximum size of a RAMBlock, only
with the used size - so existing setups are not affected. Of course, we
cannot migrate a VM that would have crash when started on older QEMU from
new QEMU to older QEMU without failing early on the destination when
synchronizing the RAM state:
qemu-system-x86_64: Size too large: /rom@etc/table-loader: 0x2000 > 0x1000: Invalid argument
qemu-system-x86_64: error while loading state for instance 0x0 of device 'ram'
qemu-system-x86_64: load of migration failed: Invalid argument
We'll refactor the code next, to make sure we get rid of this implicit
behavior for "etc/acpi/rsdp" as well and to make the code easier to
grasp.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <shannon.zhaosl@gmail.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210304105554.121674-2-david@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Implement _DSM according to:
PCI Firmware Specification 3.1
4.6.7. DSM for Naming a PCI or PCI Express Device Under
Operating Systems
and wire it up to cold and hot-plugged PCI devices.
Feature depends on ACPI hotplug being enabled (as that provides
PCI devices descriptions in ACPI and MMIO registers that are
reused to fetch acpi-index).
acpi-index should work for
- cold plugged NICs:
$QEMU -device e1000,acpi-index=100
=> 'eno100'
- hot-plugged
(monitor) device_add e1000,acpi-index=200,id=remove_me
=> 'eno200'
- re-plugged
(monitor) device_del remove_me
(monitor) device_add e1000,acpi-index=1
=> 'eno1'
Windows also sees index under "PCI Label Id" field in properties
dialog but otherwise it doesn't seem to have any effect.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210315180102.3008391-6-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it will be used by follow up patches
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210315180102.3008391-5-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it helps to avoid device naming conflicts when guest OS is
configured to use acpi-index for naming.
Spec ialso says so:
PCI Firmware Specification Revision 3.2
4.6.7. _DSM for Naming a PCI or PCI Express Device Under Operating Systems
"
Instance number must be unique under \_SB scope. This instance number does not have to
be sequential in a given system configuration.
"
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210315180102.3008391-4-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In x86/ACPI world, linux distros are using predictable
network interface naming since systemd v197. Which on
QEMU based VMs results into path based naming scheme,
that names network interfaces based on PCI topology.
With itm on has to plug NIC in exactly the same bus/slot,
which was used when disk image was first provisioned/configured
or one risks to loose network configuration due to NIC being
renamed to actually used topology.
That also restricts freedom to reshape PCI configuration of
VM without need to reconfigure used guest image.
systemd also offers "onboard" naming scheme which is
preferred over PCI slot/topology one, provided that
firmware implements:
"
PCI Firmware Specification 3.1
4.6.7. DSM for Naming a PCI or PCI Express Device Under
Operating Systems
"
that allows to assign user defined index to PCI device,
which systemd will use to name NIC. For example, using
-device e1000,acpi-index=100
guest will rename NIC to 'eno100', where 'eno' is default
prefix for "onboard" naming scheme. This doesn't require
any advance configuration on guest side to com in effect
at 'onboard' scheme takes priority over path based naming.
Hope is that 'acpi-index' it will be easier to consume by
management layer, compared to forcing specific PCI topology
and/or having several disk image templates for different
topologies and will help to simplify process of spawning
VM from the same template without need to reconfigure
guest NIC.
This patch adds, 'acpi-index'* property and wires up
a 32bit register on top of pci hotplug register block
to pass index value to AML code at runtime.
Following patch will add corresponding _DSM code and
wire it up to PCI devices described in ACPI.
*) name comes from linux kernel terminology
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210315180102.3008391-3-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If the block size is programmed to a different value from the
previous one, reset the data pointer of s->fifo_buffer[] so that
s->fifo_buffer[] can be filled in using the new block size in
the next transfer.
With this fix, the following reproducer:
outl 0xcf8 0x80001010
outl 0xcfc 0xe0000000
outl 0xcf8 0x80001001
outl 0xcfc 0x06000000
write 0xe000002c 0x1 0x05
write 0xe0000005 0x1 0x02
write 0xe0000007 0x1 0x01
write 0xe0000028 0x1 0x10
write 0x0 0x1 0x23
write 0x2 0x1 0x08
write 0xe000000c 0x1 0x01
write 0xe000000e 0x1 0x20
write 0xe000000f 0x1 0x00
write 0xe000000c 0x1 0x32
write 0xe0000004 0x2 0x0200
write 0xe0000028 0x1 0x00
write 0xe0000003 0x1 0x40
cannot be reproduced with the following QEMU command line:
$ qemu-system-x86_64 -nographic -machine accel=qtest -m 512M \
-nodefaults -device sdhci-pci,sd-spec-version=3 \
-drive if=sd,index=0,file=null-co://,format=raw,id=mydrive \
-device sd-card,drive=mydrive -qtest stdio
Cc: qemu-stable@nongnu.org
Fixes: CVE-2020-17380
Fixes: CVE-2020-25085
Fixes: CVE-2021-3409
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cornelius Aschermann (Ruhr-Universität Bochum)
Reported-by: Sergej Schumilo (Ruhr-Universität Bochum)
Reported-by: Simon Wörner (Ruhr-Universität Bochum)
Buglink: https://bugs.launchpad.net/qemu/+bug/1892960
Buglink: https://bugs.launchpad.net/qemu/+bug/1909418
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1928146
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210303122639.20004-6-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
The codes to limit the maximum block size is only necessary when
SDHC_BLKSIZE register is writable.
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210303122639.20004-5-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
When an ADMA transfer is started, the codes forget to set the
controller status to indicate a transfer is in progress.
With this fix, the following 2 reproducers:
https://paste.debian.net/plain/1185136https://paste.debian.net/plain/1185141
cannot be reproduced with the following QEMU command line:
$ qemu-system-x86_64 -nographic -machine accel=qtest -m 512M \
-nodefaults -device sdhci-pci,sd-spec-version=3 \
-drive if=sd,index=0,file=null-co://,format=raw,id=mydrive \
-device sd-card,drive=mydrive -qtest stdio
Cc: qemu-stable@nongnu.org
Fixes: CVE-2020-17380
Fixes: CVE-2020-25085
Fixes: CVE-2021-3409
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cornelius Aschermann (Ruhr-Universität Bochum)
Reported-by: Sergej Schumilo (Ruhr-Universität Bochum)
Reported-by: Simon Wörner (Ruhr-Universität Bochum)
Buglink: https://bugs.launchpad.net/qemu/+bug/1892960
Buglink: https://bugs.launchpad.net/qemu/+bug/1909418
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1928146
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210303122639.20004-4-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
At the end of sdhci_send_command(), it starts a data transfer if the
command register indicates data is associated. But the data transfer
should only be initiated when the command execution has succeeded.
With this fix, the following reproducer:
outl 0xcf8 0x80001810
outl 0xcfc 0xe1068000
outl 0xcf8 0x80001804
outw 0xcfc 0x7
write 0xe106802c 0x1 0x0f
write 0xe1068004 0xc 0x2801d10101fffffbff28a384
write 0xe106800c 0x1f 0x9dacbbcad9e8f7061524334251606f7e8d9cabbac9d8e7f60514233241505f
write 0xe1068003 0x28 0x80d000251480d000252280d000253080d000253e80d000254c80d000255a80d000256880d0002576
write 0xe1068003 0x1 0xfe
cannot be reproduced with the following QEMU command line:
$ qemu-system-x86_64 -nographic -M pc-q35-5.0 \
-device sdhci-pci,sd-spec-version=3 \
-drive if=sd,index=0,file=null-co://,format=raw,id=mydrive \
-device sd-card,drive=mydrive \
-monitor none -serial none -qtest stdio
Cc: qemu-stable@nongnu.org
Fixes: CVE-2020-17380
Fixes: CVE-2020-25085
Fixes: CVE-2021-3409
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cornelius Aschermann (Ruhr-Universität Bochum)
Reported-by: Sergej Schumilo (Ruhr-Universität Bochum)
Reported-by: Simon Wörner (Ruhr-Universität Bochum)
Buglink: https://bugs.launchpad.net/qemu/+bug/1892960
Buglink: https://bugs.launchpad.net/qemu/+bug/1909418
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1928146
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210303122639.20004-2-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
At present the sd_erase() does not erase the requested range of card
data to 0xFFs. Let's make the erase operation actually happen.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <1613811493-58815-1-git-send-email-bmeng.cn@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
"qemu-common.h" should be included to provide the forward declaration
of qemu_hexdump() when DEBUG_SD is on.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210228050609.24779-1-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
ret in virtio_pmem_resp is a uint32_t variable, which should be assigned
using virtio_stl_p.
The kernel side driver does not guarantee virtio_pmem_resp to be initialized
to zero in advance, So sometimes the flush operation will fail.
Signed-off-by: Wang Liang <wangliangzz@inspur.com>
Message-Id: <20210317024145.271212-1-wangliangzz@126.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now that everything is in place, have the nested event loop to monitor
the slave channel. The source in the main event loop is destroyed and
recreated to ensure any pending even for the slave channel that was
previously detected is purged. This guarantees that the main loop
wont invoke slave_read() based on an event that was already handled
by the nested loop.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-7-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
A deadlock condition potentially exists if a vhost-user process needs
to request something to QEMU on the slave channel while processing a
vhost-user message.
This doesn't seem to affect any vhost-user implementation so far, but
this is currently biting the upcoming enablement of DAX with virtio-fs.
The issue is being observed when the guest does an emergency reboot while
a mapping still exits in the DAX window, which is very easy to get with
a busy enough workload (e.g. as simulated by blogbench [1]) :
- QEMU sends VHOST_USER_GET_VRING_BASE to virtiofsd.
- In order to complete the request, virtiofsd then asks QEMU to remove
the mapping on the slave channel.
All these dialogs are synchronous, hence the deadlock.
As pointed out by Stefan Hajnoczi:
When QEMU's vhost-user master implementation sends a vhost-user protocol
message, vhost_user_read() does a "blocking" read during which slave_fd
is not monitored by QEMU.
The natural solution for this issue is an event loop. The main event
loop cannot be nested though since we have no guarantees that its
fd handlers are prepared for re-entrancy.
Introduce a new event loop that only monitors the chardev I/O for now
in vhost_user_read() and push the actual reading to a one-shot handler.
A subsequent patch will teach the loop to monitor and process messages
from the slave channel as well.
[1] https://github.com/jedisct1/Blogbench
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-6-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
The slave channel is implemented with socketpair() : QEMU creates
the pair, passes one of the socket to virtiofsd and monitors the
other one with the main event loop using qemu_set_fd_handler().
In order to fix a potential deadlock between QEMU and a vhost-user
external process (e.g. virtiofsd with DAX), we want to be able to
monitor and service the slave channel while handling vhost-user
requests.
Prepare ground for this by converting the slave channel to be a
QIOChannelSocket. This will make monitoring of the slave channel
as simple as calling qio_channel_add_watch_source(). Since the
connection is already established between the two sockets, only
incoming I/O (G_IO_IN) and disconnect (G_IO_HUP) need to be
serviced.
This also allows to get rid of the ancillary data parsing since
QIOChannelSocket can do this for us. Note that the MSG_CTRUNC
check is dropped on the way because QIOChannelSocket ignores this
case. This isn't a problem since slave_read() provisions space for
8 file descriptors, but affected vhost-user slave protocol messages
generally only convey one. If for some reason a buggy implementation
passes more file descriptors, no need to break the connection, just
like we don't break it if some other type of ancillary data is
received : this isn't explicitely violating the protocol per-se so
it seems better to ignore it.
The current code errors out on short reads and writes. Use the
qio_channel_*_all() variants to address this on the way.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-5-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-4-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Some message types, e.g. VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG,
can convey file descriptors. These must be closed before returning
from slave_read() to avoid being leaked. This can currently be done
in two different places:
[1] just after the request has been processed
[2] on the error path, under the goto label err:
These path are supposed to be mutually exclusive but they are not
actually. If the VHOST_USER_NEED_REPLY_MASK flag was passed and the
sending of the reply fails, both [1] and [2] are performed with the
same descriptor values. This can potentially cause subtle bugs if one
of the descriptor was recycled by some other thread in the meantime.
This code duplication complicates rollback for no real good benefit.
Do the closing in a unique place, under a new fdcleanup: goto label
at the end of the function.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-3-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
slave_read() checks EAGAIN when reading or writing to the socket
fails. This gives the impression that the slave channel is in
non-blocking mode, which is certainly not the case with the current
code base. And the rest of the code isn't actually ready to cope
with non-blocking I/O.
Just drop the checks everywhere in this function for the sake of
clarity.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-2-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Both functions don't check the personality of the interface (legacy or
modern) before accessing the configuration memory and always use
virtio_config_readX()/virtio_config_writeX().
With this patch, they now check the personality and in legacy mode
call virtio_config_readX()/virtio_config_writeX(), otherwise call
virtio_config_modern_readX()/virtio_config_modern_writeX().
This change has been tested with virtio-mmio guests (virt stretch/armhf and
virt sid/m68k) and virtio-pci guests (pseries RHEL-7.3/ppc64 and /ppc64le).
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210314200300.3259170-1-laurent@vivier.eu>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For virtio-net, there is no need to pad the Ethernet frame size to
60 bytes before sending to it.
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Several QOM type names contain ',':
ARM,bitband-memory
etraxfs,pic
etraxfs,serial
etraxfs,timer
fsl,imx25
fsl,imx31
fsl,imx6
fsl,imx6ul
fsl,imx7
grlib,ahbpnp
grlib,apbpnp
grlib,apbuart
grlib,gptimer
grlib,irqmp
qemu,register
SUNW,bpp
SUNW,CS4231
SUNW,DBRI
SUNW,DBRI.prom
SUNW,fdtwo
SUNW,sx
SUNW,tcx
xilinx,zynq_slcr
xlnx,zynqmp
xlnx,zynqmp-pmu-soc
xlnx,zynq-xadc
These are all device types. They can't be plugged with -device /
device_add, except for xlnx,zynqmp-pmu-soc, and I doubt that one
actually works.
They *can* be used with -device / device_add to request help.
Usability is poor, though: you have to double the comma, like this:
$ qemu-system-x86_64 -device SUNW,,fdtwo,help
Trap for the unwary. The fact that this was broken in
device-introspect-test for more than six years until commit e27bd49876
fixed it demonstrates that "the unwary" includes seasoned developers.
One QOM type name contains ' ': "ICH9 SMB". Because having to
remember just one way to quote would be too easy.
Rename the "SUNW,FOO types to "sun-FOO". Summarily replace ',' and '
' by '-' in the other type names.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210304140229.575481-2-armbru@redhat.com>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
The previous commit rendered the name fdctrl_connect_drives() somewhat
misleading. Get rid of it by inlining the (now pretty simple)
function into its only caller.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20210309161214.1402527-4-armbru@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Drop the crap deprecated in commit 4a27a638e7 "fdc: Deprecate
configuring floppies with -global isa-fdc" (v5.1.0).
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20210309161214.1402527-3-armbru@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Some compiler versions are smart enough to detect a potentially
uninitialized variable, but are not smart enough to detect that this
cannot happen due to the code flow:
../hw/intc/i8259.c: In function ‘pic_read_irq’:
../hw/intc/i8259.c:203:13: error: ‘irq2’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
203 | irq = irq2 + 8;
| ~~~~^~~~~~~~~~
Restrict irq2 variable use to the inner statement.
Fixes: 78ef2b6989 ("i8259: Reorder intack in pic_read_irq")
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210318163059.3686596-1-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- QAPIfy object-add and --object
- stream: Fail gracefully if permission is denied
- storage-daemon: Fix crash on quit when job is still running
- curl: Fix use after free
- char: Deprecate backend aliases, fix QMP query-chardev-backends
- Fix image creation option defaults that exist in both the format and
the protocol layer (e.g. 'cluster_size' in qcow2 and rbd; the qcow2
default was incorrectly applied to the rbd layer)
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmBUbF4RHGt3b2xmQHJl
ZGhhdC5jb20ACgkQfwmycsiPL9YX2Q//Ve6++hRulIuJVuh8QDxlmGWERqey/ClX
mUqGDOkSSXfftPTDPCYSUFE7QD6HD25oJmUTix2B2P89AIyDcvqvthMDU/j8clor
X3Kx03ky3NLJilZNdYZ2GOMyljgNP3JSrDHBjc/tZx+1e7C5tPNVxXOUW946wIC9
no6xTAarAANl/GS23ZI+vJ3PBEggzAbu6t/hwT//WAB0WB9wFhkCCzWPXIkdBXwP
QpG8chTwuwFAW1c52F0OeQV5FpM0bcMtxYASuNq0HPL6B8qUdKOusTRgTB/fjLLV
tYMhc6tzPLUlin1mGD4m0P+9tRMBFtF/flZVwbd4S+avcAbV2L5S6Xq0QsiNTbx2
oQUk6N2/IWBOMC6D8aBTBwZ7CCasgEg0imtLUdJ8gKp6T44C7cqg1oZwT2dOyYuI
jS+3T+DcZZn3mHmp61nowL/2/2LDAVaLmOfbsvmvlbuX5j8QHj/Lvt6udRjqpelJ
n0jV9Ay0myu7dSK5ng7WNQUlSrba5I/W3CAjuH0CDp90ADCymWSp2jktvv5rSO/R
bQpz58kRY72y4dEwOy0zkWc/EClqh3p4abq5HDBCIkO+EO8CjJhnEnT+oOrFF5C/
LU93bFPyp6ZJoXzsKnjEjSzMzgDT6XuGTAgrh6upZy52ssjkG8zACbNOmXZTYmAg
hg3OlpdEUvM=
=zZCm
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches and object-add QAPIfication
- QAPIfy object-add and --object
- stream: Fail gracefully if permission is denied
- storage-daemon: Fix crash on quit when job is still running
- curl: Fix use after free
- char: Deprecate backend aliases, fix QMP query-chardev-backends
- Fix image creation option defaults that exist in both the format and
the protocol layer (e.g. 'cluster_size' in qcow2 and rbd; the qcow2
default was incorrectly applied to the rbd layer)
# gpg: Signature made Fri 19 Mar 2021 09:18:22 GMT
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (42 commits)
vl: allow passing JSON to -object
qom: move user_creatable_add_opts logic to vl.c and QAPIfy it
tests: convert check-qom-proplist to keyval
qom: Support JSON in HMP object_add and tools --object
char: Simplify chardev_name_foreach()
char: Deprecate backend aliases 'tty' and 'parport'
char: Skip CLI aliases in query-chardev-backends
qom: Add user_creatable_parse_str()
hmp: QAPIfy object_add
qemu-img: Use user_creatable_process_cmdline() for --object
qom: Add user_creatable_add_from_str()
qemu-nbd: Use user_creatable_process_cmdline() for --object
qemu-io: Use user_creatable_process_cmdline() for --object
qom: Factor out user_creatable_process_cmdline()
qom: Remove user_creatable_add_dict()
qemu-storage-daemon: Implement --object with qmp_object_add()
qom: Make "object" QemuOptsList optional
qapi/qom: QAPIfy object-add
qapi/qom: Add ObjectOptions for x-remote-object
qapi/qom: Add ObjectOptions for input-*
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This converts object-add from 'gen': false to the ObjectOptions QAPI
type. As an immediate benefit, clients can now use QAPI schema
introspection for user creatable QOM objects.
It is also the first step towards making the QAPI schema the only
external interface for the creation of user creatable objects. Once all
other places (HMP and command lines of the system emulator and all
tools) go through QAPI, too, some object implementations can be
simplified because some checks (e.g. that mandatory options are set) are
already performed by QAPI, and in another step, QOM boilerplate code
could be generated from the schema.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
* fixes for Coverity CID 1450756, 1450757 and 1450758 (me)
* fix for a bug in zone management receive (me)
* metadata and end-to-end data protection support (me & Gollu Appalanaidu)
* verify support (Gollu Appalanaidu)
* multiple lba formats and format nvm support (Minwoo Im)
and a couple of misc refactorings from me.
v2:
- remove an unintended submodule update. Argh.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEUigzqnXi3OaiR2bATeGvMW1PDekFAmBTP0wACgkQTeGvMW1P
DelZvAf+Ijw77YLz2t0P97CiwoOAG4FADwo61WQJo2AHyS3JMYPVdgNUXF7UGt/S
cNhU1JjdEylgZPKi5t9o/qgy7+vtSL0KKqoXIS0z9ZWZkdFsgObNetGULhaqXgaX
4KcAt7PyJS/33uFqYGSVZxJTO4GtCy34hGw6XrVs388tQD1S+eI+oS5EYBN57rl9
MrV5Z72iMCBx8Y074u81SD/u7n34b5WWbUOJADVI1rAKhilDhnkAhBQ/gK/UBodB
trUUQY/KU+eCnXEqfBeMD6fL75UOAFOjnm7Kfk/Q/g3di3H2fu0RGLrJuSS1dG0n
zmwgE7pQ1jT1UcB3QwAJLR3tGtd6sA==
=HwqP
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/nvme/tags/nvme-next-pull-request' into staging
emulated nvme updates and fixes
* fixes for Coverity CID 1450756, 1450757 and 1450758 (me)
* fix for a bug in zone management receive (me)
* metadata and end-to-end data protection support (me & Gollu Appalanaidu)
* verify support (Gollu Appalanaidu)
* multiple lba formats and format nvm support (Minwoo Im)
and a couple of misc refactorings from me.
v2:
- remove an unintended submodule update. Argh.
# gpg: Signature made Thu 18 Mar 2021 11:53:48 GMT
# gpg: using RSA key 522833AA75E2DCE6A24766C04DE1AF316D4F0DE9
# gpg: Good signature from "Klaus Jensen <its@irrelevant.dk>" [unknown]
# gpg: aka "Klaus Jensen <k.jensen@samsung.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: DDCA 4D9C 9EF9 31CC 3468 4272 63D5 6FC5 E55D A838
# Subkey fingerprint: 5228 33AA 75E2 DCE6 A247 66C0 4DE1 AF31 6D4F 0DE9
* remotes/nvme/tags/nvme-next-pull-request:
hw/block/nvme: add support for the format nvm command
hw/block/nvme: pull lba format initialization
hw/block/nvme: prefer runtime helpers instead of device parameters
hw/block/nvme: support multiple lba formats
hw/block/nvme: add non-mdts command size limit for verify
hw/block/nvme: add verify command
hw/block/nvme: end-to-end data protection
hw/block/nvme: add metadata support
hw/block/nvme: fix zone management receive reporting too many zones
hw/block/nvme: assert namespaces array indices
hw/block/nvme: fix potential overflow
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Format NVM admin command can make a namespace or namespaces to be
with different LBA size and metadata size with protection information
types.
This patch introduces Format NVM command with LBA format, Metadata, and
Protection Information for the device. The secure erase operation things
and support for formatting zoned namespaces are yet to be added.
The parameter checks inside of this patch has been referred from
Keith's old branch.
Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
[anaidu.gollu: rebased on e2e]
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
[k.jensen: rebased for reworked aio tracking]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Pull lba format initialization code into separate function in
preparation for Format NVM support.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
In preparation for Format NVM support, use runtime helpers instead of
the constant device parameters when getting lba size information etc.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
This patch introduces multiple LBA formats supported with the typical
logical block sizes of 512 bytes and 4096 bytes as well as metadata
sizes of 0, 8, 16 and 64 bytes. The format will be chosed based on the
lbads and ms parameters of the nvme-ns device.
Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
[k.jensen: resurrected and rebased]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Verify is not subject to MDTS, so a single Verify command may result in
excessive amounts of allocated memory. Impose a limit on the data size
by adding support for TP 4040 ("Non-MDTS Command Size Limits").
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Add support for namespaces formatted with protection information. The
type of end-to-end data protection (i.e. Type 1, Type 2 or Type 3) is
selected with the `pi` nvme-ns device parameter. If the number of
metadata bytes is larger than 8, the `pil` nvme-ns device parameter may
be used to control the location of the 8-byte DIF tuple. The default
`pil` value of '0', causes the DIF tuple to be transferred as the last
8 bytes of the metadata. Set to 1 to store this in the first eight bytes
instead.
Co-authored-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Add support for metadata in the form of extended logical blocks as well
as a separate buffer of data. The new `ms` nvme-ns device parameter
specifies the size of metadata per logical block in bytes. The `mset`
nvme-ns device parameter controls whether metadata is transfered as part
of an extended lba (set to '1') or in a separate buffer (set to '0',
the default).
Regardsless of the scheme chosen with `mset`, metadata is stored at the
end of the namespace backing block device. This requires the user
provided PRP/SGLs to be walked and "split" into data and metadata
scatter/gather lists if the extended logical block scheme is used, but
has the advantage of not breaking the deallocated blocks support.
Co-authored-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
nvme_zone_mgmt_recv uses nvme_ns_nlbas() to get the number of LBAs in
the namespace and then calculates the number of zones to report by
incrementing slba with ZSZE until exceeding the number of LBAs as
returned by nvme_ns_nlbas().
This is bad because the namespace might be of such as size that some
LBAs are valid, but are not part of any zone, causing zone management
receive to report one additional (but non-existing) zone.
Fix this with a conventional loop on i < ns->num_zones instead.
Fixes: a479335bfa ("hw/block/nvme: Support Zoned Namespace Command Set")
Cc: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Coverity complains about a possible memory corruption in the
nvme_ns_attach and _detach functions. While we should not (famous last
words) be able to reach this function without nsid having previously
been validated, this is still an open door for future misuse.
Make Coverity and maintainers happy by asserting that the index into the
array is valid. Also, while not detected by Coverity (yet), add an
assert in nvme_subsys_ns and nvme_subsys_register_ns as well since a
similar issue is exists there.
Fixes: 037953b5b2 ("hw/block/nvme: support namespace detach")
Fixes: CID 1450757
Fixes: CID 1450758
Cc: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
page_size is a uint32_t, and zasl is a uint8_t, so the expression
`page_size << zasl` is done using 32-bit arithmetic and might overflow.
Since we then compare this against a 64 bit data_size value, Coverity
complains that we might overflow unintentionally. An MDTS/ZASL value in
excess of 4GiB is probably impractical, but it is not entirely
unrealistic, so add a cast such that we handle that case properly.
Fixes: 578d914b26 ("hw/block/nvme: align zoned.zasl with mdts")
Fixes: CID 1450756
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Rather than having a device specific debug implementation in
pflash_cfi01.c and pflash_cfi02.c, use the standard tracing facility.
Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210216142721.1985543-2-david.edmondson@oracle.com>
[PMD: Rebased, fixed pflash_write_block_erase trace event format]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>