Fix code style. Space required before the open parenthesis '('.
Signed-off-by: Xinhao Zhang <zhangxinhao1@huawei.com>
Signed-off-by: Kai Deng <dengkai1@huawei.com>
Reported-by: Euler Robot <euler.robot@huawei.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20201030043515.1030223-3-zhangxinhao1@huawei.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Fix code style. Open braces for struct should go on the same line.
Signed-off-by: Xinhao Zhang <zhangxinhao1@huawei.com>
Signed-off-by: Kai Deng <dengkai1@huawei.com>
Reported-by: Euler Robot <euler.robot@huawei.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20201030043515.1030223-2-zhangxinhao1@huawei.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
s390-pci-vfio.c calls into the vfio code, so we need it to be
built conditionally on vfio (which implies CONFIG_LINUX).
Fixes: cd7498d07f ("s390x/pci: Add routine to get the vfio dma available count")
Reported-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Message-Id: <20201103123237.718242-1-cohuck@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
HPT resizing is asynchronous: the guest first kicks off the creation of a
new HPT, then it waits for that new HPT to be actually created and finally
it asks the current HPT to be replaced by the new one.
In the case of a userland allocated HPT, this currently relies on calling
qemu_memalign() which aborts on OOM and never returns NULL. Since we seem
to have path to report the failure to the guest with an H_NO_MEM return
value, use qemu_try_memalign() instead of qemu_memalign().
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160398563636.32380.1747166034877173994.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Sometimes QEMU needs to allocate the HPT in userspace, namely with TCG
or PR KVM. This is performed with qemu_memalign() because of alignment
requirements. Like glib's allocators, its behaviour is to abort on OOM
instead of returning NULL.
This could be changed to qemu_try_memalign(), but in the specific case
of spapr_reallocate_hpt(), the outcome would be to terminate QEMU anyway
since no HPT means no MMU for the guest. Drop the dead code instead.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160398562892.32380.15006707861753544263.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The source and destination x,y display parameters in ati_2d_blt()
may run off the vga limits if either of s->regs.[src|dst]_[xy] is
zero. Check the parameter values to avoid potential crash.
Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20201021103818.1704030-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Note that whilst the device does not do anything with these values, they are
logged with trace events and stored to allow future implementation.
The default flow control is set to none at reset as documented in the Linux
ftdi_sio.h header file.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-id: 20201027150456.24606-9-mark.cave-ayland@ilande.co.uk
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Also implement the behaviour reported in Linux's ftdi_sio.c whereby if an invalid
data_bits value is provided then the hardware defaults to using 8.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20201027150456.24606-8-mark.cave-ayland@ilande.co.uk
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Some operating systems will generate a new device ID when a USB device is unplugged
and then replugged into the USB. If this is done whilst switching between multiple
applications over a virtual serial port, the change of device ID requires going
back into the OS/application to locate the new device accordingly.
Add a new always-plugged property that if specified will ensure that the device
always remains attached to the USB regardless of the state of the backend
chardev.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-id: 20201027150456.24606-7-mark.cave-ayland@ilande.co.uk
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The DeviceOutVendor and DeviceInVendor macros can be replaced with their
equivalent VendorDeviceOutRequest and VendorDeviceRequest macros from usb.h.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20201027150456.24606-6-mark.cave-ayland@ilande.co.uk
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Virtqueue has split and packed, so before setting inflight,
you need to inform the back-end virtqueue format.
Signed-off-by: Jin Yu <jin.yu@intel.com>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20201103123617.28256-1-jin.yu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit adb29c0273.
The commit broke -device vhost-user-blk-pci because the
vhost_dev_prepare_inflight() function it introduced segfaults in
vhost_dev_set_features() when attempting to access struct vhost_dev's
vdev pointer before it has been assigned.
To reproduce the segfault simply launch a vhost-user-blk device with the
contrib vhost-user-blk device backend:
$ build/contrib/vhost-user-blk/vhost-user-blk -s /tmp/vhost-user-blk.sock -r -b /var/tmp/foo.img
$ build/qemu-system-x86_64 \
-device vhost-user-blk-pci,id=drv0,chardev=char1,addr=4.0 \
-object memory-backend-memfd,id=mem,size=1G,share=on \
-M memory-backend=mem,accel=kvm \
-chardev socket,id=char1,path=/tmp/vhost-user-blk.sock
Segmentation fault (core dumped)
Cc: Jin Yu <jin.yu@intel.com>
Cc: Raphael Norwitz <raphael.norwitz@nutanix.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20201102165709.232180-1-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
IOMMUs may declare memory regions spanning from 0 to UINT64_MAX. When
attempting to deal with such region, vfio_listener_region_del() passes a
size of 2^64 to int128_get64() which throws an assertion failure. Even
ignoring this, the VFIO_IOMMU_DMA_MAP ioctl cannot handle this size
since the size field is 64-bit. Split the request in two.
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20201030180510.747225-11-jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to the loongson spec
(http://www.loongson.cn/uploadfile/cpu/3B1500/Loongson_3B1500_cpu_user_1.pdf)
and the macro definition(#define R_PERCORE_ISR(x) (0x40 + 0x8 * x)), we know
that the ISR size per CORE is 8, so here we need to divide
(addr - R_PERCORE_ISR(0)) by 8, not 4.
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <5FA12391.8090400@huawei.com>
[PMD: Shortened subject]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
There is no "version 2" of the "Lesser" General Public License.
It is either "GPL version 2.0" or "Lesser GPL version 2.1".
This patch replaces all occurrences of "Lesser GPL version 2" with
"Lesser GPL version 2.1" in comment section.
Signed-off-by: Chetan Pant <chetan4windows@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20201023122633.19466-1-chetan4windows@gmail.com>
[PMD: Added hw/mips/ prefix in subject]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
There is no "version 2" of the "Lesser" General Public License.
It is either "GPL version 2.0" or "Lesser GPL version 2.1".
This patch replaces all occurrences of "Lesser GPL version 2" with
"Lesser GPL version 2.1" in comment section.
Signed-off-by: Chetan Pant <chetan4windows@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20201016143509.26692-1-chetan4windows@gmail.com>
[PMD: Split hw/ vs target/]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
We deprecated the support for the 'r4k' machine for the 5.0 release
(commit d32dc61421), which means that our deprecation policy allows
us to drop it in release 5.2. Remove the code.
To repeat the rationale from the deprecation note:
- this virtual machine has no specification
- the Linux kernel dropped support for it 10 years ago
Users are recommended to use the Malta board instead.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
ACKed-by: Peter Krempa <pkrempa@redhat.com>
Message-Id: <20201102201311.2220005-1-f4bug@amsat.org>
The latest SD card image [1] released by Microchip ships a Linux
kernel with built-in PolarFire SoC I2C driver support. The device
tree file includes the description for the I2C1 node hence kernel
tries to probe the I2C1 device during boot.
It is enough to create an unimplemented device for I2C1 to allow
the kernel to continue booting to the shell.
[1] ftp://ftpsoc.microsemi.com/outgoing/core-image-minimal-dev-icicle-kit-es-sd-20201009141623.rootfs.wic.gz
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 1603863010-15807-11-git-send-email-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
When system memory is larger than 1 GiB (high memory), PolarFire SoC
maps it at address 0x10_0000_0000. Address 0xC000_0000 and above is
aliased to the same 1 GiB low memory with different cache attributes.
At present QEMU maps the system memory contiguously from 0x8000_0000.
This corrects the wrong QEMU logic. Note address 0x14_0000_0000 is
the alias to the high memory, and even physical memory is only 1 GiB,
the HSS codes still tries to probe the high memory alias address.
It seems there is no issue on the real hardware, so we will have to
take that into the consideration in our emulation. Due to this, we
we increase the default system memory size to 1537 MiB (the minimum
required high memory size by HSS) so that user gets notified an error
when less than 1537 MiB is specified.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20201101170538.3732-1-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Somehow HSS needs to access address 0 [1] for the DDR calibration data
which is in the chipset's reserved memory. Let's map it.
[1] See the config_copy() calls in various places in ddr_setup() in
the HSS source codes.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 1603863010-15807-9-git-send-email-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Previously SYSREG was created as an unimplemented device. Now that
we have a simple SYSREG module, connect it.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 1603863010-15807-8-git-send-email-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This creates a minimum model for Microchip PolarFire SoC SYSREG
module. It only implements the ENVM_CR register to tell guest
software that eNVM is running at the configured divider rate.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 1603863010-15807-7-git-send-email-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Previously IOSCB_CFG was created as an unimplemented device. With
the new IOSCB model, its memory range is already covered by the
IOSCB hence remove the previous unimplemented device creation in
the SoC codes.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 1603863010-15807-6-git-send-email-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This creates a model for PolarFire SoC IOSCB [1] module. It actually
contains lots of sub-modules like various PLLs to control different
peripherals. Only the mininum capabilities are emulated to make the
HSS DDR memory initialization codes happy. Lots of sub-modules are
created as an unimplemented devices.
[1] PF_SoC_RegMap_V1_1/MPFS250T/mpfs250t_ioscb_memmap_dri.htm in
https://www.microsemi.com/document-portal/doc_download/1244581-polarfire-soc-register-map
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 1603863010-15807-5-git-send-email-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Connect DDR SGMII PHY module and CFG module to the PolarFire SoC.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 1603863010-15807-4-git-send-email-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The PolarFire SoC DDR Memory Controller mainly includes 2 modules,
called SGMII PHY module and the CFG module, as documented in the
chipset datasheet.
This creates a single file that groups these 2 modules, providing
the minimum functionalities that make the HSS DDR initialization
codes happy.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 1603863010-15807-3-git-send-email-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
It is not easy to find out the memory map for a specific component
in the PolarFire SoC as the information is scattered in different
documents. Add some comments so that people can know where to get
such information from the Microchip website.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 1603863010-15807-2-git-send-email-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Add sifive_plic vmstate for supporting sifive_plic migration.
Current vmstate framework only supports one structure parameter
as num field to describe variable length arrays, so introduce
num_enables.
Signed-off-by: Yifei Jiang <jiangyifei@huawei.com>
Signed-off-by: Yipeng Yin <yinyipeng1@huawei.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20201026115530.304-7-jiangyifei@huawei.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Extend virt machine to allow passing custom DTB using "-dtb"
command-line parameter. This will help users pass modified DTB
to virt machine.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20201022053225.2596110-2-anup.patel@wdc.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Extend sifive_u machine to allow passing custom DTB using "-dtb"
command-line parameter. This will help users pass modified DTB
or Linux SiFive DTB to sifive_u machine.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20201022053225.2596110-1-anup.patel@wdc.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
libFuzzer triggered the following assertion:
cat << EOF | qemu-system-i386 -M pc-q35-5.0 \
-nographic -monitor none -serial none \
-qtest stdio -d guest_errors -trace pci\*
outl 0xcf8 0x8400f841
outl 0xcfc 0xebed205d
outl 0x5d02 0xedf82049
EOF
pci_cfg_write ICH9-LPC 31:0 @0x41 <- 0xebed205d
hw/pci/pci.c:268: int pci_bus_get_irq_level(PCIBus *, int): Assertion `irq_num < bus->nirq' failed.
This is because ich9_lpc_sci_irq() returns -1 for reserved
(illegal) values, but ich9_lpc_pmbase_sci_update() considers
it valid and store it in a 8-bit unsigned type. Then the 255
value is used as GSI IRQ, resulting in a PIRQ value of 247,
more than ICH9_LPC_NB_PIRQS (8).
Fix by simply ignoring the invalid access (and reporting it):
pci_cfg_write ICH9-LPC 31:0 @0x41 <- 0xebed205d
ICH9 LPC: SCI IRQ SEL #3 is reserved
pci_cfg_read mch 00:0 @0x0 -> 0x8086
pci_cfg_read mch 00:0 @0x0 -> 0x29c08086
...
Cc: qemu-stable@nongnu.org
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: 8f242cb724 ("ich9: implement SCI_IRQ_SEL register")
BugLink: https://bugs.launchpad.net/qemu/+bug/1878642
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200717151705.18611-1-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The virtio-iommu device can deal with arbitrary page sizes for virtual
endpoints, but for endpoints assigned with VFIO it must follow the page
granule used by the host IOMMU driver.
Implement the interface to set the vIOMMU page size mask, called by VFIO
for each endpoint. We assume that all host IOMMU drivers use the same
page granule (the host page granule). Override the page_size_mask field
in the virtio config space.
Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20201030180510.747225-10-jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Set IOMMU supported page size mask same as host Linux supported page
size mask.
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20201030180510.747225-9-jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add notify_flag_changed() to notice when memory listeners are added and
removed.
Acked-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20201030180510.747225-7-jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Implement the replay callback to setup all mappings for a new memory
region.
Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20201030180510.747225-6-jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Call the memory notifiers when attaching an endpoint to a domain, to
replay existing mappings, and when detaching the endpoint, to remove all
mappings.
Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20201030180510.747225-5-jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Extend VIRTIO_IOMMU_T_MAP/UNMAP request to notify memory listeners. It
will call VFIO notifier to map/unmap regions in the physical IOMMU.
Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20201030180510.747225-4-jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Store the memory region associated to each endpoint into the endpoint
structure, to allow efficient memory notification on map/unmap.
Acked-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20201030180510.747225-3-jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Due to an invalid mask, virtio_iommu_mr() may return the wrong memory
region. It hasn't been too problematic so far because the function was
only used to test existence of an endpoint, but that is about to change.
Fixes: cfb42188b2 ("virtio-iommu: Implement attach/detach command")
Cc: QEMU Stable <qemu-stable@nongnu.org>
Acked-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20201030180510.747225-2-jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fix the following Coverity issue (RESOURCE_LEAK):
CID 1432879: Resource leak
Handle variable fd going out of scope leaks the handle.
Replace a close() call by qemu_close() since the handle is
opened with qemu_open().
Fixes: bb99f4772f ("hw/smbios: support loading OEM strings values from a file")
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20201030152742.1553968-1-philmd@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fix uninitialized value issues reported by Coverity:
Field 'msg.reserved' is uninitialized when calling write().
While the 'struct vhost_msg' does not have a 'reserved' field,
we still initialize it to have the two parts of the function
consistent.
Reported-by: Coverity (CID 1432864: UNINIT)
Fixes: c471ad0e9b ("vhost_net: device IOTLB support")
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20201103063541.2463363-1-philmd@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fix code style. Operator needs spaces both sides.
Signed-off-by: Xinhao Zhang <zhangxinhao1@huawei.com>
Signed-off-by: Kai Deng <dengkai1@huawei.com>
Message-Id: <20201103102634.273021-3-zhangxinhao1@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fix code style. Space required before the open parenthesis '('.
Signed-off-by: Xinhao Zhang <zhangxinhao1@huawei.com>
Signed-off-by: Kai Deng <dengkai1@huawei.com>
Message-Id: <20201103102634.273021-2-zhangxinhao1@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fix code style. Don't use '#' flag of printf format ('%#') in
format strings, use '0x' prefix instead
Signed-off-by: Xinhao Zhang <zhangxinhao1@huawei.com>
Signed-off-by: Kai Deng <dengkai1@huawei.com>
Message-Id: <20201103102634.273021-1-zhangxinhao1@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The block size determines the alignment requirements. Implement
get_min_alignment() of the TYPE_MEMORY_DEVICE interface.
This allows auto-assignment of a properly aligned address in guest
physical address space. For example, when specifying a 2GB block size
for a virtio-mem device with 10GB with a memory setup "-m 4G, 20G",
we'll no longer fail when realizing.
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20201008083029.9504-7-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a callback that can be used to express additional alignment
requirements (exceeding the ones from the memory region).
Will be used by virtio-mem to express special alignment requirements due
to manually configured, big block sizes (e.g., 1GB with an ordinary
memory-backend-ram). This avoids failing later when realizing, because
auto-detection wasn't able to assign a properly aligned address.
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20201008083029.9504-6-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's warn instead of bailing out - the worst thing that can happen is
that we'll fail hot/coldplug later. The user got warned, and this should
be rare.
This will be necessary for memory devices with rather big (user-defined)
alignment requirements - say a virtio-mem device with a 2G block size -
which will become important, for example, when supporting vfio in the
future.
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20201008083029.9504-5-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's allow a minimum block size of 1 MiB in all configurations. Select
the default block size based on
- The page size of the memory backend.
- The THP size if the memory backend size corresponds to the real host
page size.
- The global minimum of 1 MiB.
and warn if something smaller is configured by the user.
VIRTIO_MEM only supports Linux (depends on LINUX), so we can probe the
THP size unconditionally.
For now we only support virtio-mem on x86-64 - there isn't a user-visible
change (x86-64 only supports 2 MiB THP on the PMD level) - the default
was, and will be 2 MiB.
If we ever have THP on the PUD level (e.g., 1 GiB THP on x86-64), we
expect it to be more transparent - e.g., to only optimize fully populated
ranges unless explicitly told /configured otherwise (in contrast to PMD
THP).
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20201008083029.9504-4-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The spec states:
"The device MUST set addr, region_size, usable_region_size, plugged_size,
requested_size to multiples of block_size."
With block sizes > 256MB, we currently wouldn't guarantee that for the
usable_region_size.
Note that we cannot exceed the region_size, as we already enforce the
alignment there properly.
Fixes: 910b25766b ("virtio-mem: Paravirtualized memory hot(un)plug")
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20201008083029.9504-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The spec states:
"The device MUST set addr, region_size, usable_region_size, plugged_size,
requested_size to multiples of block_size."
In some cases, we currently don't guarantee that for "addr": For example,
when starting a VM with 4 GiB boot memory and a virtio-mem device with a
block size of 2 GiB, "memaddr"/"addr" will be auto-assigned to
0x140000000 (5 GiB).
We'll try to improve auto-assignment for memory devices next, to avoid
bailing out in case memory device code selects a bad address.
Note: The Linux driver doesn't support such big block sizes yet.
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Fixes: 910b25766b ("virtio-mem: Paravirtualized memory hot(un)plug")
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20201008083029.9504-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE28EdLTc7SjdV9QLsYlFWYQpPbMAFAl+gI74ACgkQYlFWYQpP
bMCBrA/9GXMZZGDfHFenXF+rS6J+ZKxtk29vq9Ly8KZ9YW7CzF9MP8qE/5iyFfmx
d1BknXGQerW2kAzpkOq2/MKDklOc+0BAhaTdUaFR/ao5ZKuv2LQ8uFnKVoTrhTx9
+HVkTVUTnez6ReCZVIrtN4+XVdyQTeQotJg6H2m5Q/BxQKcj6OMOlneuSGDn5vFN
EWgDvEmfFEkzbN8FMXtkT35bg3vA5TGmfQRMk1SMMREOPxF04CaTVTxYscCpS0WC
Cl+62mx4XLjscK7hwXuTNTrxeOLxZ2xLK5dhDd/qxBveio07mIM5X2psdKR0t5qX
HLtm437T9CAYmyo8jgvM4KL8f+rbJnLd579qyVwIMsue28Qisj9nuWCTcaEpjfck
4krhxJwxenRtqQ9wYrnbnQI5yQDIE6iUGf0toXwCNdJIr+FvyIcT7vJtTzZXtRI8
sxwK5wfJ/WSey9uNLZGFbQuv4vjOMV+Nk3mEi1gUV8ujogo+2U6WUAE3NhqFLKn1
YT6AJhDZvqL1f8gFrbiqR8xwvPrYmwK/tK38X1exSDOqiB7UNzR/apAb1oniul0e
rS5xWzIs9APvkdWQssCHvrVDdh6VISXQ5bnT8lkfmvYrCTn2gUGAFXDrxZjXIaL9
scCr8N9STkHmoYpc2ACRKIpfK3E1sDjGA8mAPemkxsLakNwBS4o=
=s4KC
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/nvme/tags/pull-nvme-20201102' into staging
nvme pull 2 Nov 2020
# gpg: Signature made Mon 02 Nov 2020 15:20:30 GMT
# gpg: using RSA key DBC11D2D373B4A3755F502EC625156610A4F6CC0
# gpg: Good signature from "Keith Busch <kbusch@kernel.org>" [unknown]
# gpg: aka "Keith Busch <keith.busch@gmail.com>" [unknown]
# gpg: aka "Keith Busch <keith.busch@intel.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: DBC1 1D2D 373B 4A37 55F5 02EC 6251 5661 0A4F 6CC0
* remotes/nvme/tags/pull-nvme-20201102: (30 commits)
hw/block/nvme: fix queue identifer validation
hw/block/nvme: fix create IO SQ/CQ status codes
hw/block/nvme: fix prp mapping status codes
hw/block/nvme: report actual LBA data shift in LBAF
hw/block/nvme: add trace event for requests with non-zero status code
hw/block/nvme: add nsid to get/setfeat trace events
hw/block/nvme: reject io commands if only admin command set selected
hw/block/nvme: support for admin-only command set
hw/block/nvme: validate command set selected
hw/block/nvme: support per-namespace smart log
hw/block/nvme: fix log page offset check
hw/block/nvme: remove pointless rw indirection
hw/block/nvme: update nsid when registered
hw/block/nvme: change controller pci id
pci: allocate pci id for nvme
hw/block/nvme: support multiple namespaces
hw/block/nvme: refactor identify active namespace id list
hw/block/nvme: add support for sgl bit bucket descriptor
hw/block/nvme: add support for scatter gather lists
hw/block/nvme: harden cmb access
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In gicv3_init_cpuif() we copy the ARMCPU gicv3_maintenance_interrupt
into the GICv3CPUState struct's maintenance_irq field. This will
only work if the board happens to have already wired up the CPU
maintenance IRQ before the GIC was realized. Unfortunately this is
not the case for the 'virt' board, and so the value that gets copied
is NULL (since a qemu_irq is really a pointer to an IRQState struct
under the hood). The effect is that the CPU interface code never
actually raises the maintenance interrupt line.
Instead, since the GICv3CPUState has a pointer to the CPUState, make
the dereference at the point where we want to raise the interrupt, to
avoid an implicit requirement on board code to wire things up in a
particular order.
Reported-by: Jose Martins <josemartins90@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20201009153904.28529-1-peter.maydell@linaro.org
Reviewed-by: Luc Michel <luc@lmichel.fr>
In exynos4210_fimd_update(), the pointer s is dereferinced before
being check if it is valid, which may lead to NULL pointer dereference.
So move the assignment to global_width after checking that the s is valid.
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 5F9F8D88.9030102@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In omap_lcd_interrupts(), the pointer omap_lcd is dereferinced before
being check if it is valid, which may lead to NULL pointer dereference.
So move the assignment to surface after checking that the omap_lcd is valid
and move surface_bits_per_pixel(surface) to after the surface assignment.
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: AlexChen <alex.chen@huawei.com>
Message-id: 5F9CDB8A.9000001@huawei.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When booting a CPU with EL3 using the -kernel flag, set up CPTR_EL3 so
that SVE will not trap to EL3.
Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20201030151541.11976-1-remi@remlab.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use the BIT_ULL() macro to ensure we use 64-bit arithmetic.
This fixes the following Coverity issue (OVERFLOW_BEFORE_WIDEN):
CID 1432363 (#1 of 1): Unintentional integer overflow:
overflow_before_widen:
Potentially overflowing expression 1 << scale with type int
(32 bits, signed) is evaluated using 32-bit arithmetic, and
then used in a context that expects an expression of type
hwaddr (64 bits, unsigned).
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20201030144617.1535064-1-philmd@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is to allow IDE disks to be unplugged when adding to QEMU via:
-drive file=/root/disk_file,if=none,id=ide-disk0,format=raw
-device ide-hd,drive=ide-disk0,bus=ide.0,unit=0
as the current code only works for disk added with:
-drive file=/root/disk_file,if=ide,index=0,media=disk,format=raw
Since the code already have the IDE controller as `dev`, we don't need
to use the legacy DriveInfo to find all the drive we want to unplug.
We can simply use `blk` from the controller, as it kind of was already
assume to be the same, by setting it to NULL.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20201027154058.495112-1-anthony.perard@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
The type of input variable is unsigned int
while the printer type is int. So fix incorrect print type.
Signed-off-by: Zhengui li <lizhengui@huawei.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
We use the capability chains of the VFIO_DEVICE_GET_INFO ioctl to retrieve
the CLP information that the kernel exports.
To be compatible with previous kernel versions we fall back on previous
predefined values, same as the emulation values, when the ioctl is found
to not support capability chains. If individual CLP capabilities are not
found, we fall back on default values for only those capabilities missing
from the chain.
This patch is based on work previously done by Pierre Morel.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[aw: non-Linux build fixes]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Now that VFIO_DEVICE_GET_INFO supports capability chains, add a helper
function to find specific capabilities in the chain.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
We use a ClpRspQueryPci structure to hold the information related to a
zPCI Function.
This allows us to be ready to support different zPCI functions and to
retrieve the zPCI function information from the host.
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Add a step to remove all stashed PCI groups to avoid stale data between
machine resets.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
We use a S390PCIGroup structure to hold the information related to a
zPCI Function group.
This allows us to be ready to support multiple groups and to retrieve
the group information from the host.
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
When an s390 guest is using lazy unmapping, it can result in a very
large number of oustanding DMA requests, far beyond the default
limit configured for vfio. Let's track DMA usage similar to vfio
in the host, and trigger the guest to flush their DMA mappings
before vfio runs out.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[aw: non-Linux build fixes]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Create new files for separating out vfio-specific work for s390
pci. Add the first such routine, which issues VFIO_IOMMU_GET_INFO
ioctl to collect the current dma available count.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[aw: Fix non-Linux build with CONFIG_LINUX]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The underlying host may be limiting the number of outstanding DMA
requests for type 1 IOMMU. Add helper functions to check for the
DMA available capability and retrieve the current number of DMA
mappings allowed.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[aw: vfio_get_info_dma_avail moved inside CONFIG_LINUX]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Rather than duplicating the same loop in multiple locations,
create a static function to do the work.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Seems a more appropriate location for them.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Added amount of bytes transferred to the VM at destination by all VFIO
devices
Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
If the device is not a failover primary device, call
vfio_migration_probe() and vfio_migration_finalize() to enable
migration support for those devices that support it respectively to
tear it down again.
Removed migration blocker from VFIO PCI device specific structure and use
migration blocker from generic structure of VFIO device.
Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
With vIOMMU, IO virtual address range can get unmapped while in pre-copy
phase of migration. In that case, unmap ioctl should return pages pinned
in that range and QEMU should find its correcponding guest physical
addresses and report those dirty.
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
[aw: fix error_report types, fix cpu_physical_memory_set_dirty_lebitmap() cast]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
When vIOMMU is enabled, register MAP notifier from log_sync when all
devices in container are in stop and copy phase of migration. Call replay
and get dirty pages from notifier callback.
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
vfio_listener_log_sync gets list of dirty pages from container using
VFIO_IOMMU_GET_DIRTY_BITMAP ioctl and mark those pages dirty when all
devices are stopped and saving state.
Return early for the RAM block section of mapped MMIO region.
Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
[aw: fix error_report types, fix cpu_physical_memory_set_dirty_lebitmap() cast]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Call VFIO_IOMMU_DIRTY_PAGES ioctl to start and stop dirty pages tracking
for VFIO devices.
Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Added helper functions to get IOMMU info capability chain.
Added function to get migration capability information from that
capability chain for IOMMU container.
Similar change was proposed earlier:
https://lists.gnu.org/archive/html/qemu-devel/2018-05/msg03759.html
Disable migration for devices if IOMMU module doesn't support migration
capability.
Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Cc: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Cc: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Sequence during _RESUMING device state:
While data for this device is available, repeat below steps:
a. read data_offset from where user application should write data.
b. write data of data_size to migration region from data_offset.
c. write data_size which indicates vendor driver that data is written in
staging buffer.
For user, data is opaque. User should write data in the same order as
received.
Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Added .save_live_pending, .save_live_iterate and .save_live_complete_precopy
functions. These functions handles pre-copy and stop-and-copy phase.
In _SAVING|_RUNNING device state or pre-copy phase:
- read pending_bytes. If pending_bytes > 0, go through below steps.
- read data_offset - indicates kernel driver to write data to staging
buffer.
- read data_size - amount of data in bytes written by vendor driver in
migration region.
- read data_size bytes of data from data_offset in the migration region.
- Write data packet to file stream as below:
{VFIO_MIG_FLAG_DEV_DATA_STATE, data_size, actual data,
VFIO_MIG_FLAG_END_OF_STATE }
In _SAVING device state or stop-and-copy phase
a. read config space of device and save to migration file stream. This
doesn't need to be from vendor driver. Any other special config state
from driver can be saved as data in following iteration.
b. read pending_bytes. If pending_bytes > 0, go through below steps.
c. read data_offset - indicates kernel driver to write data to staging
buffer.
d. read data_size - amount of data in bytes written by vendor driver in
migration region.
e. read data_size bytes of data from data_offset in the migration region.
f. Write data packet as below:
{VFIO_MIG_FLAG_DEV_DATA_STATE, data_size, actual data}
g. iterate through steps b to f while (pending_bytes > 0)
h. Write {VFIO_MIG_FLAG_END_OF_STATE}
When data region is mapped, its user's responsibility to read data from
data_offset of data_size before moving to next steps.
Added fix suggested by Artem Polyakov to reset pending_bytes in
vfio_save_iterate().
Added fix suggested by Zhi Wang to add 0 as data size in migration stream and
add END_OF_STATE delimiter to indicate phase complete.
Suggested-by: Artem Polyakov <artemp@nvidia.com>
Suggested-by: Zhi Wang <zhi.wang.linux@gmail.com>
Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Define flags to be used as delimiter in migration stream for VFIO devices.
Added .save_setup and .save_cleanup functions. Map & unmap migration
region from these functions at source during saving or pre-copy phase.
Set VFIO device state depending on VM's state. During live migration, VM is
running when .save_setup is called, _SAVING | _RUNNING state is set for VFIO
device. During save-restore, VM is paused, _SAVING state is set for VFIO device.
Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Added migration state change notifier to get notification on migration state
change. These states are translated to VFIO device state and conveyed to
vendor driver.
Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
VM state change handler is called on change in VM's state. Based on
VM state, VFIO device state should be changed.
Added read/write helper functions for migration region.
Added function to set device_state.
Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[aw: lx -> HWADDR_PRIx, remove redundant parens]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Whether the VFIO device supports migration or not is decided based of
migration region query. If migration region query is successful and migration
region initialization is successful then migration is supported else
migration is blocked.
Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Added functions to save and restore PCI device specific data,
specifically config space of PCI device.
Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This function will be used for migration region.
Migration region is mmaped when migration starts and will be unmapped when
migration is complete.
Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Just a bunch of bugfixes all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl+cCq8PHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpKWkH/0qq+u2z/q0KYzmodcdW2eFNsvKF2e+Dz3Af
zZGMREH93DsKAQ2k3t84sz2RAaAP45cCqrq+v8kSzmpaC7GqKYA/VceeLwy8e6Eu
YKvu5QixrOVTpNg2QV/w44ywgtA4NbWy5Fr9S4qhzPyyD/gtE609weZ1vQnSFT7B
Gg4vr1lcqskwYTH7sh+bpsDTUeANr7QaknWKnaomroz+IUO8m9ig6RKtegaXhQCj
xswI4458S3nklqnoGMa56j46VYwft8YHO1lBiR1WefTHylknyng9Tdvf9G5mnzVg
wyrMTuT36lMXIa5KcSZeECIt2ZUT6KSSjWzEKZNXL5lS3gfUo7o=
=powp
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc,pci,vhost,virtio: misc fixes
Just a bunch of bugfixes all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 30 Oct 2020 12:44:31 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
intel_iommu: Fix two misuse of "0x%u" prints
virtio: skip guest index check on device load
vhost-blk: set features before setting inflight feature
pci: Disallow improper BAR registration for type 1
pci: Change error_report to assert(3)
pci: advertise a page aligned ATS
pc: Implement -no-hpet as sugar for -machine hpet=on
vhost: Don't special case vq->used_phys in vhost_get_log_size()
pci: Assert irqnum is between 0 and bus->nirqs in pci_bus_change_irq_level
hw/pci: Extract pci_bus_change_irq_level() from pci_change_irq_level()
hw/virtio/vhost-vdpa: Fix Coverity CID 1432864
acpi/crs: Support ranges > 32b for hosts
acpi/crs: Prevent bad ranges for host bridges
vhost-vsock: set vhostfd to non-blocking mode
vhost-vdpa: negotiate VIRTIO_NET_F_STATUS with driver
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Dave magically found this. Fix them with "0x%x".
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20201019173922.100270-1-peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
QEMU must be careful when loading device state off migration streams to
prevent a malicious source from exploiting the emulator. Overdoing these
checks has the side effect of allowing a guest to "pin itself" in cloud
environments by messing with state which is entirely in its control.
Similarly to what f3081539 achieved in usb_device_post_load(), this
commit removes such a check from virtio_load(). Worth noting, the result
of a load without this check is the same as if a guest enables a VQ with
invalid indexes to begin with. That is, the virtual device is set in a
broken state (by the datapath handler) and must be reset.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Message-Id: <20201028134643.110698-1-felipe@nutanix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Virtqueue has split and packed, so before setting inflight,
you need to inform the back-end virtqueue format.
Signed-off-by: Jin Yu <jin.yu@intel.com>
Message-Id: <20200910134851.7817-1-jin.yu@intel.com>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Prevent future developers working on root complexes, root ports, or
bridges that also wish to implement a BAR for those, from shooting
themselves in the foot. PCI type 1 headers only support 2 base address
registers. It is incorrect and difficult to figure out what is wrong
with the device when this mistake is made. With this, it is immediate
and obvious what has gone wrong.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Message-Id: <20201015181411.89104-2-ben.widawsky@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Asserts are used for developer bugs. As registering a bar of the wrong
size is not something that should be possible for a user to achieve,
this is a developer bug.
While here, use the more obvious helper function.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Message-Id: <20201015181411.89104-1-ben.widawsky@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
After Linux kernel commit 61363c1474b1 ("iommu/vt-d: Enable ATS only
if the device uses page aligned address."), ATS will be only enabled
if device advertises a page aligned request.
Unfortunately, vhost-net is the only user and we don't advertise the
aligned request capability in the past since both vhost IOTLB and
address_space_get_iotlb_entry() can support non page aligned request.
Though it's not clear that if the above kernel commit makes
sense. Let's advertise a page aligned ATS here to make vhost device
IOTLB work with Intel IOMMU again.
Note that in the future we may extend pcie_ats_init() to accept
parameters like queue depth and page alignment.
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20200909081731.24688-1-jasowang@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Get rid of yet another global variable.
The default will be hpet=on only if CONFIG_HPET=y.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20201021144716.1536388-1-ehabkost@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The first loop in vhost_get_log_size() computes the size of the dirty log
bitmap so that it allows to track changes in the entire guest memory, in
terms of GPA.
When not using a vIOMMU, the address of the vring's used structure,
vq->used_phys, is a GPA. It is thus already covered by the first loop.
When using a vIOMMU, vq->used_phys is a GIOVA that will be translated
to an HVA when the vhost backend needs to update the used structure. It
will log the corresponding GPAs into the bitmap but it certainly won't
log the GIOVA.
So in any case, vq->used_phys shouldn't be explicitly used to size the
bitmap. Drop the second loop.
This fixes a crash of the source when migrating a guest using in-kernel
vhost-net and iommu_platform=on on POWER, because DMA regions are put
over 0x800000000000000ULL. The resulting insanely huge log size causes
g_malloc0() to abort.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1879349
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160208823418.29027.15172801181796272300.stgit@bahia.lan>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
These assertions similar to those in the adjacent pci_bus_get_irq_level()
function ensure that irqnum lies within the valid PCI bus IRQ range.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20201011082022.3016-1-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20201024203900.3619498-3-f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Extract pci_bus_change_irq_level() from pci_change_irq_level() to
make it clearer it operates on the bus.
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20201024203900.3619498-2-f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fix uninitialized value issues reported by Coverity:
Field 'msg.reserved' is uninitialized when calling write().
Fixes: a5bd05800f ("vhost-vdpa: batch updating IOTLB mappings")
Reported-by: Coverity (CID 1432864: UNINIT)
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20201028154004.776760-1-philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to PCIe spec 5.0 Type 1 header space Base Address Registers
are defined by 7.5.1.2.1 Base Address Registers (same as Type 0). The
_CRS region should allow for the same range (up to 64b). Prior to this
change, any host bridge utilizing more than 32b for the BAR would have
the address truncated and likely lead to conflicts when the operating
systems reads the _CRS object.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Message-Id: <20201026193924.985014-2-ben.widawsky@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Prevent _CRS resources being quietly chopped off and instead throw an
assertion. _CRS is used by host bridges to declare regions of io and/or
memory that they consume. On some (all?) platforms the host bridge
doesn't have PCI header space and so they need some way to convey the
information.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Message-Id: <20201026193924.985014-1-ben.widawsky@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
vhost IOTLB API uses read()/write() to exchange iotlb messages with
the kernel module.
The QEMU implementation expects a non-blocking fd, indeed commit
c471ad0e9b ("vhost_net: device IOTLB support") set it for vhost-net.
Without this patch, if we enable iommu for the vhost-vsock device,
QEMU can hang when exchanging IOTLB messages.
As commit 894022e616 ("net: check if the file descriptor is valid
before using it") did for tap, let's use qemu_try_set_nonblock()
when fd is provided by the user.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20201029144849.70958-1-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Here's the next pull request for ppc and spapr related patches, which
should be the last things for soft freeze. Includes:
* Numerous error handling cleanups from Greg Kurz
* Cleanups to cpu realization and hotplug handling from Greg Kurz
* A handful of other small fixes and cleanups
This does include a change to pc_dimm_plug() that isn't in my normal
areas of concern. That's there as a a prerequisite for ppc specific
changes, and has an ack from Igor.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl+YKwEACgkQbDjKyiDZ
s5LNmg/+J/nKfe494WPSDhOgpQNz4NJD1GZUOsADwuvDPQWmMdl4dY3MKd+ipN1X
HYmor7t+A5cvtm5zS0sHEZR/lg9qz3Oqc1nWpR+IJ7JmZMlmPbCKc6Iah4R1rQgA
N17C2LSdBXGN84cmi3B0IbMPxpogFM7SqiSG7T4QIy7nZ/gm/aYwejVs+lLA7vjl
hZmRnX76GE8fU0SzzmvIKTYwIqPbaZELP4LDmzKczdsWKvwOfeyUEp2loZ+xtyzc
s1ecJTaTRcgaiB5Ylu10Bv5/L07P4YxV2lMhd8WJqq6Ki/fdUIeVyP6lnZyTQVkj
YxtfrFsObVjNkHl8JtXG07QoQkGvkJLXGNyoPoCQy8ZEahYjjwP88UltAhENCllY
uDtR6fLj0YnpHu9gW1onGpkxjI93QaiWA0ePoF/z1gQ8E6G3vwSmI4h5FC4UHNv5
NNtHIJNgFj1icf8C8lyh5DBNqYNt8FFTS2iAmtl7eDbWcaDj1+fZWXQHqJS1CUI1
P8d2TNJXEyJ7fxfzk56LovdwvLs4/fLb4Lu5loi5CC4JPaYiwC7dEbhRPsxq8F++
l9gwfnnKo1WYkx8Y3PZEpXcoypCOGQToVAi519lx8YfRYeFTIXrR8vqOAfpeQpXY
9tktUFMgFZDoJl/Jc81enMjWy+mnmcIPfFR12vZ2LIR3X0tHE/g=
=igPS
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-5.2-20201028' into staging
ppc patch queue 2020-10-28
Here's the next pull request for ppc and spapr related patches, which
should be the last things for soft freeze. Includes:
* Numerous error handling cleanups from Greg Kurz
* Cleanups to cpu realization and hotplug handling from Greg Kurz
* A handful of other small fixes and cleanups
This does include a change to pc_dimm_plug() that isn't in my normal
areas of concern. That's there as a a prerequisite for ppc specific
changes, and has an ack from Igor.
# gpg: Signature made Tue 27 Oct 2020 14:13:21 GMT
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-5.2-20201028:
ppc/: fix some comment spelling errors
spapr: Improve spapr_reallocate_hpt() error reporting
target/ppc: Fix kvmppc_load_htab_chunk() error reporting
spapr: Use error_append_hint() in spapr_reallocate_hpt()
spapr: Simplify error handling in spapr_memory_plug()
spapr: Pass &error_abort when getting some PC DIMM properties
spapr: Use appropriate getter for PC_DIMM_SLOT_PROP
spapr: Use appropriate getter for PC_DIMM_ADDR_PROP
pc-dimm: Drop @errp argument of pc_dimm_plug()
spapr: Simplify spapr_cpu_core_realize() and spapr_cpu_core_unrealize()
spapr: Make spapr_cpu_core_unrealize() idempotent
spapr: Drop spapr_delete_vcpu() unused argument
spapr: Unrealize vCPUs with qdev_unrealize()
spapr: Fix leak of CPU machine specific data
spapr: Move spapr_create_nvdimm_dr_connectors() to core machine code
hw/net: move allocation to the heap due to very large stack frame
ppc/spapr: re-assert IRQs during event-scan if there are pending
spapr: Clarify why DR connectors aren't user creatable
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
There is no actual code in the CONFIG_VIRGL=n case. So building is
(a) pointless and (b) makes macos ranlib complain.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20201026142851.28735-1-kraxel@redhat.com
Build virtio-gpu vga devices modular. Must be a separate module because
not all qemu softmmu variants come with VGA support.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20201023064618.21409-3-kraxel@redhat.com
Build virtio-gpu pci devices modular. Must be a separate module because
not all qemu softmmu variants come with PCI support.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20201023064618.21409-2-kraxel@redhat.com
We only need to zero-initialize 'val' once.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20201012170950.3491912-4-f4bug@amsat.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The region is registered as 64KiB in sabre_init():
memory_region_init_io(&s->sabre_config, OBJECT(s), &sabre_config_ops, s,
"sabre-config", 0x10000);
Remove the superfluous check.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20201012170950.3491912-3-f4bug@amsat.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The current link redirects to https://www.oracle.com/sun/
announcing "Oracle acquired Sun Microsystems in 2010, ..."
but does not give hint where to find the datasheet.
Use the archived PDF on the Wayback Machine, which works.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20201012170950.3491912-2-f4bug@amsat.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The rework of the sabre IRQs in commit 6864fa3897 "sun4u: update PCI topology to
include simba PCI bridges" changed the IRQ routing so that both PCI and legacy
OBIO IRQs are routed through the sabre PCI host bridge to the CPU.
Unfortunately this commit failed to increase the number of PCI bus IRQs
accordingly meaning that access to the legacy IRQs OBIO (irqnum >= 0x20) would
overflow the PCI bus IRQ array causing strange failures running qemu-system-sparc64
in NetBSD.
Cc: qemu-stable@nongnu.org
Reported-by: Harold Gutch <logix@foobar.franken.de>
Fixes: https://bugs.launchpad.net/qemu/+bug/1838658
Fixes: 6864fa3897 ("sun4u: update PCI topology to include simba PCI bridges")
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20201011081347.2146-1-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The S24/TCX datasheet is listed as "Unable to locate" on [1].
However the NetBSD revision 1.32 of the driver introduced
64-bit accesses to the stippler and blitter [2]. It is safe
to assume these memory regions are 64-bit accessible.
QEMU implementation is 32-bit, so fill the 'impl' fields.
Michael Lorenz (author of the NetBSD code [2]) provided us with more
information in [3]:
> IIRC the real hardware *requires* 64bit accesses for stipple and
> blitter operations to work. For stipples you write a 64bit word into
> STIP space, the address defines where in the framebuffer you want to
> draw, the data contain a 32bit bitmask, foreground colour and a ROP.
> BLIT space works similarly, the 64bit word contains an offset were to
> read pixels from, and how many you want to copy.
>
> One more thing since there seems to be some confusion - 64bit accesses
> on the framebuffer are fine as well. TCX/S24 is *not* an SBus device,
> even though its node says it is.
> S24 is a card that plugs into a special slot on the SS5 mainboard,
> which is shared with an SBus slot and looks a lot like a horizontal
> UPA slot. Both S24 and TCX are accessed through the Micro/TurboSPARC's
> AFX bus which is 64bit wide and intended for graphics.
> Early FFB docs even mentioned connecting to both AFX and UPA,
> no idea if that was ever realized in hardware though.
[1] http://web.archive.org/web/20111209011516/http://wikis.sun.com/display/FOSSdocs/Home
[2] http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/dev/sbus/tcx.c.diff?r1=1.31&r2=1.32
[3] https://www.mail-archive.com/qemu-devel@nongnu.org/msg734928.html
Cc: qemu-stable@nongnu.org
Reported-by: Andreas Gustafsson <gson@gson.org>
Buglink: https://bugs.launchpad.net/bugs/1892540
Fixes: 55d7bfe229 ("tcx: Implement hardware acceleration")
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Andreas Gustafsson <gson@gson.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20201024205100.3623006-1-f4bug@amsat.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The device should not map itself but instead should be mapped to sysbus by the
sun4u machine.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200926140216.7368-7-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Instead use qdev_set_nic_properties() to configure the on-board NIC at the
sun4m machine level.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200926140216.7368-5-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Store the child object directly within the sparc32-espdma object rather than
using link properties.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200926140216.7368-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Store the child object directly within the sparc32-ledma object rather than
using link properties.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200926140216.7368-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Store the child objects directly within the sparc32-dma object rather than using
link properties.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200926140216.7368-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
CI jobs results:
. https://cirrus-ci.com/build/4879251751043072
. https://gitlab.com/philmd/qemu/-/pipelines/207661784
. https://travis-ci.org/github/philmd/qemu/builds/738958191
. https://app.shippable.com/github/philmd/qemu/runs/891/summary/console
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAl+XR88ACgkQ4+MsLN6t
wN4Org//UPv8cTDvJwsS2vaw0CRHQmU7/t2hI6xN1mDMFaUDwj6AjtdO6W3FUOcV
KQbwFP16Po684h4xhDayUs15RcPz/mCSaiiz4WkS3sA5M0y0SWQE+1HgvGZwrEXn
o5j8Lilh8m4/WE97Q7hVPD2cesMv9W3EziWBMEqukaXCSnTfAURiUUWmXpu7jyQ8
tc439KhFXnLx6Gx/XyPoN2CLfC5q1OReGTGvAP2GqFDPxPnIAGyqHLaVOZBXlX3a
PsRl9HKvuPL86zrSXLRqsnUiiQ5vHg0Quw4jntd+ZF+V97Qlg01x0BuxI2+9GqO4
b/RVBbvrLdWEpmKffr6EsmNNLaREeTfjQWVowD2uy3IK6JxkG7oiljAj7sDySSzs
WYo8PCE/xpTtrtbLZNtRkkz3Ui/h0Qjdvi8oS/8k84/0/+fDufWekXz1pX6YZ4jI
2GK+aa/OrTZQh5uUYdzQPxf0ieviOVCSf0IgKvHkOlAqnZIQRkFmtIOdPyNlctQx
cGNMT9pgz+n4g1ge9LaBlyAMwxCGkU5nygbbPyYHGZyN8xiDNUKOAMRhXgLiKpXb
tbseixu4TDGr2iuc291G9dkeKbPkxeIE3NIbFwcB4UtTftxLko7pTZ5l3G0jvUt1
VCfK1ot6DKO8B+TzXyIgAWiZmWSV8KhHA4EkX4BihqLL3hIvlNk=
=inaF
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/philmd-gitlab/tags/led-api-20201026' into staging
API to model LED.
CI jobs results:
. https://cirrus-ci.com/build/4879251751043072
. https://gitlab.com/philmd/qemu/-/pipelines/207661784
. https://travis-ci.org/github/philmd/qemu/builds/738958191
. https://app.shippable.com/github/philmd/qemu/runs/891/summary/console
# gpg: Signature made Mon 26 Oct 2020 22:03:59 GMT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* remotes/philmd-gitlab/tags/led-api-20201026:
hw/arm/tosa: Replace fprintf() calls by LED devices
hw/misc/mps2-scc: Use the LED device
hw/misc/mps2-fpgaio: Use the LED device
hw/arm/aspeed: Add the 3 front LEDs drived by the PCA9552 #1
hw/misc/led: Emit a trace event when LED intensity has changed
hw/misc/led: Allow connecting from GPIO output
hw/misc/led: Add a LED device
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The SRST protocol states that after diagnostics are complete and the
status is posted, we should clear the SRST bit if it should so happen to
be set.
The reset method itself should handle this, but just in case -- make our
intention explicit here.
Signed-off-by: John Snow <jsnow@redhat.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-id: 20201020200242.1497705-4-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
We don't need to wait for the falling edge. We can set BSY as
soon as possible and begin immediately resetting the drive. Devices
don't appear to need to take any specific action on the falling edge.
Signed-off-by: John Snow <jsnow@redhat.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-id: 20201020200242.1497705-3-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Software reset (SRST) should cause the diagnostic command to be run. Make an
explicit call to that routine.
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20201020200242.1497705-2-jsnow@redhat.com
Fixes: 55adb3c456
Fixes: https://bugs.launchpad.net/bugs/1900155
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: John Snow <jsnow@redhat.com>
spapr_reallocate_hpt() has three users, two of which pass &error_fatal
and the third one, htab_load(), passes &local_err, uses it to detect
failures and simply propagates -EINVAL up to vmstate_load(), which will
cause QEMU to exit. It is thus confusing that spapr_reallocate_hpt()
doesn't return right away when an error is detected in some cases. Also,
the comment suggesting that the caller is welcome to try to carry on
seems like a remnant in this respect.
This can be improved:
- change spapr_reallocate_hpt() to always report a negative errno on
failure, either as reported by KVM or -ENOSPC if the HPT is smaller
than what was asked,
- use that to detect failures in htab_load() which is preferred over
checking &local_err,
- propagate this negative errno to vmstate_load() because it is more
accurate than propagating -EINVAL for all possible errors.
[dwg: Fix compile error due to omitted prelim patch]
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160371605460.305923.5890143959901241157.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
If kvmppc_load_htab_chunk() fails, its return value is propagated up
to vmstate_load(). It should thus be a negative errno, not -1 (which
maps to EPERM and would lure the user into thinking that the problem
is necessarily related to a lack of privilege).
Return the error reported by KVM or ENOSPC in case of short write.
While here, propagate the error message through an @errp argument
and have the caller to print it with error_report_err() instead
of relying on fprintf().
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160371604713.305923.5264900354159029580.stgit@bahia.lan>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Hints should be added with the dedicated error_append_hint() API
because we don't want to print them when using QMP. This requires
to insert ERRP_GUARD as explained in "qapi/error.h".
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160371604030.305923.17464161378167312662.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
As recommended in "qapi/error.h", add a bool return value to
spapr_add_lmbs() and spapr_add_nvdimm(), and use them instead
of local_err in spapr_memory_plug().
This allows to get rid of the error propagation overhead.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160309734178.2739814.3488437759887793902.stgit@bahia.lan>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Both PC_DIMM_SLOT_PROP and PC_DIMM_ADDR_PROP are defined in the
default property list of the PC DIMM device class:
DEFINE_PROP_UINT64(PC_DIMM_ADDR_PROP, PCDIMMDevice, addr, 0),
DEFINE_PROP_INT32(PC_DIMM_SLOT_PROP, PCDIMMDevice, slot,
PC_DIMM_UNASSIGNED_SLOT),
They should thus be always gettable for both PC DIMMs and NVDIMMs.
An error in getting them can only be the result of a programming
error. It doesn't make much sense to propagate the error in this
case. Abort instead.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160309732180.2739814.7243774674998010907.stgit@bahia.lan>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The PC_DIMM_SLOT_PROP property is defined as:
DEFINE_PROP_INT32(PC_DIMM_SLOT_PROP, PCDIMMDevice, slot,
PC_DIMM_UNASSIGNED_SLOT),
Use object_property_get_int() instead of object_property_get_uint().
Since spapr_memory_plug() only gets called if pc_dimm_pre_plug()
succeeded, we expect to have a valid >= 0 slot number, either because
the user passed a valid slot number or because pc_dimm_get_free_slot()
picked one up for us.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160309730758.2739814.15821922745424652642.stgit@bahia.lan>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The PC_DIMM_ADDR_PROP property is defined as:
DEFINE_PROP_UINT64(PC_DIMM_ADDR_PROP, PCDIMMDevice, addr, 0),
Use object_property_get_uint() instead of object_property_get_int().
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160309729609.2739814.4996614957953215591.stgit@bahia.lan>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
pc_dimm_plug() doesn't use it. It only aborts on error.
Drop @errp and adapt the callers accordingly.
[dwg: Removed unused label to fix compile]
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160309728447.2739814.12831204841251148202.stgit@bahia.lan>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that the error path of spapr_cpu_core_realize() is just to call
idempotent spapr_cpu_core_unrealize() for rollback, no need to create
and realize the vCPUs in two separate loops.
Merge them and do them same in spapr_cpu_core_unrealize() for symmetry.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160279673321.1808373.2248221100790367912.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
spapr_cpu_core_realize() has a rollback path which partially duplicates
the code of spapr_cpu_core_unrealize().
Let's make spapr_cpu_core_unrealize() idempotent and call it instead. This
requires to:
- move the registration and unregistration of the reset handler around
but it is harmless,
- allocate the array of vCPUs with g_new0() to be able to filter out
unused slots,
- make sure to only unrealize vCPUs that have been already realized.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160279672626.1808373.14142129300586424514.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The 'sc' argument is unused. Drop it.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160279671929.1808373.10333672533575251075.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since we introduced CPU hot-unplug in sPAPR, we don't unrealize the
vCPU objects explicitly. Instead, we let QOM handle that for us under
object_property_del_all() when the CPU core object is finalized. The
only thing we do is calling cpu_remove_sync() to tear the vCPU thread
down.
This happens to work but it is ugly because:
- we call qdev_realize() but the corresponding qdev_unrealize() is
buried deep in the QOM code
- we call cpu_remove_sync() to undo qemu_init_vcpu() called by
ppc_cpu_realize() in target/ppc/translate_init.c.inc
- the CPU init and teardown paths aren't really symmetrical
The latter didn't bite us so far but a future patch that greatly
simplifies the CPU core realize path needs it to avoid a crash
in QOM.
For all these reasons, have ppc_cpu_unrealize() to undo the changes
of ppc_cpu_realize() by calling cpu_remove_sync() at the right place,
and have the sPAPR CPU core code to call qdev_unrealize().
This requires to add a missing stub because translate_init.c.inc is
also compiled for user mode.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160279671236.1808373.14732005038172874990.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When a CPU core is being removed, the machine specific data of each
CPU thread object is leaked.
Fix this by calling the dedicated helper we have for that instead of
simply unparenting the CPU object. Call it from a separate loop in
spapr_cpu_core_unrealize() for symmetry with spapr_cpu_core_realize().
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160279670540.1808373.17319746576919615623.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The spapr_create_nvdimm_dr_connectors() function doesn't need to access
any internal details of the sPAPR NVDIMM implementation. Also, pretty
much like for the LMBs, only spapr_machine_init() is responsible for the
creation of DR connectors for NVDIMMs.
Make this clear by making this function static in hw/ppc/spapr.c.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160249772183.757627.7396780936543977766.stgit@bahia.lan>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[dwg] The stack frame itself probably isn't that big a deal, but
avoiding alloca() is generally recommended these days.
Signed-off-by: Elena Afanasova <eafanasova@gmail.com>
Message-Id: <8f07132478469b35fb50a4706691e2b56b10a67b.camel@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
If we hotplug a CPU during the first second of the kernel boot,
the IRQ can be sent to the kernel while the RTAS event handler
is not installed. The event is queued, but the kernel doesn't
collect it and ignores the new CPU.
As the code relies on edge-triggered IRQ, we can re-assert it
during the event-scan RTAS call if there are still pending
events (as it is already done in check-exception).
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20201015210318.117386-1-lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
DR connector is a device that emulates a firmware abstraction used by PAPR
compliant guests to manage hotplug/dynamic-reconfiguration of PHBs, PCI
devices, memory, and CPUs.
It is internally created by the spapr platform and requires to be owned by
either the machine (PHBs, CPUs, memory) or by a PHB (PCI devices).
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160250199940.765467.6896806997161856576.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The armv7m systick timer is a 24-bit decrementing, wrap-on-zero,
clear-on-write counter. Our current implementation has various
bugs and dubious workarounds in it (for instance see
https://bugs.launchpad.net/qemu/+bug/1872237).
We have an implementation of a simple decrementing counter
and we put a lot of effort into making sure it handles the
interesting corner cases (like "spend a cycle at 0 before
reloading") -- ptimer.
Rewrite the systick timer to use a ptimer rather than
a raw QEMU timer.
Unfortunately this is a migration compatibility break,
which will affect all M-profile boards.
Among other bugs, this fixes
https://bugs.launchpad.net/qemu/+bug/1872237 :
now writes to SYST_CVR when the timer is enabled correctly
do nothing; when the timer is enabled via SYST_CSR.ENABLE,
the ptimer code will (because of POLICY_NO_IMMEDIATE_RELOAD)
arrange that after one timer tick the counter is reloaded
from SYST_RVR and then counts down from there, as the
architecture requires.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20201015151829.14656-3-peter.maydell@linaro.org
In ptimer_reload(), we call the callback function provided by the
timer device that is using the ptimer. This callback might disable
the ptimer. The code mostly handles this correctly, except that
we'll still print the warning about "Timer with delta zero,
disabling" if the now-disabled timer happened to be set such that it
would fire again immediately if it were enabled (eg because the
limit/reload value is zero).
Suppress the spurious warning message and the unnecessary
repeat-deletion of the underlying timer in this case.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20201015151829.14656-2-peter.maydell@linaro.org
Included the newly implemented SBSA generic watchdog device model into
SBSA platform
Signed-off-by: Shashi Mallela <shashi.mallela@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20201027015927.29495-3-shashi.mallela@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Generic watchdog device model implementation as per ARM SBSA v6.0
Signed-off-by: Shashi Mallela <shashi.mallela@linaro.org>
Message-id: 20201027015927.29495-2-shashi.mallela@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Connect the 'uart-out' clock from the CPRMAN to the PL011 instance.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Luc Michel <luc@lmichel.fr>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a clock input to the PL011 UART so we can compute the current baud
rate and trace it. This is intended for developers who wish to use QEMU
to e.g. debug their firmware or to figure out the baud rate configured
by an unknown/closed source binary.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Luc Michel <luc@lmichel.fr>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Those reset values have been extracted from a Raspberry Pi 3 model B
v1.2, using the 2020-08-20 version of raspios. The dump was done using
the debugfs interface of the CPRMAN driver in Linux (under
'/sys/kernel/debug/clk'). Each exposed clock tree stage (PLLs, channels
and muxes) can be observed by reading the 'regdump' file (e.g.
'plla/regdump').
Those values are set by the Raspberry Pi firmware at boot time (Linux
expects them to be set when it boots up).
Some stages are not exposed by the Linux driver (e.g. the PLL B). For
those, the reset values are unknown and left to 0 which implies a
disabled output.
Once booted in QEMU, the final clock tree is very similar to the one
visible on real hardware. The differences come from some unimplemented
devices for which the driver simply disable the corresponding clock.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Luc Michel <luc@lmichel.fr>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This simple mux sits between the PLL channels and the DSI0E and DSI0P
clock muxes. This mux selects between PLLA-DSI0 and PLLD-DSI0 channel
and outputs the selected signal to source number 4 of DSI0E/P clock
muxes. It is controlled by the cm_dsi0hsck register.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Luc Michel <luc@lmichel.fr>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A clock mux can be configured to select one of its 10 sources through
the CM_CTL register. It also embeds yet another clock divider, composed
of an integer part and a fractional part. The number of bits of each
part is mux dependent.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Luc Michel <luc@lmichel.fr>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The clock multiplexers are the last clock stage in the CPRMAN. Each mux
outputs one clock signal that goes out of the CPRMAN to the SoC
peripherals.
Each mux has at most 10 sources. The sources 0 to 3 are common to all
muxes. They are:
0. ground (no clock signal)
1. the main oscillator (xosc)
2. "test debug 0" clock
3. "test debug 1" clock
Test debug 0 and 1 are actual clock muxes that can be used as sources to
other muxes (for debug purpose).
Sources 4 to 9 are mux specific and can be unpopulated (grounded). Those
sources are fed by the PLL channels outputs.
One corner case exists for DSI0E and DSI0P muxes. They have their source
number 4 connected to an intermediate multiplexer that can select
between PLLA-DSI0 and PLLD-DSI0 channel. This multiplexer is called
DSI0HSCK and is not a clock mux as such. It is really a simple mux from
the hardware point of view (see https://elinux.org/The_Undocumented_Pi).
This mux is not implemented in this commit.
Note that there is some muxes for which sources are unknown (because of
a lack of documentation). For those cases all the sources are connected
to ground in this implementation.
Each clock mux output is exported by the CPRMAN at the qdev level,
adding the suffix '-out' to the mux name to form the output clock name.
(E.g. the 'uart' mux sees its output exported as 'uart-out' at the
CPRMAN level.)
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Luc Michel <luc@lmichel.fr>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A PLL channel is able to further divide the generated PLL frequency.
The divider is given in the CTRL_A2W register. Some channels have an
additional fixed divider which is always applied to the signal.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Luc Michel <luc@lmichel.fr>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
PLLs are composed of multiple channels. Each channel outputs one clock
signal. They are modeled as one device taking the PLL generated clock as
input, and outputting a new clock.
A channel shares the CM register with its parent PLL, and has its own
A2W_CTRL register. A write to the CM register will trigger an update of
the PLL and all its channels, while a write to an A2W_CTRL channel
register will update the required channel only.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Luc Michel <luc@lmichel.fr>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The CPRMAN PLLs generate a clock based on a prescaler, a multiplier and
a divider. The prescaler doubles the parent (xosc) frequency, then the
multiplier/divider are applied. The multiplier has an integer and a
fractional part.
This commit also implements the CPRMAN CM_LOCK register. This register
reports which PLL is currently locked. We consider a PLL has being
locked as soon as it is enabled (on real hardware, there is a delay
after turning a PLL on, for it to stabilize).
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Luc Michel <luc@lmichel.fr>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
There are 5 PLLs in the CPRMAN, namely PLL A, C, D, H and B. All of them
take the xosc clock as input and produce a new clock.
This commit adds a skeleton implementation for the PLLs as sub-devices
of the CPRMAN. The PLLs are instantiated and connected internally to the
main oscillator.
Each PLL has 6 registers : CM, A2W_CTRL, A2W_ANA[0,1,2,3], A2W_FRAC. A
write to any of them triggers a call to the (not yet implemented)
pll_update function.
If the main oscillator changes frequency, an update is also triggered.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Luc Michel <luc@lmichel.fr>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The BCM2835 CPRMAN is the clock manager of the SoC. It is composed of a
main oscillator, and several sub-components (PLLs, multiplexers, ...) to
generate the BCM2835 clock tree.
This commit adds a skeleton of the CPRMAN, with a dummy register
read/write implementation. It embeds the main oscillator (xosc) from
which all the clocks will be derived.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Luc Michel <luc@lmichel.fr>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The CPRMAN (clock controller) was mapped at the watchdog/power manager
address. It was also split into two unimplemented peripherals (CM and
A2W) but this is really the same one, as shown by this extract of the
Raspberry Pi 3 Linux device tree:
watchdog@7e100000 {
compatible = "brcm,bcm2835-pm\0brcm,bcm2835-pm-wdt";
[...]
reg = <0x7e100000 0x114 0x7e00a000 0x24>;
[...]
};
[...]
cprman@7e101000 {
compatible = "brcm,bcm2835-cprman";
[...]
reg = <0x7e101000 0x2000>;
[...]
};
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Luc Michel <luc@lmichel.fr>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The nanosecond unit greatly limits the dynamic range we can display in
clock value traces, for values in the order of 1GHz and more. The
internal representation can go way beyond this value and it is quite
common for today's clocks to be within those ranges.
For example, a frequency between 500MHz+ and 1GHz will be displayed as
1ns. Beyond 1GHz, it will show up as 0ns.
Replace nanosecond periods traces with frequencies in the Hz unit
to have more dynamic range in the trace output.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
Signed-off-by: Luc Michel <luc@lmichel.fr>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use of 0x%d - make up our mind as 0x%x
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20201014193355.53074-1-dgilbert@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Pi 3A+ is a stripped down version of the 3B:
- 512 MiB of RAM instead of 1 GiB
- no on-board ethernet chipset
Add it as it is a closer match to what we model.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20201024170127.3592182-10-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Similarly to the Pi A, the Pi Zero uses a BCM2835 SoC (ARMv6Z core).
The only difference between the revision 1.2 and 1.3 is the latter
exposes a CSI camera connector. As we do not implement the Unicam
peripheral, there is no point in exposing a camera connector :)
Therefore we choose to model the 1.2 revision.
Example booting the machine using content from [*]:
$ qemu-system-arm -M raspi0 -serial stdio \
-kernel raspberrypi/firmware/boot/kernel.img \
-dtb raspberrypi/firmware/boot/bcm2708-rpi-zero.dtb \
-append 'printk.time=0 earlycon=pl011,0x20201000 console=ttyAMA0'
[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 4.19.118+ (dom@buildbot) (gcc version 4.9.3 (crosstool-NG crosstool-ng-1.22.0-88-g8460611)) #1311 Mon Apr 27 14:16:15 BST 2020
[ 0.000000] CPU: ARMv6-compatible processor [410fb767] revision 7 (ARMv7), cr=00c5387d
[ 0.000000] CPU: VIPT aliasing data cache, unknown instruction cache
[ 0.000000] OF: fdt: Machine model: Raspberry Pi Zero
...
[*] http://archive.raspberrypi.org/debian/pool/main/r/raspberrypi-firmware/raspberrypi-kernel_1.20200512-2_armhf.deb
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20201024170127.3592182-9-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Pi A is almost the first machine released.
It uses a BCM2835 SoC which includes a ARMv6Z core.
Example booting the machine using content from [*]
(we use the device tree from the B model):
$ qemu-system-arm -M raspi1ap -serial stdio \
-kernel raspberrypi/firmware/boot/kernel.img \
-dtb raspberrypi/firmware/boot/bcm2708-rpi-b-plus.dtb \
-append 'earlycon=pl011,0x20201000 console=ttyAMA0'
[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 4.19.118+ (dom@buildbot) (gcc version 4.9.3 (crosstool-NG crosstool-ng-1.22.0-88-g8460611)) #1311 Mon Apr 27 14:16:15 BST 2020
[ 0.000000] CPU: ARMv6-compatible processor [410fb767] revision 7 (ARMv7), cr=00c5387d
[ 0.000000] CPU: VIPT aliasing data cache, unknown instruction cache
[ 0.000000] OF: fdt: Machine model: Raspberry Pi Model B+
...
[*] http://archive.raspberrypi.org/debian/pool/main/r/raspberrypi-firmware/raspberrypi-kernel_1.20200512-2_armhf.deb
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20201024170127.3592182-8-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20201024170127.3592182-7-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The realize() function is clearly composed of two parts,
each described by a comment:
void realize()
{
/* common peripherals from bcm2835 */
...
/* bcm2836 interrupt controller (and mailboxes, etc.) */
...
}
Split the two part, so we can reuse the common part with other
SoCs from this family.
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20201024170127.3592182-6-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It makes no sense to set enabled-cpus=0 on single core SoCs.
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20201024170127.3592182-5-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The BCM2835 has only one core. Introduce the core_count field to
be able to use values different than BCM283X_NCPUS (4).
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20201024170127.3592182-4-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Remove usage of TypeInfo::class_data. Instead fill the fields in
the corresponding class_init().
So far all children use the same values for almost all fields,
but we are going to add the BCM2711/BCM2838 SoC for the raspi4
machine which use different fields.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20201024170127.3592182-3-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
No code out of bcm2836.c uses (or requires) the BCM283XInfo
declarations. Move it locally to the C source file.
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20201024170127.3592182-2-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Ensure the vSMMUv3 will be restored before all PCIe devices so that DMA
translation can work properly during migration.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Message-id: 20201019091508.197-1-yuzenghui@huawei.com
Acked-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The NPCM7xx chips have multiple GPIO controllers that are mostly
identical except for some minor differences like the reset values of
some registers. Each controller controls up to 32 pins.
Each individual pin is modeled as a pair of unnamed GPIOs -- one for
emitting the actual pin state, and one for driving the pin externally.
Like the nRF51 GPIO controller, a gpio level may be negative, which
means the pin is not driven, or floating.
Reviewed-by: Tyrone Ting <kfting@nuvoton.com>
Signed-off-by: Havard Skinnemoen <hskinnemoen@google.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The NPCM730 and NPCM750 chips have a single USB host port shared between
a USB 2.0 EHCI host controller and a USB 1.1 OHCI host controller. This
adds support for both of them.
Testing notes:
* With -device usb-kbd, qemu will automatically insert a full-speed
hub, and the keyboard becomes controlled by the OHCI controller.
* With -device usb-kbd,bus=usb-bus.0,port=1, the keyboard is directly
attached to the port without any hubs, and the device becomes
controlled by the EHCI controller since it's high speed capable.
* With -device usb-kbd,bus=usb-bus.0,port=1,usb_version=1, the
keyboard is directly attached to the port, but it only advertises
itself as full-speed capable, so it becomes controlled by the OHCI
controller.
In all cases, the keyboard device enumerates correctly.
Reviewed-by: Tyrone Ting <kfting@nuvoton.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Havard Skinnemoen <hskinnemoen@google.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The RNG module returns a byte of randomness when the Data Valid bit is
set.
This implementation ignores the prescaler setting, and loads a new value
into RNGD every time RNGCS is read while the RNG is enabled and random
data is available.
A qtest featuring some simple randomness tests is included.
Reviewed-by: Tyrone Ting <kfting@nuvoton.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Havard Skinnemoen <hskinnemoen@google.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The watchdog is part of NPCM7XX's timer module. Its behavior is
controlled by the WTCR register in the timer.
When enabled, the watchdog issues an interrupt signal after a pre-set
amount of cycles, and issues a reset signal shortly after that.
Reviewed-by: Tyrone Ting <kfting@nuvoton.com>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Signed-off-by: Havard Skinnemoen <hskinnemoen@google.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: deleted blank line at end of npcm_watchdog_timer-test.c]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This allows us to reuse npcm7xx_timer_pause for the watchdog timer.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Havard Skinnemoen <hskinnemoen@google.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch sets min_cpus field for xlnx-versal-virt platform,
because it always creates XLNX_VERSAL_NR_ACPUS cpus even with
-smp 1 command line option.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 160343854912.8460.17915238517799132371.stgit@pasha-ThinkPad-X280
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When compiling with -Werror=implicit-fallthrough, gcc complains about
missing fallthrough annotations in this file. Looking at the code,
the fallthrough is very likely intended here, so add some comments
to silence the compiler warnings.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 20201020105938.23209-1-thuth@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The nvme_check_{sq,cq} functions check if the given queue identifer is
valid *and* that the queue exists. Thus, the function return value
cannot simply be inverted to check if the identifer is valid and that
the queue does *not* exist.
Replace the call with an OR'ed version of the checks.
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Replace the Invalid Field in Command with the Invalid PRP Offset status
code in the nvme_create_{cq,sq} functions. Also, allow PRP1 to be
address 0x0.
Also replace the Completion Queue Invalid status code returned in
nvme_create_cq when the the queue identifier is invalid with the Invalid
Queue Identifier. The Completion Queue Invalid status code is
exclusively for indicating that the completion queue identifer given
when creating a submission queue is invalid.
See NVM Express v1.3d, Section 5.3 ("Create I/O Completion Queue
command") and 5.4("Create I/O Submission Queue command").
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Address 0 is not an invalid address. Remove those invalikd checks.
Unaligned PRP2 and PRP list entries should result in Invalid PRP Offset
status code and not Invalid Field. Fix that.
See NVMe Express v1.3d, Section 4.3 ("Physical Region Page Entry and
List").
Suggested-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Calculate the data shift value to report based on the set value of
logical_block_size device property.
In the process, use a local variable to calculate the LBA format
index instead of the hardcoded value 0. This makes the code more
readable and it will make it easier to add support for multiple LBA
formats in the future.
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
If a command results in a non-zero status code, trace it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Include the namespace id in the pci_nvme_{get,set}feat trace events.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
If the host sets CC.CSS to 111b, all commands submitted to I/O queues
should be completed with status Invalid Command Opcode.
Note that this is technically a v1.4 feature, but it does not hurt to
implement before we finally bump the reported version implemented.
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Fail to start the controller if the user requests a command set that the
controller does not support.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Let the user specify a specific namespace if they want to get access
stats for a specific namespace.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Return error if the requested offset starts after the size of the log
being returned. Also, move the check for earlier in the function so
we're not doing unnecessary calculations.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed- by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
The code switches on the opcode to invoke a function specific to that
opcode. There's no point in consolidating back to a common function that
just switches on that same opcode without any actual common code.
Restore the opcode specific behavior without going back through another
level of switches.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
If the user does not specify an nsid parameter on the nvme-ns device,
nvme_register_namespace will find the first free namespace id and assign
that.
This fix makes sure the assigned id is saved.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
There are two reasons for changing this:
1. The nvme device currently uses an internal Intel device id.
2. Since commits "nvme: fix write zeroes offset and count" and "nvme:
support multiple namespaces" the controller device no longer has
the quirks that the Linux kernel think it has.
As the quirks are applied based on pci vendor and device id, change
them to get rid of the quirks.
To keep backward compatibility, add a new 'use-intel-id' parameter to
the nvme device to force use of the Intel vendor and device id. This is
off by default but add a compat property to set this for 5.1 machines
and older. If a 5.1 machine is booted (or the use-intel-id parameter is
explicitly set to true), the Linux kernel will just apply these
unnecessary quirks:
1. NVME_QUIRK_IDENTIFY_CNS which says that the device does not support
anything else than values 0x0 and 0x1 for CNS (Identify Namespace
and Identify Namespace). With multiple namespace support, this just
means that the kernel will "scan" namespaces instead of using
"Active Namespace ID list" (CNS 0x2).
2. NVME_QUIRK_DISABLE_WRITE_ZEROES. The nvme device started out with a
broken Write Zeroes implementation which has since been fixed in
commit 9d6459d21a ("nvme: fix write zeroes offset and count").
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>