Commit Graph

22263 Commits

Author SHA1 Message Date
Marc-André Lureau
43df70a9dd compat: replace PC_COMPAT_2_11 & HW_COMPAT_2_11 macros
Use static arrays instead.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-07 16:18:41 +04:00
Marc-André Lureau
0d47310b03 compat: replace PC_COMPAT_2_12 & HW_COMPAT_2_12 macros
Use static arrays instead.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-07 16:18:41 +04:00
Marc-André Lureau
ddb3235de1 compat: replace PC_COMPAT_3_0 & HW_COMPAT_3_0 macros
Use static arrays instead.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-07 16:18:41 +04:00
Marc-André Lureau
abd93cc7df compat: replace PC_COMPAT_3_1 & HW_COMPAT_3_1 macros
Use static arrays instead.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-07 16:18:41 +04:00
Marc-André Lureau
88cbe07374 machine: move compat properties out of globals
Move the compat arrays inside functions that use them.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-07 16:18:41 +04:00
Marc-André Lureau
b66bbee39f hw: apply machine compat properties without touching globals
Similarly to accel properties, move compat properties out of globals
registration, and apply the machine compat properties during
device_post_init().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-07 16:18:41 +04:00
Marc-André Lureau
fa386d989d machines: replace COMPAT define with a static array
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-07 16:18:41 +04:00
Marc-André Lureau
ea9ce8934c hw: apply accel compat properties without touching globals
Instead of registering compat properties as globals, let's keep them
in their own array, to avoid mixing with user globals.

Introduce object_apply_global_props() function, to apply compatibility
properties from a GPtrArray.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-07 16:18:41 +04:00
Li Qiang
19bcc4bc32 fw_cfg: Make qemu_extra_params_fw locally
qemu_extra_params_fw[] has external linkage, but is used
only in fw_cfg_bootsplash(), it makes sense to make it
locally.

Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <1542777026-2788-4-git-send-email-liq3ea@gmail.com>
[PMD: Removed qemu_extra_params_fw declaration in vl.c]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-01-04 15:30:52 +01:00
Li Qiang
ee5d0f89de fw_cfg: Fix -boot reboot-timeout error checking
fw_cfg_reboot() gets option parameter "reboot-timeout" with
qemu_opt_get(), then converts it to an integer by hand. It neglects to
check that conversion for errors, and fails to reject negative values.
Positive values above the limit get reported and replaced by the limit.
This patch checks for conversion errors properly, and reject all values
outside 0...0xffff.

Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <1542777026-2788-3-git-send-email-liq3ea@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-01-04 15:30:52 +01:00
Li Qiang
6912bb0b3d fw_cfg: Fix -boot bootsplash error checking
fw_cfg_bootsplash() gets option parameter "splash-time"
with qemu_opt_get(), then converts it to an integer by hand.
It neglects to check that conversion for errors. This is
needlessly complicated and error-prone. But as "splash-time
not specified" is not the same as "splash-time=T" for any T,
we need use qemu_opt_get() to check if splash time exists.
This patch also make the qemu exit when finding or loading
splash file failed.

Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <1542777026-2788-2-git-send-email-liq3ea@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-01-04 15:30:52 +01:00
Li Qiang
bed6633677 fw_cfg: Improve error message when can't load splash file
read_splashfile() reports "failed to read splash file" without
further details. Get the details from g_file_get_contents(), and
include them in the error message. Also remove unnecessary 'res'
variable.

Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <1541052148-28752-1-git-send-email-liq3ea@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-01-04 15:30:52 +01:00
Peter Maydell
20d6c7312f RISC-V Changes for 3.2, Part 1
This pull request contains the first set of RISC-V patches I'd like to
 target for the 3.2 development cycle.  It's really just a collection of
 bug fixes with one major new feature: PCIe can now be attached to RISC-V
 guests.
 
 This has passed my usual test of booting the latest Linux RC into a
 Fedora disk image on the virt machine.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAlwdDlkTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRDvTKFQLMurQWJgEACdUZuZIYGu1b1QjOLmSzPSVa+EvP5V
 +AAXekfsk3T3dNuVDtQyb2IfcPsI/QZkr6WcO0gWsY4TWPXwm6fMKyvZFBFxq2hM
 RWYtgeHDSSJ1ZY0AGz8Lz//zC76rJfbQDl5TsPQEX0ARCdV8VI0Uh0paaWDRypHz
 5tXruzuHAp0dKk9czyBGC//LrWdNBMGhcti9QxN0ivyvR6FXJndEGvY9UL5WcF8t
 rPbX+r1n/lezaJTdKAybyy5SaEQoyGChhxyESA9MCj1foE3MKd5oXArOGEmU6dwP
 PdJznOn1T/4IozAMHYUpzSIlJ5ssoa/KdZbULE4MIWBmfh0+AeVYDnmrGEffdmWw
 d2MNJrn1yFSEaey+i19DCZIl2+4xbpjzq3GZVDllGGDznXNiG3ORiaiCOATLDubJ
 WYHxLETln/Ix1fBq3u6QbV7GeJ6EIZ+MobNwJEq1kvmyoU3tqrcFBOYMw7usvTda
 TcYDVNbhtWtdv0EhwxFpV+8otamcWfoE7OTl5Msy+9ZpV9JWABssvU/aXu68eNi/
 nHlCggrXUh4i4c+XoPeyckTj4GQ8QpoSt8PNx8SIbz+ElKC5BoChInXo8o1XKjhA
 wYLYyL7XH6NjdQAlerIvWIKA6tWKG8SqL8kvr9P05tZLmzc4UoQ1h5QlXf5BiAKO
 e4qNigdEd+VtYw==
 =BAOm
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/palmer/tags/riscv-for-master-3.2-part1' into staging

RISC-V Changes for 3.2, Part 1

This pull request contains the first set of RISC-V patches I'd like to
target for the 3.2 development cycle.  It's really just a collection of
bug fixes with one major new feature: PCIe can now be attached to RISC-V
guests.

This has passed my usual test of booting the latest Linux RC into a
Fedora disk image on the virt machine.

# gpg: Signature made Fri 21 Dec 2018 16:01:29 GMT
# gpg:                using RSA key EF4CA1502CCBAB41
# gpg: Good signature from "Palmer Dabbelt <palmer@dabbelt.com>"
# gpg:                 aka "Palmer Dabbelt <palmer@sifive.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 00CE 76D1 8349 60DF CE88  6DF8 EF4C A150 2CCB AB41

* remotes/palmer/tags/riscv-for-master-3.2-part1:
  MAINTAINERS: Mark RISC-V as Supported
  riscv/cpu: use device_class_set_parent_realize
  target/riscv/pmp.c: Fix pmp_decode_napot()
  sifive_uart: Implement interrupt pending register
  RISC-V: Enable second UART on sifive_e and sifive_u
  RISC-V: Fix PLIC pending bitfield reads
  RISC-V: Fix CLINT timecmp low 32-bit writes
  RISC-V: Add hartid and \n to interrupt logging
  sifive_u: Set 'clock-frequency' DT property for SiFive UART
  sifive_u: Add clock DT node for GEM ethernet
  riscv: Enable VGA and PCIE_VGA
  hw/riscv/virt: Connect the gpex PCIe
  hw/riscv/virt: Adjust memory layout spacing
  hw/riscv/virt: Increase the number of interrupts

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-01-03 13:26:30 +00:00
Prasad J Pandit
f1e2e38ee0 pvrdma: check return value from pvrdma_idx_ring_has_ routines
pvrdma_idx_ring_has_[data/space] routines also return invalid
index PVRDMA_INVALID_IDX[=-1], if ring has no data/space. Check
return value from these routines to avoid plausible infinite loops.

Reported-by: Li Qiang <liq3ea@163.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:57 +02:00
Prasad J Pandit
7be3a21325 rdma: remove unused VENDOR_ERR_NO_SGE macro
With commit 4481985c (rdma: check num_sge does not exceed MAX_SGE)
macro VENDOR_ERR_NO_SGE is no longer in use - delete it.

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:57 +02:00
Prasad J Pandit
509f57c98e pvrdma: release ring object in case of an error
create_cq and create_qp routines allocate ring object, but it's
not released in case of an error, leading to memory leakage.

Reported-by: Li Qiang <liq3ea@163.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:57 +02:00
Prasad J Pandit
2c858ce5da pvrdma: check number of pages when creating rings
When creating CQ/QP rings, an object can have up to
PVRDMA_MAX_FAST_REG_PAGES 8 pages. Check 'npages' parameter
to avoid excessive memory allocation or a null dereference.

Reported-by: Li Qiang <liq3ea@163.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:57 +02:00
Prasad J Pandit
2aa86456fb pvrdma: add uar_read routine
Define skeleton 'uar_read' routine. Avoid NULL dereference.

Reported-by: Li Qiang <liq3ea@163.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:57 +02:00
Prasad J Pandit
0e68373cc2 rdma: check num_sge does not exceed MAX_SGE
rdma back-end has scatter/gather array ibv_sge[MAX_SGE=4] set
to have 4 elements. A guest could send a 'PvrdmaSqWqe' ring element
with 'num_sge' set to > MAX_SGE, which may lead to OOB access issue.
Add check to avoid it.

Reported-by: Saar Amar <saaramar5@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:57 +02:00
Prasad J Pandit
cce648613b pvrdma: release device resources in case of an error
If during pvrdma device initialisation an error occurs,
pvrdma_realize() does not release memory resources, leading
to memory leakage.

Reported-by: Li Qiang <liq3ea@163.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20181212175817.815-1-ppandit@redhat.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:57 +02:00
Yuval Shaia
305fd2ba06 hw/rdma: Do not call rdma_backend_del_gid on an empty gid
When device goes down the function fini_ports loops over all entries in
gid table regardless of the fact whether entry is valid or not. In case
that entry is not valid we'd like to skip from any further processing in
backend device.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:57 +02:00
Yuval Shaia
9a3053d2e8 hw/rdma: Do not use bitmap_zero_extend to free bitmap
bitmap_zero_extend is designed to work for extending, not for
shrinking.
Using g_free instead.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:57 +02:00
Yuval Shaia
ffa65d97fc hw/pvrdma: Clean device's resource when system is shutdown
In order to clean some external resources such as GIDs, QPs etc,
register to receive notification when VM is shutdown.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:57 +02:00
Yuval Shaia
14c74f7207 hw/rdma: Remove unneeded code that handles more that one port
Device supports only one port, let's remove a dead code that handles
more than one port.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:56 +02:00
Yuval Shaia
091782171f hw/pvrdma: Fill error code in command's response
Driver checks error code let's set it.
In addition, for code simplification purposes, set response's fields
ack, response and err outside of the scope of command handlers.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:56 +02:00
Yuval Shaia
eaac01005d hw/pvrdma: Fill all CQE fields
Add ability to pass specific WC attributes to CQE such as GRH_BIT flag.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:56 +02:00
Yuval Shaia
e976ebc87c hw/pvrdma: Make device state depend on Ethernet function state
User should be able to control the device by changing Ethernet function
state so if user runs 'ifconfig ens3 down' the PVRDMA function should be
down as well.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:56 +02:00
Yuval Shaia
028c3f93d6 hw/rdma: Initialize node_guid from vmxnet3 mac address
node_guid should be set once device is load.
Make node_guid be GID format (32 bit) of PCI function 0 vmxnet3 device's
MAC.

A new function was added to do the conversion.
So for example the MAC 56:b6:44:e9:62:dc will be converted to GID
54b6:44ff:fee9:62dc.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:56 +02:00
Yuval Shaia
d961ead16e hw/pvrdma: Make sure PCI function 0 is vmxnet3
Guest driver enforces it, we should also.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:56 +02:00
Yuval Shaia
317639aafd vmxnet3: Move some definitions to header file
pvrdma setup requires vmxnet3 device on PCI function 0 and PVRDMA device
on PCI function 1.
pvrdma device needs to access vmxnet3 device object for several reasons:
1. Make sure PCI function 0 is vmxnet3.
2. To monitor vmxnet3 device state.
3. To configure node_guid accoring to vmxnet3 device's MAC address.

To be able to access vmxnet3 device the definition of VMXNET3State is
moved to a new header file.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Dmitry Fleytman <dmitry.fleytman@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:56 +02:00
Yuval Shaia
2b05705dc8 hw/pvrdma: Add support to allow guest to configure GID table
The control over the RDMA device's GID table is done by updating the
device's Ethernet function addresses.
Usually the first GID entry is determined by the MAC address, the second
by the first IPv6 address and the third by the IPv4 address. Other
entries can be added by adding more IP addresses. The opposite is the
same, i.e. whenever an address is removed, the corresponding GID entry
is removed.

The process is done by the network and RDMA stacks. Whenever an address
is added the ib_core driver is notified and calls the device driver
add_gid function which in turn update the device.

To support this in pvrdma device we need to hook into the create_bind
and destroy_bind HW commands triggered by pvrdma driver in guest.
Whenever a change is made to the pvrdma port's GID table a special QMP
message is sent to be processed by libvirt to update the address of the
backend Ethernet device.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum<marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:56 +02:00
Yuval Shaia
1625bb13da hw/pvrdma: Set the correct opcode for send completion
opcode for WC should be set by the device and not taken from work
element.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:56 +02:00
Yuval Shaia
2bff59e699 hw/pvrdma: Set the correct opcode for recv completion
The function pvrdma_post_cqe populates CQE entry with opcode from the
given completion element. For receive operation value was not set. Fix
it by setting it to IBV_WC_RECV.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:56 +02:00
Yuval Shaia
536692ca1d hw/pvrdma: Make default pkey 0xFFFF
Commit 6e7dba23af ("hw/pvrdma: Make default pkey 0xFFFF") exports
default pkey as external definition but omit the change from 0x7FFF to
0xFFFF.

Fixes: 6e7dba23af ("hw/pvrdma: Make default pkey 0xFFFF")

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:56 +02:00
Yuval Shaia
f00c48caab hw/pvrdma: Make function reset_device return void
This function cannot fail - fix it to return void

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:56 +02:00
Yuval Shaia
605ec1663b hw/rdma: Add support for MAD packets
MAD (Management Datagram) packets are widely used by various modules
both in kernel and in user space for example the rdma_* API which is
used to create and maintain "connection" layer on top of RDMA uses
several types of MAD packets.

For more information please refer to chapter 13.4 in Volume 1
Architecture Specification, Release 1.1 available here:
https://www.infinibandta.org/ibta-specifications-download/

To support MAD packets the device uses an external utility
(contrib/rdmacm-mux) to relay packets from and to the guest driver.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum<marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:56 +02:00
Yuval Shaia
305bdd7a57 hw/rdma: Abort send-op if fail to create addr handler
Function create_ah might return NULL, let's exit with an error.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:56 +02:00
Yuval Shaia
46462cb161 hw/rdma: Return qpn 1 if ibqp is NULL
Device is not supporting QP0, only QP1.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:56 +02:00
Yuval Shaia
4082e533f6 hw/rdma: Add ability to force notification without re-arm
Upon completion of incoming packet the device pushes CQE to driver's RX
ring and notify the driver (msix).
While for data-path incoming packets the driver needs the ability to
control whether it wished to receive interrupts or not, for control-path
packets such as incoming MAD the driver needs to be notified anyway, it
even do not need to re-arm the notification bit.

Enhance the notification field to support this.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:56 +02:00
Yuval Shaia
dee2e53c86 hw/pvrdma: Check the correct return value
Return value of 0 means ok, we want to free the memory only in case of
error.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Message-Id: <20181025061700.17050-1-yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum<marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-12-22 11:09:56 +02:00
Peter Maydell
891ff9f4a3 ppc patch queue 2018-12-21
This pull request supersedes the one from 2018-12-13.
 
 This is a revised first ppc pull request for qemu-4.0.  Highlights
 are:
 
  * Most of the code for the POWER9 "XIVE" interrupt controller
    (not complete yet, but we're getting there)
  * A number of g_new vs. g_malloc cleanups
  * Some IRQ wiring cleanups
  * A fix for how we advertise NUMA nodes to the guest for pseries
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAlwce1QACgkQbDjKyiDZ
 s5JqlhAAhES3+UoNHa/eTf/o2OOgIgZ3A2FVP1kGQbb3Q415hxx1blpYkBDk6sUg
 POEgwbj8QMvJ8npOICb2NnHNsAhKRfgGxx/lqVgxLPTDdwGBq7Jr1lfhyX4D99WD
 C2oLtJWvQrA7yIsDzurMjJpFvw8SYSogppom4lqE5667pm6U0j7JggFJkwIo+VAj
 jzl706vvB6/EL3PHZ8eCzsxT2oRpxxMStE3lJ1JPpKc60mFb5gkXMk3hura1L8Ez
 t4NEN9I4ePivXlh6YYDp7Pv5l9JSzKV7Uu8xrYeMdz33e0jUWyoERrdhM51mI4s1
 WGoQm6eL6p0jngAUPYtAdIGC6ZGaCMT5rkoDZ4K+us94kVvdqzWjQyRNp84GpQq0
 Z/sxJaTSK2DZMnQL3LE19upk7XkB5uBgnjs5T5FcFia7bIDG3p8MY4VwIM4dRum9
 WuirEUJRKg28eTTnuK9NQX2+MEnrRWc/FSNaBLjxrijD4C4jHogXTpssNprYnkV7
 HgkQ2MaidcnNLftfOUeBr0aTx+rGqtUB56Xas1UK+WqykKVfRdZ6hnbRg30FJvQ/
 X4SIc/QZcLcA78C/SvCuXa1uqWqlrZMhN2e5r+eiEXaFYyriWgMYe9w+mXKbrTsb
 ZPkbax6xz1esqOZ15ytCTneQONyhMXy5iFDfb0khO4DO0uXq4dA=
 =IN1s
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.0-20181221' into staging

ppc patch queue 2018-12-21

This pull request supersedes the one from 2018-12-13.

This is a revised first ppc pull request for qemu-4.0.  Highlights
are:

 * Most of the code for the POWER9 "XIVE" interrupt controller
   (not complete yet, but we're getting there)
 * A number of g_new vs. g_malloc cleanups
 * Some IRQ wiring cleanups
 * A fix for how we advertise NUMA nodes to the guest for pseries

# gpg: Signature made Fri 21 Dec 2018 05:34:12 GMT
# gpg:                using RSA key 6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-4.0-20181221: (40 commits)
  MAINTAINERS: PPC: add a XIVE section
  spapr: change default CPU type to POWER9
  spapr: introduce an 'ic-mode' machine option
  spapr: add an extra OV5 field to the sPAPR IRQ backend
  spapr: add a 'reset' method to the sPAPR IRQ backend
  spapr: extend the sPAPR IRQ backend for XICS migration
  spapr: allocate the interrupt thread context under the CPU core
  spapr: add device tree support for the XIVE exploitation mode
  spapr: add hcalls support for the XIVE exploitation interrupt mode
  spapr: introduce a new machine IRQ backend for XIVE
  spapr-iommu: Always advertise the maximum possible DMA window size
  spapr/xive: use the VCPU id as a NVT identifier
  spapr/xive: introduce a XIVE interrupt controller
  ppc/xive: notify the CPU when the interrupt priority is more privileged
  ppc/xive: introduce a simplified XIVE presenter
  ppc/xive: introduce the XIVE interrupt thread context
  ppc/xive: add support for the END Event State Buffers
  Changes requirement for "vsubsbs" instruction
  spapr: export and rename the xics_max_server_number() routine
  spapr: introduce a spapr_irq_init() routine
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-12-21 15:49:59 +00:00
Peter Maydell
15763776bf pci, pc, virtio: fixes, features
VTD fixes
 IR and split irqchip are now the default for Q35
 ACPI refactoring
 hotplug refactoring
 new names for virtio devices
 multiple pcie link width/speeds
 PCI fixes
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJcG967AAoJECgfDbjSjVRpUlMH/1wy8WW7Br/4JxlWUPsfZTqZ
 0Lg2n59wuFzRVS+gLotp6bUaJGR9xn9fKjI1wfD59oVrDTKyauuw82v0OityEs33
 ZquFecuJvP6k5N40DkqA9YJjKSw7nUmLrsyrD0t2H43npikP2aD9f6yootrr3oVV
 nPwBvyT9ykIBQc7IzsHDiLw3EPdIf+9IR7+l+PLptzV0zK43Jfwi/nzHIQq00UMz
 eLM/ejQLIx4BZJnYGS0Cy6v3K7cS3k45LHDY0cGc0id2jHFN2vdQyHCF9I1ps72q
 pSlhMaLEwvQSYyre6iFTG5uuvyIPWv3LOkaBEwMSA5B/HXuEb2RKHThYzS9dc68=
 =OwW7
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pci, pc, virtio: fixes, features

VTD fixes
IR and split irqchip are now the default for Q35
ACPI refactoring
hotplug refactoring
new names for virtio devices
multiple pcie link width/speeds
PCI fixes

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Thu 20 Dec 2018 18:26:03 GMT
# gpg:                using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (44 commits)
  x86-iommu: turn on IR by default if proper
  x86-iommu: switch intr_supported to OnOffAuto type
  q35: set split kernel irqchip as default
  pci: Adjust PCI config limit based on bus topology
  spapr_pci: perform unplug via the hotplug handler
  pci/shpc: perform unplug via the hotplug handler
  pci: Reuse pci-bridge hotplug handler handlers for pcie-pci-bridge
  pci/pcie: perform unplug via the hotplug handler
  pci/pcihp: perform unplug via the hotplug handler
  pci/pcihp: overwrite hotplug handler recursively from the start
  pci/pcihp: perform check for bus capability in pre_plug handler
  s390x/pci: rename hotplug handler callbacks
  pci/shpc: rename hotplug handler callbacks
  pci/pcie: rename hotplug handler callbacks
  hw/i386: Remove deprecated machines pc-0.10 and pc-0.11
  hw: acpi: Remove AcpiRsdpDescriptor and fix tests
  hw: acpi: Export and share the ARM RSDP build
  hw: arm: Support both legacy and current RSDP build
  hw: arm: Convert the RSDP build to the buid_append_foo() API
  hw: arm: Carry RSDP specific data through AcpiRsdpData
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-12-21 14:06:01 +00:00
Cédric Le Goater
34a6b015a9 spapr: change default CPU type to POWER9
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21 09:40:43 +11:00
Cédric Le Goater
3ba3d0bc33 spapr: introduce an 'ic-mode' machine option
This option is used to select the interrupt controller mode (XICS or
XIVE) with which the machine will operate. XICS being the default
mode for now.

When running a machine with the XIVE interrupt mode backend, the guest
OS is required to have support for the XIVE exploitation mode. In the
case of legacy OS, the mode selected by CAS should be XICS and the OS
should fail to boot. However, QEMU could possibly detect it, terminate
the boot process and reset to stop in the SLOF firmware. This is not
yet handled.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21 09:40:43 +11:00
Cédric Le Goater
db592b5b16 spapr: add an extra OV5 field to the sPAPR IRQ backend
The interrupt modes supported by the hypervisor are advertised to the
guest with new bits definitions of the option vector 5 of property
"ibm,arch-vec-5-platform-support. The byte 23 bits 0-1 of the OV5 are
defined as follow :

  0b00   PAPR 2.7 and earlier (Legacy systems)
  0b01   XIVE Exploitation mode only
  0b10   Either available

If the client/guest selects the XIVE interrupt mode, it informs the
hypervisor by returning the value 0b01 in byte 23 bits 0-1. A 0b00
value indicates the use of the XICS interrupt mode (Legacy systems).

The sPAPR IRQ backend is extended with these definitions and the
values are directly used to populate the "ibm,arch-vec-5-platform-support"
property. The interrupt mode is advertised under TCG and under KVM.
Although a KVM XIVE device is not yet available, the machine can still
operate with kernel_irqchip=off. However, we apply a restriction on
the CPU which is required to be a POWER9 when a XIVE interrupt
controller is in use.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21 09:40:43 +11:00
Cédric Le Goater
b2e2247716 spapr: add a 'reset' method to the sPAPR IRQ backend
For the time being, the XIVE reset handler updates the OS CAM line of
the vCPU as it is done under a real hypervisor when a vCPU is
scheduled to run on a HW thread. This will let the XIVE presenter
engine find a match among the NVTs dispatched on the HW threads.

This handler will become even more useful when we introduce the
machine supporting both interrupt modes, XIVE and XICS. In this
machine, the interrupt mode is chosen by the CAS negotiation process
and activated after a reset.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: Fix style nits]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21 09:40:35 +11:00
Cédric Le Goater
1c53b06c03 spapr: extend the sPAPR IRQ backend for XICS migration
Introduce a new sPAPR IRQ handler to handle resend after migration
when the machine is using a KVM XICS interrupt controller model.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21 09:39:13 +11:00
Cédric Le Goater
1a937ad7e7 spapr: allocate the interrupt thread context under the CPU core
Each interrupt mode has its own specific interrupt presenter object,
that we store under the CPU object, one for XICS and one for XIVE.

Extend the sPAPR IRQ backend with a new handler to support them both.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21 09:39:13 +11:00
Cédric Le Goater
6e21de4a50 spapr: add device tree support for the XIVE exploitation mode
The XIVE interface for the guest is described in the device tree under
the "interrupt-controller" node. A couple of new properties are
specific to XIVE :

 - "reg"

   contains the base address and size of the thread interrupt
   managnement areas (TIMA), for the User level and for the Guest OS
   level. Only the Guest OS level is taken into account today.

 - "ibm,xive-eq-sizes"

   the size of the event queues. One cell per size supported, contains
   log2 of size, in ascending order.

 - "ibm,xive-lisn-ranges"

   the IRQ interrupt number ranges assigned to the guest for the IPIs.

and also under the root node :

 - "ibm,plat-res-int-priorities"

   contains a list of priorities that the hypervisor has reserved for
   its own use. OPAL uses the priority 7 queue to automatically
   escalate interrupts for all other queues (DD2.X POWER9). So only
   priorities [0..6] are allowed for the guest.

Extend the sPAPR IRQ backend with a new handler to populate the DT
with the appropriate "interrupt-controller" node.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: Fix style nits]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21 09:39:07 +11:00
Cédric Le Goater
23bcd5eb9a spapr: add hcalls support for the XIVE exploitation interrupt mode
The different XIVE virtualization structures (sources and event queues)
are configured with a set of Hypervisor calls :

 - H_INT_GET_SOURCE_INFO

   used to obtain the address of the MMIO page of the Event State
   Buffer (ESB) entry associated with the source.

 - H_INT_SET_SOURCE_CONFIG

   assigns a source to a "target".

 - H_INT_GET_SOURCE_CONFIG

   determines which "target" and "priority" is assigned to a source

 - H_INT_GET_QUEUE_INFO

   returns the address of the notification management page associated
   with the specified "target" and "priority".

 - H_INT_SET_QUEUE_CONFIG

   sets or resets the event queue for a given "target" and "priority".
   It is also used to set the notification configuration associated
   with the queue, only unconditional notification is supported for
   the moment. Reset is performed with a queue size of 0 and queueing
   is disabled in that case.

 - H_INT_GET_QUEUE_CONFIG

   returns the queue settings for a given "target" and "priority".

 - H_INT_RESET

   resets all of the guest's internal interrupt structures to their
   initial state, losing all configuration set via the hcalls
   H_INT_SET_SOURCE_CONFIG and H_INT_SET_QUEUE_CONFIG.

 - H_INT_SYNC

   issue a synchronisation on a source to make sure all notifications
   have reached their queue.

Calls that still need to be addressed :

   H_INT_SET_OS_REPORTING_LINE
   H_INT_GET_OS_REPORTING_LINE

See the code for more documentation on each hcall.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Folded in fix for field accessors]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21 09:37:38 +11:00