Commit Graph

102731 Commits

Author SHA1 Message Date
Igor Mammedov
22c8dd000f tests: acpi: add device with acpi-index on non-hotpluggble bus
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-21-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov
52ad9eb237 tests: acpi: whitelist DSDT before adding device with acpi-index to testcases
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-20-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov
bda649537c tests: acpi: update expected blobs
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-19-imammedo@redhat.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov
fe0d5f5319 acpi: pci: add EDSM method to DSDT
it's a helper method for acpi-index support on PCI buses
that do no support or have disabled ACPI PCI hotplug
or for non-hotpluggble endpoint devices.
(like non-hotpluggble NICs, integrated endpoints and
later for machines that do not support ACPI PCI hotplug)

no functional change, commit adds only EDSM method in DSDT
without any users. (the follow up patches will use it)

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-18-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov
d6cfe1d834 tests: acpi: whitelist DSDT before adding EDSM method
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-17-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov
e9ea452237 tests: acpi: update expected blobs
only following context change:
 -  Local1 = Zero
    If ((Arg0 != ToUUID ("e5c937d0-3553-4d7a-9117-ea4d19c3434d") /* Device Labeling Interface */))
    {
        Return (Local0)
 ...
        Return (Local0)
    }

 +  Local1 = Zero
    Local2 = AIDX (DerefOf (Arg4 [Zero]), DerefOf (Arg4 [One]

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-16-imammedo@redhat.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov
0a4584fca3 pcihp: move PCI _DSM function 0 prolog into separate function
it will be reused by follow up patches that will implement
static _DSM for non-hotpluggable devices.

no functional AML change, only context one, where 'cap' (Local1)
initialization is moved after UUID/revision checks.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-15-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:39:00 -05:00
Igor Mammedov
bd95cd5323 tests: acpi: whitelist DSDT blobs before isolating PCI _DSM func 0 prolog
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-14-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Igor Mammedov
ceefa0b746 pci: fix 'hotplugglable' property behavior
Currently the property may flip its state
during VM bring up or just doesn't work as
the name implies.

In particular with PCIE root port that has
'hotplug={on|off}' property, and when it's
turned off, one would expect
  'hotpluggable' == false
for any devices attached to it.
Which is not the case since qbus_is_hotpluggable()
used by the property just checks for presence
of any hotplug_handler set on bus.

The problem is that name BusState::hotplug_handler
from its inception is misnomer, as it handles
not only hotplug but also in many cases coldplug
as well (i.e. generic wiring interface), and
it's fine to have hotplug_handler set on bus
while it doesn't support hotplug (ex. pcie-slot
with hotplug=off).

Another case of root port flipping 'hotpluggable'
state when ACPI PCI hotplug is enabled in this
case root port with 'hotplug=off' starts as
hotpluggable and then later on, pcihp
hotplug_handler clears hotplug_handler
explicitly after checking root port's 'hotplug'
property.

So root-port hotpluggablity check sort of works
if pcihp is enabled but is broken if pcihp is
disabled.

One way to deal with the issue is to ask
hotplug_handler if bus it controls is hotpluggable
or not. To do that add is_hotpluggable_bus()
hook to HotplugHandler interface and use it in
'hotpluggable' property + teach pcie-slot to
actually look into 'hotplug' property state
before deciding if bus is hotpluggable.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-13-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Igor Mammedov
f40e6a4cc1 pcihp: piix4: do not redirect hotplug controller to piix4 when ACPI hotplug is disabled
commit [1] added ability to disable ACPI PCI hotplug
on hostbridge but forgot to take into account that it
should disable all ACPI hotplug machinery in case both
hostbridge and bridge hotplug are disabled.

Commit [2] tried to fix that, however it forgot to
remove hotplug_handler override which hands hotplug
control over to piix4 hotplug controller
(uninitialized after [2]).

As result at the time bridge is plugged in, its default
(SHPC) hotplug handler is replaced by piix4 one in
  acpi_pcihp_device_plug_cb()
    ...
    if (!s->legacy_piix &&
       ...
       qbus_set_hotplug_handler(BUS(sec), OBJECT(hotplug_dev));

which is acting on uninitialized s->legacy_piix value
(0 by default) that was supposed to be initialized by
acpi_pcihp_init(), that is no longer called due to
following condition being false:

  piix4_acpi_system_hot_add_init()
    if (s->use_acpi_hotplug_bridge || s->use_acpi_root_pci_hotplug) {

and the bridge ends up with piix4 as hotplug handler
instead of shpc one.

Followup hotplug on that bridge as result yields
piix4 specific error:

  Error: Unsupported bus. Bus doesn't have property 'acpi-pcihp-bsel' set

1) 3d7e78aa77 (Introduce a new flag for i440fx to disable PCI hotplug on the root bus)
2) df4008c9c5 (piix4: don't reserve hw resources when hotplug is off globally)

Fixes: df4008c9c5 (piix4: don't reserve hw resources when hotplug is off globally)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-12-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Igor Mammedov
30216b3eaf tests: acpi: update expected blobs
BNUM numbering changes across DSDT due to addition of new bridges.

Fixed missing PCI tree brunch (q35/DSDT.multi-bridge case):

  //  -device pcie-root-port,id=rpnohp,chassis=8,addr=0xA.0,hotplug=off
  +            Device (S50)
  +            {
  +                Name (_ADR, 0x000A0000)  // _ADR: Address
  //  -device pcie-root-port,id=rp3,chassis=9,bus=rpnohp
  +                Device (S00)
  +                {
  +                    Name (_ADR, Zero)  // _ADR: Address
  +                    Name (BSEL, Zero)
  +                    Device (S00)
  +                    {
  +                        Name (_ADR, Zero)  // _ADR: Address
  +                        Name (ASUN, Zero)
  +                        Method (_DSM, 4, Serialized)  // _DSM: Device-Specific Method
  +                        {
  +                            Local0 = Package (0x02)
  +                                {
  +                                    BSEL,
  +                                    ASUN
  +                                }
  +                            Return (PDSM (Arg0, Arg1, Arg2, Arg3, Local0))
  +                        }
  +
  +                        Name (_SUN, Zero)  // _SUN: Slot User Number
  +                        Method (_EJ0, 1, NotSerialized)  // _EJx: Eject Device
  +                        {
  +                            PCEJ (BSEL, _SUN)
  +                        }
  +                    }
  +
  +                    Method (DVNT, 2, NotSerialized)
  +                    {
  +                        If ((Arg0 & One))
  +                        {
  +                            Notify (S00, Arg1)
  +                        }
  +                    }
  +                }
  +            }

Fixed hotplug notification for leaf root port (hotplug=on) attached to
intermediate root port (hotplug=off) (q35/DSDT.multi-bridge case)

  //  -device pcie-root-port,id=rpnohp,chassis=8,addr=0xA.0,hotplug=off
  +        Scope (S50)
  +        {
  //  -device pcie-root-port,id=rp3,chassis=9,bus=rpnohp
  +            Scope (S00)
  +            {
  +                Method (PCNT, 0, NotSerialized)
  +                {
  +                    BNUM = Zero
  +                    DVNT (PCIU, One)
  +                    DVNT (PCID, 0x03)
  +                }
  +            }
  +
  +            Method (PCNT, 0, NotSerialized)
  +            {
  +                ^S00.PCNT ()
  +            }
  +        }
  ...
           Method (PCNT, 0, NotSerialized)
           {
  +            ^S50.PCNT ()
               ^S13.PCNT ()

Populated slots being described on coldplugged bridges even if
ACPI bridge hotplug is disabled.
(pc/DSDT.hpbridge and pc/DSDT.hpbrroot)
  ...
               Device (S18)
               {
                   Name (_ADR, 0x00030000)  // _ADR: Address
  +                Device (S08)
  +                {
  +                    Name (_ADR, 0x00010000)  // _ADR: Address
  +                }
  +
  +                Device (S10)
  +                {
  +                    Name (_ADR, 0x00020000)  // _ADR: Address
  +                }
               }
  ...
               Device (S18)
               {
                   Name (_ADR, 0x00030000)  // _ADR: Address
  +                Device (S00)
  +                {
  +                    Name (_ADR, Zero)  // _ADR: Address
  +                }
               }

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-11-imammedo@redhat.com>
2023-03-07 12:38:59 -05:00
Igor Mammedov
0e84fd3b98 x86: pcihp: fix missing bridge AML when intermediate root-port has 'hotplug=off' set
(I practice [1] hasn't broke anything since on hardware side we unset
hotplug_handler on such intermediate port => hotplug behind it has
never worked)

When deciding if bridge should be described, the original
condition was

  cold_plugged_bridge && pcihp_bridge_en

which was replaced [1] by

  bridge has ACPI_PCIHP_PROP_BSEL

the later however is not the same thing as the original
and flips to false if intermediate bridge has hotplug
turned off (root-port with 'hotplug=off' option).

Since we already in build_pci_bridge_aml(), the question
if it's bridge is answered. Use DeviceState::hotplugged
to make decision if bridge should describe its slots.

What's left out is pcihp_bridge_en, which tells us if
ACPI bridge hotplug is enabled.

With hotplug and non hotplug part now being mostly
separated, omitting this check will only lead to
colplugged bridges describe occupied slots in case
when ACPI bridge hotplug is disabled.
Which makes behavior consistent with occupied slots
on hostbridge.

Ex (pc/DSDT.hpbrroot diff):
  ...
               Device (S20)
               {
                   Name (_ADR, 0x00040000)  // _ADR: Address
  +                Device (S08)
  +                {
  +                    Name (_ADR, 0x00010000)  // _ADR: Address
  +                }
  +
  +                Device (S10)
  +                {
  +                    Name (_ADR, 0x00020000)  // _ADR: Address
  +                }
               }
  ...

PS:
testing shows that above doesn't affect adversely guest OS
behavior: i.e. if ACPI bridge hotplug is enabled it's
expected behaviour, and with ACPI bridge hotplug is disabled
(a.k. native hotplug), it doesn't break slot enumeration
nor native hotplug. (tested with RHEL9.0 and WS2022).

1)
Fixes: 6c36ec46b0 ("pcihp: make bridge describe itself using AcpiDevAmlIfClass:build_dev_aml")
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-10-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Igor Mammedov
1c103f35d1 tests: acpi: whitelist pc/DSDT.hpbrroot and pc/DSDT.hpbridge tests
follow up fix for missing root-port AML will affect these tests
by adding non-hotpluggable Device descriptors of colplugged
bridges when bridge hotplug is disabled.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-9-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Igor Mammedov
11215a349e x86: pcihp: fix missing PCNT callchain when intermediate root-port has 'hotplug=off' set
Beside BSEL numbers change (due to 2 extra root-ports in q35/miltibridge test),
following change is expected:

       Scope (\_SB.PCI0)
       {
  ...
  +        Scope (S50)
  +        {
  +            Scope (S00)
  +            {
  +                Method (PCNT, 0, NotSerialized)
  +                {
  +                    BNUM = Zero
  +                    DVNT (PCIU, One)
  +                    DVNT (PCID, 0x03)
  +                }
  +            }
  +
  +            Method (PCNT, 0, NotSerialized)
  +            {
  +                ^S00.PCNT
  +            }
  +        }
  ...
           Method (PCNT, 0, NotSerialized)
           {
  +            ^S50.PCNT ()
               ^S13.PCNT ()
               ^S12.PCNT ()
               ^S11.PCNT ()

I practice [1] hasn't broke anything since on hardware side we unset
hotplug_handler on such intermediate port => hotplug behind it has
not been properly wired and as result not worked.

1)
Fixes: ddab4d3fae ("pcihp: compose PCNT callchain right before its user _GPE._E01")
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-8-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Igor Mammedov
0c3bf7c431 tests: acpi: extend multi-bridge case with case 'root-port,id=HOHP,hotplug=off root-port,bus=NOHP'
Following corner case wasn't covered:

  -device pcie-root-port,id=NO_HOTPLUG,hotplug=off
  -device pcie-root-port,bus=NO_HOTPLUG

when intermediate root-port has explicitly disabled hotplug,
all hierarchy below it is not described anymore (used to be
described in 7.2)

So as result we see only NO_HOTPLUG root-port described

  +            Device (S50)
  +            {
  +                Name (_ADR, 0x000A0000)  // _ADR: Address
  +            }

and no children nor notification chain for them are being composed.
Follow up patches will fix missing leaf root-port descriptor
and notification chain that should accompany it.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-7-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Igor Mammedov
0ece4e3bc3 tests: acpi: whitelist q35/DSDT.multi-bridge before extending testcase
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-6-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Igor Mammedov
6bf2d446d4 tests: acpi: update expected blobs
expected changes:
Basically adds devices present on root bus in form:
  Device (SXX)
  {
     Name (_ADR, 0xYYYYYYYY)  // _ADR: Address
  }

On top of that For q35.noacpihp, all ACPI PCI hotplug
AML is removed and _OSC get native hotplug enabled:

                       CreateDWordField (Arg3, 0x04, CDW2)
                       CreateDWordField (Arg3, 0x08, CDW3)
                       Local0 = CDW3 /* \_SB_.PCI0._OSC.CDW3 */
  -                    Local0 &= 0x1E
  +                    Local0 &= 0x1F
                       If ((Arg1 != One))
                       {
                           CDW1 |= 0x08
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-5-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Igor Mammedov
b0b3b99e5f tests: acpi: add test_acpi_q35_tcg_no_acpi_hotplug test and extend test_acpi_piix4_no_acpi_pci_hotplug
test bridge AML generator with ACPI PCI hotplug disabled
(i.e. with native hotplug enabled/disabled per bridge/root port)

PS:
while at make sure that devices on pci-bridge are starting
from addr=1.0 as slot 0 is not available there and test
passes only because of a bug in ACPI hotplug that will be
fixed by follow up patch

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-4-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Igor Mammedov
d3860a57c7 tests: acpi: whitelist new q35.noacpihp test and pc.hpbrroot
for q35.noacpihp use plain default Q35 DSDT table as a starting point.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-3-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Igor Mammedov
7cb230788e Revert "tests/qtest: Check for devices in bios-tables-test"
This reverts commit c471eb4f40.

which broke acpi tables test and rebuild due to skipping some tests
even thought none of devices tests depend on weren't disabled.

As result it leads to some expected tables not being updated,
merge conflicts and tests failure.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-2-imammedo@redhat.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
ab7337e3b2 vdpa: return VHOST_F_LOG_ALL in vhost-vdpa devices
vhost-vdpa devices can return this feature now that blockers have been
set in case some features are not met.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230303172445.1089785-15-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
609ab4c3ed vdpa net: allow VHOST_F_LOG_ALL
Since some actions move to the start function instead of init, the
device features may not be the parent vdpa device's, but the one
returned by vhost backend.  If transition to SVQ is supported, the vhost
backend will return _F_LOG_ALL to signal the device is migratable.

Add VHOST_F_LOG_ALL.  HW dirty page tracking can be added on top of this
change if the device supports it in the future.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230303172445.1089785-14-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
57ac831865 vdpa: block migration if SVQ does not admit a feature
Next patches enable devices to be migrated even if vdpa netdev has not
been started with x-svq. However, not all devices are migratable, so we
need to block migration if we detect that.

Block migration if we detect the device expose a feature SVQ does not
know how to work with.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-13-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
5c1ebd4c43 vdpa: block migration if device has unsupported features
A vdpa net device must initialize with SVQ in order to be migratable at
this moment, and initialization code verifies some conditions.  If the
device is not initialized with the x-svq parameter, it will not expose
_F_LOG so the vhost subsystem will block VM migration from its
initialization.

Next patches change this, so we need to verify migration conditions
differently.

QEMU only supports a subset of net features in SVQ, and it cannot
migrate state that cannot track or restore in the destination.  Add a
migration blocker if the device offers an unsupported feature.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-12-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
9c363cf6d5 vdpa net: block migration if the device has CVQ
Devices with CVQ need to migrate state beyond vq state.  Leaving this to
future series.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-11-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
a230c4712b vdpa: disable RAM block discard only for the first device
Although it does not make a big difference, its more correct and
simplifies the cleanup path in subsequent patches.

Move ram_block_discard_disable(false) call to the top of
vhost_vdpa_cleanup because:
* We cannot use vhost_vdpa_first_dev after dev->opaque = NULL
  assignment.
* Improve the stack order in cleanup: since it is the last action taken
  in init, it should be the first at cleanup.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-10-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
6949843046 vdpa: add vdpa net migration state notifier
This allows net to restart the device backend to configure SVQ on it.

Ideally, these changes should not be net specific and they could be done
in:
* vhost_vdpa_set_features (with VHOST_F_LOG_ALL)
* vhost_vdpa_set_vring_addr (with .enable_log)
* vhost_vdpa_set_log_base.

However, the vdpa net backend is the one with enough knowledge to
configure everything because of some reasons:
* Queues might need to be shadowed or not depending on its kind (control
  vs data).
* Queues need to share the same map translations (iova tree).

Also, there are other problems that may have solutions but complicates
the implementation at this stage:
* We're basically duplicating vhost_dev_start and vhost_dev_stop, and
  they could go out of sync.  If we want to reuse them, we need a way to
  skip some function calls to avoid recursiveness (either vhost_ops ->
  vhost_set_features, vhost_set_vring_addr, ...).
* We need to traverse all vhost_dev of a given net device twice: one to
  stop and get the vq state and another one after the reset to
  configure properties like address, fd, etc.

Because of that it is cleaner to restart the whole net backend and
configure again as expected, similar to how vhost-kernel moves between
userspace and passthrough.

If more kinds of devices need dynamic switching to SVQ we can:
* Create a callback struct like VhostOps and move most of the code
  there.  VhostOps cannot be reused since all vdpa backend share them,
  and to personalize just for networking would be too heavy.
* Add a parent struct or link all the vhost_vdpa or vhost_dev structs so
  we can traverse them.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-9-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
c3716f260b vdpa: move vhost reset after get vring base
The function vhost.c:vhost_dev_stop calls vhost operation
vhost_dev_start(false). In the case of vdpa it totally reset and wipes
the device, making the fetching of the vring base (virtqueue state) totally
useless.

The kernel backend does not use vhost_dev_start vhost op callback, but
vhost-user do. A patch to make vhost_user_dev_start more similar to vdpa
is desirable, but it can be added on top.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-8-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
0bb302a996 vdpa: add vhost_vdpa_suspend
The function vhost.c:vhost_dev_stop fetches the vring base so the vq
state can be migrated to other devices.  However, this is unreliable in
vdpa, since we didn't signal the device to suspend the queues, making
the value fetched useless.

Suspend the device if possible before fetching first and subsequent
vring bases.

Moreover, vdpa totally reset and wipes the device at the last device
before fetch its vrings base, making that operation useless in the last
device. This will be fixed in later patches of this series.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-7-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
b6662cb7e5 vdpa: add vhost_vdpa->suspended parameter
This allows vhost_vdpa to track if it is safe to get the vring base from
the device or not.  If it is not, vhost can fall back to fetch idx from
the guest buffer again.

No functional change intended in this patch, later patches will use this
field.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-6-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
4241e8bd72 vdpa: rewind at get_base, not set_base
At this moment it is only possible to migrate to a vdpa device running
with x-svq=on. As a protective measure, the rewind of the inflight
descriptors was done at the destination. That way if the source sent a
virtqueue with inuse descriptors they are always discarded.

Since this series allows to migrate also to passthrough devices with no
SVQ, the right thing to do is to rewind at the source so the base of
vrings are correct.

Support for inflight descriptors may be added in the future.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230303172445.1089785-5-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
d83b494580 vdpa: Negotiate _F_SUSPEND feature
This is needed for qemu to know it can suspend the device to retrieve
its status and enable SVQ with it, so all the process is transparent to
the guest.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230303172445.1089785-4-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
b276524386 vdpa: Remember last call fd set
As SVQ can be enabled dynamically at any time, it needs to store call fd
always.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-3-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
00ef422e9f vdpa net: move iova tree creation from init to start
Only create iova_tree if and when it is needed.

The cleanup keeps being responsible for the last VQ but this change
allows it to merge both cleanup functions.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230303172445.1089785-2-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
2133e07c4c MAINTAINERS: add myself as the maintainer for cryptodev
I developed the akcipher service, QoS setting, QMP/HMP commands and
statistics accounting for crypto device. Making myself as the
maintainer for QEMU's cryptodev.

Cc: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-13-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
f2b901098e cryptodev: Support query-stats QMP command
Now we can use "query-stats" QMP command to query statistics of
crypto devices. (Originally this was designed to show statistics
by '{"execute": "query-cryptodev"}'. Daniel Berrangé suggested that
querying configuration info by "query-cryptodev", and querying
runtime performance info by "query-stats". This makes sense!)

Example:
~# virsh qemu-monitor-command vm '{"execute": "query-stats", \
   "arguments": {"target": "cryptodev"} }' | jq
{
  "return": [
    {
      "provider": "cryptodev",
      "stats": [
        {
          "name": "asym-verify-bytes",
          "value": 7680
        },
        ...
        {
          "name": "asym-decrypt-ops",
          "value": 32
        },
        {
          "name": "asym-encrypt-ops",
          "value": 48
        }
      ],
      "qom-path": "/objects/cryptodev0" # support asym only
    },
    {
      "provider": "cryptodev",
      "stats": [
        {
          "name": "asym-verify-bytes",
          "value": 0
        },
        ...
        {
          "name": "sym-decrypt-bytes",
          "value": 5376
        },
        ...
      ],
      "qom-path": "/objects/cryptodev1" # support asym/sym
    }
  ],
  "id": "libvirt-422"
}

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-12-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
2580b452ff cryptodev: support QoS
Add 'throttle-bps' and 'throttle-ops' limitation to set QoS. The
two arguments work with both QEMU command line and QMP command.

Example of QEMU command line:
-object cryptodev-backend-builtin,id=cryptodev1,throttle-bps=1600,\
throttle-ops=100

Example of QMP command:
virsh qemu-monitor-command buster --hmp qom-set /objects/cryptodev1 \
throttle-ops 100

or cancel limitation:
virsh qemu-monitor-command buster --hmp qom-set /objects/cryptodev1 \
throttle-ops 0

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-11-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
e7a775fd9f cryptodev: Account statistics
Account OPS/BPS for crypto device, this will be used for 'query-stats'
QEMU monitor command and QoS in the next step.

Note that a crypto device may support symmetric mode, asymmetric mode,
both symmetric and asymmetric mode. So we use two structure to
describe the statistics of a crypto device.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-10-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
2cb0692768 cryptodev: Use CryptoDevBackendOpInfo for operation
Move queue_index, CryptoDevCompletionFunc and opaque into struct
CryptoDevBackendOpInfo, then cryptodev_backend_crypto_operation()
needs an argument CryptoDevBackendOpInfo *op_info only. And remove
VirtIOCryptoReq from cryptodev. It's also possible to hide
VirtIOCryptoReq into virtio-crypto.c in the next step. (In theory,
VirtIOCryptoReq is a private structure used by virtio-crypto only)

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-9-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
ef52091aeb hmp: add cryptodev info command
Example of this command:
 # virsh qemu-monitor-command vm --hmp info cryptodev
cryptodev1: service=[akcipher|mac|hash|cipher]
    queue 0: type=builtin
cryptodev0: service=[akcipher]
    queue 0: type=lkcf

Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-8-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
abca0fc329 cryptodev-builtin: Detect akcipher capability
Rather than exposing akcipher service/RSA algorithm to virtio crypto
device unconditionally, detect akcipher capability from akcipher
crypto framework. This avoids unsuccessful requests.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-7-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
5dcb019810 cryptodev: Introduce 'query-cryptodev' QMP command
Now we have a QMP command to query crypto devices:
virsh qemu-monitor-command vm '{"execute": "query-cryptodev"}' | jq
{
  "return": [
    {
      "service": [
        "akcipher",
        "mac",
        "hash",
        "cipher"
      ],
      "id": "cryptodev1",
      "client": [
        {
          "queue": 0,
          "type": "builtin"
        }
      ]
    },
    {
      "service": [
        "akcipher"
      ],
      "id": "cryptodev0",
      "client": [
        {
          "queue": 0,
          "type": "lkcf"
        }
      ]
    }
  ],
  "id": "libvirt-417"
}

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-6-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
bc304a6442 cryptodev: Introduce server type in QAPI
Introduce cryptodev service type in cryptodev.json, then apply this
to related codes. Now we can remove VIRTIO_CRYPTO_SERVICE_xxx
dependence from QEMU cryptodev.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-5-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
999c789f00 cryptodev: Introduce cryptodev alg type in QAPI
Introduce cryptodev alg type in cryptodev.json, then apply this to
related codes, and drop 'enum CryptoDevBackendAlgType'.

There are two options:
1, { 'enum': 'QCryptodevBackendAlgType',
  'prefix': 'CRYPTODEV_BACKEND_ALG',
  'data': ['sym', 'asym']}
Then we can keep 'CRYPTODEV_BACKEND_ALG_SYM' and avoid lots of
changes.
2, changes in this patch(with prefix 'QCRYPTODEV_BACKEND_ALG').

To avoid breaking the rule of QAPI, use 2 here.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-4-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
3f478371fd cryptodev: Remove 'name' & 'model' fields
We have already used qapi to generate crypto device types, this allows
to convert type to a string 'model', so the 'model' field is not
needed.

And the 'name' field is not used by any backend driver, drop it.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-3-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
14c9fd1673 cryptodev: Introduce cryptodev.json
Introduce QCryptodevBackendType in cryptodev.json, also apply this to
related codes. Then we can drop 'enum CryptoDevBackendOptionsType'.

Note that `CRYPTODEV_BACKEND_TYPE_NONE` is *NOT* used by anywhere, so
drop it(no 'none' enum in QCryptodevBackendType).

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-2-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Avihai Horon
333f988d19 docs/devel: Document VFIO device dirty page tracking
Adjust the VFIO dirty page tracking documentation and add a section to
describe device dirty page tracking.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-16-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 10:21:22 -07:00
Joao Martins
95b29658b6 vfio/migration: Query device dirty page tracking support
Now that everything has been set up for device dirty page tracking,
query the device for device dirty page tracking support.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-15-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 10:21:22 -07:00
Joao Martins
e46883204c vfio/migration: Block migration with vIOMMU
Migrating with vIOMMU will require either tracking maximum
IOMMU supported address space (e.g. 39/48 address width on Intel)
or range-track current mappings and dirty track the new ones
post starting dirty tracking. This will be done as a separate
series, so add a live migration blocker until that is fixed.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-14-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 10:21:22 -07:00
Joao Martins
b153402a89 vfio/common: Add device dirty page bitmap sync
Add device dirty page bitmap sync functionality. This uses the device
DMA logging uAPI to sync dirty page bitmap from the device.

Device dirty page bitmap sync is used only if all devices within a
container support device dirty page tracking.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-13-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 10:21:22 -07:00