Commit Graph

979 Commits

Author SHA1 Message Date
Stefan Hajnoczi
9445e1e15e virtio-blk-pci: default num_queues to -smp N
Automatically size the number of virtio-blk-pci request virtqueues to
match the number of vCPUs.  Other transports continue to default to 1
request virtqueue.

A 1:1 virtqueue:vCPU mapping ensures that completion interrupts are
handled on the same vCPU that submitted the request.  No IPI is
necessary to complete an I/O request and performance is improved.  The
maximum number of MSI-X vectors and virtqueues limit are respected.

Performance improves from 78k to 104k IOPS on a 32 vCPU guest with 101
virtio-blk-pci devices (ioengine=libaio, iodepth=1, bs=4k, rw=randread
with NVMe storage).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Message-Id: <20200818143348.310613-7-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-27 08:29:13 -04:00
Stefan Hajnoczi
6a55882284 virtio-scsi-pci: default num_queues to -smp N
Automatically size the number of virtio-scsi-pci, vhost-scsi-pci, and
vhost-user-scsi-pci request virtqueues to match the number of vCPUs.
Other transports continue to default to 1 request virtqueue.

A 1:1 virtqueue:vCPU mapping ensures that completion interrupts are
handled on the same vCPU that submitted the request.  No IPI is
necessary to complete an I/O request and performance is improved.  The
maximum number of MSI-X vectors and virtqueues limit are respected.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200818143348.310613-6-stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-27 08:29:13 -04:00
Stefan Hajnoczi
4e5163bd84 virtio-scsi: introduce a constant for fixed virtqueues
The event and control virtqueues are always present, regardless of the
multi-queue configuration.  Define a constant so that virtqueue number
calculations are easier to read.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20200818143348.310613-5-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-27 08:29:13 -04:00
Stefan Hajnoczi
1436f32a84 virtio-pci: add virtio_pci_optimal_num_queues() helper
Multi-queue devices achieve the best performance when each vCPU has a
dedicated queue. This ensures that virtqueue used notifications are
handled on the same vCPU that submitted virtqueue buffers.  When another
vCPU handles the the notification an IPI will be necessary to wake the
submission vCPU and this incurs a performance overhead.

Provide a helper function that virtio-pci devices will use in later
patches to automatically select the optimal number of queues.

The function handles guests with large numbers of CPUs by limiting the
number of queues to fit within the following constraints:
1. The maximum number of MSI-X vectors.
2. The maximum number of virtqueues.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20200818143348.310613-4-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-27 08:29:13 -04:00
Marc-André Lureau
be786d2f4a meson: convert hw/virtio
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-08-21 06:30:26 -04:00
Paolo Bonzini
2becc36a3e meson: infrastructure for building emulators
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-08-21 06:30:17 -04:00
Paolo Bonzini
243af0225a trace: switch position of headers to what Meson requires
Meson doesn't enjoy the same flexibility we have with Make in choosing
the include path.  In particular the tracing headers are using
$(build_root)/$(<D).

In order to keep the include directives unchanged,
the simplest solution is to generate headers with patterns like
"trace/trace-audio.h" and place forwarding headers in the source tree
such that for example "audio/trace.h" includes "trace/trace-audio.h".

This patch is too ugly to be applied to the Makefiles now.  It's only
a way to separate the changes to the tracing header files from the
Meson rewrite of the tracing logic.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-08-21 06:18:24 -04:00
Bruce Rogers
facc68516a virtio-mem: Correct format specifier mismatch for RISC-V
This likely affects other, less popular host architectures as well.
Less common host architectures under linux get QEMU_VMALLOC_ALIGN (from
which VIRTIO_MEM_MIN_BLOCK_SIZE is derived) define to a variable of
type uintptr, which isn't compatible with the format specifier used to
print a user message. Since this particular usage of the underlying data
seems unique to this file, the simple fix is to just cast
QEMU_VMALLOC_ALIGN to uint32_t, which corresponds to the format specifier
used.

Signed-off-by: Bruce Rogers <brogers@suse.com>
Message-Id: <20200730130519.168475-1-brogers@suse.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
2020-08-04 11:48:17 -04:00
Yuri Benditovich
a48aaf882b virtio-pci: fix wrong index in virtio_pci_queue_enabled
We should use the index passed by the caller instead of the queue_sel
when checking the enablement of a specific virtqueue.

This is reported in https://bugzilla.redhat.com/show_bug.cgi?id=1702608

Fixes: f19bcdfedd ("virtio-pci: implement queue_enabled method")
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2020-07-28 16:54:46 +08:00
Laurent Vivier
0c9753ebda virtio-pci: fix virtio_pci_queue_enabled()
In legacy mode, virtio_pci_queue_enabled() falls back to
virtio_queue_enabled() to know if the queue is enabled.

But virtio_queue_enabled() calls again virtio_pci_queue_enabled()
if k->queue_enabled is set. This ends in a crash after a stack
overflow.

The problem can be reproduced with
"-device virtio-net-pci,disable-legacy=off,disable-modern=true
 -net tap,vhost=on"

And a look to the backtrace is very explicit:

    ...
    #4  0x000000010029a438 in virtio_queue_enabled ()
    #5  0x0000000100497a9c in virtio_pci_queue_enabled ()
    ...
    #130902 0x000000010029a460 in virtio_queue_enabled ()
    #130903 0x0000000100497a9c in virtio_pci_queue_enabled ()
    #130904 0x000000010029a460 in virtio_queue_enabled ()
    #130905 0x0000000100454a20 in vhost_net_start ()
    ...

This patch fixes the problem by introducing a new function
for the legacy case and calls it from virtio_pci_queue_enabled().
It also calls it from virtio_queue_enabled() to avoid code duplication.

Fixes: f19bcdfedd ("virtio-pci: implement queue_enabled method")
Cc: Jason Wang <jasowang@redhat.com>
Cc: Cindy Lu <lulu@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20200727153319.43716-1-lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-27 11:34:50 -04:00
Cindy Lu
386494f29e vhost-vdpa :Fix Coverity CID 1430270 / CID 1420267
In the function vhost_vdpa_dma_map/unmap, The struct msg was not initialized all its fields.

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20200710064642.24505-1-lulu@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-27 10:28:28 -04:00
Raphael Norwitz
4fdecf0543 Fix vhost-user buffer over-read on ram hot-unplug
The VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS vhost-user protocol
feature introduced a shadow-table, used by the backend to dynamically
determine how a vdev's memory regions have changed since the last
vhost_user_set_mem_table() call. On hot-remove, a memmove() operation
is used to overwrite the removed shadow region descriptor(s). The size
parameter of this memmove was off by 1 such that if a VM with a backend
supporting the VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS filled it's
shadow-table (by performing the maximum number of supported hot-add
operatons) and attempted to remove the last region, Qemu would read an
out of bounds value and potentially crash.

This change fixes the memmove() bounds such that this erroneous read can
never happen.

Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com>
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <1594799958-31356-1-git-send-email-raphael.norwitz@nutanix.com>
Fixes: f1aeb14b08 ("Transmit vhost-user memory regions individually")
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-27 10:28:28 -04:00
David Hildenbrand
c22aba1963 virtio-mem-pci: force virtio version 1
Trying to run simple virtio-mem-pci examples currently fails with
    qemu-system-x86_64: -device virtio-mem-pci,id=vm0,memdev=mem0,node=0,
    requested-size=300M: device is modern-only, use disable-legacy=on
due to the added safety checks in 9b3a35ec82 ("virtio: verify that legacy
support is not accidentally on").

As noted by Conny, we have to force virtio version 1. While at it, use
qdev_realize() to set the parent bus and realize - like most other
virtio-*-pci implementations.

Fixes: 0b9a2443a4 ("virtio-pci: Proxy for virtio-mem")
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200727115905.129397-1-david@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-27 09:38:33 -04:00
Andrew Melnychenko
ccec7e9603 virtio-pci: Changed vdev to proxy for VirtIO PCI BAR callbacks.
There is an issue when callback may be called with invalid vdev.
It happens on unplug when vdev already deleted and VirtIOPciProxy is not.
So now, callbacks accept proxy device, and vdev retrieved from it.
Technically memio callbacks should be removed during the flatview update,
but memoryregions remain til PCI device(and it's address space) completely deleted.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1716352

Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Message-Id: <20200706112123.971087-1-andrew@daynix.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-22 08:05:37 -04:00
Cornelia Huck
9b3a35ec82 virtio: verify that legacy support is not accidentally on
If a virtio device does not have legacy support, make sure that
it is actually off, and bail out if not.

For virtio-pci, this means that any device without legacy support
that has been specified to modern-only (or that has been forced
to it) will work.

For virtio-ccw, this duplicates the check that is currently done
prior to realization for any device that explicitly specified no
support for legacy.

This catches devices that have not been fenced properly.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20200707105446.677966-3-cohuck@redhat.com>
Cc: qemu-stable@nongnu.org
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-22 07:57:07 -04:00
Cornelia Huck
7c78bdd7a3 virtio: list legacy-capable devices
Several types of virtio devices had already been around before the
virtio standard was specified. These devices support virtio in legacy
(and transitional) mode.

Devices that have been added in the virtio standard are considered
non-transitional (i.e. with no support for legacy virtio).

Provide a helper function so virtio transports can figure that out
easily.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20200707105446.677966-2-cohuck@redhat.com>
Cc: qemu-stable@nongnu.org
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-22 07:57:07 -04:00
Alexander Duyck
3219b42f02 virtio-balloon: Replace free page hinting references to 'report' with 'hint'
Recently a feature named Free Page Reporting was added to the virtio
balloon. In order to avoid any confusion we should drop the use of the word
'report' when referring to Free Page Hinting. So what this patch does is go
through and replace all instances of 'report' with 'hint" when we are
referring to free page hinting.

Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Message-Id: <20200720175128.21935.93927.stgit@localhost.localdomain>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-22 07:57:07 -04:00
Alexander Duyck
1a83e0b9c4 virtio-balloon: Add locking to prevent possible race when starting hinting
There is already locking in place when we are stopping free page hinting
but there is not similar protections in place when we start. I can only
assume this was overlooked as in most cases the page hinting should not be
occurring when we are starting the hinting, however there is still a chance
we could be processing hints by the time we get back around to restarting
the hinting so we are better off making sure to protect the state with the
mutex lock rather than just updating the value with no protections.

Based on feedback from Peter Maydell this issue had also been spotted by
Coverity: CID 1430269

Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Message-Id: <20200720175122.21935.78013.stgit@localhost.localdomain>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-22 07:57:07 -04:00
Alexander Duyck
20a4da0f23 virtio-balloon: Prevent guest from starting a report when we didn't request one
Based on code review it appears possible for the driver to force the device
out of a stopped state when hinting by repeating the last ID it was
provided.

Prevent this by only allowing a transition to the start state when we are
in the requested state. This way the driver is only allowed to send one
descriptor that will transition the device into the start state. All others
will leave it in the stop state once it has finished.

Fixes: c13c4153f7 ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Message-Id: <20200720175115.21935.99563.stgit@localhost.localdomain>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-22 07:57:07 -04:00
Markus Armbruster
cf4e3d000e virtio: Drop broken and superfluous object_property_set_link()
virtio_crypto_pci_realize() and copies the value of vcrypto->vdev's
property "cryptodev" to vcrypto's property:

    object_property_set_link(OBJECT(vrng), "rng", OBJECT(vrng->vdev.conf.rng),
                             NULL);

Since it does so only after realize, this always fails, but the error
is ignored.

It's actually superfluous: vcrypto's property is an alias of
vcrypto->vdev's property, created by virtio_instance_init_common().

Drop the call.

Same for virtio_ccw_crypto_realize(), virtio_rng_pci_realize(),
virtio_ccw_rng_realize().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200721121153.1128844-1-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-22 07:57:07 -04:00
Markus Armbruster
7a309cc95b qom: Change object_get_canonical_path_component() not to malloc
object_get_canonical_path_component() returns a malloced copy of a
property name on success, null on failure.

19 of its 25 callers immediately free the returned copy.

Change object_get_canonical_path_component() to return the property
name directly.  Since modifying the name would be wrong, adjust the
return type to const char *.

Drop the free from the 19 callers become simpler, add the g_strdup()
to the other six.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200714160202.3121879-4-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
2020-07-21 16:23:43 +02:00
Eric Auger
e95e05dab6 virtio-iommu: Fix coverity issue in virtio_iommu_handle_command()
Coverity points out (CID 1430180) that the new case is missing
break or a /* fallthrough */ comment. Break is the right thing to
do as in that case, tail is not used.

Fixes 1733eebb9e ("virtio-iommu: Implement RESV_MEM probe request")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200708160147.18426-1-eric.auger@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-07-13 14:36:08 +01:00
Peter Maydell
0fb7ea4515 virtio: bugfix
fixes vdpa on non-kvm platforms
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl8FnUwPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRp/HAIAK85sNj6fAGwEESnUsiY9M8yy3fX8z9jyXxu
 RcwHSwot6+NK6tLOj/879TV/9cLlHTNaAkpfFSFqvQEiaaP5RjutMC7RB/HiDNrs
 Q0114x0blKYtmChWxTDiXONVY8m8lCm36dWs2hMFL2JPhubCCJJaJaiBaFCvM7H1
 zv2yFLri+To01gyfqPf8YYzgXzVRneJze0KfIQIVI7WxAm7HyhxhofZ4kmX2VHOQ
 PIaMMv/bUhb54ZkFFDOawkw+hfJMQUtmwpW1OewA1bQ8PX3Pv8LN5sCWz2oZmiA9
 3qVEwbI1jBwUgvniqki2KwFn6jDRzJWsrhfLHxuyDkn1nQENh0c=
 =vIBD
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio: bugfix

fixes vdpa on non-kvm platforms

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Wed 08 Jul 2020 11:17:48 BST
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  vhost-vdpa: fix the compile issue without kvm

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-07-10 19:29:07 +01:00
Markus Armbruster
668f62ec62 error: Eliminate error_propagate() with Coccinelle, part 1
When all we do with an Error we receive into a local variable is
propagating to somewhere else, we can just as well receive it there
right away.  Convert

    if (!foo(..., &err)) {
        ...
        error_propagate(errp, err);
        ...
        return ...
    }

to

    if (!foo(..., errp)) {
        ...
        ...
        return ...
    }

where nothing else needs @err.  Coccinelle script:

    @rule1 forall@
    identifier fun, err, errp, lbl;
    expression list args, args2;
    binary operator op;
    constant c1, c2;
    symbol false;
    @@
         if (
    (
    -        fun(args, &err, args2)
    +        fun(args, errp, args2)
    |
    -        !fun(args, &err, args2)
    +        !fun(args, errp, args2)
    |
    -        fun(args, &err, args2) op c1
    +        fun(args, errp, args2) op c1
    )
            )
         {
             ... when != err
                 when != lbl:
                 when strict
    -        error_propagate(errp, err);
             ... when != err
    (
             return;
    |
             return c2;
    |
             return false;
    )
         }

    @rule2 forall@
    identifier fun, err, errp, lbl;
    expression list args, args2;
    expression var;
    binary operator op;
    constant c1, c2;
    symbol false;
    @@
    -    var = fun(args, &err, args2);
    +    var = fun(args, errp, args2);
         ... when != err
         if (
    (
             var
    |
             !var
    |
             var op c1
    )
            )
         {
             ... when != err
                 when != lbl:
                 when strict
    -        error_propagate(errp, err);
             ... when != err
    (
             return;
    |
             return c2;
    |
             return false;
    |
             return var;
    )
         }

    @depends on rule1 || rule2@
    identifier err;
    @@
    -    Error *err = NULL;
         ... when != err

Not exactly elegant, I'm afraid.

The "when != lbl:" is necessary to avoid transforming

         if (fun(args, &err)) {
             goto out
         }
         ...
     out:
         error_propagate(errp, err);

even though other paths to label out still need the error_propagate().
For an actual example, see sclp_realize().

Without the "when strict", Coccinelle transforms vfio_msix_setup(),
incorrectly.  I don't know what exactly "when strict" does, only that
it helps here.

The match of return is narrower than what I want, but I can't figure
out how to express "return where the operand doesn't use @err".  For
an example where it's too narrow, see vfio_intx_enable().

Silently fails to convert hw/arm/armsse.c, because Coccinelle gets
confused by ARMSSE being used both as typedef and function-like macro
there.  Converted manually.

Line breaks tidied up manually.  One nested declaration of @local_err
deleted manually.  Preexisting unwanted blank line dropped in
hw/riscv/sifive_e.c.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200707160613.848843-35-armbru@redhat.com>
2020-07-10 15:18:08 +02:00
Markus Armbruster
778a2dc592 qom: Use returned bool to check for failure, Coccinelle part
The previous commit enables conversion of

    foo(..., &err);
    if (err) {
        ...
    }

to

    if (!foo(..., errp)) {
        ...
    }

for QOM functions that now return true / false on success / error.
Coccinelle script:

    @@
    identifier fun = {
        object_apply_global_props, object_initialize_child_with_props,
        object_initialize_child_with_propsv, object_property_get,
        object_property_get_bool, object_property_parse, object_property_set,
        object_property_set_bool, object_property_set_int,
        object_property_set_link, object_property_set_qobject,
        object_property_set_str, object_property_set_uint, object_set_props,
        object_set_propv, user_creatable_add_dict,
        user_creatable_complete, user_creatable_del
    };
    expression list args, args2;
    typedef Error;
    Error *err;
    @@
    -    fun(args, &err, args2);
    -    if (err)
    +    if (!fun(args, &err, args2))
         {
             ...
         }

Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by
ARMSSE being used both as typedef and function-like macro there.
Convert manually.

Line breaks tidied up manually.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200707160613.848843-29-armbru@redhat.com>
2020-07-10 15:18:08 +02:00
Markus Armbruster
5325cc34a2 qom: Put name parameter before value / visitor parameter
The object_property_set_FOO() setters take property name and value in
an unusual order:

    void object_property_set_FOO(Object *obj, FOO_TYPE value,
                                 const char *name, Error **errp)

Having to pass value before name feels grating.  Swap them.

Same for object_property_set(), object_property_get(), and
object_property_parse().

Convert callers with this Coccinelle script:

    @@
    identifier fun = {
        object_property_get, object_property_parse, object_property_set_str,
        object_property_set_link, object_property_set_bool,
        object_property_set_int, object_property_set_uint, object_property_set,
        object_property_set_qobject
    };
    expression obj, v, name, errp;
    @@
    -    fun(obj, v, name, errp)
    +    fun(obj, name, v, errp)

Chokes on hw/arm/musicpal.c's lcd_refresh() with the unhelpful error
message "no position information".  Convert that one manually.

Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by
ARMSSE being used both as typedef and function-like macro there.
Convert manually.

Fails to convert hw/rx/rx-gdbsim.c, because Coccinelle gets confused
by RXCPU being used both as typedef and function-like macro there.
Convert manually.  The other files using RXCPU that way don't need
conversion.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200707160613.848843-27-armbru@redhat.com>
[Straightforwad conflict with commit 2336172d9b "audio: set default
value for pcspk.iobase property" resolved]
2020-07-10 15:18:08 +02:00
Markus Armbruster
62a35aaa31 qapi: Use returned bool to check for failure, Coccinelle part
The previous commit enables conversion of

    visit_foo(..., &err);
    if (err) {
        ...
    }

to

    if (!visit_foo(..., errp)) {
        ...
    }

for visitor functions that now return true / false on success / error.
Coccinelle script:

    @@
    identifier fun =~ "check_list|input_type_enum|lv_start_struct|lv_type_bool|lv_type_int64|lv_type_str|lv_type_uint64|output_type_enum|parse_type_bool|parse_type_int64|parse_type_null|parse_type_number|parse_type_size|parse_type_str|parse_type_uint64|print_type_bool|print_type_int64|print_type_null|print_type_number|print_type_size|print_type_str|print_type_uint64|qapi_clone_start_alternate|qapi_clone_start_list|qapi_clone_start_struct|qapi_clone_type_bool|qapi_clone_type_int64|qapi_clone_type_null|qapi_clone_type_number|qapi_clone_type_str|qapi_clone_type_uint64|qapi_dealloc_start_list|qapi_dealloc_start_struct|qapi_dealloc_type_anything|qapi_dealloc_type_bool|qapi_dealloc_type_int64|qapi_dealloc_type_null|qapi_dealloc_type_number|qapi_dealloc_type_str|qapi_dealloc_type_uint64|qobject_input_check_list|qobject_input_check_struct|qobject_input_start_alternate|qobject_input_start_list|qobject_input_start_struct|qobject_input_type_any|qobject_input_type_bool|qobject_input_type_bool_keyval|qobject_input_type_int64|qobject_input_type_int64_keyval|qobject_input_type_null|qobject_input_type_number|qobject_input_type_number_keyval|qobject_input_type_size_keyval|qobject_input_type_str|qobject_input_type_str_keyval|qobject_input_type_uint64|qobject_input_type_uint64_keyval|qobject_output_start_list|qobject_output_start_struct|qobject_output_type_any|qobject_output_type_bool|qobject_output_type_int64|qobject_output_type_null|qobject_output_type_number|qobject_output_type_str|qobject_output_type_uint64|start_list|visit_check_list|visit_check_struct|visit_start_alternate|visit_start_list|visit_start_struct|visit_type_.*";
    expression list args;
    typedef Error;
    Error *err;
    @@
    -    fun(args, &err);
    -    if (err)
    +    if (!fun(args, &err))
         {
             ...
         }

A few line breaks tidied up manually.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200707160613.848843-19-armbru@redhat.com>
2020-07-10 15:18:08 +02:00
Markus Armbruster
79c3e2bc6e virtio-crypto-pci: Tidy up virtio_crypto_pci_realize()
virtio_crypto_pci_realize() continues after realization of its
"virtio-crypto-device" fails.  Only an object_property_set_link()
follows; looks harmless to me.  Tidy up anyway: return after failure,
just like virtio_rng_pci_realize() does.

Cc: "Gonglei (Arei)" <arei.gonglei@huawei.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Gonglei < arei.gonglei@huawei.com>
Message-Id: <20200707160613.848843-7-armbru@redhat.com>
2020-07-10 15:01:06 +02:00
Markus Armbruster
118bfd76c9 qdev: Use returned bool to check for qdev_realize() etc. failure
Convert

    foo(..., &err);
    if (err) {
        ...
    }

to

    if (!foo(..., &err)) {
        ...
    }

for qdev_realize(), qdev_realize_and_unref(), qbus_realize() and their
wrappers isa_realize_and_unref(), pci_realize_and_unref(),
sysbus_realize(), sysbus_realize_and_unref(), usb_realize_and_unref().
Coccinelle script:

    @@
    identifier fun = {
        isa_realize_and_unref, pci_realize_and_unref, qbus_realize,
        qdev_realize, qdev_realize_and_unref, sysbus_realize,
        sysbus_realize_and_unref, usb_realize_and_unref
    };
    expression list args, args2;
    typedef Error;
    Error *err;
    @@
    -    fun(args, &err, args2);
    -    if (err)
    +    if (!fun(args, &err, args2))
         {
             ...
         }

Chokes on hw/arm/musicpal.c's lcd_refresh() with the unhelpful error
message "no position information".  Nothing to convert there; skipped.

Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by
ARMSSE being used both as typedef and function-like macro there.
Converted manually.

A few line breaks tidied up manually.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20200707160613.848843-5-armbru@redhat.com>
2020-07-10 15:01:06 +02:00
Cindy Lu
4dc5acc07e vhost-vdpa: fix the compile issue without kvm
Fix the compile issue in the system without the kvm support

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20200708084922.21904-1-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-08 06:16:48 -04:00
Peter Maydell
c8eaf81fd2 virtio,acpi: features, fixes, cleanups.
vdpa support
 virtio-mem support
 a handy script for disassembling acpi tables
 misc fixes and cleanups
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl8EY+MPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpOMkIAMMhfbzZXlwv1xiQ/pMTtEqXDnLeic7NK6xF
 RJkAFlMM+eEXBRZLYJXhPAFjneTA813vR0xlygHn2pYhCF3ozTfLqEABfQsG0w+d
 VDSYTnFHAx2GwGGQBNYltsIs+8lAADYhlo9VG/qC5nAsNaoVBeTJLuF96un1WGDz
 vWH0Cx/AG+yhiKvlSHA/CCSXMVGVTkRfUCjGF8Yq0mVtx23OZ9blQkJRkGfHWctB
 GxQlh/b+4YLaXhy+V1+/Iu2U45KgKN7qrsyKvHBMgKd2qazowr/D8Aexh4hN/eg5
 jibmxurDHXze+VUDCySy6qLBBySNkz++qLKfPOQ0iszDHLXQzOQ=
 =H/KL
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio,acpi: features, fixes, cleanups.

vdpa support
virtio-mem support
a handy script for disassembling acpi tables
misc fixes and cleanups

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 07 Jul 2020 13:00:35 BST
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (41 commits)
  vhost-vdpa: introduce vhost-vdpa net client
  vhost-vdpa: introduce vhost-vdpa backend
  vhost_net: introduce set_config & get_config
  vhost: implement vhost_force_iommu method
  vhost: introduce new VhostOps vhost_force_iommu
  vhost: implement vhost_vq_get_addr method
  vhost: introduce new VhostOps vhost_vq_get_addr
  vhost: implement vhost_dev_start method
  vhost: introduce new VhostOps vhost_dev_start
  vhost: check the existence of vhost_set_iotlb_callback
  virtio-pci: implement queue_enabled method
  virtio-bus: introduce queue_enabled method
  vhost_net: use the function qemu_get_peer
  net: introduce qemu_get_peer
  MAINTAINERS: add VT-d entry
  docs: vhost-user: add Virtio status protocol feature
  tests/acpi: remove stale allowed tables
  numa: Auto-enable NUMA when any memory devices are possible
  virtio-mem: Exclude unplugged memory during migration
  virtio-mem: Add trace events
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	hw/arm/virt.c
#	hw/virtio/trace-events
2020-07-07 17:37:44 +01:00
Cindy Lu
108a64818e vhost-vdpa: introduce vhost-vdpa backend
Currently we have 2 types of vhost backends in QEMU: vhost kernel and
vhost-user. The above patch provides a generic device for vDPA purpose,
this vDPA device exposes to user space a non-vendor-specific configuration
interface for setting up a vhost HW accelerator, this patch set introduces
a third vhost backend called vhost-vdpa based on the vDPA interface.

Vhost-vdpa usage:

qemu-system-x86_64 -cpu host -enable-kvm \
    ......
    -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-id,id=vhost-vdpa0 \
    -device virtio-net-pci,netdev=vhost-vdpa0,page-per-vq=on \

Signed-off-by: Lingshan zhu <lingshan.zhu@intel.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Cindy Lu <lulu@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20200701145538.22333-14-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2020-07-07 07:59:51 -04:00
Eric Auger
8077b8e549 virtio-iommu-pci: Add array of Interval properties
The machine may need to pass reserved regions to the
virtio-iommu-pci device (such as the MSI window on x86
or the MSI doorbells on ARM).

So let's add an array of Interval properties.

Note: if some reserved regions are already set by the
machine code - which should be the case in general -,
the length of the property array is already set and
prevents the end-user from modifying them. For example,
attempting to use:

-device virtio-iommu-pci,\
 len-reserved-regions=1,reserved-regions[0]=0xfee00000:0xfeefffff:1

would result in the following error message:

qemu-system-aarch64: -device virtio-iommu-pci,addr=0xa,
len-reserved-regions=1,reserved-regions[0]=0xfee00000:0xfeefffff:1:
array size property len-reserved-regions may not be set more than once

Otherwise, for example, adding two reserved regions is achieved
using the following options:

-device virtio-iommu-pci,addr=0xa,len-reserved-regions=2,\
 reserved-regions[0]=0xfee00000:0xfeefffff:1,\
 reserved-regions[1]=0x1000000:100ffff:1

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-id: 20200629070404.10969-5-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-07-03 16:59:42 +01:00
Eric Auger
0f5a3092ee virtio-iommu: Handle reserved regions in the translation process
When translating an address we need to check if it belongs to
a reserved virtual address range. If it does, there are 2 cases:

- it belongs to a RESERVED region: the guest should neither use
  this address in a MAP not instruct the end-point to DMA on
  them. We report an error

- It belongs to an MSI region: we bypass the translation.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20200629070404.10969-4-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-07-03 16:59:42 +01:00
Eric Auger
1733eebb9e virtio-iommu: Implement RESV_MEM probe request
This patch implements the PROBE request. At the moment,
only THE RESV_MEM property is handled. The first goal is
to report iommu wide reserved regions such as the MSI regions
set by the machine code. On x86 this will be the IOAPIC MSI
region, [0xFEE00000 - 0xFEEFFFFF], on ARM this may be the ITS
doorbell.

In the future we may introduce per device reserved regions.
This will be useful when protecting host assigned devices
which may expose their own reserved regions

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20200629070404.10969-3-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-07-03 16:59:42 +01:00
Cindy Lu
7a471694a1 vhost: implement vhost_force_iommu method
use the vhost_force_iommu callback to force enable feature bit VIRTIO_F_IOMMU_PLATFORM

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20200701145538.22333-12-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2020-07-03 07:57:04 -04:00
Cindy Lu
b4ab225c34 vhost: implement vhost_vq_get_addr method
use vhost_vq_get_addr callback to get the vq address from backend

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20200701145538.22333-10-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2020-07-03 07:57:04 -04:00
Cindy Lu
ca71db438b vhost: implement vhost_dev_start method
use the vhost_dev_start callback to send the status to backend

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20200701145538.22333-8-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2020-07-03 07:57:04 -04:00
Jason Wang
3f63b4c655 vhost: check the existence of vhost_set_iotlb_callback
Add the check of vhost_set_iotlb_callback
before calling

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20200701145538.22333-6-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2020-07-03 07:57:04 -04:00
Jason Wang
f19bcdfedd virtio-pci: implement queue_enabled method
With version 1, we can detect whether a queue is enabled via
queue_enabled.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20200701145538.22333-5-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2020-07-03 07:57:04 -04:00
Jason Wang
b2a5f62a22 virtio-bus: introduce queue_enabled method
This patch introduces queue_enabled() method which allows the
transport to implement its own way to report whether or not a queue is
enabled.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20200701145538.22333-4-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2020-07-03 07:57:04 -04:00
David Hildenbrand
0bc7806c5a virtio-mem: Exclude unplugged memory during migration
The content of unplugged memory is undefined and should not be migrated,
ever. Exclude all unplugged memory during precopy using the precopy notifier
infrastructure introduced for free page hinting in virtio-balloon.

Unplugged memory is marked as "not dirty", meaning it won't be
considered for migration.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200626072248.78761-21-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-03 07:57:04 -04:00
David Hildenbrand
43e5495027 virtio-mem: Add trace events
Let's add some trace events that might come in handy later.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200626072248.78761-20-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-03 07:57:04 -04:00
David Hildenbrand
383ee44555 virtio-mem: Migration sanity checks
We want to make sure that certain properties don't change during
migration, especially to catch user errors in a nice way. Let's migrate
a temporary structure and validate that the properties didn't change.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200626072248.78761-19-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-03 07:57:04 -04:00
David Hildenbrand
722a3c783e virtio-pci: Send qapi events when the virtio-mem size changes
Let's register the notifier and trigger the qapi event with the right
device id.

MEMORY_DEVICE_SIZE_CHANGE is similar to BALLOON_CHANGE, however on a
memory device level.

Don't unregister the notifier (we neither have finalize() nor unrealize()
for VirtIOPCIProxy, so it's not that simple to do it) - both devices are
expected to vanish at the same time.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200626072248.78761-18-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-03 07:57:04 -04:00
David Hildenbrand
c95b4437da virtio-mem: Allow notifiers for size changes
We want to send qapi events in case the size of a virtio-mem device
changes. This allows upper layers to always know how much memory is
actually currently consumed via a virtio-mem device.

Unfortuantely, we have to report the id of our proxy device. Let's provide
an easy way for our proxy device to register, so it can send the qapi
events. Piggy-backing on the notifier infrastructure (although we'll
only ever have one notifier registered) seems to be an easy way.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200626072248.78761-17-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-03 07:57:04 -04:00
David Hildenbrand
0b9a2443a4 virtio-pci: Proxy for virtio-mem
Let's add a proxy for virtio-mem, make it a memory device, and
pass-through the properties.

Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200626072248.78761-12-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-03 07:57:04 -04:00
David Hildenbrand
910b25766b virtio-mem: Paravirtualized memory hot(un)plug
This is the very basic/initial version of virtio-mem. An introduction to
virtio-mem can be found in the Linux kernel driver [1]. While it can be
used in the current state for hotplug of a smaller amount of memory, it
will heavily benefit from resizeable memory regions in the future.

Each virtio-mem device manages a memory region (provided via a memory
backend). After requested by the hypervisor ("requested-size"), the
guest can try to plug/unplug blocks of memory within that region, in order
to reach the requested size. Initially, and after a reboot, all memory is
unplugged (except in special cases - reboot during postcopy).

The guest may only try to plug/unplug blocks of memory within the usable
region size. The usable region size is a little bigger than the
requested size, to give the device driver some flexibility. The usable
region size will only grow, except on reboots or when all memory is
requested to get unplugged. The guest can never plug more memory than
requested. Unplugged memory will get zapped/discarded, similar to in a
balloon device.

The block size is variable, however, it is always chosen in a way such that
THP splits are avoided (e.g., 2MB). The state of each block
(plugged/unplugged) is tracked in a bitmap.

As virtio-mem devices (e.g., virtio-mem-pci) will be memory devices, we now
expose "VirtioMEMDeviceInfo" via "query-memory-devices".

--------------------------------------------------------------------------

There are two important follow-up items that are in the works:
1. Resizeable memory regions: Use resizeable allocations/RAM blocks to
   grow/shrink along with the usable region size. This avoids creating
   initially very big VMAs, RAM blocks, and KVM slots.
2. Protection of unplugged memory: Make sure the gust cannot actually
   make use of unplugged memory.

Other follow-up items that are in the works:
1. Exclude unplugged memory during migration (via precopy notifier).
2. Handle remapping of memory.
3. Support for other architectures.

--------------------------------------------------------------------------

Example usage (virtio-mem-pci is introduced in follow-up patches):

Start QEMU with two virtio-mem devices (one per NUMA node):
 $ qemu-system-x86_64 -m 4G,maxmem=20G \
  -smp sockets=2,cores=2 \
  -numa node,nodeid=0,cpus=0-1 -numa node,nodeid=1,cpus=2-3 \
  [...]
  -object memory-backend-ram,id=mem0,size=8G \
  -device virtio-mem-pci,id=vm0,memdev=mem0,node=0,requested-size=0M \
  -object memory-backend-ram,id=mem1,size=8G \
  -device virtio-mem-pci,id=vm1,memdev=mem1,node=1,requested-size=1G

Query the configuration:
 (qemu) info memory-devices
 Memory device [virtio-mem]: "vm0"
   memaddr: 0x140000000
   node: 0
   requested-size: 0
   size: 0
   max-size: 8589934592
   block-size: 2097152
   memdev: /objects/mem0
 Memory device [virtio-mem]: "vm1"
   memaddr: 0x340000000
   node: 1
   requested-size: 1073741824
   size: 1073741824
   max-size: 8589934592
   block-size: 2097152
   memdev: /objects/mem1

Add some memory to node 0:
 (qemu) qom-set vm0 requested-size 500M

Remove some memory from node 1:
 (qemu) qom-set vm1 requested-size 200M

Query the configuration again:
 (qemu) info memory-devices
 Memory device [virtio-mem]: "vm0"
   memaddr: 0x140000000
   node: 0
   requested-size: 524288000
   size: 524288000
   max-size: 8589934592
   block-size: 2097152
   memdev: /objects/mem0
 Memory device [virtio-mem]: "vm1"
   memaddr: 0x340000000
   node: 1
   requested-size: 209715200
   size: 209715200
   max-size: 8589934592
   block-size: 2097152
   memdev: /objects/mem1

[1] https://lkml.kernel.org/r/20200311171422.10484-1-david@redhat.com

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200626072248.78761-11-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-03 07:57:04 -04:00
David Hildenbrand
06df2e692a virtio-balloon: Rip out qemu_balloon_inhibit()
The only remaining special case is postcopy. It cannot handle
concurrent discards yet, which would result in requesting already sent
pages from the source. Special-case it in virtio-balloon instead.

Introduce migration_in_incoming_postcopy(), to find out if incoming
postcopy is active.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200626072248.78761-7-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-02 05:54:59 -04:00
David Hildenbrand
dd8eeb9671 virtio-balloon: always indicate S_DONE when migration fails
If something goes wrong during precopy, before stopping the VM, we will
never send a S_DONE indication to the VM, resulting in the hinted pages
not getting released to be used by the guest OS (e.g., Linux).

Easy to reproduce:
1. Start migration (e.g., HMP "migrate -d 'exec:gzip -c > STATEFILE.gz'")
2. Cancel migration (e.g., HMP "migrate_cancel")
3. Oberve in the guest (e.g., cat /proc/meminfo) that there is basically
   no free memory left.

While at it, add similar locking to virtio_balloon_free_page_done() as
done in virtio_balloon_free_page_stop. Locking is still weird, but that
has to be sorted out separately.

There is nothing to do in the PRECOPY_NOTIFY_COMPLETE case. Add some
comments regarding S_DONE handling.

Fixes: c13c4153f7 ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200629080615.26022-1-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-02 05:54:59 -04:00