Commit Graph

232 Commits

Author SHA1 Message Date
Jason Wang
51a81a2118 virtio-net: calculating proper msix vectors on init
Currently, the default msix vectors for virtio-net-pci is 3 which is
obvious not suitable for multiqueue guest, so we depends on the user
or management tools to pass a correct vectors parameter. In fact, we
can simplifying this by calculating the number of vectors on realize.

Consider we have N queues, the number of vectors needed is 2*N + 2
(#queue pairs + plus one config interrupt and control vq). We didn't
check whether or not host support control vq because it was added
unconditionally by qemu to avoid breaking legacy guests such as Minix.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-15 16:41:22 +08:00
Isaku Yamahata
6be8cf56bc acpi/core: always set SCI_EN when SMM isn't supported
If SMM is not supported, ACPI fixed hardware doesn't support
legacy-mode. ACPI-only platform. Where SCI_EN in PM1_CNT register is
always set.
The bit tells OS legacy mode(SCI_EN cleared) or ACPI mode(SCI_EN set).

With the next patch (setting fadt.smi_cmd = 0 when smm isn't enabled),
guest Linux tries to switch to ACPI mode, finds smi_cmd = 0, and then
fails to initialize acpi subsystem. This patch proactively fixes it.

This patch changes guest ABI. To keep compatibility, use
"smm-compat" introduced by earlier patch. If the property is true,
disable new behavior.

ACPI spec 4.8.10.1 PM1 Event Grouping
PM1 Eanble Registers
> For ACPI-only platforms (where SCI_EN is always set)

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-Id: <500f62081626997e46f96377393d3662211763a8.1613615732.git.isaku.yamahata@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23 10:58:42 -05:00
David Gibson
9f88a7a3df confidential guest support: Alter virtio default properties for protected guests
The default behaviour for virtio devices is not to use the platforms normal
DMA paths, but instead to use the fact that it's running in a hypervisor
to directly access guest memory.  That doesn't work if the guest's memory
is protected from hypervisor access, such as with AMD's SEV or POWER's PEF.

So, if a confidential guest mechanism is enabled, then apply the
iommu_platform=on option so it will go through normal DMA mechanisms.
Those will presumably have some way of marking memory as shared with
the hypervisor or hardware so that DMA will work.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2021-02-08 16:57:38 +11:00
David Gibson
e0292d7c62 confidential guest support: Rework the "memory-encryption" property
Currently the "memory-encryption" property is only looked at once we
get to kvm_init().  Although protection of guest memory from the
hypervisor isn't something that could really ever work with TCG, it's
not conceptually tied to the KVM accelerator.

In addition, the way the string property is resolved to an object is
almost identical to how a QOM link property is handled.

So, create a new "confidential-guest-support" link property which sets
this QOM interface link directly in the machine.  For compatibility we
keep the "memory-encryption" property, but now implemented in terms of
the new property.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2021-02-08 16:57:38 +11:00
David Gibson
6e6a6ca7d9 confidential guest support: Move side effect out of machine_set_memory_encryption()
When the "memory-encryption" property is set, we also disable KSM
merging for the guest, since it won't accomplish anything.

We want that, but doing it in the property set function itself is
thereoretically incorrect, in the unlikely event of some configuration
environment that set the property then cleared it again before
constructing the guest.

More importantly, it makes some other cleanups we want more difficult.
So, instead move this logic to machine_run_board_init() conditional on
the final value of the property.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2021-02-08 16:57:38 +11:00
Stefano Garzarella
c126b4c57e virtio: move 'use-disabled-flag' property to hw_compat_4_2
Commit 9d7bd0826f introduced a new 'use-disabled-flag' property
set to true by default.
To allow the migration, we set this property to false in the hw_compat,
but in the wrong place (hw_compat_4_1).

Since commit 9d7bd0826f was released with QEMU 5.0, we move
'use-disabled-flag' property to hw_compat_4_2, so 4.2 machine types
will have the pre-patch behavior and the migration can work.

The issue was discovered with vhost-vsock device and 4.2 machine
type without running any kernel in the VM:
    $ qemu-4.2 -M pc-q35-4.2,accel=kvm \
        -device vhost-vsock-pci,guest-cid=4 \
        -monitor stdio -incoming tcp:0:3333

    $ qemu-5.2 -M pc-q35-4.2,accel=kvm \
        -device vhost-vsock-pci,guest-cid=3 \
        -monitor stdio
    (qemu) migrate -d tcp:0:3333

    # qemu-4.2 output
    qemu-system-x86_64: Failed to load virtio-vhost_vsock:virtio
    qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:03.0/virtio-vhost_vsock'
    qemu-system-x86_64: load of migration failed: No such file or directory

Reported-by: Jing Zhao <jinzhao@redhat.com>
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1907255
Fixes: 9d7bd0826f ("virtio-pci: disable vring processing when bus-mastering is disabled")
Cc: mdroth@linux.vnet.ibm.com
CC: qemu-stable@nongnu.org
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20210108171252.209502-1-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-05 08:52:58 -05:00
Eric Blake
54aa3de72e qapi: Use QAPI_LIST_PREPEND() where possible
Anywhere we create a list of just one item or by prepending items
(typically because order doesn't matter), we can use
QAPI_LIST_PREPEND().  But places where we must keep the list in order
by appending remain open-coded until later patches.

Note that as a side effect, this also performs a cleanup of two minor
issues in qga/commands-posix.c: the old code was performing
 new = g_malloc0(sizeof(*ret));
which 1) is confusing because you have to verify whether 'new' and
'ret' are variables with the same type, and 2) would conflict with C++
compilation (not an actual problem for this file, but makes
copy-and-paste harder).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201113011340.463563-5-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
[Straightforward conflicts due to commit a8aa94b5f8 "qga: update
schema for guest-get-disks 'dependents' field" and commit a10b453a52
"target/mips: Move mips_cpu_add_definition() from helper.c to cpu.c"
resolved.  Commit message tweaked.]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-12-19 10:20:14 +01:00
Peter Maydell
af3f37319c * New -action option and set-action QMP command (Alejandro)
* More vl.c cleanup (myself with help from Daniel and Igor)
 * Remove deprecated options (Philippe, Thomas)
 * Dirty bitmap fix (Zenghui)
 * icount caching speedup (Pavel)
 * SCSI race fix (Maxim)
 * Remove pre-GCC 4.8 code (Marc-André)
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl/Y+BQUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOhPwf9GnBgYnsJcMpvsmKPZ0aU7qhCtGxY
 HrlxRXxGSK1kBlbRnIA2XEJe07isQRyT4q3dpLH+cJkA6gBAmBICvO0tHY8eW0OB
 6qbTPdYvrIhjBkeeEy+WGmgU+kw/YXCOCx6TdZFjYm6BqalAMXuTGw82jLlKyOGI
 2Ehc0PNnOTPZuRSmIgs7Kox5lDViJIX1ydMUeWw1lMksosCgeyZzecVdp9Ehmv9O
 SuSgq5ilmsUvzrV9DbYaT3/KinwLI+ZHPCiOd75WWB+cX546iji1nH9aaUHPnwy0
 EsAGza06A93uyiQNOxIPkF7Wd27e+u1VkfsCdOBNS8hy3sOeKhh/yZxkgA==
 =sFe1
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging

* New -action option and set-action QMP command (Alejandro)
* More vl.c cleanup (myself with help from Daniel and Igor)
* Remove deprecated options (Philippe, Thomas)
* Dirty bitmap fix (Zenghui)
* icount caching speedup (Pavel)
* SCSI race fix (Maxim)
* Remove pre-GCC 4.8 code (Marc-André)

# gpg: Signature made Tue 15 Dec 2020 17:53:24 GMT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini-gitlab/tags/for-upstream: (45 commits)
  build: -no-pie is no functional linker flag
  scripts/git.orderfile: Keep files with .inc extension sorted
  compiler.h: remove QEMU_GNUC_PREREQ
  linux-user: remove GNUC check
  compiler: remove GNUC check
  xen: remove GNUC check
  poison: remove GNUC check
  compiler.h: explicit case for Clang printf attribute
  virtiofsd: replace _Static_assert with QEMU_BUILD_BUG_ON
  tests: remove GCC < 4 fallbacks
  qemu-plugin.h: remove GCC < 4
  compiler.h: remove GCC < 3 __builtin_expect fallback
  accel/tcg: Remove special case for GCC < 4.6
  qemu/atomic: Drop special case for unsupported compiler
  hw/core: Restrict 'fw-path-provider.c' to system mode emulation
  docs: set CONFDIR when running sphinx
  vl: rename local variable in configure_accelerators
  qemu-option: pass QemuOptsList to opts_accepts_any
  qemu-option: simplify search for end of key
  kvm: Take into account the unaligned section size when preparing bitmap
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	softmmu/vl.c
2020-12-15 21:24:31 +00:00
Paolo Bonzini
2f181fbd5a machine: introduce MachineInitPhase
Generalize the qdev_hotplug variable to the different phases of
machine initialization.  We would like to allow different
monitor commands depending on the phase.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 12:51:52 -05:00
Paolo Bonzini
5a1ee6077b chardev: do not use machine_init_done
machine_init_done is not the right flag to check when preconfig
is taken into account; for example "./qemu-system-x86_64 -serial
mon:stdio -preconfig" does not print the QEMU monitor header until after
exit_preconfig.  Add back a custom bool for mux character devices.  This
partially undoes commit c7278b4355 ("chardev: introduce chr_machine_done
hook", 2018-03-12), but it keeps the cleaner logic using a function
pointer in ChardevClass.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 12:51:51 -05:00
Paolo Bonzini
f66dc8737c vl: move all generic initialization out of vl.c
qdev_machine_creation_done is only setting a flag now.  Extend it to
move more code out of vl.c.  Leave only consistency checks and gdbserver
processing in qemu_machine_creation_done.

gdbserver_start can be moved after qdev_machine_creation_done because
it only does listen on the socket and creates some internal data
structures; it does not send any data (e.g. guest state) over the socket.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 12:51:50 -05:00
Paolo Bonzini
c5e3c9182d vl: extract softmmu/globals.c
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 12:51:50 -05:00
Paolo Bonzini
6b21670cfd vl: extract machine done notifiers
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-10 12:15:18 -05:00
Paolo Bonzini
a3ef9bfb88 vl: move CHECKPOINT_INIT after preconfig
Move CHECKPOINT_INIT right before the machine initialization is
completed.  Everything before is essentially an extension of
command line parsing.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-10 12:15:13 -05:00
Paolo Bonzini
58c91595a7 vl: extract various command line validation snippets to a new function
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-10 12:15:12 -05:00
Paolo Bonzini
e0d17dfd22 vl: move various initialization routines out of qemu_init
Some very simple initialization routines can be nested in existing
subsystem-level functions, do that to simplify qemu_init.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-10 12:15:11 -05:00
Paolo Bonzini
3df8c4f31a vl: extract validation of -smp to machine.c
Once smp_parse is done, the validation operates on the MachineState.
There is no reason for that code to be in vl.c.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-10 12:15:11 -05:00
Cornelia Huck
576a00bdeb hw: add compat machines for 6.0
Add 6.0 machine types for arm/i440fx/q35/s390x/spapr.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20201109173928.1001764-1-cohuck@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-08 13:48:58 -05:00
Paolo Bonzini
b1b0393c3c pvpanic: Advertise the PVPANIC_CRASHLOADED event support
Advertise both types of events as supported when the guest OS
queries the pvpanic device.  Currently only PVPANIC_PANICKED is
exposed; PVPANIC_CRASHLOADED must also be advertised, but only on
new machine types.

Fixes: 7dc58deea7 ("pvpanic: implement crashloaded event handling")
Reported-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-11 03:59:05 -05:00
Klaus Jensen
6eb7a07129 hw/block/nvme: change controller pci id
There are two reasons for changing this:

  1. The nvme device currently uses an internal Intel device id.

  2. Since commits "nvme: fix write zeroes offset and count" and "nvme:
     support multiple namespaces" the controller device no longer has
     the quirks that the Linux kernel think it has.

     As the quirks are applied based on pci vendor and device id, change
     them to get rid of the quirks.

To keep backward compatibility, add a new 'use-intel-id' parameter to
the nvme device to force use of the Intel vendor and device id. This is
off by default but add a compat property to set this for 5.1 machines
and older. If a 5.1 machine is booted (or the use-intel-id parameter is
explicitly set to true), the Linux kernel will just apply these
unnecessary quirks:

  1. NVME_QUIRK_IDENTIFY_CNS which says that the device does not support
     anything else than values 0x0 and 0x1 for CNS (Identify Namespace
     and Identify Namespace). With multiple namespace support, this just
     means that the kernel will "scan" namespaces instead of using
     "Active Namespace ID list" (CNS 0x2).

  2. NVME_QUIRK_DISABLE_WRITE_ZEROES. The nvme device started out with a
     broken Write Zeroes implementation which has since been fixed in
     commit 9d6459d21a ("nvme: fix write zeroes offset and count").

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
2020-10-27 07:24:47 +01:00
Paolo Bonzini
8b0e484c8b machine: move SMP initialization from vl.c
Initialize the object's values from the class when the object is
created, no need to have vl.c do it for us.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-10-26 07:08:40 -04:00
Paolo Bonzini
2c920e4577 machine: move UP defaults to class_base_init
Clean up vl.c, default min/max/default_cpus to uniprocessor
directly in the QOM class initialization code.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-10-26 07:08:40 -04:00
Paolo Bonzini
9f2931bc65 machine: remove deprecated -machine enforce-config-section option
Deprecated since 3.1 and complicates the initialization sequence,
remove it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-10-26 07:08:39 -04:00
Robert Hoo
61ad65d0f0 cpu: Introduce CPU model deprecation API
Implement the ability of marking some versions deprecated. When
that CPU model is chosen, print a warning.  The warning message
can be customized, e.g. suggesting an alternative CPU model to be
used instead.

The deprecation message will be printed by x86_cpu_list_entry(),
e.g. '-cpu help'.

QMP command 'query-cpu-definitions' will return a bool value
indicating the deprecation status.

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1600758855-80046-1-git-send-email-robert.hu@linux.intel.com>
[ehabkost: reword commit message]
[ehabkost: Handle NULL cpu_type]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-10-14 15:28:54 -04:00
Igor Mammedov
b21aa7e01e numa: drop support for '-numa node' (without memory specified)
it was deprecated since 4.1
commit 4bb4a2732e (numa: deprecate implict memory distribution between nodes)

Users of existing VMs, wishing to preserve the same RAM distribution,
should configure it explicitly using ``-numa node,memdev`` options.
Current RAM distribution can be retrieved using HMP command
`info numa` and if separate memory devices (pc|nv-dimm) are present
use `info memory-device` and subtract device memory from output of
`info numa`.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20200911084410.788171-2-imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-30 19:09:20 +02:00
Stefano Garzarella
d55f518248 virtio: skip legacy support check on machine types less than 5.1
Commit 9b3a35ec82 ("virtio: verify that legacy support is not accidentally
on") added a check that returns an error if legacy support is on, but the
device does not support legacy.

Unfortunately some devices were wrongly declared legacy capable even if
they were not (e.g vhost-vsock).

To avoid migration issues, we add a virtio-device property
(x-disable-legacy-check) to skip the legacy error, printing a warning
instead, for machine types < 5.1.

Cc: qemu-stable@nongnu.org
Fixes: 9b3a35ec82 ("virtio: verify that legacy support is not accidentally on")
Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Suggested-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20200921122506.82515-2-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-09-29 02:15:24 -04:00
Eduardo Habkost
acd5b054e4 machine: Register "memory-backend" as class property
Class properties make QOM introspection simpler and easier, as
they don't require an object to be instantiated.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20200921221045.699690-21-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-09-22 16:48:29 -04:00
Igor Mammedov
c4332cd1dc smp: drop support for deprecated (invalid topologies)
it's was deprecated since 3.1

Support for invalid topologies is removed, the user must ensure
that topologies described with -smp include all possible cpus,
i.e. (sockets * cores * threads) == maxcpus or QEMU will
exit with error.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20200911133202.938754-1-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-09-22 16:45:16 -04:00
Stefan Hajnoczi
a4eef0711b vhost-user-blk-pci: default num_queues to -smp N
Automatically size the number of request virtqueues to match the number
of vCPUs.  This ensures that completion interrupts are handled on the
same vCPU that submitted the request.  No IPI is necessary to complete
an I/O request and performance is improved.  The maximum number of MSI-X
vectors and virtqueues limit are respected.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20200818143348.310613-8-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-27 08:29:13 -04:00
Stefan Hajnoczi
9445e1e15e virtio-blk-pci: default num_queues to -smp N
Automatically size the number of virtio-blk-pci request virtqueues to
match the number of vCPUs.  Other transports continue to default to 1
request virtqueue.

A 1:1 virtqueue:vCPU mapping ensures that completion interrupts are
handled on the same vCPU that submitted the request.  No IPI is
necessary to complete an I/O request and performance is improved.  The
maximum number of MSI-X vectors and virtqueues limit are respected.

Performance improves from 78k to 104k IOPS on a 32 vCPU guest with 101
virtio-blk-pci devices (ioengine=libaio, iodepth=1, bs=4k, rw=randread
with NVMe storage).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Message-Id: <20200818143348.310613-7-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-27 08:29:13 -04:00
Stefan Hajnoczi
6a55882284 virtio-scsi-pci: default num_queues to -smp N
Automatically size the number of virtio-scsi-pci, vhost-scsi-pci, and
vhost-user-scsi-pci request virtqueues to match the number of vCPUs.
Other transports continue to default to 1 request virtqueue.

A 1:1 virtqueue:vCPU mapping ensures that completion interrupts are
handled on the same vCPU that submitted the request.  No IPI is
necessary to complete an I/O request and performance is improved.  The
maximum number of MSI-X vectors and virtqueues limit are respected.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200818143348.310613-6-stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-27 08:29:13 -04:00
Cornelia Huck
3ff3c5d317 hw: add compat machines for 5.2
Add 5.2 machine types for arm/i440fx/q35/s390x/spapr.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20200819144016.281156-1-cohuck@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-08-19 10:45:48 -04:00
Hogan Wang
2ebc21216f hw/pci-host: save/restore pci host config register
The pci host config register is used to save PCI address for
read/write config data. If guest writes a value to config register,
and then QEMU pauses the vcpu to migrate, after the migration, the guest
will continue to write pci config data, and the write data will be ignored
because of new qemu process losing the config register state.

To trigger the bug:
1. guest is booting in seabios.
2. guest enables the SMRAM in seabios:piix4_apmc_smm_setup, and then
   expects to disable the SMRAM by pci_config_writeb.
3. after guest writes the pci host config register, QEMU pauses vcpu
   to finish migration.
4. guest write of config data(0x0A) fails to disable the SMRAM because
   the config register state is lost.
5. guest continues to boot and crashes in ipxe option ROM due to SMRAM
   in enabled state.

Example Reproducer:

step 1. Make modifications to seabios and qemu for increase reproduction
efficiency, write 0xf0 to 0x402 port notify qemu to stop vcpu after
0x0cf8 port wrote i440 configure register. qemu stop vcpu when catch
0x402 port wrote 0xf0.

seabios:/src/hw/pci.c
@@ -52,6 +52,11 @@ void pci_config_writeb(u16 bdf, u32 addr, u8 val)
         writeb(mmconfig_addr(bdf, addr), val);
     } else {
         outl(ioconfig_cmd(bdf, addr), PORT_PCI_CMD);
+       if (bdf == 0 && addr == 0x72 && val == 0xa) {
+            dprintf(1, "stop vcpu\n");
+            outb(0xf0, 0x402); // notify qemu to stop vcpu
+            dprintf(1, "resume vcpu\n");
+        }
         outb(val, PORT_PCI_DATA + (addr & 3));
     }
 }

qemu:hw/char/debugcon.c
@@ -60,6 +61,9 @@ static void debugcon_ioport_write(void *opaque, hwaddr addr, uint64_t val,
     printf(" [debugcon: write addr=0x%04" HWADDR_PRIx " val=0x%02" PRIx64 "]\n", addr, val);
 #endif

+    if (ch == 0xf0) {
+        vm_stop(RUN_STATE_PAUSED);
+    }
     /* XXX this blocks entire thread. Rewrite to use
      * qemu_chr_fe_write and background I/O callbacks */
     qemu_chr_fe_write_all(&s->chr, &ch, 1);

step 2. start vm1 by the following command line, and then vm stopped.
$ qemu-system-x86_64 -machine pc-i440fx-5.0,accel=kvm\
 -netdev tap,ifname=tap-test,id=hostnet0,vhost=on,downscript=no,script=no\
 -device virtio-net-pci,netdev=hostnet0,id=net0,bus=pci.0,addr=0x13,bootindex=3\
 -device cirrus-vga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2\
 -chardev file,id=seabios,path=/var/log/test.seabios,append=on\
 -device isa-debugcon,iobase=0x402,chardev=seabios\
 -monitor stdio

step 3. start vm2 to accept vm1 state.
$ qemu-system-x86_64 -machine pc-i440fx-5.0,accel=kvm\
 -netdev tap,ifname=tap-test1,id=hostnet0,vhost=on,downscript=no,script=no\
 -device virtio-net-pci,netdev=hostnet0,id=net0,bus=pci.0,addr=0x13,bootindex=3\
 -device cirrus-vga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2\
 -chardev file,id=seabios,path=/var/log/test.seabios,append=on\
 -device isa-debugcon,iobase=0x402,chardev=seabios\
 -monitor stdio \
 -incoming tcp:127.0.0.1:8000

step 4. execute the following qmp command in vm1 to migrate.
(qemu) migrate tcp:127.0.0.1:8000

step 5. execute the following qmp command in vm2 to resume vcpu.
(qemu) cont
Before this patch, we get KVM "emulation failure" error on vm2.
This patch fixes it.

Cc: qemu-stable@nongnu.org
Signed-off-by: Hogan Wang <hogan.wang@huawei.com>
Message-Id: <20200727084621.3279-1-hogan.wang@huawei.com>
Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-27 10:24:39 -04:00
Markus Armbruster
7a309cc95b qom: Change object_get_canonical_path_component() not to malloc
object_get_canonical_path_component() returns a malloced copy of a
property name on success, null on failure.

19 of its 25 callers immediately free the returned copy.

Change object_get_canonical_path_component() to return the property
name directly.  Since modifying the name would be wrong, adjust the
return type to const char *.

Drop the free from the 19 callers become simpler, add the g_strdup()
to the other six.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200714160202.3121879-4-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
2020-07-21 16:23:43 +02:00
Markus Armbruster
668f62ec62 error: Eliminate error_propagate() with Coccinelle, part 1
When all we do with an Error we receive into a local variable is
propagating to somewhere else, we can just as well receive it there
right away.  Convert

    if (!foo(..., &err)) {
        ...
        error_propagate(errp, err);
        ...
        return ...
    }

to

    if (!foo(..., errp)) {
        ...
        ...
        return ...
    }

where nothing else needs @err.  Coccinelle script:

    @rule1 forall@
    identifier fun, err, errp, lbl;
    expression list args, args2;
    binary operator op;
    constant c1, c2;
    symbol false;
    @@
         if (
    (
    -        fun(args, &err, args2)
    +        fun(args, errp, args2)
    |
    -        !fun(args, &err, args2)
    +        !fun(args, errp, args2)
    |
    -        fun(args, &err, args2) op c1
    +        fun(args, errp, args2) op c1
    )
            )
         {
             ... when != err
                 when != lbl:
                 when strict
    -        error_propagate(errp, err);
             ... when != err
    (
             return;
    |
             return c2;
    |
             return false;
    )
         }

    @rule2 forall@
    identifier fun, err, errp, lbl;
    expression list args, args2;
    expression var;
    binary operator op;
    constant c1, c2;
    symbol false;
    @@
    -    var = fun(args, &err, args2);
    +    var = fun(args, errp, args2);
         ... when != err
         if (
    (
             var
    |
             !var
    |
             var op c1
    )
            )
         {
             ... when != err
                 when != lbl:
                 when strict
    -        error_propagate(errp, err);
             ... when != err
    (
             return;
    |
             return c2;
    |
             return false;
    |
             return var;
    )
         }

    @depends on rule1 || rule2@
    identifier err;
    @@
    -    Error *err = NULL;
         ... when != err

Not exactly elegant, I'm afraid.

The "when != lbl:" is necessary to avoid transforming

         if (fun(args, &err)) {
             goto out
         }
         ...
     out:
         error_propagate(errp, err);

even though other paths to label out still need the error_propagate().
For an actual example, see sclp_realize().

Without the "when strict", Coccinelle transforms vfio_msix_setup(),
incorrectly.  I don't know what exactly "when strict" does, only that
it helps here.

The match of return is narrower than what I want, but I can't figure
out how to express "return where the operand doesn't use @err".  For
an example where it's too narrow, see vfio_intx_enable().

Silently fails to convert hw/arm/armsse.c, because Coccinelle gets
confused by ARMSSE being used both as typedef and function-like macro
there.  Converted manually.

Line breaks tidied up manually.  One nested declaration of @local_err
deleted manually.  Preexisting unwanted blank line dropped in
hw/riscv/sifive_e.c.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200707160613.848843-35-armbru@redhat.com>
2020-07-10 15:18:08 +02:00
Markus Armbruster
62a35aaa31 qapi: Use returned bool to check for failure, Coccinelle part
The previous commit enables conversion of

    visit_foo(..., &err);
    if (err) {
        ...
    }

to

    if (!visit_foo(..., errp)) {
        ...
    }

for visitor functions that now return true / false on success / error.
Coccinelle script:

    @@
    identifier fun =~ "check_list|input_type_enum|lv_start_struct|lv_type_bool|lv_type_int64|lv_type_str|lv_type_uint64|output_type_enum|parse_type_bool|parse_type_int64|parse_type_null|parse_type_number|parse_type_size|parse_type_str|parse_type_uint64|print_type_bool|print_type_int64|print_type_null|print_type_number|print_type_size|print_type_str|print_type_uint64|qapi_clone_start_alternate|qapi_clone_start_list|qapi_clone_start_struct|qapi_clone_type_bool|qapi_clone_type_int64|qapi_clone_type_null|qapi_clone_type_number|qapi_clone_type_str|qapi_clone_type_uint64|qapi_dealloc_start_list|qapi_dealloc_start_struct|qapi_dealloc_type_anything|qapi_dealloc_type_bool|qapi_dealloc_type_int64|qapi_dealloc_type_null|qapi_dealloc_type_number|qapi_dealloc_type_str|qapi_dealloc_type_uint64|qobject_input_check_list|qobject_input_check_struct|qobject_input_start_alternate|qobject_input_start_list|qobject_input_start_struct|qobject_input_type_any|qobject_input_type_bool|qobject_input_type_bool_keyval|qobject_input_type_int64|qobject_input_type_int64_keyval|qobject_input_type_null|qobject_input_type_number|qobject_input_type_number_keyval|qobject_input_type_size_keyval|qobject_input_type_str|qobject_input_type_str_keyval|qobject_input_type_uint64|qobject_input_type_uint64_keyval|qobject_output_start_list|qobject_output_start_struct|qobject_output_type_any|qobject_output_type_bool|qobject_output_type_int64|qobject_output_type_null|qobject_output_type_number|qobject_output_type_str|qobject_output_type_uint64|start_list|visit_check_list|visit_check_struct|visit_start_alternate|visit_start_list|visit_start_struct|visit_type_.*";
    expression list args;
    typedef Error;
    Error *err;
    @@
    -    fun(args, &err);
    -    if (err)
    +    if (!fun(args, &err))
         {
             ...
         }

A few line breaks tidied up manually.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200707160613.848843-19-armbru@redhat.com>
2020-07-10 15:18:08 +02:00
Paolo Bonzini
f983ff95f4 vmport: move compat properties to hw_compat_5_0
The patches that introduced the properties were submitted when QEMU 5.0
had not been released yet, so they got merged under the wrong heading.
Move them to hw_compat_5_0 so that 5.0 machine types get the pre-patch
behavior.

Fixes: b889212973 ("hw/i386/vmport: Propagate IOPort read to vCPU EAX register")
Fixes: 0342ee761e ("hw/i386/vmport: Set EAX to -1 on failed and unsupported commands")
Fixes: f8bdc55037 ("hw/i386/vmport: Report vmware-vmx-type in CMD_GETVERSION")
Fixes: aaacf1c15a ("hw/i386/vmport: Add support for CMD_GETBIOSUUID")
Reported-by: Laurent Vivier <lvivier@redhat.com>
Cc: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:40 -04:00
Peter Maydell
7d3660e798 * Miscellaneous fixes and feature enablement (many)
* SEV refactoring (David)
 * Hyper-V initial support (Jon)
 * i386 TCG fixes (x87 and SSE, Joseph)
 * vmport cleanup and improvements (Philippe, Liran)
 * Use-after-free with vCPU hot-unplug (Nengyuan)
 * run-coverity-scan improvements (myself)
 * Record/replay fixes (Pavel)
 * -machine kernel_irqchip=split improvements for INTx (Peter)
 * Code cleanups (Philippe)
 * Crash and security fixes (PJP)
 * HVF cleanups (Roman)
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl7jpdAUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMfjwf/X7+0euuE9dwKFKDDMmIi+4lRWnq7
 gSOyE1BYSfDIUXRIukf64konXe0VpiotNYlyEaYnnQjkMdGm5E9iXKF+LgEwXj/t
 NSGkfj5J3VeWRG4JJp642CSN/aZWO8uzkenld3myCnu6TicuN351tDJchiFwAk9f
 wsXtgLKd67zE8MLVt8AP0rNTbzMHttPXnPaOXDCuwjMHNvMEKnC93UeOeM0M4H5s
 3Dl2HvsNWZ2SzUG9mAbWp0bWWuoIb+Ep9//87HWANvb7Z8jratRws18i6tYt1sPx
 8zOnUS87sVnh1CQlXBDd9fEcqBUVgR9pAlqaaYavNhFp5eC31euvpDU8Iw==
 =F4sU
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Miscellaneous fixes and feature enablement (many)
* SEV refactoring (David)
* Hyper-V initial support (Jon)
* i386 TCG fixes (x87 and SSE, Joseph)
* vmport cleanup and improvements (Philippe, Liran)
* Use-after-free with vCPU hot-unplug (Nengyuan)
* run-coverity-scan improvements (myself)
* Record/replay fixes (Pavel)
* -machine kernel_irqchip=split improvements for INTx (Peter)
* Code cleanups (Philippe)
* Crash and security fixes (PJP)
* HVF cleanups (Roman)

# gpg: Signature made Fri 12 Jun 2020 16:57:04 BST
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (116 commits)
  target/i386: Remove obsolete TODO file
  stubs: move Xen stubs to accel/
  replay: fix replay shutdown for console mode
  exec/cpu-common: Move MUSB specific typedefs to 'hw/usb/hcd-musb.h'
  hw/usb: Move device-specific declarations to new 'hcd-musb.h' header
  exec/memory: Remove unused MemoryRegionMmio type
  checkpatch: reversed logic with acpi test checks
  target/i386: sev: Unify SEVState and SevGuestState
  target/i386: sev: Remove redundant handle field
  target/i386: sev: Remove redundant policy field
  target/i386: sev: Remove redundant cbitpos and reduced_phys_bits fields
  target/i386: sev: Partial cleanup to sev_state global
  target/i386: sev: Embed SEVState in SevGuestState
  target/i386: sev: Rename QSevGuestInfo
  target/i386: sev: Move local structure definitions into .c file
  target/i386: sev: Remove unused QSevGuestInfoClass
  xen: fix build without pci passthrough
  i386: hvf: Drop HVFX86EmulatorState
  i386: hvf: Move mmio_buf into CPUX86State
  i386: hvf: Move lazy_flags into CPUX86State
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	hw/i386/acpi-build.c
2020-06-12 23:06:22 +01:00
Liran Alon
aaacf1c15a hw/i386/vmport: Add support for CMD_GETBIOSUUID
This is VMware documented functionallity that some guests rely on.
Returns the BIOS UUID of the current virtual machine.

Note that we also introduce a new compatability flag "x-cmds-v2" to
make sure to expose new VMPort commands only to new machine-types.
This flag will also be used by the following patches that will introduce
additional VMPort commands.

Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20200312165431.82118-10-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-10 12:09:47 -04:00
Liran Alon
f8bdc55037 hw/i386/vmport: Report vmware-vmx-type in CMD_GETVERSION
As can be seen from VmCheck_GetVersion() in open-vm-tools code,
CMD_GETVERSION should return vmware-vmx-type in ECX register.

Default is to fake host as VMware ESX server. But user can control
this value by "-global vmport.vmware-vmx-type=X".

Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20200312165431.82118-7-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-10 12:09:45 -04:00
Liran Alon
0342ee761e hw/i386/vmport: Set EAX to -1 on failed and unsupported commands
This is used as a signal for VMware Tools to know if a command it
attempted to invoke, failed or is unsupported. As a result, VMware Tools
will either report failure to user or fallback to another backdoor command
in attempt to perform some operation.

A few examples:
* open-vm-tools TimeSyncReadHost() function fallbacks to
CMD_GETTIMEFULL command when CMD_GETTIMEFULL_WITH_LAG
fails/unsupported.
* open-vm-tools Hostinfo_NestingSupported() function verifies
EAX != -1 to check for success.
* open-vm-tools Hostinfo_VCPUInfoBackdoor() functions checks
if reserved-bit is set to indicate command is unimplemented.

Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20200312165431.82118-5-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-10 12:09:44 -04:00
Liran Alon
b889212973 hw/i386/vmport: Propagate IOPort read to vCPU EAX register
vmport_ioport_read() returns the value that should propagate to vCPU EAX
register when guest reads VMPort IOPort (i.e. By x86 IN instruction).

However, because vmport_ioport_read() calls cpu_synchronize_state(), the
returned value gets overridden by the value in QEMU vCPU EAX register.
i.e. cpu->env.regs[R_EAX].

To fix this issue, change vmport_ioport_read() to explicitly override
cpu->env.regs[R_EAX] with the value it wish to propagate to vCPU EAX
register.

Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20200312165431.82118-4-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-10 12:09:43 -04:00
Alexander Duyck
7483cbbaf8 virtio-balloon: Implement support for page poison reporting feature
We need to make certain to advertise support for page poison reporting if
we want to actually get data on if the guest will be poisoning pages.

Add a value for reporting the poison value being used if page poisoning is
enabled in the guest. With this we can determine if we will need to skip
free page reporting when it is enabled in the future.

The value currently has no impact on existing balloon interfaces. In the
case of existing balloon interfaces the onus is on the guest driver to
reapply whatever poison is in place.

When we add free page reporting the poison value is used to determine if
we can perform in-place page reporting. The expectation is that a reported
page will already contain the value specified by the poison, and the
reporting of the page should not change that value.

Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Message-Id: <20200527041400.12700.33251.stgit@localhost.localdomain>
2020-06-09 14:18:04 -04:00
Markus Armbruster
d2623129a7 qom: Drop parameter @errp of object_property_add() & friends
The only way object_property_add() can fail is when a property with
the same name already exists.  Since our property names are all
hardcoded, failure is a programming error, and the appropriate way to
handle it is passing &error_abort.

Same for its variants, except for object_property_add_child(), which
additionally fails when the child already has a parent.  Parentage is
also under program control, so this is a programming error, too.

We have a bit over 500 callers.  Almost half of them pass
&error_abort, slightly fewer ignore errors, one test case handles
errors, and the remaining few callers pass them to their own callers.

The previous few commits demonstrated once again that ignoring
programming errors is a bad idea.

Of the few ones that pass on errors, several violate the Error API.
The Error ** argument must be NULL, &error_abort, &error_fatal, or a
pointer to a variable containing NULL.  Passing an argument of the
latter kind twice without clearing it in between is wrong: if the
first call sets an error, it no longer points to NULL for the second
call.  ich9_pm_add_properties(), sparc32_ledma_realize(),
sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize()
are wrong that way.

When the one appropriate choice of argument is &error_abort, letting
users pick the argument is a bad idea.

Drop parameter @errp and assert the preconditions instead.

There's one exception to "duplicate property name is a programming
error": the way object_property_add() implements the magic (and
undocumented) "automatic arrayification".  Don't drop @errp there.
Instead, rename object_property_add() to object_property_try_add(),
and add the obvious wrapper object_property_add().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-15-armbru@redhat.com>
[Two semantic rebase conflicts resolved]
2020-05-15 07:07:58 +02:00
Markus Armbruster
7eecec7d12 qom: Drop object_property_set_description() parameter @errp
object_property_set_description() and
object_class_property_set_description() fail only when property @name
is not found.

There are 85 calls of object_property_set_description() and
object_class_property_set_description().  None of them can fail:

* 84 immediately follow the creation of the property.

* The one in spapr_rng_instance_init() refers to a property created in
  spapr_rng_class_init(), from spapr_rng_properties[].

Every one of them still gets to decide what to pass for @errp.

51 calls pass &error_abort, 32 calls pass NULL, one receives the error
and propagates it to &error_abort, and one propagates it to
&error_fatal.  I'm actually surprised none of them violates the Error
API.

What are we gaining by letting callers handle the "property not found"
error?  Use when the property is not known to exist is simpler: you
don't have to guard the call with a check.  We haven't found such a
use in 5+ years.  Until we do, let's make life a bit simpler and drop
the @errp parameter.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-8-armbru@redhat.com>
[One semantic rebase conflict resolved]
2020-05-15 07:06:49 +02:00
Cornelia Huck
541aaa1df8 hw: add compat machines for 5.1
Add 5.1 machine types for arm/i440fx/q35/s390x/spapr.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-id: 20200429144605.7262-1-cohuck@redhat.com
2020-05-06 10:12:16 -04:00
Shameer Kolothum
394f0f72fd fw_cfg: Migrate ACPI table mr sizes separately
Any sub-page size update to ACPI MRs will be lost during
migration, as we use aligned size in ram_load_precopy() ->
qemu_ram_resize() path. This will result in inconsistency in
FWCfgEntry sizes between source and destination. In order to avoid
this, save and restore them separately during migration.

Up until now, this problem may not be that relevant for x86 as both
ACPI table and Linker MRs gets padded and aligned. Also at present,
qemu_ram_resize() doesn't invoke callback to update FWCfgEntry for
unaligned size changes. But since we are going to fix the
qemu_ram_resize() in the subsequent patch, the issue may become
more serious especially for RSDP MR case.

Moreover, the issue will soon become prominent in arm/virt as well
where the MRs are not padded or aligned at all and eventually have
acpi table changes as part of future additions like NVDIMM hot-add
feature.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Acked-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200403101827.30664-3-shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-04-13 06:55:54 -04:00
Cornelia Huck
02501fc393 compat: disable edid on correct virtio-gpu device
Commit bb15791166 ("compat: disable edid on virtio-gpu base
device") tried to disable 'edid' on the virtio-gpu base device.
However, that device is not 'virtio-gpu', but 'virtio-gpu-device'.
Fix it.

Fixes: bb15791166 ("compat: disable edid on virtio-gpu base device")
Reported-by: Lukáš Doktor <ldoktor@redhat.com>
Tested-by: Lukáš Doktor <ldoktor@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-id: 20200318093919.24942-1-cohuck@redhat.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2020-03-20 07:50:52 +01:00
Babu Moger
8cb30e3aec machine: Add SMP Sockets in CpuTopology
Store the  smp sockets in CpuTopology. The socket information required to
build the apic id in EPYC mode. Right now socket information is not passed
to down when decoding the apic id. Add the socket information here.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <158396718647.58170.2278448323151215741.stgit@naples-babu.amd.com>
2020-03-17 19:48:10 -04:00
Dr. David Alan Gilbert
4ba59be1d6 machine/memory encryption: Disable mem merge
When a host is running with memory encryption, the memory isn't visible
to the host kernel; attempts to merge that memory are futile because
what it's really comparing is encrypted memory, usually encrypted
with different keys.

Automatically turn mem-merge off when memory encryption is specified.

https://bugzilla.redhat.com/show_bug.cgi?id=1796356

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200130175046.85850-1-dgilbert@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-03-17 19:48:10 -04:00
Paolo Bonzini
ca6155c0f2 Merge tag 'patchew/20200219160953.13771-1-imammedo@redhat.com' of https://github.com/patchew-project/qemu into HEAD
This series removes ad hoc RAM allocation API (memory_region_allocate_system_memory)
and consolidates it around hostmem backend. It allows to

* resolve conflicts between global -mem-prealloc and hostmem's "policy" option,
  fixing premature allocation before binding policy is applied

* simplify complicated memory allocation routines which had to deal with 2 ways
  to allocate RAM.

* reuse hostmem backends of a choice for main RAM without adding extra CLI
  options to duplicate hostmem features.  A recent case was -mem-shared, to
  enable vhost-user on targets that don't support hostmem backends [1] (ex: s390)

* move RAM allocation from individual boards into generic machine code and
  provide them with prepared MemoryRegion.

* clean up deprecated NUMA features which were tied to the old API (see patches)
  - "numa: remove deprecated -mem-path fallback to anonymous RAM"
  - (POSTPONED, waiting on libvirt side) "forbid '-numa node,mem' for 5.0 and newer machine types"
  - (POSTPONED) "numa: remove deprecated implicit RAM distribution between nodes"

Introduce a new machine.memory-backend property and wrapper code that aliases
global -mem-path and -mem-alloc into automatically created hostmem backend
properties (provided memory-backend was not set explicitly given by user).
A bulk of trivial patches then follow to incrementally convert individual
boards to using machine.memory-backend provided MemoryRegion.

Board conversion typically involves:

* providing MachineClass::default_ram_size and MachineClass::default_ram_id
  so generic code could create default backend if user didn't explicitly provide
  memory-backend or -m options

* dropping memory_region_allocate_system_memory() call

* using convenience MachineState::ram MemoryRegion, which points to MemoryRegion
   allocated by ram-memdev

On top of that for some boards:

* missing ram_size checks are added (typically it were boards with fixed ram size)

* ram_size fixups are replaced by checks and hard errors, forcing user to
  provide correct "-m" values instead of ignoring it and continuing running.

After all boards are converted, the old API is removed and memory allocation
routines are cleaned up.
2020-02-25 09:19:00 +01:00
Denis Plotnikov
c9b7d9ec21 virtio: increase virtqueue size for virtio-scsi and virtio-blk
The goal is to reduce the amount of requests issued by a guest on
1M reads/writes. This rises the performance up to 4% on that kind of
disk access pattern.

The maximum chunk size to be used for the guest disk accessing is
limited with seg_max parameter, which represents the max amount of
pices in the scatter-geather list in one guest disk request.

Since seg_max is virqueue_size dependent, increasing the virtqueue
size increases seg_max, which, in turn, increases the maximum size
of data to be read/write from a guest disk.

More details in the original problem statment:
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg03721.html

Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-id: 20200214074648.958-1-dplotnikov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Igor Mammedov
82b911aaff machine: introduce convenience MachineState::ram
the new field will be used by boards to get access to main
RAM memory region and will help to save boiler plate in
boards which often introduce a field or variable just for
this purpose.

Memory region will be equivalent to what currently used
memory_region_allocate_system_memory() is returning apart
from that it will come from hostmem backend.
Followup patches will incrementally switch boards to using
RAM from MachineState::ram.

Patch takes care of non-NUMA case and follow up patch will
initialize MachineState::ram for NUMA case.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200219160953.13771-5-imammedo@redhat.com>
2020-02-19 16:49:53 +00:00
Igor Mammedov
aa8b183974 machine: introduce memory-backend property
Property will contain link to memory backend that will be
used for backing initial RAM.
Follow up commit will alias -mem-path and -mem-prealloc
CLI options into memory backend options to make memory
handling consistent (using only hostmem backend family
for guest RAM allocation).

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200219160953.13771-3-imammedo@redhat.com>
2020-02-19 16:49:53 +00:00
Gerd Hoffmann
ed71c09ffd qxl: introduce hardware revision 5
The only difference to hardware revision 4 is that the device doesn't
switch to VGA mode in case someone happens to touch a VGA register,
which should make things more robust in configurations with multiple
vga devices.

Swtiching back to VGA mode happens on reset, either full machine
reset or qxl device reset (QXL_IO_RESET ioport command).

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-id: 20200206074358.4274-1-kraxel@redhat.com
2020-02-13 08:31:40 +01:00
Philippe Mathieu-Daudé
11a18c84db hw/core: Allow setting 'virtio-blk-device.scsi' property on OSX host
Commit ed65fd1a27 ("virtio-blk: switch off scsi-passthrough by
default") changed the default value of the 'scsi' property of
virtio-blk, which is only available on Linux hosts. It also added
an unconditional compat entry for 2.4 or earlier machines.

Trying to set this property on a pre-2.5 machine on OSX, we get:

   Unexpected error in object_property_find() at qom/object.c:1201:
   qemu-system-x86_64: -device virtio-blk-pci,id=scsi0,drive=drive0: can't apply global virtio-blk-device.scsi=true: Property '.scsi' not found

Fix this error by marking the property optional.

Fixes: ed65fd1a27 ("virtio-blk: switch off scsi-passthrough by default")
Suggested-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20200207001404.1739-1-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-07 16:49:39 +00:00
Yuri Benditovich
32187f3d90 usb-redir: remove 'remote wakeup' flag from configuration descriptor
If the redirected device has this capability, Windows guest may
place the device into D2 and expect it to wake when the device
becomes active, but this will never happen. For example, when
internal Bluetooth adapter is redirected, keyboards and mice
connected to it do not work. Current commit removes this
capability (starting from machine 5.0)
Set 'usb-redir.suppress-remote-wake' property to 'off' to keep
'remote wake' as is or to 'on' to remove 'remote wake' on
4.2 or earlier.

Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Message-id: 20200108091044.18055-3-yuri.benditovich@daynix.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2020-01-13 09:17:31 +01:00
Yuri Benditovich
7bacaf5fea usb-host: remove 'remote wakeup' flag from configuration descriptor
If the redirected device has this capability, Windows guest may
place the device into D2 and expect it to wake when the device
becomes active, but this will never happen. For example, when
internal Bluetooth adapter is redirected, keyboards and mice
connected to it do not work. Current commit removes this
capability (starting from machine 5.0)
Set 'usb-host.suppress-remote-wake' property to 'off' to keep
'remote wake' as is or to 'on' to remove 'remote wake' on
4.2 or earlier.

Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Message-id: 20200108091044.18055-2-yuri.benditovich@daynix.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2020-01-13 09:17:31 +01:00
Peter Maydell
973d306dd6 virtio, pci, pc: fixes, features
Bugfixes all over the place.
 HMAT support.
 New flags for vhost-user-blk utility.
 Auto-tuning of seg max for virtio storage.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl4TaMEPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpvzgH/2LyDAzCa9h93ikSJjmyUk5FUaqve38daEb3
 S3JYjwKxQx7u1ydooKhvBQnBCZ2i3S+k62gfYyKB+nBv8xvjs0Eg5D1YJ5E8hciy
 lf5OFGWWtX2iPDjZwQwT13kiJe0o3JRGxJJ6XqTEG+1EYOp7cky/FEv4PD030b9m
 I2wROZ/Am+onB9YJX8c0Vv1CG+AryuJNXnvwQzTXEjj4U7bEYUyJwVZaCRyAdWQ3
 uYXIZN9VwjVX6BFvy9ZAJbEsUVJvOM1/aQaDqcrLz+VlzRT7bRkKHi2G3vakrm1I
 r5OpgyLo84132awCncbSykKDH5o8WaxLaJBjGmuBfasMz9wPzAg=
 =uL1o
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio, pci, pc: fixes, features

Bugfixes all over the place.
HMAT support.
New flags for vhost-user-blk utility.
Auto-tuning of seg max for virtio storage.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Mon 06 Jan 2020 17:05:05 GMT
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (32 commits)
  intel_iommu: add present bit check for pasid table entries
  intel_iommu: a fix to vtd_find_as_from_bus_num()
  virtio-net: delete also control queue when TX/RX deleted
  virtio: reset region cache when on queue deletion
  virtio-mmio: update queue size on guest write
  tests: add virtio-scsi and virtio-blk seg_max_adjust test
  virtio: make seg_max virtqueue size dependent
  hw: fix using 4.2 compat in 5.0 machine types for i440fx/q35
  vhost-user-scsi: reset the device if supported
  vhost-user: add VHOST_USER_RESET_DEVICE to reset devices
  hw/pci/pci_host: Let pci_data_[read/write] use unsigned 'size' argument
  hw/pci/pci_host: Remove redundant PCI_DPRINTF()
  virtio-mmio: Clear v2 transport state on soft reset
  ACPI: add expected files for HMAT tests (acpihmat)
  tests/bios-tables-test: add test cases for ACPI HMAT
  tests/numa: Add case for QMP build HMAT
  hmat acpi: Build Memory Side Cache Information Structure(s)
  hmat acpi: Build System Locality Latency and Bandwidth Information Structure(s)
  hmat acpi: Build Memory Proximity Domain Attributes Structure(s)
  numa: Extend CLI to provide memory side cache information
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-01-07 16:25:00 +00:00
Denis Plotnikov
1bf8a989a5 virtio: make seg_max virtqueue size dependent
Before the patch, seg_max parameter was immutable and hardcoded
to 126 (128 - 2) without respect to queue size. This has two negative effects:

1. when queue size is < 128, we have Virtio 1.1 specfication violation:
   (2.6.5.3.1 Driver Requirements) seq_max must be <= queue_size.
   This violation affects the old Linux guests (ver < 4.14). These guests
   crash on these queue_size setups.

2. when queue_size > 128, as was pointed out by Denis Lunev <den@virtuozzo.com>,
   seg_max restrics guest's block request length which affects guests'
   performance making them issues more block request than needed.
   https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg03721.html

To mitigate this two effects, the patch adds the property adjusting seg_max
to queue size automaticaly. Since seg_max is a guest visible parameter,
the property is machine type managable and allows to choose between
old (seg_max = 126 always) and new (seg_max = queue_size - 2) behaviors.

Not to change the behavior of the older VMs, prevent setting the default
seg_max_adjust value for older machine types.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-Id: <20191220140905.1718-2-dplotnikov@virtuozzo.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-01-06 12:04:43 -05:00
Peter Maydell
6fb0dae9ef x86 and machine queue, 2019-12-20
Bug fix:
 * Resolve CPU models to v1 by default (Eduardo Habkost)
 
 Cleanup:
 * Remove incorrect numa_mem_supported checks (Igor Mammedov)
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEWjIv1avE09usz9GqKAeTb5hNxaYFAl39HqYUHGVoYWJrb3N0
 QHJlZGhhdC5jb20ACgkQKAeTb5hNxabHeBAAkybU8+JzzqXoG9e16MiQiUQ0vSy9
 MFkWIlsD5RCncdlI7s7yyuPUa7GEkJztRxzanvP2BcbMvHHpaM01EgOsZuZfld8Z
 R6lQaTZdAC4XQFPmD14ccIQ/r8cDUXRfUhasKXq3tNQdXORUw5/T9XHwyn3kvHUT
 /nEglWdUG0LmRQMQNRpbSgQ4B0jx+RwRg6KLGRm/mqlwiFV8nULLB8IYDMrxHSu3
 iY/PAFOMqQbRbbDQ7rK3l7u0TyRTB41FTx8s2eT9Is2V3HZU9P9lbWPQBMnxPwxm
 VYo/LVO6smZ9gZbyCcPZtOn95ay5gGk+fQ9Twg6/l1tHsK7vmNxn8Z3y+QWEvJ30
 BnOJ2Y0RaFNBDrhiIqJu12Lp0nJXMDi96tAS71hqwsJssjzLYSpD/faoKO0vDyR9
 RLoumrXLcrgeMopRKsft8ZkJIakHlXc+85AuIMZ9obhcz4liC7r/IbjOqOumKTPN
 8feLmzqdldAmh0jvJCfyu1n4qhH4KUPPrFxOvZfuzdWkvSUbcJSkQaPwYxxQaFvo
 9jRHwNNF4MTnImgQIw59ao/u6JXVM+4oY5dc+BjeGTefQKuwRRvT/54Z+v7jULwK
 ZKGlLnCRlYeD/U+67iBIeV2nrRM7pTkcsTWmhX+/u2pwyKpmiA4quG63KmR7dyDK
 6HqJez6jOKTEARU=
 =psTk
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ehabkost/tags/x86-and-machine-pull-request' into staging

x86 and machine queue, 2019-12-20

Bug fix:
* Resolve CPU models to v1 by default (Eduardo Habkost)

Cleanup:
* Remove incorrect numa_mem_supported checks (Igor Mammedov)

# gpg: Signature made Fri 20 Dec 2019 19:19:02 GMT
# gpg:                using RSA key 5A322FD5ABC4D3DBACCFD1AA2807936F984DC5A6
# gpg:                issuer "ehabkost@redhat.com"
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" [full]
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-and-machine-pull-request:
  numa: properly check if numa is supported
  numa: remove not needed check
  i386: Resolve CPU models to v1 by default

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-01-06 14:08:04 +00:00
Tao Xu
244b3f4485 numa: Extend CLI to provide initiator information for numa nodes
In ACPI 6.3 chapter 5.2.27 Heterogeneous Memory Attribute Table (HMAT),
The initiator represents processor which access to memory. And in 5.2.27.3
Memory Proximity Domain Attributes Structure, the attached initiator is
defined as where the memory controller responsible for a memory proximity
domain. With attached initiator information, the topology of heterogeneous
memory can be described. Add new machine property 'hmat' to enable all
HMAT specific options.

Extend CLI of "-numa node" option to indicate the initiator numa node-id.
In the linux kernel, the codes in drivers/acpi/hmat/hmat.c parse and report
the platform's HMAT tables. Before using initiator option, enable HMAT with
-machine hmat=on.

Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-2-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-01-05 07:03:03 -05:00
Michael Roth
9d7bd0826f virtio-pci: disable vring processing when bus-mastering is disabled
Currently the SLOF firmware for pseries guests will disable/re-enable
a PCI device multiple times via IO/MEM/MASTER bits of PCI_COMMAND
register after the initial probe/feature negotiation, as it tends to
work with a single device at a time at various stages like probing
and running block/network bootloaders without doing a full reset
in-between.

In QEMU, when PCI_COMMAND_MASTER is disabled we disable the
corresponding IOMMU memory region, so DMA accesses (including to vring
fields like idx/flags) will no longer undergo the necessary
translation. Normally we wouldn't expect this to happen since it would
be misbehavior on the driver side to continue driving DMA requests.

However, in the case of pseries, with iommu_platform=on, we trigger the
following sequence when tearing down the virtio-blk dataplane ioeventfd
in response to the guest unsetting PCI_COMMAND_MASTER:

  #2  0x0000555555922651 in virtqueue_map_desc (vdev=vdev@entry=0x555556dbcfb0, p_num_sg=p_num_sg@entry=0x7fffe657e1a8, addr=addr@entry=0x7fffe657e240, iov=iov@entry=0x7fffe6580240, max_num_sg=max_num_sg@entry=1024, is_write=is_write@entry=false, pa=0, sz=0)
      at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:757
  #3  0x0000555555922a89 in virtqueue_pop (vq=vq@entry=0x555556dc8660, sz=sz@entry=184)
      at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:950
  #4  0x00005555558d3eca in virtio_blk_get_request (vq=0x555556dc8660, s=0x555556dbcfb0)
      at /home/mdroth/w/qemu.git/hw/block/virtio-blk.c:255
  #5  0x00005555558d3eca in virtio_blk_handle_vq (s=0x555556dbcfb0, vq=0x555556dc8660)
      at /home/mdroth/w/qemu.git/hw/block/virtio-blk.c:776
  #6  0x000055555591dd66 in virtio_queue_notify_aio_vq (vq=vq@entry=0x555556dc8660)
      at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:1550
  #7  0x000055555591ecef in virtio_queue_notify_aio_vq (vq=0x555556dc8660)
      at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:1546
  #8  0x000055555591ecef in virtio_queue_host_notifier_aio_poll (opaque=0x555556dc86c8)
      at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:2527
  #9  0x0000555555d02164 in run_poll_handlers_once (ctx=ctx@entry=0x55555688bfc0, timeout=timeout@entry=0x7fffe65844a8)
      at /home/mdroth/w/qemu.git/util/aio-posix.c:520
  #10 0x0000555555d02d1b in try_poll_mode (timeout=0x7fffe65844a8, ctx=0x55555688bfc0)
      at /home/mdroth/w/qemu.git/util/aio-posix.c:607
  #11 0x0000555555d02d1b in aio_poll (ctx=ctx@entry=0x55555688bfc0, blocking=blocking@entry=true)
      at /home/mdroth/w/qemu.git/util/aio-posix.c:639
  #12 0x0000555555d0004d in aio_wait_bh_oneshot (ctx=0x55555688bfc0, cb=cb@entry=0x5555558d5130 <virtio_blk_data_plane_stop_bh>, opaque=opaque@entry=0x555556de86f0)
      at /home/mdroth/w/qemu.git/util/aio-wait.c:71
  #13 0x00005555558d59bf in virtio_blk_data_plane_stop (vdev=<optimized out>)
      at /home/mdroth/w/qemu.git/hw/block/dataplane/virtio-blk.c:288
  #14 0x0000555555b906a1 in virtio_bus_stop_ioeventfd (bus=bus@entry=0x555556dbcf38)
      at /home/mdroth/w/qemu.git/hw/virtio/virtio-bus.c:245
  #15 0x0000555555b90dbb in virtio_bus_stop_ioeventfd (bus=bus@entry=0x555556dbcf38)
      at /home/mdroth/w/qemu.git/hw/virtio/virtio-bus.c:237
  #16 0x0000555555b92a8e in virtio_pci_stop_ioeventfd (proxy=0x555556db4e40)
      at /home/mdroth/w/qemu.git/hw/virtio/virtio-pci.c:292
  #17 0x0000555555b92a8e in virtio_write_config (pci_dev=0x555556db4e40, address=<optimized out>, val=1048832, len=<optimized out>)
      at /home/mdroth/w/qemu.git/hw/virtio/virtio-pci.c:613

I.e. the calling code is only scheduling a one-shot BH for
virtio_blk_data_plane_stop_bh, but somehow we end up trying to process
an additional virtqueue entry before we get there. This is likely due
to the following check in virtio_queue_host_notifier_aio_poll:

  static bool virtio_queue_host_notifier_aio_poll(void *opaque)
  {
      EventNotifier *n = opaque;
      VirtQueue *vq = container_of(n, VirtQueue, host_notifier);
      bool progress;

      if (!vq->vring.desc || virtio_queue_empty(vq)) {
          return false;
      }

      progress = virtio_queue_notify_aio_vq(vq);

namely the call to virtio_queue_empty(). In this case, since no new
requests have actually been issued, shadow_avail_idx == last_avail_idx,
so we actually try to access the vring via vring_avail_idx() to get
the latest non-shadowed idx:

  int virtio_queue_empty(VirtQueue *vq)
  {
      bool empty;
      ...

      if (vq->shadow_avail_idx != vq->last_avail_idx) {
          return 0;
      }

      rcu_read_lock();
      empty = vring_avail_idx(vq) == vq->last_avail_idx;
      rcu_read_unlock();
      return empty;

but since the IOMMU region has been disabled we get a bogus value (0
usually), which causes virtio_queue_empty() to falsely report that
there are entries to be processed, which causes errors such as:

  "virtio: zero sized buffers are not allowed"

or

  "virtio-blk missing headers"

and puts the device in an error state.

This patch works around the issue by introducing virtio_set_disabled(),
which sets a 'disabled' flag to bypass checks like virtio_queue_empty()
when bus-mastering is disabled. Since we'd check this flag at all the
same sites as vdev->broken, we replace those checks with an inline
function which checks for either vdev->broken or vdev->disabled.

The 'disabled' flag is only migrated when set, which should be fairly
rare, but to maintain migration compatibility we disable it's use for
older machine types. Users requiring the use of the flag in conjunction
with older machine types can set it explicitly as a virtio-device
option.

NOTES:

 - This leaves some other oddities in play, like the fact that
   DRIVER_OK also gets unset in response to bus-mastering being
   disabled, but not restored (however the device seems to continue
   working)
 - Similarly, we disable the host notifier via
   virtio_bus_stop_ioeventfd(), which seems to move the handling out
   of virtio-blk dataplane and back into the main IO thread, and it
   ends up staying there till a reset (but otherwise continues working
   normally)

Cc: David Gibson <david@gibson.dropbear.id.au>,
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Message-Id: <20191120005003.27035-1-mdroth@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-01-05 07:03:03 -05:00
Igor Mammedov
fcd3f2cc12 numa: properly check if numa is supported
Commit aa57020774, by mistake used MachineClass::numa_mem_supported
to check if NUMA is supported by machine and also as unrelated change
set it to true for sbsa-ref board.

Luckily change didn't break machines that support NUMA, as the field
is set to true for them.

But the field is not intended for checking if NUMA is supported and
will be flipped to false within this release for new machine types.

Fix it:
 - by using previously used condition
      !mc->cpu_index_to_instance_props || !mc->get_default_cpu_node_id
   the first time and then use MachineState::numa_state down the road
   to check if NUMA is supported
 - dropping stray sbsa-ref chunk

Fixes: aa57020774
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1576154936-178362-3-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-12-19 14:57:14 -03:00
Paolo Bonzini
11bc4a13d1 kvm: convert "-machine kernel_irqchip" to an accelerator property
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-17 19:32:46 +01:00
Paolo Bonzini
23b0898e44 kvm: convert "-machine kvm_shadow_mem" to an accelerator property
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-17 19:32:27 +01:00
Paolo Bonzini
46472d8232 xen: convert "-machine igd-passthru" to an accelerator property
The first machine property to fall is Xen's Intel integrated graphics
passthrough.  The "-machine igd-passthru" option does not set anymore
a property on the machine object, but desugars to a GlobalProperty on
accelerator objects.

The setter is very simple, since the value ends up in a
global variable, so this patch also provides an example before the more
complicated cases that follow it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-17 19:32:27 +01:00
Paolo Bonzini
6f6e1698a6 vl: configure accelerators from -accel options
Drop the "accel" property from MachineState, and instead desugar
"-machine accel=" to a list of "-accel" options.

This has a semantic change due to removing merge_lists from -accel.
For example:

- "-accel kvm -accel tcg" all but ignored "-accel kvm".  This is a bugfix.

- "-accel kvm -accel thread=single" ignored "thread=single", since it
  applied the option to KVM.  Now it fails due to not specifying the
  accelerator on "-accel thread=single".

- "-accel tcg -accel thread=single" chose single-threaded TCG, while now
  it will fail due to not specifying the accelerator on "-accel
  thread=single".

Also, "-machine accel" and "-accel" become incompatible.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-17 19:32:26 +01:00
Evgeny Yakovlev
5f2585772f virtio-blk: advertise F_WCE (F_FLUSH) if F_CONFIG_WCE is advertised
Virtio spec 1.1 (and earlier), 5.2.5.2 Driver Requirements: Device
Initialization:

"Devices SHOULD always offer VIRTIO_BLK_F_FLUSH, and MUST offer it if
they offer VIRTIO_BLK_F_CONFIG_WCE"

Currently F_CONFIG_WCE and F_WCE are not connected to each other.
Qemu will advertise F_CONFIG_WCE if config-wce argument is
set for virtio-blk device. And F_WCE is advertised only if
underlying block backend actually has it's caching enabled.

Fix this by advertising F_WCE if F_CONFIG_WCE is also advertised.

To preserve backwards compatibility with newer machine types make this
behaviour governed by "x-enable-wce-if-config-wce" virtio-blk-device
property and introduce hw_compat_4_2 with new property being off by
default for all machine types <= 4.2 (but don't introduce 4.3
machine type itself yet).

Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>
Message-Id: <1572978137-189218-1-git-send-email-wrfsh@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-12-13 11:22:06 +00:00
Peter Maydell
a8b5ad8e1f virtio,vhost: fixes, features, cleanups.
FLR support.
 Misc fixes, cleanups.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJdb6W/AAoJECgfDbjSjVRpRlEIAKvo9Sbq9bOtZ8nhbfJvLBWV
 nyOk5kgwv+XE+VhYGTsU7poYDPdRQn8uohBzXDb1zzCHd9corHriUXnUQ8TkDdz9
 V9v8buK7qRPZa4OddPRVHDPZEn7OBbvNanhbo/Nw8iRcE/XdW+Ezw33A/aR8rSY7
 KOxHYHeR2uBzVVDWKxp2yfBd+Zm9gbO27Y1thb9fyi4o7mHZ+gbrFl2p7z3wilNK
 KuGi0jCmS4I+4h2wmrZXnzSrozg9vJhXxkkdfI7QBze1XiVqC8w/bCcjXGVVGfhe
 SOvJH9A+yVyWpfjJpgmof4UISah+4zTi9G2SanZ4UERULD/NsiGfLQTVilUijAk=
 =K61t
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio,vhost: fixes, features, cleanups.

FLR support.
Misc fixes, cleanups.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Wed 04 Sep 2019 12:53:35 BST
# gpg:                using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  libvhost-user: introduce and use vu_has_protocol_feature()
  libvhost-user: fix SLAVE_SEND_FD handling
  virtio-pci: Add Function Level Reset support
  virtio-rng: change default backend to rng-builtin
  virtio-rng: Keep the default backend out of VirtIORNGConf
  rng-builtin: add an RNG backend that uses qemu_guest_getrandom()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-09-04 17:22:34 +01:00
Julia Suvorova
eb1556c493 virtio-pci: Add Function Level Reset support
Using FLR becomes convenient in cases where resetting the bus is
impractical, for example, when debugging the behavior of individual
functions.

Signed-off-by: Julia Suvorova <jusual@redhat.com>
Message-Id: <20190820163005.1880-1-jusual@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-09-04 06:33:10 -04:00
Tao Xu
aa57020774 numa: move numa global variable nb_numa_nodes into MachineState
Add struct NumaState in MachineState and move existing numa global
nb_numa_nodes(renamed as "num_nodes") into NumaState. And add variable
numa_support into MachineClass to decide which submachines support NUMA.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190809065731.9097-3-tao3.xu@intel.com>
[ehabkost: include hw/boards.h again to fix build failures]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-09-03 11:26:55 -03:00
Cornelia Huck
9aec2e52ce hw: add compat machines for 4.2
Add 4.2 machine types for arm/i440fx/q35/s390x/spapr.

For i440fx and q35, unversioned cpu models are still translated
to -v1, as 0788a56bd1 ("i386: Make unversioned CPU models be
aliases") states this should only transition to the latest cpu
model version in 4.3 (or later).

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190724103524.20916-1-cohuck@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 11:32:11 +10:00
Cornelia Huck
bb15791166 compat: disable edid on virtio-gpu base device
'edid' is a property of the virtio-gpu base device, so turning
it off on virtio-gpu-pci is not enough (it misses -ccw). Turn
it off on the base device instead.

Fixes: 0a71966253 ("edid: flip the default to enabled")
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20190806115819.16026-1-cohuck@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-08-06 15:45:59 +01:00
Dr. David Alan Gilbert
c8557f1b48 pcie_root_port: Disable ACS on older machines
ACS got added in 4.0 unconditionally,  that broke older<->4.0 migration
where there was a PCIe root port.
Fix this by turning it off for 3.1 and older machines; note this
fixes compatibility for older QEMUs but breaks compatibility with 4.0
for older machine types.

    machine type    source qemu   dest qemu
       3.1             3.1           4.0        broken
       3.1             3.1           4.1rc2     broken
       3.1             3.1           4.1+this   OK ++
       3.1             4.0           4.1rc2     OK
       3.1             4.0           4.1+this   broken --
       4.0             4.0           4.1rc2     OK
       4.0             4.0           4.1+this   OK

So we gain and lose; the consensus seems to be treat this as a
fix for older machine types.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190730093719.12958-3-dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-07-30 12:07:07 -04:00
Dr. David Alan Gilbert
dd56040d29 Revert "hw: report invalid disable-legacy|modern usage for virtio-1-only devs"
This reverts commit f2784eed30
since that accidentally removes the PCIe capabilities from virtio
devices because virtio_pci_dc_realize is called before the new 'mode'
flag is set.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190729162903.4489-3-dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2019-07-29 16:57:27 -04:00
Peter Maydell
032cfe6a79 pl031: Correctly migrate state when using -rtc clock=host
The PL031 RTC tracks the difference between the guest RTC
and the host RTC using a tick_offset field. For migration,
however, we currently always migrate the offset between
the guest and the vm_clock, even if the RTC clock is not
the same as the vm_clock; this was an attempt to retain
migration backwards compatibility.

Unfortunately this results in the RTC behaving oddly across
a VM state save and restore -- since the VM clock stands still
across save-then-restore, regardless of how much real world
time has elapsed, the guest RTC ends up out of sync with the
host RTC in the restored VM.

Fix this by migrating the raw tick_offset. To retain migration
compatibility as far as possible, we have a new property
migrate-tick-offset; by default this is 'true' and we will
migrate the true tick offset in a new subsection; if the
incoming data has no subsection we fall back to the old
vm_clock-based offset information, so old->new migration
compatibility is preserved. For complete new->old migration
compatibility, the property is set to 'false' for 4.0 and
earlier machine types (this will only affect 'virt-4.0'
and below, as none of the other pl031-using machines are
versioned).

Reported-by: Russell King <rmk@armlinux.org.uk>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 20190709143912.28905-1-peter.maydell@linaro.org
2019-07-15 14:17:04 +01:00
Stefan Hajnoczi
2bbadb08ce virtio-balloon: fix QEMU 4.0 config size migration incompatibility
The virtio-balloon config size changed in QEMU 4.0 even for existing
machine types.  Migration from QEMU 3.1 to 4.0 can fail in some
circumstances with the following error:

  qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10 read: a1 device: 1 cmask: ff wmask: c0 w1cmask:0

This happens because the virtio-balloon config size affects the VIRTIO
Legacy I/O Memory PCI BAR size.

Introduce a qdev property called "qemu-4-0-config-size" and enable it
only for the QEMU 4.0 machine types.  This way <4.0 machine types use
the old size, 4.0 uses the larger size, and >4.0 machine types use the
appropriate size depending on enabled virtio-balloon features.

Live migration to and from old QEMUs to QEMU 4.1 works again as long as
a versioned machine type is specified (do not use just "pc"!).

Originally-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20190710141440.27635-1-stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-07-12 10:56:26 -04:00
Like Xu
6f479566a8 machine: Refactor smp_parse() in vl.c as MachineClass::smp_parse()
To make smp_parse() more flexible and expansive, a smp_parse function
pointer is added to MachineClass that machine types could override.

The generic smp_parse() code in vl.c is moved to hw/core/machine.c, and
become the default implementation of MachineClass::smp_parse. A PC-specific
function called pc_smp_parse() has been added to hw/i386/pc.c, which in
this patch changes nothing against the default one .

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190620054525.37188-3-like.xu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Like Xu
176d2cda0d i386/cpu: Consolidate die-id validity in smp context
The field die_id (default as 0) and has_die_id are introduced to X86CPU.
Following the legacy smp check rules, the die_id validity is added to
the same contexts as leagcy smp variables such as hmp_hotpluggable_cpus(),
machine_set_cpu_numa_node(), cpu_slot_to_string() and pc_cpu_pre_plug().

Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190612084104.34984-4-like.xu@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:03 -03:00
Xie Yongji
e57f2c31b6 virtio: add "use-started" property
In order to avoid migration issues, we introduce a "use-started"
property to the base virtio device to indicate whether use
"started" flag or not. This property will be true by default and
set to false when machine type <= 4.0.

Suggested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Message-Id: <20190626023130.31315-2-xieyongji@baidu.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-07-04 17:00:32 -04:00
Greg Kurz
8e8cbed09a hw: Nuke hw_compat_4_0_1 and pc_compat_4_0_1
Commit c87759ce87 fixed a regression affecting pc-q35 machines by
introducing a new pc-q35-4.0.1 machine version to be used instead
of pc-q35-4.0. The only purpose was to revert the default behaviour
of not using split irqchip, but the change also introduced the usual
hw_compat and pc_compat bits, and wired them for pc-q35 only.

This raises questions when it comes to add new compat properties for
4.0* machine versions of any architecture. Where to add them ? In
4.0, 4.0.1 or both ? Error prone. Another possibility would be to teach
all other architectures about 4.0.1. This solution isn't satisfying,
especially since this is a pc-q35 specific issue.

It turns out that the split irqchip default is handled in the machine
option function and doesn't involve compat lists at all.

Drop all the 4.0.1 compat lists and use the 4.0 ones instead in the 4.0.1
machine option function.

Move the compat props that were added to the 4.0.1 since c87759ce87 to
4.0.

Even if only hw_compat_4_0_1 had an impact on other architectures,
drop pc_compat_4_0_1 as well for consistency.

Fixes: c87759ce87 "q35: Revert to kernel irqchip"
Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <156051774276.244890.8660277280145466396.stgit@bahia.lan>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-21 13:25:29 +02:00
Gerd Hoffmann
0a71966253 edid: flip the default to enabled
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20190607083444.32175-1-kraxel@redhat.com
2019-06-13 09:34:50 +02:00
Alex Williamson
c87759ce87 q35: Revert to kernel irqchip
Commit b2fc91db84 ("q35: set split kernel irqchip as default") changed
the default for the pc-q35-4.0 machine type to use split irqchip, which
turned out to have disasterous effects on vfio-pci INTx support.  KVM
resampling irqfds are registered for handling these interrupts, but
these are non-functional in split irqchip mode.  We can't simply test
for split irqchip in QEMU as userspace handling of this interrupt is a
significant performance regression versus KVM handling (GeForce GPUs
assigned to Windows VMs are non-functional without forcing MSI mode or
re-enabling kernel irqchip).

The resolution is to revert the change in default irqchip mode in the
pc-q35-4.1 machine and create a pc-q35-4.0.1 machine for the 4.0-stable
branch.  The qemu-q35-4.0 machine type should not be used in vfio-pci
configurations for devices requiring legacy INTx support without
explicitly modifying the VM configuration to use kernel irqchip.

Link: https://bugs.launchpad.net/qemu/+bug/1826422
Fixes: b2fc91db84 ("q35: set split kernel irqchip as default")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <155786484688.13873.6037015630912983760.stgit@gimli.home>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-03 14:03:03 +02:00
Gerd Hoffmann
442bac16a6 usb-tablet: fix serial compat property
s/kbd/tablet/, fixes cut+paste bug.

Cc: qemu-stable@nongnu.org
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190520081805.15019-1-kraxel@redhat.com
2019-05-29 07:10:02 +02:00
Daniel P. Berrangé
f2784eed30 hw: report invalid disable-legacy|modern usage for virtio-1-only devs
A number of virtio devices (gpu, crypto, mouse, keyboard, tablet) only
support the virtio-1 (aka modern) mode. Currently if the user launches
QEMU, setting those devices to enable legacy mode, QEMU will silently
create them in modern mode, ignoring the user's (mistaken) request.

This patch introduces proper data validation so that an attempt to
configure a virtio-1-only devices in legacy mode gets reported as an
error to the user.

Checking this required introduction of a new field to explicitly track
what operating model is to be used for a device, separately from the
disable_modern and disable_legacy fields that record the user's
requested configuration.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190215103239.28640-2-berrange@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-05-20 18:40:02 -04:00
Cornelia Huck
9bf2650bc3 hw: add compat machines for 4.1
Add 4.1 machine types for arm/i440fx/q35/s390x/spapr.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190411102025.22559-1-cohuck@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-04-25 14:16:41 -03:00
Eric Auger
f6a0d06ba7 machine: Move nvdimms state into struct MachineState
As NVDIMM support is looming for ARM and SPAPR, let's
move the acpi_nvdimm_state to the generic machine struct
instead of duplicating the same code in several machines.
It is also renamed into nvdimms_state and becomes a pointer.

nvdimm and nvdimm-persistence become generic machine options.
They become guarded by a nvdimm_supported machine class member.
We also add a description for those options.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20190308182053.5487-3-eric.auger@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-03-11 10:44:25 -03:00
Stefano Garzarella
5c81161f80 virtio-blk: add "discard" and "write-zeroes" properties
In order to avoid migration issues, we enable DISCARD and
WRITE_ZEROES features only for machine type >= 4.0

As discussed with Michael S. Tsirkin and Stefan Hajnoczi on the
list [1], DISCARD operation should not have security implications
(eg. page cache attacks), so we can enable it by default.

[1] https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg00504.html

Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190221103314.58500-4-sgarzare@redhat.com
Message-Id: <20190221103314.58500-4-sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-02-22 09:42:16 +00:00
Gerd Hoffmann
b63e10508b usb: assign unique serial numbers to hid devices
Windows guests have trouble dealing with usb devices having identical
serial numbers.  So, assign unique serial numbers to usb hid devices.
All other usb devices have this already.

In the past the fixed serial number has been used to indicate working
remote setup to linux guests.  Here is a bit of history:

 * First there was nothing.
 * Then I added a rule to udev checking for serial == 42.
   (this is in rhel-6).
 * Then systemd + udev merged.
 * Then I changed the rule to check for serial != 1 instead, so we can
   use any serial but "1" which is the one the old broken devices had
   (this is in rhel-7).  March 2014 in upstream systemd.
 * Then all usb power management rules where dropped from systemd (June
   2015).  Which I figured today (Sept 2018), after wondering that the
   rules are gone in fedora 28.

So, three years ago the serial number check was dropped upstream, yet I
hav't seen a single report about autosuspend issues (or cpu usage for
usb emulation going up, which is the typical symtom).

So I figured I can stop worring that changing the serial number will
break things and just do it.

And even if it turns out autosuspend is still an issue:  I think
meanwhile we can really stop worrying about guests running in old qemu
versions with broken usb suspend (fixed in 0.13 !).  If needed we can
enable autosuspend unconditionally in guests.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20190110125108.22834-1-kraxel@redhat.com
2019-01-30 06:47:52 +01:00
Marc-André Lureau
b6148757f9 tpm: add a "ppi" boolean property
The following patches implement the TPM Physical Presence Interface,
make use of a new memory region and a fw_cfg entry. Enable PPI by
default with >=4.0 machine type, to avoid migration issues.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17 21:10:57 -05:00
Eduardo Habkost
53921bfdce virtio: Make disable-legacy/disable-modern compat properties optional
The disable-legacy and disable-modern properties apply only to
some virtio-pci devices.  Make those properties optional.

This fixes the crash introduced by commit f6e501a28e ("virtio: Provide
version-specific variants of virtio PCI devices"):

  $ qemu-system-x86_64 -machine pc-i440fx-2.6 \
    -device virtio-net-pci-non-transitional
  Unexpected error in object_property_find() at qom/object.c:1092:
  qemu-system-x86_64: -device virtio-net-pci-non-transitional: can't apply \
  global virtio-pci.disable-modern=on: Property '.disable-modern' not found
  Aborted (core dumped)

Reported-by: Thomas Huth <thuth@redhat.com>
Fixes: f6e501a28e ("virtio: Provide version-specific variants of virtio PCI devices")
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17 21:10:57 -05:00
Peter Maydell
e53f7796fb Machine queue, 2019-01-10
* Simplify GlobalProperty array declarations (Eduardo Habkost)
 * Deprecate cpu-add commands (Kashyap Chamarthy)
 * range/memory-device cleanups (David Hildenbrand)
 * Fix -device scsi-hd,help regression (Marc-André Lureau)
 * Fix crash when -global generates multiple warnings (Eduardo
   Habkost)
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJcN1aHAAoJECgHk2+YTcWmi20QALYlChpE/YNh/8RZvpKRRgt4
 4aj8/qOlFq+TeoKQPf6KNCrTy6awvVGWOhOZOLdGcalqNZ/l69ZmKmECU0dqTrEw
 2zo0FxD8bjqVdsBJifLFa925GbUBHFtw/Am87Qv/LO+PV0bqMMIVUyNo5hd0iMej
 AhwlaIKAJxIKlsUiZOXpc2n1siFBjciJWdNapsam+ia8b8Y5jZ5GlkfTCJ3Nve2m
 uHMTPTt6iQao3YmPZ11E2elQENV7F51wM21Ti93Eo90FRpmADzsGw+lnh4wp50Mr
 q2QvaDByxcNEfdhKWOTdjRV6pkVBUh6GZps9SJvFtSGc2vfxqPgpi+HvN6f79uIO
 QV3rbkM0dyR+xveKrBVHA7MjCNEdVdpz5GRsYHDC3kjZC/dpEN/nV5g4TKSMzd6A
 hHQi6sRBi1qiKytsRB8sTElDDJxmoE8t5X1QeW+Wws0hBKVnrlr/BY0u09ip2j4x
 qoR4QG1nrmAXHIKVwsIdn8tiPqX2iq+OPu/Tp8FHlRI/90+3jazSbderT6I/9oCd
 AqnxSGXRoPhK0l0n/57mcPeY478ENtRqdeUkUVDVFSoTG9ys8aYqYzS19CdrRExo
 QsJ1M+NGWqzCal4VxNrdbUHAQoG4b1tVnQJMiFBc9LxS/88Da7huoiXRwAQkwOCd
 Um3orC2vlv78QXTwTho6
 =LNk6
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ehabkost/tags/machine-next-pull-request' into staging

Machine queue, 2019-01-10

* Simplify GlobalProperty array declarations (Eduardo Habkost)
* Deprecate cpu-add commands (Kashyap Chamarthy)
* range/memory-device cleanups (David Hildenbrand)
* Fix -device scsi-hd,help regression (Marc-André Lureau)
* Fix crash when -global generates multiple warnings (Eduardo
  Habkost)

# gpg: Signature made Thu 10 Jan 2019 14:28:23 GMT
# gpg:                using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/machine-next-pull-request:
  qom: Don't keep error value between object_property_parse() calls
  qdev: fix -device scsi-hd,help regression
  machine: Use shorter format for GlobalProperty arrays
  machine: Eliminate unnecessary stringify() usage
  spapr: Eliminate SPAPR_PCI_2_7_MMIO_WIN_SIZE macro
  memory-device: rewrite address assignment using ranges
  range: add some more functions
  Mention that QMP 'cpu-add' will be deprecated
  Update that HMP 'cpu-add' is deprecated in 4.0
  qemu-deprecated.texi: Rename the HMP section

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-01-11 13:35:48 +00:00
Eduardo Habkost
6c36bddf53 machine: Use shorter format for GlobalProperty arrays
Instead of verbose arrays with 4 lines for each entry, make each
entry take only one line.  This makes long arrays that couldn't
fit in the screen become short and readable.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190107193020.21744-4-ehabkost@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-09 22:10:00 -02:00
Eduardo Habkost
e0985450e1 machine: Eliminate unnecessary stringify() usage
stringify() is useful when we need to use macros in compat_props
(like when we set virtio-baloon-pci.class=PCI_CLASS_MEMORY_RAM at
pc_i440fx_1_0_machine_options()), but it is pointless when we are
already providing a number literal.

Replace stringify() with string literals when appropriate.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190107193020.21744-3-ehabkost@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-09 22:10:00 -02:00
Wainer dos Santos Moschetta
097f699ce9 hw/core: fix whitespace in a sentence
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20181224154922.3338-1-wainersm@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2019-01-09 11:09:31 +01:00
Marc-André Lureau
fa0cb34d22 hostmem: use object id for memory region name with >= 4.0
hostmem-file and hostmem-memfd use the whole object path for the
memory region name, and hostname-ram uses only the path component (the
object id, or canonical path basename):

qemu -m 1024 -object memory-backend-file,id=mem,size=1G,mem-path=/tmp/foo -numa node,memdev=mem -monitor stdio
(qemu) info ramblock
              Block Name    PSize              Offset               Used              Total
            /objects/mem    4 KiB  0x0000000000000000 0x0000000040000000 0x0000000040000000

qemu -m 1024 -object memory-backend-memfd,id=mem,size=1G -numa node,memdev=mem -monitor stdio
(qemu) info ramblock
              Block Name    PSize              Offset               Used              Total
            /objects/mem    4 KiB  0x0000000000000000 0x0000000040000000 0x0000000040000000

qemu -m 1024 -object memory-backend-ram,id=mem,size=1G -numa node,memdev=mem -monitor stdio
(qemu) info ramblock
              Block Name    PSize              Offset               Used              Total
                     mem    4 KiB  0x0000000000000000 0x0000000040000000 0x0000000040000000

For consistency, change to use object id for -file and -memfd as well
with >= 4.0.

Having a consistent naming allows to migrate to different hostmem
backends.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-07 16:18:42 +04:00
Marc-André Lureau
c4fc5695b7 compat: replace PC_COMPAT_2_1 & HW_COMPAT_2_1 macros
Use static arrays instead.  I decided to rename the conflicting
pc_compat_2_1() function with pc_compat_2_1_fn().

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-07 16:18:42 +04:00
Marc-André Lureau
1c30044e1a compat: replace PC_COMPAT_2_2 & HW_COMPAT_2_2 macros
Use static arrays instead.  I decided to rename the conflicting
pc_compat_2_2() function with pc_compat_2_2_fn().

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-07 16:18:42 +04:00
Marc-André Lureau
8995dd9009 compat: replace PC_COMPAT_2_3 & HW_COMPAT_2_3 macros
Use static arrays instead.  I decided to rename the conflicting
pc_compat_2_3() function with pc_compat_2_3_fn().

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-07 16:18:42 +04:00