Commit Graph

1024 Commits

Author SHA1 Message Date
Stefano Garzarella
1793ad0247 iothread: add aio-max-batch parameter
The `aio-max-batch` parameter will be propagated to AIO engines
and it will be used to control the maximum number of queued requests.

When there are in queue a number of requests equal to `aio-max-batch`,
the engine invokes the system call to forward the requests to the kernel.

This parameter allows us to control the maximum batch size to reduce
the latency that requests might accumulate while queued in the AIO
engine queue.

If `aio-max-batch` is equal to 0 (default value), the AIO engine will
use its default maximum batch size value.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20210721094211.69853-3-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-07-21 13:47:50 +01:00
Paolo Bonzini
24b36e9813 block: add max_hw_transfer to BlockLimits
For block host devices, I/O can happen through either the kernel file
descriptor I/O system calls (preadv/pwritev, io_submit, io_uring)
or the SCSI passthrough ioctl SG_IO.

In the latter case, the size of each transfer can be limited by the
HBA, while for file descriptor I/O the kernel is able to split and
merge I/O in smaller pieces as needed.  Applying the HBA limits to
file descriptor I/O results in more system calls and suboptimal
performance, so this patch splits the max_transfer limit in two:
max_transfer remains valid and is used in general, while max_hw_transfer
is limited to the maximum hardware size.  max_hw_transfer can then be
included by the scsi-generic driver in the block limits page, to ensure
that the stricter hardware limit is used.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25 10:54:13 +02:00
Peter Maydell
b6d73e9cb1 * avoid deprecation warnings for SASL on macOS 10.11 or newer
* fix -readconfig when config blocks have an id (like [chardev "qmp"])
 * Error* initialization fixes
 * Improvements to ESP emulation (Mark)
 * Allow creating noreserve memory backends (David)
 * Improvements to query-memdev (David)
 * Bump compiler to C11 (Richard)
 * First round of SVM fixes from GSoC project (Lara)
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmDKGs0UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNE4Qf+PUGkCzq5EupfW9mQXuYJ+xAkcX6+
 fsnahw3QFUNXWtaXkkDnWXtXDxt0muofb5z5axa0kpRdjmpey+Q7jBGSC5jXU043
 AJWdquCSIMWzlGnnR65R+shLY8/aRyRLS2q2uz5f60nwxe6J07mfNZNpKqHpV0rf
 D+VkjmHXMO5wbdmuoaoDGeeOc5aPjG/zFvirXdVvl5xbT7Yx1ZaBvXf+lXUhB6Jq
 6mzafwXZ7D6ZIRMCv8dJvoJ8tHtTrFNsLsYsiNJPHvvI9e4nImenFAy0kZC0ZEjf
 iowEZUnVd+IhHWhFlycceXi2clkIav6ZoJoz8R2RyN/OSTPSNLCVvaVsUg==
 =XAO1
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging

* avoid deprecation warnings for SASL on macOS 10.11 or newer
* fix -readconfig when config blocks have an id (like [chardev "qmp"])
* Error* initialization fixes
* Improvements to ESP emulation (Mark)
* Allow creating noreserve memory backends (David)
* Improvements to query-memdev (David)
* Bump compiler to C11 (Richard)
* First round of SVM fixes from GSoC project (Lara)

# gpg: Signature made Wed 16 Jun 2021 16:37:49 BST
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini-gitlab/tags/for-upstream: (45 commits)
  configure: Remove probe for _Static_assert
  qemu/compiler: Remove QEMU_GENERIC
  include/qemu/lockable: Use _Generic instead of QEMU_GENERIC
  util: Use unique type for QemuRecMutex in thread-posix.h
  util: Pass file+line to qemu_rec_mutex_unlock_impl
  util: Use real functions for thread-posix QemuRecMutex
  softfloat: Use _Generic instead of QEMU_GENERIC
  configure: Use -std=gnu11
  target/i386: Added Intercept CR0 writes check
  target/i386: Added consistency checks for CR0
  target/i386: Added consistency checks for VMRUN intercept and ASID
  target/i386: Refactored intercept checks into cpu_svm_has_intercept
  configure: map x32 to cpu_family x86_64 for meson
  hmp: Print "reserve" property of memory backends with "info memdev"
  qmp: Include "reserve" property of memory backends
  hmp: Print "share" property of memory backends with "info memdev"
  qmp: Include "share" property of memory backends
  qmp: Clarify memory backend properties returned via query-memdev
  hostmem: Wire up RAM_NORESERVE via "reserve" property
  util/mmap-alloc: Support RAM_NORESERVE via MAP_NORESERVE under Linux
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-17 15:43:26 +01:00
David Hildenbrand
9181fb7043 hostmem: Wire up RAM_NORESERVE via "reserve" property
Let's provide a way to control the use of RAM_NORESERVE via memory
backends using the "reserve" property which defaults to true (old
behavior).

Only Linux currently supports clearing the flag (and support is checked at
runtime, depending on the setting of "/proc/sys/vm/overcommit_memory").
Windows and other POSIX systems will bail out with "reserve=false".

The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM. This essentially allows
avoiding to set "/proc/sys/vm/overcommit_memory == 0") when using
virtio-mem and also supporting hugetlbfs in the future.

As really only Linux implements RAM_NORESERVE right now, let's expose
the property only with CONFIG_LINUX. Setting the property to "false"
will then only fail in corner cases -- for example on very old kernels
or when memory overcommit was completely disabled by the admin.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-11-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-15 20:27:38 +02:00
Stefan Berger
e542b71805 sysemu: Make TPM structures inaccessible if CONFIG_TPM is not defined
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210614191335.1968807-5-stefanb@linux.ibm.com>
[PMD: Remove tpm_init() / tpm_cleanup() stubs]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-06-15 10:55:12 -04:00
Richard Henderson
fa79cde6ed accel/tcg: Merge tcg_exec_init into tcg_init_machine
There is only one caller, and shortly we will need access
to the MachineState, which tcg_init_machine already has.

Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-11 09:26:28 -07:00
Alexander Graf
b533450e74 hvf: Introduce hvf vcpu struct
We will need more than a single field for hvf going forward. To keep
the global vcpu struct uncluttered, let's allocate a special hvf vcpu
struct, similar to how hax does it.

Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Tested-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-12-agraf@csgraf.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03 16:43:27 +01:00
Alexander Graf
d662ede2b1 hvf: Remove hvf-accel-ops.h
We can move the definition of hvf_vcpu_exec() into our internal
hvf header, obsoleting the need for hvf-accel-ops.h.

Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-11-agraf@csgraf.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03 16:43:27 +01:00
Alexander Graf
cfe58455f3 hvf: Split out common code on vcpu init and destroy
Until now, Hypervisor.framework has only been available on x86_64 systems.
With Apple Silicon shipping now, it extends its reach to aarch64. To
prepare for support for multiple architectures, let's start moving common
code out into its own accel directory.

This patch splits the vcpu init and destroy functions into a generic and
an architecture specific portion. This also allows us to move the generic
functions into the generic hvf code, removing exported functions.

Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-8-agraf@csgraf.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03 16:43:27 +01:00
Alexander Graf
3f965ef4e0 hvf: Make hvf_set_phys_mem() static
The hvf_set_phys_mem() function is only called within the same file.
Make it static.

Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-6-agraf@csgraf.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03 16:43:27 +01:00
Alexander Graf
861457ce73 hvf: Move hvf internal definitions into common header
Until now, Hypervisor.framework has only been available on x86_64 systems.
With Apple Silicon shipping now, it extends its reach to aarch64. To
prepare for support for multiple architectures, let's start moving common
code out into its own accel directory.

This patch moves a few internal struct and constant defines over.

Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-5-agraf@csgraf.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03 16:43:26 +01:00
Alexander Graf
358e7505b2 hvf: Move cpu functions into common directory
Until now, Hypervisor.framework has only been available on x86_64 systems.
With Apple Silicon shipping now, it extends its reach to aarch64. To
prepare for support for multiple architectures, let's start moving common
code out into its own accel directory.

This patch moves CPU and memory operations over. While at it, make sure
the code is consumable on non-i386 systems.

Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-4-agraf@csgraf.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03 16:43:26 +01:00
Alexander Graf
d57bc3c109 hvf: Move assert_hvf_ok() into common directory
Until now, Hypervisor.framework has only been available on x86_64 systems.
With Apple Silicon shipping now, it extends its reach to aarch64. To
prepare for support for multiple architectures, let's start moving common
code out into its own accel directory.

This patch moves assert_hvf_ok() and introduces generic build infrastructure.

Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-2-agraf@csgraf.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03 16:43:26 +01:00
Sergio Lopez
095cc4d0f6 block-backend: add drained_poll
Allow block backends to poll their devices/users to check if they have
been quiesced when entering a drained section.

This will be used in the next patch to wait for the NBD server to be
completely quiesced.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Sergio Lopez <slp@redhat.com>
Message-Id: <20210602060552.17433-2-slp@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-02 14:23:20 +02:00
Peter Xu
563d32ba9b KVM: Cache kvm slot dirty bitmap size
Cache it too because we'll reference it more frequently in the future.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210506160549.130416-8-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-26 14:49:45 +02:00
Peter Xu
2c20b27eed KVM: Provide helper to sync dirty bitmap from slot to ramblock
kvm_physical_sync_dirty_bitmap() calculates the ramblock offset in an
awkward way from the MemoryRegionSection that passed in from the
caller.  The truth is for each KVMSlot the ramblock offset never
change for the lifecycle.  Cache the ramblock offset for each KVMSlot
into the structure when the KVMSlot is created.

With that, we can further simplify kvm_physical_sync_dirty_bitmap()
with a helper to sync KVMSlot dirty bitmap to the ramblock dirty
bitmap of a specific KVMSlot.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210506160549.130416-6-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-26 14:49:45 +02:00
Peter Xu
e65e5f50db KVM: Provide helper to get kvm dirty log
Provide a helper kvm_slot_get_dirty_log() to make the function
kvm_physical_sync_dirty_bitmap() clearer.  We can even cache the as_id
into KVMSlot when it is created, so that we don't even need to pass it
down every time.

Since at it, remove return value of kvm_physical_sync_dirty_bitmap()
because it should never fail.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210506160549.130416-5-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-26 14:49:45 +02:00
Peter Xu
a2f77862ff KVM: Use a big lock to replace per-kml slots_lock
Per-kml slots_lock will bring some trouble if we want to take all slots_lock of
all the KMLs, especially when we're in a context that we could have taken some
of the KML slots_lock, then we even need to figure out what we've taken and
what we need to take.

Make this simple by merging all KML slots_lock into a single slots lock.

Per-kml slots_lock isn't anything that helpful anyway - so far only x86 has two
address spaces (so, two slots_locks).  All the rest archs will be having one
address space always, which means there's actually one slots_lock so it will be
the same as before.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210506160549.130416-3-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-26 14:49:45 +02:00
Thomas Huth
13b48fb00e include/sysemu: Poison all accelerator CONFIG switches in common code
We are already poisoning CONFIG_KVM since this switch is not working
in common code. Do the same with the other accelerator switches, too
(except for CONFIG_TCG, which is special, since it is also defined in
config-host.h).

Message-Id: <20210414112004.943383-2-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-05-14 12:31:44 +02:00
Markus Armbruster
4369223902 Drop the deprecated unicore32 target
Target unicore32 was deprecated in commit 8e4ff4a8d2, v5.2.0.  See
there for rationale.

Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210503084034.3804963-3-armbru@redhat.com>
Acked-by: Thomas Huth <thuth@redhat.com>
2021-05-12 18:20:52 +02:00
Markus Armbruster
9d49bcf699 Drop the deprecated lm32 target
Target lm32 was deprecated in commit d849800512, v5.2.0.  See there
for rationale.

Some of its code lives on in device models derived from milkymist
ones: hw/char/digic-uart.c and hw/display/bcm2835_fb.c.

Cc: Michael Walle <michael@walle.cc>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210503084034.3804963-2-armbru@redhat.com>
Acked-by: Michael Walle <michael@walle.cc>
[Trivial conflicts resolved, reST markup fixed]
2021-05-12 18:20:25 +02:00
Thomas Huth
875bb7e35b Remove the deprecated moxie target
There are no known users of this CPU anymore, and there are no
binaries available online which could be used for regression tests,
so the code has likely completely bit-rotten already. It's been
marked as deprecated since two releases now and nobody spoke up
that there is still a need to keep it, thus let's remove it now.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210430160355.698194-1-thuth@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[Commit message typos fixed, trivial conflicts resolved]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-05-12 17:42:23 +02:00
Peter Maydell
415a9fb880 osdep: Make os-win32.h and os-posix.h handle 'extern "C"' themselves
Both os-win32.h and os-posix.h include system header files. Instead
of having osdep.h include them inside its 'extern "C"' block, make
these headers handle that themselves, so that we don't include the
system headers inside 'extern "C"'.

This doesn't fix any current problems, but it's conceptually the
right way to handle system headers.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2021-05-10 17:21:54 +01:00
Reinoud Zandijk
b9bc6169de Add NVMM accelerator: acceleration enlightenments
Signed-off-by: Kamil Rytarowski <kamil@NetBSD.org>
Signed-off-by: Reinoud Zandijk <reinoud@NetBSD.org>

Message-Id: <20210402202535.11550-4-reinoud@NetBSD.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-04 14:15:34 +02:00
Laurent Vivier
4c5806a56b m68k: add the virtio devices aliases
Similarly to 5f629d943c ("s390x: fix s390 virtio aliases"),
define the virtio aliases.

This allows to start machines with virtio devices without
knowledge of the implementation type.

For instance, we can use "-device virtio-scsi" on
m68k, s390x or PC, and the device will be respectively
"virtio-scsi-device", "virtio-scsi-ccw" or "virtio-scsi-pci".

This already exists for s390x and -ccw interfaces, add them
for m68k and MMIO (-device) interfaces.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20210319202335.2397060-3-laurent@vivier.eu>
Message-Id: <20210323165308.15244-18-alex.bennee@linaro.org>
2021-03-24 14:25:48 +00:00
Laurent Vivier
203adb43fc qdev: define list of archs with virtio-pci or virtio-ccw
This is used to define virtio-*-pci and virtio-*-ccw aliases
rather than substracting the CCW architecture from all the others.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20210319202335.2397060-2-laurent@vivier.eu>
Message-Id: <20210323165308.15244-17-alex.bennee@linaro.org>
2021-03-24 14:25:48 +00:00
Markus Armbruster
fe9f70a1c3 blockdev: Drop deprecated bogus -drive interface type
Drop the crap deprecated in commit a1b40bda08 "blockdev: Deprecate
-drive with bogus interface type" (v5.1.0).

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20210309161214.1402527-5-armbru@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
2021-03-19 15:18:43 +01:00
David Hildenbrand
25459eb762 exec: Get rid of phys_mem_set_alloc()
As the last user is gone, we can get rid of phys_mem_set_alloc() and
simplify.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210303130916.22553-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-03-15 11:01:23 +01:00
Eric Auger
f14fb6c2db dma: Introduce dma_aligned_pow2_mask()
Currently get_naturally_aligned_size() is used by the intel iommu
to compute the maximum invalidation range based on @size which is
a power of 2 while being aligned with the @start address and less
than the maximum range defined by @gaw.

This helper is also useful for other iommu devices (virtio-iommu,
SMMUv3) to make sure IOMMU UNMAP notifiers only are called with
power of 2 range sizes.

Let's move this latter into dma-helpers.c and rename it into
dma_aligned_pow2_mask(). Also rewrite the helper so that it
accomodates UINT64_MAX values for the size mask and max mask.
It now returns a mask instead of a size. Change the caller.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-id: 20210309102742.30442-3-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-03-12 12:40:10 +00:00
Peter Maydell
6f34661b6c Pull request
-----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmBJQHkSHGxhdXJlbnRA
 dml2aWVyLmV1AAoJEPMMOL0/L748EdsP/2U2CGTM95tjDunTs9uZV/7zM6PWt85M
 vAPItNVU2jYPfzmaJN8twrzlj0PEDhvB9Q+OJjE4HEGxEbPcdblLg/R6Zs/EaWuY
 N6oKHPXnOnHb+e80UUJdiAq+Y5RUnJbb5L3ArycnVzBgws+Oj3DtqjB2VDccY4C/
 Gkt23tZ7ikU4958e5VBqW2NUUrr+BQO0mqsW+sbbeE3WPj75NQc6srvS3TWvsg7W
 OYEyVYwm52/q2W/1a3Knfv/YO6UU9NGMpGyDLD2kwQwKbgUWYLW2BiWVwOAUldo9
 De3nfKbKnFezLCZAZro20lfCa/aKwNGCOXWzlrKxqUQCmGYUx7gM1+3ahrSd5N0v
 zUgLdZm7O428ZHL6GujWGLA1UwwzpM9X3P3yo4c0S1J6fHypbI6a9jtewrUFvFgP
 TuQ7dp6cn2DTBYUcsrWilPHbTZMADYQNRD/xUtKqalYBEWy3FX5W75+OYBJKKh+X
 Qip68m6JBzgkszXhCcu6xlLb8ynZJr2VsHvtvIgf4NnLqNOIEgVLcMtoMZT8DPrp
 rIoRc5oUFz8zj5lHnJuLADBUvlCMqoCCoU3h2aqHwH8a7RGb180f+82BW9aBcb2u
 Jk+WgAhBUjWBBC97ReFgrINUD/qZRXVoOq8LthTuQSSyr/i1zq+oLM1F0EDXcMDm
 ssATku2IxL24
 =moUF
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vivier2/tags/trivial-branch-for-6.0-pull-request' into staging

Pull request

# gpg: Signature made Wed 10 Mar 2021 21:56:09 GMT
# gpg:                using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
# gpg:                issuer "laurent@vivier.eu"
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>" [full]
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/trivial-branch-for-6.0-pull-request: (22 commits)
  sysemu: Let VMChangeStateHandler take boolean 'running' argument
  sysemu/runstate: Let runstate_is_running() return bool
  hw/lm32/Kconfig: Have MILKYMIST select LM32_DEVICES
  hw/lm32/Kconfig: Rename CONFIG_LM32 -> CONFIG_LM32_DEVICES
  hw/lm32/Kconfig: Introduce CONFIG_LM32_EVR for lm32-evr/uclinux boards
  qemu-common.h: Update copyright string to 2021
  tests/fp/fp-test: Replace the word 'blacklist'
  qemu-options: Replace the word 'blacklist'
  seccomp: Replace the word 'blacklist'
  scripts/tracetool: Replace the word 'whitelist'
  ui: Replace the word 'whitelist'
  virtio-gpu: Adjust code space style
  exec/memory: Use struct Object typedef
  fuzz-test: remove unneccessary debugging flags
  net: Use id_generate() in the network subsystem, too
  MAINTAINERS: Fix the location of tools manuals
  vhost_user_gpu: Drop dead check for g_malloc() failure
  backends/dbus-vmstate: Fix short read error handling
  target/hexagon/gen_tcg_funcs: Fix a typo
  hw/elf_ops: Fix a typo
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-03-11 18:55:27 +00:00
Alex Bennée
78da6a1bca device_tree: add qemu_fdt_setprop_string_array helper
A string array in device tree is simply a series of \0 terminated
strings next to each other. As libfdt doesn't support that directly
we need to build it ourselves.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20210303173642.3805-4-alex.bennee@linaro.org>
2021-03-10 15:34:11 +00:00
Philippe Mathieu-Daudé
538f049704 sysemu: Let VMChangeStateHandler take boolean 'running' argument
The 'running' argument from VMChangeStateHandler does not require
other value than 0 / 1. Make it a plain boolean.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20210111152020.1422021-3-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-03-09 23:13:57 +01:00
Philippe Mathieu-Daudé
0a38950931 sysemu/runstate: Let runstate_is_running() return bool
runstate_check() returns a boolean. runstate_is_running()
returns what runstate_check() returns, also a boolean.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210111152020.1422021-2-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-03-09 23:12:58 +01:00
Pavel Dovgalyuk
366a85e4bb replay: fix icount request when replaying clock access
Record/replay provides REPLAY_CLOCK_LOCKED macro to access
the clock when vm_clock_seqlock is locked. This macro is
needed because replay internals operate icount. In locked case
replay use icount_get_raw_locked for icount request, which prevents
excess locking which leads to deadlock. But previously only
record code used *_locked function and replay did not.
Therefore sometimes clock access lead to deadlocks.
This patch fixes clock access for replay too and uses *_locked
icount access function.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Message-Id: <161347990483.1313189.8371838968343494161.stgit@pasha-ThinkPad-X280>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-16 17:15:39 +01:00
Tom Lendacky
92a5199b29 sev/i386: Don't allow a system reset under an SEV-ES guest
An SEV-ES guest does not allow register state to be altered once it has
been measured. When an SEV-ES guest issues a reboot command, Qemu will
reset the vCPU state and resume the guest. This will cause failures under
SEV-ES. Prevent that from occuring by introducing an arch-specific
callback that returns a boolean indicating whether vCPUs are resettable.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Aleksandar Rikalo <aleksandar.rikalo@syrmia.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: David Hildenbrand <david@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Message-Id: <1ac39c441b9a3e970e9556e1cc29d0a0814de6fd.1611682609.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-16 17:15:39 +01:00
Paolo Bonzini
b2f73a0784 sev/i386: Allow AP booting under SEV-ES
When SEV-ES is enabled, it is not possible modify the guests register
state after it has been initially created, encrypted and measured.

Normally, an INIT-SIPI-SIPI request is used to boot the AP. However, the
hypervisor cannot emulate this because it cannot update the AP register
state. For the very first boot by an AP, the reset vector CS segment
value and the EIP value must be programmed before the register has been
encrypted and measured. Search the guest firmware for the guest for a
specific GUID that tells Qemu the value of the reset vector to use.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <22db2bfb4d6551aed661a9ae95b4fdbef613ca21.1611682609.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-16 17:15:39 +01:00
James Bottomley
9617cddb72 pc: add parser for OVMF reset block
OVMF is developing a mechanism for depositing a GUIDed table just
below the known location of the reset vector.  The table goes
backwards in memory so all entries are of the form

<data>|len|<GUID>

Where <data> is arbtrary size and type, <len> is a uint16_t and
describes the entire length of the entry from the beginning of the
data to the end of the guid.

The foot of the table is of this form and <len> for this case
describes the entire size of the table.  The table foot GUID is
defined by OVMF as 96b582de-1fb2-45f7-baea-a366c55a082d and if the
table is present this GUID is just below the reset vector, 48 bytes
before the end of the firmware file.

Add a parser for the ovmf reset block which takes a copy of the block,
if the table foot guid is found, minus the footer and a function for
later traversal to return the data area of any specified GUIDs.

Signed-off-by: James Bottomley <jejb@linux.ibm.com>

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210204193939.16617-2-jejb@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-16 17:15:39 +01:00
Elena Ufimtseva
ad22c3088b multi-process: define MPQemuMsg format and transmission functions
Defines MPQemuMsg, which is the message that is sent to the remote
process. This message is sent over QIOChannel and is used to
command the remote process to perform various tasks.
Define transmission functions used by proxy and by remote.

Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 56ca8bcf95195b2b195b08f6b9565b6d7410bce5.1611938319.git.jag.raman@oracle.com

[Replace struct iovec send[2] = {0} with {} to make clang happy as
suggested by Peter Maydell <peter.maydell@linaro.org>.
--Stefan]

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-02-10 09:23:28 +00:00
David Gibson
c9f5aaa6bc sev: Add Error ** to sev_kvm_init()
This allows failures to be reported richly and idiomatically.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2021-02-08 16:57:38 +11:00
David Gibson
e0292d7c62 confidential guest support: Rework the "memory-encryption" property
Currently the "memory-encryption" property is only looked at once we
get to kvm_init().  Although protection of guest memory from the
hypervisor isn't something that could really ever work with TCG, it's
not conceptually tied to the KVM accelerator.

In addition, the way the string property is resolved to an object is
almost identical to how a QOM link property is handled.

So, create a new "confidential-guest-support" link property which sets
this QOM interface link directly in the machine.  For compatibility we
keep the "memory-encryption" property, but now implemented in terms of
the new property.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2021-02-08 16:57:38 +11:00
David Gibson
aacdb84413 sev: Remove false abstraction of flash encryption
When AMD's SEV memory encryption is in use, flash memory banks (which are
initialed by pc_system_flash_map()) need to be encrypted with the guest's
key, so that the guest can read them.

That's abstracted via the kvm_memcrypt_encrypt_data() callback in the KVM
state.. except, that it doesn't really abstract much at all.

For starters, the only call site is in code specific to the 'pc'
family of machine types, so it's obviously specific to those and to
x86 to begin with.  But it makes a bunch of further assumptions that
need not be true about an arbitrary confidential guest system based on
memory encryption, let alone one based on other mechanisms:

 * it assumes that the flash memory is defined to be encrypted with the
   guest key, rather than being shared with hypervisor
 * it assumes that that hypervisor has some mechanism to encrypt data into
   the guest, even though it can't decrypt it out, since that's the whole
   point
 * the interface assumes that this encrypt can be done in place, which
   implies that the hypervisor can write into a confidential guests's
   memory, even if what it writes isn't meaningful

So really, this "abstraction" is actually pretty specific to the way SEV
works.  So, this patch removes it and instead has the PC flash
initialization code call into a SEV specific callback.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2021-02-08 16:57:38 +11:00
Claudio Fontana
b86f59c715 accel: replace struct CpusAccel with AccelOpsClass
This will allow us to centralize the registration of
the cpus.c module accelerator operations (in accel/accel-softmmu.c),
and trigger it automatically using object hierarchy lookup from the
new accel_init_interfaces() initialization step, depending just on
which accelerators are available in the code.

Rename all tcg-cpus.c, kvm-cpus.c, etc to tcg-accel-ops.c,
kvm-accel-ops.c, etc, matching the object type names.

Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <20210204163931.7358-18-cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-02-05 10:24:15 -10:00
Claudio Fontana
940e43aa30 accel: extend AccelState and AccelClass to user-mode
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

[claudio: rebased on Richard's splitwx work]

Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <20210204163931.7358-17-cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-02-05 10:24:15 -10:00
Kevin Wolf
86b1cf3227 block: Separate blk_is_writable() and blk_supports_write_perm()
Currently, blk_is_read_only() tells whether a given BlockBackend can
only be used in read-only mode because its root node is read-only. Some
callers actually try to answer a slightly different question: Is the
BlockBackend configured to be writable, by taking write permissions on
the root node?

This can differ, for example, for CD-ROM devices which don't take write
permissions, but may be backed by a writable image file. scsi-cd allows
write requests to the drive if blk_is_read_only() returns false.
However, the write request will immediately run into an assertion
failure because the write permission is missing.

This patch introduces separate functions for both questions.
blk_supports_write_perm() answers the question whether the block
node/image file can support writable devices, whereas blk_is_writable()
tells whether the BlockBackend is currently configured to be writable.

All calls of blk_is_read_only() are converted to one of the two new
functions.

Fixes: https://bugs.launchpad.net/bugs/1906693
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210118123448.307825-2-kwolf@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-01-27 20:45:20 +01:00
Paolo Bonzini
84f4ef17ae whpx: move internal definitions to whpx-internal.h
Only leave the external interface in sysemu/whpx.h.  whpx_apic_in_platform
is moved to a .c file because it needs whpx_state.

Reported-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20201219090637.1700900-3-pbonzini@redhat.com>
2021-01-12 12:38:03 +01:00
Richard Henderson
a35b3e1415 tcg: Add --accel tcg,split-wx property
Plumb the value through to alloc_code_gen_buffer.  This is not
supported by any os or tcg backend, so for now enabling it will
result in an error.

Reviewed-by: Joelle van Dyne <j@getutm.app>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-01-07 05:09:41 -10:00
Alejandro Jimenez
c753e8e725 vl: Add option to avoid stopping VM upon guest panic
The current default action of pausing a guest after a panic event
is received leaves the responsibility to resume guest execution to the
management layer. The reasons for this behavior are discussed here:
https://lore.kernel.org/qemu-devel/52148F88.5000509@redhat.com/

However, in instances like the case of older guests (Linux and
Windows) using a pvpanic device but missing support for the
PVPANIC_CRASHLOADED event, and Windows guests using the hv-crash
enlightenment, it is desirable to allow the guests to continue
running after sending a PVPANIC_PANICKED event. This allows such
guests to proceed to capture a crash dump and automatically reboot
without intervention of a management layer.

Add an option to avoid stopping a VM after a panic event is received,
by passing:

-action panic=none

in the command line arguments, or during runtime by using an upcoming
QMP command.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Message-Id: <1607705564-26264-3-git-send-email-alejandro.j.jimenez@oracle.com>
[Do not fix panic action in the variable, instead modify -no-shutdown. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 12:51:58 -05:00
Alejandro Jimenez
e6dba04813 qmp: generalize watchdog-set-action to -no-reboot/-no-shutdown
Add a QMP command to allow for the behaviors specified by the
-no-reboot and -no-shutdown command line option to be set at runtime.
The new command is named set-action and takes optional arguments, named
after an event, that provide a corresponding action to take.

Example:

-> { "execute": "set-action",
     "arguments": {
	"reboot": "none",
	"shutdown": "poweroff",
	"watchdog": "debug" } }
<- { "return": {} }

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Message-Id: <1607705564-26264-4-git-send-email-alejandro.j.jimenez@oracle.com>
[Split the series differently, with -action based on the QMP command. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 12:51:57 -05:00
Paolo Bonzini
f2ce39b4f0 vl: make qemu_get_machine_opts static
Machine options can be retrieved as properties of the machine object.
Encourage that by removing the "easy" accessor to machine options.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 12:51:55 -05:00
Paolo Bonzini
5a1ee6077b chardev: do not use machine_init_done
machine_init_done is not the right flag to check when preconfig
is taken into account; for example "./qemu-system-x86_64 -serial
mon:stdio -preconfig" does not print the QEMU monitor header until after
exit_preconfig.  Add back a custom bool for mux character devices.  This
partially undoes commit c7278b4355 ("chardev: introduce chr_machine_done
hook", 2018-03-12), but it keeps the cleaner logic using a function
pointer in ChardevClass.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 12:51:51 -05:00