According to hw/ide/internal's file comment, only files in hw/ide/ are
supposed to include it. Drag reality slightly closer to supposition.
Three includes outside hw/ide remain: hw/arm/sbsa-ref.c,
include/hw/ide/pci.h, and include/hw/misc/macio/macio.h. Turns out
board code needs ide-internal.h to wire up IDE stuff. More cleanup is
needed. Left for another day.
Cc: John Snow <jsnow@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-11-armbru@redhat.com>
In my "build everything" tree, changing migration/qemu-file-types.h
triggers a recompile of some 2600 out of 6600 objects (not counting
tests and objects that don't depend on qemu/osdep.h).
The culprit is again hw/hw.h, which supposedly includes it for
convenience.
Include migration/qemu-file-types.h only where it's needed. Touching
it now recompiles less than 200 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-10-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
In my "build everything" tree, changing sysemu/reset.h triggers a
recompile of some 2600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).
The main culprit is hw/hw.h, which supposedly includes it for
convenience.
Include sysemu/reset.h only where it's needed. Touching it now
recompiles less than 200 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-9-armbru@redhat.com>
When commit 5f7d05ecfd added QLIST_INSERT_HEAD_RCU() to qemu/queue.h,
it had to include qemu/atomic.h. Commit 341774fe6c removed
QLIST_INSERT_HEAD_RCU() again, but neglected to remove the #include.
Do that now.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20190812052359.30071-6-armbru@redhat.com>
TYPE_IOMMU_MEMORY_REGION is a direct subtype of TYPE_MEMORY_REGION.
Its instance struct is IOMMUMemoryRegion, and its first member is a
MemoryRegion. Correct. Its class struct is IOMMUMemoryRegionClass,
and its first member is a DeviceClass. Wrong. Messed up when commit
1221a47467 introduced the QOM type. It even included hw/qdev-core.h
just for that.
TYPE_MEMORY_REGION doesn't bother to define a class struct. This is
fine, it simply defaults to its super-type TYPE_OBJECT's class struct
ObjectClass. Changing IOMMUMemoryRegionClass's first member's type to
ObjectClass would be a minimal fix, if a bit brittle: if
TYPE_MEMORY_REGION ever acquired own class struct, we'd have to update
IOMMUMemoryRegionClass to use it.
Fix it the clean and robust way instead: give TYPE_MEMORY_REGION its
own class struct MemoryRegionClass now, and use it for
IOMMUMemoryRegionClass's first member.
Revert the include of hw/qdev-core.h, and fix the few files that have
come to rely on it.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20190812052359.30071-5-armbru@redhat.com>
In my "build everything" tree, changing a type in qapi/common.json
triggers a recompile of some 3600 out of 6600 objects (not counting
tests and objects that don't depend on qemu/osdep.h).
One common dependency is QapiErrorClass: it's used only in in
qapi/error.h, which uses nothing else, and is widely included.
Move QapiErrorClass from common.json to new error.json. Touching
common.json now recompiles only some 2900 objects.
Cc: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-4-armbru@redhat.com>
Some of the generated qapi-types-MODULE.h are included all over the
place. Changing a QAPI type can trigger massive recompiling. Top
scorers recompile more than 1000 out of some 6600 objects (not
counting tests and objects that don't depend on qemu/osdep.h):
6300 qapi/qapi-builtin-types.h
5700 qapi/qapi-types-run-state.h
3900 qapi/qapi-types-common.h
3300 qapi/qapi-types-sockets.h
3000 qapi/qapi-types-misc.h
3000 qapi/qapi-types-crypto.h
3000 qapi/qapi-types-job.h
3000 qapi/qapi-types-block-core.h
2800 qapi/qapi-types-block.h
1300 qapi/qapi-types-net.h
Clean up headers to include generated QAPI headers only where needed.
Impact is negligible except for hw/qdev-properties.h.
This header includes qapi/qapi-types-block.h and
qapi/qapi-types-misc.h. They are used only in expansions of property
definition macros such as DEFINE_PROP_BLOCKDEV_ON_ERROR() and
DEFINE_PROP_OFF_AUTO(). Moving their inclusion from
hw/qdev-properties.h to the users of these macros avoids pointless
recompiles. This is how other property definition macros, such as
DEFINE_PROP_NETDEV(), already work.
Improves things for some of the top scorers:
3600 qapi/qapi-types-common.h
2800 qapi/qapi-types-sockets.h
900 qapi/qapi-types-misc.h
2200 qapi/qapi-types-crypto.h
2100 qapi/qapi-types-job.h
2100 qapi/qapi-types-block-core.h
270 qapi/qapi-types-block.h
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-3-armbru@redhat.com>
Back in 2016, we discussed[1] rules for headers, and these were
generally liked:
1. Have a carefully curated header that's included everywhere first. We
got that already thanks to Peter: osdep.h.
2. Headers should normally include everything they need beyond osdep.h.
If exceptions are needed for some reason, they must be documented in
the header. If all that's needed from a header is typedefs, put
those into qemu/typedefs.h instead of including the header.
3. Cyclic inclusion is forbidden.
This patch gets include/ closer to obeying 2.
It's actually extracted from my "[RFC] Baby steps towards saner
headers" series[2], which demonstrates a possible path towards
checking 2 automatically. It passes the RFC test there.
[1] Message-ID: <87h9g8j57d.fsf@blackfin.pond.sub.org>
https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg03345.html
[2] Message-Id: <20190711122827.18970-1-armbru@redhat.com>
https://lists.nongnu.org/archive/html/qemu-devel/2019-07/msg02715.html
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-2-armbru@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
This fixes devices like IDE that can still start new requests from I/O
handlers in the CPU thread while the block backend is drained.
The basic assumption is that in a drain section, no new requests should
be allowed through a BlockBackend (blk_drained_begin/end don't exist,
we get drain sections only on the node level). However, there are two
special cases where requests should not be queued:
1. Block jobs: We already make sure that block jobs are paused in a
drain section, so they won't start new requests. However, if the
drain_begin is called on the job's BlockBackend first, it can happen
that we deadlock because the job stays busy until it reaches a pause
point - which it can't if its requests aren't processed any more.
The proper solution here would be to make all requests through the
job's filter node instead of using a BlockBackend. For now, just
disabling request queuing on the job BlockBackend is simpler.
2. In test cases where making requests through bdrv_* would be
cumbersome because we'd need a BdrvChild. As we already got the
functionality to disable request queuing from 1., use it in tests,
too, for convenience.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
The functionality offered by blk_pread_unthrottled() goes back to commit
498e386c58. Then, we couldn't perform I/O throttling with synchronous
requests because timers wouldn't be executed in polling loops. So the
commit automatically disabled I/O throttling as soon as a synchronous
request was issued.
However, for geometry detection during disk initialisation, we always
used (and still use) synchronous requests even if guest requests use AIO
later. Geometry detection was not wanted to disable I/O throttling, so
bdrv_pread_unthrottled() was introduced which disabled throttling only
temporarily.
All of this isn't necessary any more because we do run timers in polling
loop and even synchronous requests are now using coroutine
infrastructure internally. For this reason, commit 90c78624f already
removed the automatic disabling of I/O throttling.
It's time to get rid of the workaround for the removed code, and its
abuse of blk_root_drained_begin()/end(), as well.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
No reason to use blocking channel for negotiation and we'll benefit in
further reconnect feature, as qio_channel reads and writes will do
qemu_coroutine_yield while waiting for io completion.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190618114328.55249-3-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Do effective copy-on-read request when we don't need data actually. It
will be used for block-stream and NBD_CMD_CACHE.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20190725100550.33801-2-vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[eblake: comment grammar fix]
Signed-off-by: Eric Blake <eblake@redhat.com>
Both the qtest client, libqtest.c, and server, qtest.c, used the same
name for initialization functions which can cause confusion.
Signed-off-by: Alexander Oleinik <alxndr@bu.edu>
Message-Id: <20190805031240.6024-1-alxndr@bu.edu>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Move migration helpers for strings under include/, so they can be used
outside of migration/
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20190808150325.21939-2-marcandre.lureau@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The migration sequence of a guest using the XIVE exploitation mode
relies on the fact that the states of all devices are restored before
the machine is. This is not true for hot-plug devices such as CPUs
which state come after the machine. This breaks migration because the
thread interrupt context registers are not correctly set.
Fix migration of hotplugged CPUs by restoring their context in the
'post_load' handler of the XiveTCTX model.
Fixes: 277dd3d771 ("spapr/xive: add migration support for KVM")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190813064853.29310-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
ACS was added in 4.0 unconditionally, this breaks migration
compatibility.
Allow ACS to be disabled by adding a property that's
checked by pcie_root_port.
Unfortunately pcie-root-port doesn't have any instance data,
so there's no where for that flag to live, so stuff it into
PCIESlot.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190730093719.12958-2-dgilbert@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit 8fa70dbd8b.
Because we're about to revert it's neighbour and thus uses an optional
again.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190729162903.4489-2-dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
The VMSTATE_STRUCT_VARRAY_UINT32 macro is intended to handle
migrating a field which is an array of structs, but where instead of
migrating the entire array we only migrate a variable number of
elements of it.
The VMSTATE_STRUCT_VARRAY_POINTER_UINT32 macro is intended to handle
migrating a field which is of pointer type, and points to a
dynamically allocated array of structs of variable size.
We weren't actually checking that the field passed to
VMSTATE_STRUCT_VARRAY_UINT32 really is an array, with the result that
accidentally using it where the _POINTER_ macro was intended would
compile but silently corrupt memory on migration.
Add type-checking that enforces that the field passed in is
really of the right array type. This applies to all the VMSTATE
macros which use flags including VMS_VARRAY_* but not VMS_POINTER.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
Tested-by: Damien Hedde <damien.hedde@greensocs.com>
Message-id: 20190725163710.11703-3-peter.maydell@linaro.org
We still have multiple issues in the current code
- The PBP is not freed during unrealize()
- The PBP is not reset on device resets: After a reset, the PBP is stale.
- We are not indicating VIRTIO_BALLOON_F_MUST_TELL_HOST, therefore
guests (esp. legacy guests) will reuse pages without deflating,
turning the PBP stale. Adding that would require compat handling.
Instead, let's use the PBP only temporarily, when processing one bulk of
inflation requests. This will keep guest_page_size > 4k working (with
Linux guests). There is nothing to do for deflation requests anymore.
The pbp is only used for a limited amount of time.
Fixes: ed48c59875 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size")
Cc: qemu-stable@nongnu.org #v4.0.0
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-7-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
bdrv_set_aio_context_ignore() can only work in the main loop:
bdrv_drained_begin() only works in the main loop and the node's (old)
AioContext; and bdrv_drained_end() really only works in the main loop
and the node's (new) AioContext (contrary to its current comment, which
is just wrong).
Consequentially, bdrv_set_aio_context_ignore() must be called from the
main loop. Luckily, assuming that we can make block graph changes only
from the main loop as well, all its callers do that already.
Note that changing a node's context in a sense is an operation that
changes the block graph, so it actually makes sense to require this
function to be called from the main loop.
Also, fix bdrv_drained_end()'s description. You can only use it from
the main loop or the node's AioContext, and in the latter case, the
whole subtree must be in the same context.
Fixes: e037c09c78
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190722133054.21781-3-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
The i.MX6UL always has a single Cortex-A7 CPU (we set FSL_IMX6UL_NUM_CPUS
to 1 in line with this). This means that all the code in fsl-imx6ul.c to
handle multiple CPUs is dead code, and Coverity is now complaining that
it is unreachable (CID 1403008, 1403011).
Remove the unreachable code and the only-executes-once loops,
and replace the single-entry cpu[] array in the FSLIMX6ULState
with a simple cpu member.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190712115030.26895-1-peter.maydell@linaro.org
mtree" that has been lingering for too long.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl0yOgoUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroMmdwf/WM4Su7hXnyp34Z6lcT4Wq25qb397
Bmv3GSqA94Ex3mUAFPx8+PyF1KVxRGsFuobxZ9KartPt7VwLFONApN6D+Ul1GXMn
aSZ/eR9K7GCdrjVCKMSEtIX2KSgyrAhNIKVF61DjWCGXXYVXllqbtaaCHAkl012g
JR5nlCqRTYqODgwhkynoqNtq13gkRokiAO0BMsk3xwzJ9UO6aOIu71TtFy3jsUn5
ff0Mm4G6SEP9IIAC3L9lbwZvEArnWbJlL7X1j5C1tbid+Gx5b/W5CWDWO84idZZh
FctkRgCPoVHucQYZh+OdAveWuN24tBLfA1a4zu4vSKNkTKS/SHb5YpSXAA==
=nIGk
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Mostly bugfixes, plus a patch to mark accelerator MemoryRegions in "info
mtree" that has been lingering for too long.
# gpg: Signature made Fri 19 Jul 2019 22:45:46 BST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
target/i386: sev: fix failed message typos
i386: indicate that 'pconfig' feature was removed intentionally
build-sys: do no support modules on Windows
qmp: don't emit the RESET event on wakeup
hmp: Print if memory section is registered with an accelerator
test-bitmap: add test for bitmap_set
scsi-generic: Check sense key before request snooping and patching
vhost-user-scsi: Call virtio_scsi_common_unrealize() when device realize failed
vhost-scsi: Call virtio_scsi_common_unrealize() when device realize failed
virtio-scsi: remove unused argument to virtio_scsi_common_realize
target/i386: skip KVM_GET/SET_NESTED_STATE if VMX disabled, or for SVM
target/i386: kvm: Demand nested migration kernel capabilities only when vCPU may have enabled VMX
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This adds an accelerator name to the "into mtree -f" to tell the user if
a particular memory section is registered with the accelerator;
the primary user for this is KVM and such information is useful
for debugging purposes.
This adds a has_memory() callback to the accelerator class allowing any
accelerator to have a label in that memory tree dump.
Since memory sections are passed to memory listeners and get registered
in accelerators (rather than memory regions), this only prints new labels
for flatviews attached to the system address space.
An example:
Root memory region: system
0000000000000000-0000002fffffffff (prio 0, ram): /objects/mem0 kvm
0000003000000000-0000005fffffffff (prio 0, ram): /objects/mem1 kvm
0000200000000020-000020000000003f (prio 1, i/o): virtio-pci
0000200080000000-000020008000003f (prio 0, i/o): capabilities
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20190614015237.82463-1-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The argument is not used and passing it clutters error propagation in the
callers. So, get rid of it.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- block: Fix forbidden use of polling in drained_end
- block: Don't wait for I/O throttling while exiting QEMU
- iotests: Use read-zeroes for the null driver to be Valgrind-friendly
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJdMcXmAAoJEH8JsnLIjy/WD4MP/iQuIpHIxLy5YzOiOXEo+Ofj
Zw9qVndHn8/6J8gEZUbjblS94OKmlNWdCsSMApg8GWmgaGwZyHX59jDhqe+ZcFnq
l6J8Qu+MW4b+UEk8JaAH5Y0Q1ZMwEmJOoUha587cgMB6aatGEUfeuN5mE3DGMHAF
BAYDPl4W0Gd/na/cEcLTX8wio7+nQZooUW0htPuOOwdtU1aUMo8KFTXjMfnAKbDI
3Vje+mtt1ica1PCyEdud7mwkQcvOpsKNdDPKwMhpZY4BQ8xnAK5xhmd2XOyK8YoN
Pkc0inyltqWwv9OhRp4CBxWorZJtULhKhwdYLZtCuAdujSACseZQDr+3ZxASCjkU
auKQAwViBLB4e2bVsm/4xYNmAQug8RYixGjyiAxjb8YUfikQbsJnNYG64Yvs6uKJ
miMKkjtNtcpkugKTY7gJI0uPO4Hyv9sSv/VRkxJIq5zq0avmUClA23MgbtteWHf6
K1TW1pfVrBNOIsrTjtjH9YdZemJcEl2ecP/cYGZTnDW4ZQ8q1bbex4vuL1ssNhZX
T3+yq0fIqtXvNHODmUeSkPRMValGiJA7+R31z4im0qefrB7zRJzuqz+RztiYBZn7
bJKOFDySCkqmPHxwM4ZNn8+qXK5S6DCzpRKmc5b48yNRTf5ge4d12kNjhwOttgiJ
KxgWkLWxjRwpRnVawXGZ
=4ELS
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- block: Fix forbidden use of polling in drained_end
- block: Don't wait for I/O throttling while exiting QEMU
- iotests: Use read-zeroes for the null driver to be Valgrind-friendly
# gpg: Signature made Fri 19 Jul 2019 14:30:14 BST
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
iotests: Test quitting with job on throttled node
vl: Drain before (block) job cancel when quitting
iotests: Test commit with a filter on the chain
iotests: Add @has_quit to vm.shutdown()
block: Loop unsafely in bdrv*drained_end()
tests: Extend commit by drained_end test
block: Do not poll in bdrv_do_drained_end()
tests: Lock AioContexts in test-block-iothread
block: Make bdrv_parent_drained_[^_]*() static
block: Add @drained_end_counter
tests: Add job commit by drained_end test
block: Introduce BdrvChild.parent_quiesce_counter
iotests: Set read-zeroes on in null block driver for Valgrind
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A collection of patches I have fixing crypto code and other pieces
without an assigned maintainer
* Fixes crypto function signatures to be compatible with
both old and new versions of nettle
* Fixes deprecation warnings on new nettle
* Fixes GPL license header typos
* Documents security implications of monitor usage
* Optimize linking of capstone to avoid it in tools
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE2vOm/bJrYpEtDo4/vobrtBUQT98FAl0xxJUACgkQvobrtBUQ
T98e/g/+MG0T2JUWs3RiG32frJQSogLle112Nyyoonbbf1nmJ2gBUfURR8b33eCH
R+iS6NUoeeo7MxZ7DZDWJhAzqhJS/HwMtfBBf0JmpWZkRS9iitUAoPPMYOb+gYe/
xEPW13AKAR2PgDBzpnIdN36x3WWo87vuA5x56qtdK7NYHWzS2GHjI4hzqI3+t9xB
E7f3KWIHpNAXdnHlaweMu/qZ9Md+Zu4GbJQzsVGRR7PGSCKq50GEf3ssk6RFd3Hh
dtz/oZiRlfDXwlJTaI8pW/JsvUyJtFG/iQjOiRYkvKnKxdskJki9v+fB/cv8220o
ytGQKdFfc+E2+qFWd+OBZbExeRO2SYVKF9aLQiNeESKEL/UpVpJs1MO7FDN5CoNy
6+PU7gVVjV40XVOLHdVza5wRwencR2fUmewE4INcFfiMqad8rbOZoCcodkfIxFpI
UAj6js2/DfSMbrXL+7X9L2URUO1NS6YuB2OdP9E04IDTkGLdIs+3G56nZUM+E8eu
Fhw1BMG6d4ytxF6QOtPCTJ7gwHMZWE3A64pwwMFZaDyCF++aIeVgpIvGNNSgOIVG
tQIQ7WihDCuIgHFvd2tUSrLCVq6pDOWeiYde56AfrFJg34+GrsUlnrT38Us+aVaB
rKC/bSO88y4Swz58PH2QvX0Sd1+yvcrk0GZEcNIRWXJUfhx/GTk=
=iE4/
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/berrange/tags/misc-next-pull-request' into staging
Merge misc fixes
A collection of patches I have fixing crypto code and other pieces
without an assigned maintainer
* Fixes crypto function signatures to be compatible with
both old and new versions of nettle
* Fixes deprecation warnings on new nettle
* Fixes GPL license header typos
* Documents security implications of monitor usage
* Optimize linking of capstone to avoid it in tools
# gpg: Signature made Fri 19 Jul 2019 14:24:37 BST
# gpg: using RSA key DAF3A6FDB26B62912D0E8E3FBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" [full]
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>" [full]
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* remotes/berrange/tags/misc-next-pull-request:
crypto: Fix LGPL information in the file headers
doc: document that the monitor console is a privileged control interface
configure: only link capstone to emulation targets
crypto: fix function signatures for nettle 2.7 vs 3
crypto: switch to modern nettle AES APIs
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It's either "GNU *Library* General Public License version 2" or "GNU
Lesser General Public License version *2.1*", but there was no "version
2.0" of the "Lesser" license. So assume that version 2.1 is meant here.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We should never poll anywhere in bdrv_do_drained_end() (including its
recursive callees like bdrv_drain_invoke()), because it does not cope
well with graph changes. In fact, it has been written based on the
postulation that no graph changes will happen in it.
Instead, the callers that want to poll must poll, i.e. all currently
globally available wrappers: bdrv_drained_end(),
bdrv_subtree_drained_end(), bdrv_unapply_subtree_drain(), and
bdrv_drain_all_end(). Graph changes there do not matter.
They can poll simply by passing a pointer to a drained_end_counter and
wait until it reaches 0.
This patch also adds a non-polling global wrapper for
bdrv_do_drained_end() that takes a drained_end_counter pointer. We need
such a variant because now no function called anywhere from
bdrv_do_drained_end() must poll. This includes
BdrvChildRole.drained_end(), which already must not poll according to
its interface documentation, but bdrv_child_cb_drained_end() just
violates that by invoking bdrv_drained_end() (which does poll).
Therefore, BdrvChildRole.drained_end() must take a *drained_end_counter
parameter, which bdrv_child_cb_drained_end() can pass on to the new
bdrv_drained_end_no_poll() function.
Note that we now have a pattern of all drained_end-related functions
either polling or receiving a *drained_end_counter to let the caller
poll based on that.
A problem with a single poll loop is that when the drained section in
bdrv_set_aio_context_ignore() ends, some nodes in the subgraph may be in
the old contexts, while others are in the new context already. To let
the collective poll in bdrv_drained_end() work correctly, we must not
hold a lock to the old context, so that the old context can make
progress in case it is different from the current context.
(In the process, remove the comment saying that the current context is
always the old context, because it is wrong.)
In all other places, all nodes in a subtree must be in the same context,
so we can just poll that. The exception of course is
bdrv_drain_all_end(), but that always runs in the main context, so we
can just poll NULL (like bdrv_drain_all_begin() does).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
These functions are not used outside of block/io.c, there is no reason
why they should be globally available.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit 5cb2737e92 laid out why
bdrv_do_drained_end() must decrement the quiesce_counter after
bdrv_drain_invoke(). It did not give a very good reason why it has to
happen after bdrv_parent_drained_end(), instead only claiming symmetry
to bdrv_do_drained_begin().
It turns out that delaying it for so long is wrong.
Situation: We have an active commit job (i.e. a mirror job) from top to
base for the following graph:
filter
|
[file]
|
v
top --[backing]--> base
Now the VM is closed, which results in the job being cancelled and a
bdrv_drain_all() happening pretty much simultaneously.
Beginning the drain means the job is paused once whenever one of its
nodes is quiesced. This is reversed when the drain ends.
With how the code currently is, after base's drain ends (which means
that it will have unpaused the job once), its quiesce_counter remains at
1 while it goes to undrain its parents (bdrv_parent_drained_end()). For
some reason or another, undraining filter causes the job to be kicked
and enter mirror_exit_common(), where it proceeds to invoke
block_job_remove_all_bdrv().
Now base will be detached from the job. Because its quiesce_counter is
still 1, it will unpause the job once more. So in total, undraining
base will unpause the job twice. Eventually, this will lead to the
job's pause_count going negative -- well, it would, were there not an
assertion against this, which crashes qemu.
The general problem is that if in bdrv_parent_drained_end() we undrain
parent A, and then undrain parent B, which then leads to A detaching the
child, bdrv_replace_child_noperm() will undrain A as if we had not done
so yet; that is, one time too many.
It follows that we cannot decrement the quiesce_counter after invoking
bdrv_parent_drained_end().
Unfortunately, decrementing it before bdrv_parent_drained_end() would be
wrong, too. Imagine the above situation in reverse: Undraining A leads
to B detaching the child. If we had already decremented the
quiesce_counter by that point, bdrv_replace_child_noperm() would undrain
B one time too little; because it expects bdrv_parent_drained_end() to
issue this undrain. But bdrv_parent_drained_end() won't do that,
because B is no longer a parent.
Therefore, we have to do something else. This patch opts for
introducing a second quiesce_counter that counts how many times a
child's parent has been quiesced (though c->role->drained_*). With
that, bdrv_replace_child_noperm() just has to undrain the parent exactly
that many times when removing a child, and it will always be right.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This contains a pair of patches that add OpenSBI support to QEMU on
RISC-V targets. The patches have been floating around for a bit, but
everything seems solid now. These pass my standard test of booting
OpenEmbedded, and also works when I swap around the various command-line
arguments to use the new boot method.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAl0xBrMTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRDvTKFQLMurQY+wD/9ewP2nQqPaaLCkJJKKb4iYLkm7yqxu
IY1xnCZvbRi5dB/5WJvZCKILB/6eaJfpH1GIx12jUS97lSgxleopSyAE9zSvcKXY
guhbrblnsfA/Ogov7ivAWxqaLJD9QPFq7MaTMpserV+RAFaX7RHH7DIZX3Hizera
IVkUxqNsrvWFPnEV0JiRvyYiQA7kz2Bd/Qro5S/UtalP/SQGQAxi4T+j78MI4df2
L00VIs352wJ75wu+Rv3TFRjMWevVkqzCiM67ZemXDKLh25poyZr0ccWdWFq1aXRq
w90ti0WHdd92EbWLH+0tajIaAQMn+jyXcD1VIQJbEsOeSKbd4zRInTlhXlR0gloq
bUOH6F3tHucZ/l/O4D/Q1mgxeLPVc+Whlq/kVvpTVq3AyaOAGSccwvz/H5Pl/LDI
U+YxMndeUn6wh56z/VFxNNc3OMo0HJIn7sLckBiqWUrvx2N+jr/nLUz3XZ3zaBme
G0Xdepai2Dj+BDn7CP1Bzro37pVtkT7qtDIMi5n1r3K2W+kXtuET5f0nvP4wZi44
ksFtPEuLZON0Jcu+PHCo/4haTV2guWc162iEC80vhfp4WjozKo42bt/m8wYyzI4u
ZjZGJ9VqrHu7Zre8pydJXaik/xyWpdHoG4Xr859oYQJCD49Q0xBSz0eeBa9nmYyo
rn4f5TTD99hEbA==
=+N8r
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/palmer/tags/riscv-for-master-4.1-rc2' into staging
RISC-V Patches for 4.2-rc2
This contains a pair of patches that add OpenSBI support to QEMU on
RISC-V targets. The patches have been floating around for a bit, but
everything seems solid now. These pass my standard test of booting
OpenEmbedded, and also works when I swap around the various command-line
arguments to use the new boot method.
# gpg: Signature made Fri 19 Jul 2019 00:54:27 BST
# gpg: using RSA key 00CE76D1834960DFCE886DF8EF4CA1502CCBAB41
# gpg: issuer "palmer@dabbelt.com"
# gpg: Good signature from "Palmer Dabbelt <palmer@dabbelt.com>" [unknown]
# gpg: aka "Palmer Dabbelt <palmer@sifive.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 00CE 76D1 8349 60DF CE88 6DF8 EF4C A150 2CCB AB41
* remotes/palmer/tags/riscv-for-master-4.1-rc2:
hw/riscv: Load OpenSBI as the default firmware
roms: Add OpenSBI version 0.4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If the user hasn't specified a firmware to load (with -bios) or
specified no bios (with -bios none) then load OpenSBI by default. This
allows users to boot a RISC-V kernel with just -kernel.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Fix a crash with LTP testsuite and aarch64:
tst_test.c:1015: INFO: Timeout per run is 0h 05m 00s
qemu-aarch64: .../qemu/accel/tcg/translate-all.c:2522: page_check_range: Assertion `start < ((target_ulong)1 << L1_MAP_ADDR_SPACE_BITS)' failed.
qemu:handle_cpu_signal received signal outside vCPU context @ pc=0x60001554
page_check_range() should never be called with address outside the guest
address space. This patch adds a guest_addr_valid() check in access_ok()
to only call page_check_range() with a valid address.
Fixes: f6768aa1b4 ("target/arm: fix AArch64 virtual address space size")
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20190704084115.24713-1-lvivier@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
- Fixes for the NVMe block driver, the gluster block driver, and for
running multiple block jobs concurrently on a single chain
-----BEGIN PGP SIGNATURE-----
iQFGBAABCAAwFiEEkb62CjDbPohX0Rgp9AfbAGHVz0AFAl0shO8SHG1yZWl0ekBy
ZWRoYXQuY29tAAoJEPQH2wBh1c9AO3kIAKu5HCVcA3XnDlWGAzhtLxkaMb+KR+7V
qiN/YvZ5ECAVsh4gwpZkUlt2VEZEzKV8YCSZ6mFaE8zlFN4ujAq2S03ok7xxR6xP
JGf3cwhNgnIpJ3juYARe2l8o+zREt6Y6ImFcydMEkuFZBoYsvSWOYEIy+ACxlLKc
m54+w6590P1wpk3mlEGKm2hB9/1kSCDv/6LS1ig2Cv2T2U/VigZ9eGkKfVE5bWb4
4fc8Qxnde+ZDnzmhcd+hN/7epyASptMQzywJuFraspxSHhsQvOX3joIMSY3A3QXl
rmttuWctOGT7pgRoC6j+JsdFuNFn6AwRaewL6DtE9c6gKk2n4bcWs6s=
=+FyQ
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2019-07-15' into staging
Block patches for 4.1-rc1:
- Fixes for the NVMe block driver, the gluster block driver, and for
running multiple block jobs concurrently on a single chain
# gpg: Signature made Mon 15 Jul 2019 14:51:43 BST
# gpg: using RSA key 91BEB60A30DB3E8857D11829F407DB0061D5CF40
# gpg: issuer "mreitz@redhat.com"
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>" [full]
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* remotes/maxreitz/tags/pull-block-2019-07-15:
gluster: fix .bdrv_reopen_prepare when backing file is a JSON object
iotests: Add read-only test case to 030
iotests: Add new case to 030
iotests: Add @use_log to VM.run_job()
iotests: Compare error messages in 030
iotests: Fix throttling in 030
block: Deep-clear inherits_from
block/stream: Swap backing file change order
block/stream: Fix error path
block: Add BDS.never_freeze
nvme: Set number of queues later in nvme_init()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The commit and the mirror block job must be able to drop their filter
node at any point. However, this will not be possible if any of the
BdrvChild links to them is frozen. Therefore, we need to prevent them
from ever becoming frozen.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190703172813.6868-2-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Currently we are doing log_clear() right after log_sync() which mostly
keeps the old behavior when log_clear() was still part of log_sync().
This patch tries to further optimize the migration log_clear() code
path to split huge log_clear()s into smaller chunks.
We do this by spliting the whole guest memory region into memory
chunks, whose size is decided by MigrationState.clear_bitmap_shift (an
example will be given below). With that, we don't do the dirty bitmap
clear operation on the remote node (e.g., KVM) when we fetch the dirty
bitmap, instead we explicitly clear the dirty bitmap for the memory
chunk for each of the first time we send a page in that chunk.
Here comes an example.
Assuming the guest has 64G memory, then before this patch the KVM
ioctl KVM_CLEAR_DIRTY_LOG will be a single one covering 64G memory.
If after the patch, let's assume when the clear bitmap shift is 18,
then the memory chunk size on x86_64 will be 1UL<<18 * 4K = 1GB. Then
instead of sending a big 64G ioctl, we'll send 64 small ioctls, each
of the ioctl will cover 1G of the guest memory. For each of the 64
small ioctls, we'll only send if any of the page in that small chunk
was going to be sent right away.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190603065056.25211-12-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Introduce KVMMemoryListener.slots_lock to protect the slots inside the
kvm memory listener. Currently it is close to useless because all the
KVM code path now is always protected by the BQL. But it'll start to
make sense in follow up patches where we might do remote dirty bitmap
clear and also we'll update the per-slot cached dirty bitmap even
without the BQL. So let's prepare for it.
We can also use per-slot lock for above reason but it seems to be an
overkill. Let's just use this bigger one (which covers all the slots
of a single address space) but anyway this lock is still much smaller
than the BQL.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20190603065056.25211-10-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
When synchronizing dirty bitmap from kernel KVM we do it in a
per-kvmslot fashion and we allocate the userspace bitmap for each of
the ioctl. This patch instead make the bitmap cache be persistent
then we don't need to g_malloc0() every time.
More importantly, the cached per-kvmslot dirty bitmap will be further
used when we want to add support for the KVM_CLEAR_DIRTY_LOG and this
cached bitmap will be used to guarantee we won't clear any unknown
dirty bits otherwise that can be a severe data loss issue for
migration code.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190603065056.25211-9-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Introduce a new memory region listener hook log_clear() to allow the
listeners to hook onto the points where the dirty bitmap is cleared by
the bitmap users.
Previously log_sync() contains two operations:
- dirty bitmap collection, and,
- dirty bitmap clear on remote site.
Let's take KVM as example - log_sync() for KVM will first copy the
kernel dirty bitmap to userspace, and at the same time we'll clear the
dirty bitmap there along with re-protecting all the guest pages again.
We add this new log_clear() interface only to split the old log_sync()
into two separated procedures:
- use log_sync() to collect the collection only, and,
- use log_clear() to clear the remote dirty bitmap.
With the new interface, the memory listener users will still be able
to decide how to implement the log synchronization procedure, e.g.,
they can still only provide log_sync() method only and put all the two
procedures within log_sync() (that's how the old KVM works before
KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is introduced). However with this
new interface the memory listener users will start to have a chance to
postpone the log clear operation explicitly if the module supports.
That can really benefit users like KVM at least for host kernels that
support KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2.
There are three places that can clear dirty bits in any one of the
dirty bitmap in the ram_list.dirty_memory[3] array:
cpu_physical_memory_snapshot_and_clear_dirty
cpu_physical_memory_test_and_clear_dirty
cpu_physical_memory_sync_dirty_bitmap
Currently we hook directly into each of the functions to notify about
the log_clear().
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20190603065056.25211-7-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Also we change the 2nd parameter of it to be the relative offset
within the memory region. This is to be used in follow up patches.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20190603065056.25211-6-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
These helpers copy the source bitmap to destination bitmap with a
shift either on the src or dst bitmap.
Meanwhile, we never have bitmap tests but we should.
This patch also introduces the initial test cases for utils/bitmap.c
but it only tests the newly introduced functions.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20190603065056.25211-5-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
Bitmap test used sizeof(unsigned long) instead of BITS_PER_LONG.
Similar to 9460dee4b2 ("memory: do not touch code dirty bitmap unless
TCG is enabled", 2015-06-05) but for the migration bitmap - we can
skip the MIGRATION bitmap update if migration not enabled.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20190603065056.25211-4-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
cpu_physical_memory_sync_dirty_bitmap() has one RAMBlock* as
parameter, which means that it must be with RCU read lock held
already. Taking it again inside seems redundant. Removing it.
Instead comment on the functions about the RCU read lock.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20190603065056.25211-2-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The PL031 RTC tracks the difference between the guest RTC
and the host RTC using a tick_offset field. For migration,
however, we currently always migrate the offset between
the guest and the vm_clock, even if the RTC clock is not
the same as the vm_clock; this was an attempt to retain
migration backwards compatibility.
Unfortunately this results in the RTC behaving oddly across
a VM state save and restore -- since the VM clock stands still
across save-then-restore, regardless of how much real world
time has elapsed, the guest RTC ends up out of sync with the
host RTC in the restored VM.
Fix this by migrating the raw tick_offset. To retain migration
compatibility as far as possible, we have a new property
migrate-tick-offset; by default this is 'true' and we will
migrate the true tick offset in a new subsection; if the
incoming data has no subsection we fall back to the old
vm_clock-based offset information, so old->new migration
compatibility is preserved. For complete new->old migration
compatibility, the property is set to 'false' for 4.0 and
earlier machine types (this will only affect 'virt-4.0'
and below, as none of the other pl031-using machines are
versioned).
Reported-by: Russell King <rmk@armlinux.org.uk>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 20190709143912.28905-1-peter.maydell@linaro.org
It's not really possible to fit all sense codes into errno codes,
especially in such a way that sense codes can be properly categorized as
either guest-recoverable or host-handled. Create a new function that
checks for guest recoverable sense, then scsi_sense_buf_to_errno only
needs to be called for host handled sense codes.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Turn helper_retaddr into a multi-state flag that may now also
indicate when we're performing a read on behalf of the translator.
In this case, release the mmap_lock before the longjmp back to
the main cpu loop, and thereby avoid a failing assert therein.
Fixes: https://bugs.launchpad.net/qemu/+bug/1832353
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This code block is already surrounded by #ifndef CODE_ACCESS.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These functions are not used, and are not usable in the
context of code generation, because we never have a helper
return address to pass in to them.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
At present we have a potential error in that helper_retaddr contains
data for handle_cpu_signal, but we have not ensured that those stores
will be scheduled properly before the operation that may fault.
It might be that these races are not in practice observable, due to
our use of -fno-strict-aliasing, but better safe than sorry.
Adjust all of the setters of helper_retaddr.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have some potential race conditions vs our user-exec signal
handler that will be solved with this barrier.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The virtio-balloon config size changed in QEMU 4.0 even for existing
machine types. Migration from QEMU 3.1 to 4.0 can fail in some
circumstances with the following error:
qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10 read: a1 device: 1 cmask: ff wmask: c0 w1cmask:0
This happens because the virtio-balloon config size affects the VIRTIO
Legacy I/O Memory PCI BAR size.
Introduce a qdev property called "qemu-4-0-config-size" and enable it
only for the QEMU 4.0 machine types. This way <4.0 machine types use
the old size, 4.0 uses the larger size, and >4.0 machine types use the
appropriate size depending on enabled virtio-balloon features.
Live migration to and from old QEMUs to QEMU 4.1 works again as long as
a versioned machine type is specified (do not use just "pc"!).
Originally-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20190710141440.27635-1-stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The function declarations for pci_cap_slot_get and
pci_cap_slot_write_config call the argument "slot_ctl", but the function
definitions and all the call sites drop the 'o' and call it "slt_ctl".
Let's be consistent.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Children sometimes depend on their parent's vm change state handler
having completed. Add a vm change state handler API for devices that
guarantees tree depth ordering.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add an API for registering vm change state handlers with a well-defined
ordering. This is necessary when handlers depend on each other.
Small coding style fixes are included to make checkpatch.pl happy.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This will make unversioned CPU models behavior depend on the
machine type:
* "pc-*-4.0" and older will not report them as aliases.
This is done to keep compatibility with older QEMU versions
after management software starts translating aliases.
* "pc-*-4.1" will translate unversioned CPU models to -v1.
This is done to keep compatibility with existing management
software, that still relies on CPU model runnability promises.
* "none" will translate unversioned CPU models to their latest
version. This is planned become the default in future machine
types (probably in pc-*-4.3).
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-8-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
To make smp_parse() more flexible and expansive, a smp_parse function
pointer is added to MachineClass that machine types could override.
The generic smp_parse() code in vl.c is moved to hw/core/machine.c, and
become the default implementation of MachineClass::smp_parse. A PC-specific
function called pc_smp_parse() has been added to hw/i386/pc.c, which in
this patch changes nothing against the default one .
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190620054525.37188-3-like.xu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Legacy '-numa node,mem' option has a number of issues and mgmt often
defaults to it. Unfortunately it's no possible to replace it with
an alternative '-numa memdev' without breaking migration compatibility.
What's possible though is to deprecate it, keeping option working with
old machine types only.
In order to help users to find out if being deprecated CLI option
'-numa node,mem' is still supported by particular machine type, add new
"numa-mem-supported" property to output of query-machines.
"numa-mem-supported" is set to 'true' for machines that currently support
NUMA, but it will be flipped to 'false' later on, once deprecation period
expires and kept 'true' only for old machine types that used to support
the legacy option so it won't break existing configuration that are using
it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1560172207-378962-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
In new sockets/dies/cores/threads model, the apicid of logical cpu could
imply die level info of guest cpu topology thus x86_apicid_from_cpu_idx()
need to be refactored with #dies value, so does apicid_*_offset().
To keep semantic compatibility, the legacy pkg_offset which helps to
generate CPUIDs such as 0x3 for L3 cache should be mapping to die_offset.
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190612084104.34984-5-like.xu@linux.intel.com>
[ehabkost: squash unit test patch]
Message-Id: <20190612084104.34984-6-like.xu@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The field die_id (default as 0) and has_die_id are introduced to X86CPU.
Following the legacy smp check rules, the die_id validity is added to
the same contexts as leagcy smp variables such as hmp_hotpluggable_cpus(),
machine_set_cpu_numa_node(), cpu_slot_to_string() and pc_cpu_pre_plug().
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190612084104.34984-4-like.xu@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The die-level as the first PC-specific cpu topology is added to the leagcy
cpu topology model, which has one die per package implicitly and only the
numbers of sockets/cores/threads are configurable.
In the new model with die-level support, the total number of logical
processors (including offline) on board will be calculated as:
#cpus = #sockets * #dies * #cores * #threads
and considering compatibility, the default value for #dies would be
initialized to one in x86_cpu_initfn() and pc_machine_initfn().
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190612084104.34984-2-like.xu@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
To get rid of the global smp_* variables we're currently using, it's recommended
to pass MachineState in the list of incoming parameters for functions that use
global smp variables, thus some redundant parameters are dropped. It's applied
for legacy smbios_*(), *_machine_reset(), hot_add_cpu() and mips *_create_cpu().
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190518205428.90532-3-like.xu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The cpu topology property CpuTopology is added to the MachineState
and its members are initialized with the leagcy global smp variables.
From this commit, the code in the system emulation mode is supposed to
use cpu topology variables from MachineState instead of the global ones
defined in vl.c and there is no semantic change.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190518205428.90532-2-like.xu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
virtio-pmem support.
libvhost user mq support.
A bunch of fixes all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEbBAABAgAGBQJdHmkBAAoJECgfDbjSjVRpEAIH+Kmy8n5Et9NzsnmNqHAiC/pg
3V5wGyp9M4ZJVPXC0z/Q1sYJ3YYP6dBd4tjj2/7LzYZSlqlQIs83UlQCo0XTiliH
/jZD/IaAZABnfB7vAeZW67WNT2a20xG2Jr83083lSaDUI/pfIdvbMelIbBLmo/kd
tWdAAWT0kcGYjyz4xQQgtAH6zAQUleKE7ECUJ2TpJQbSMLxdI/YTaoYqek471YdP
ju5OLBO3WbNkSE9JYz4MJqTudYK0sKu568UqBVF8JdpFd5Cv+X/OI+bCsc4QK8KN
DTtFVVvbm1KGPSceqc9rwsDjO4Wd8ThvuZxrB029AahD6vT82F13IHpi/S29Fw==
=WAFb
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pc, pci: features, fixes, cleanups
virtio-pmem support.
libvhost user mq support.
A bunch of fixes all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Thu 04 Jul 2019 22:00:49 BST
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (22 commits)
docs: avoid vhost-user-net specifics in multiqueue section
libvhost-user: implement VHOST_USER_PROTOCOL_F_MQ
libvhost-user: support many virtqueues
libvhost-user: add vmsg_set_reply_u64() helper
pc: Move compat_apic_id_mode variable to PCMachineClass
virtio: Don't change "started" flag on virtio_vmstate_change()
virtio: Make sure we get correct state of device on handle_aio_output()
virtio: Set "start_on_kick" on virtio_set_features()
virtio: Set "start_on_kick" for legacy devices
virtio: add "use-started" property
virtio-pci: fix missing device properties
pc: Support for virtio-pmem-pci
numa: Handle virtio-pmem in NUMA stats
hmp: Handle virtio-pmem when printing memory device infos
virtio-pci: Proxy for virtio-pmem
virtio-pmem: sync linux headers
virtio-pci: Allow to specify additional interfaces for the base type
virtio-pmem: add virtio device
pcie: minor cleanups for slot control/status
pcie: work around for racy guest init
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Replace the static variable with a PCMachineClass field. This
will help us eventually get rid of the pc_compat_*() init
functions.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628200227.1053-1-ehabkost@redhat.com>
Reviewed-by: Philippe Mathieu-Daude <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Besides virtio 1.0 transitional devices, we should also
set "start_on_kick" flag for legacy devices (virtio 0.9).
Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20190626023130.31315-3-xieyongji@baidu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In order to avoid migration issues, we introduce a "use-started"
property to the base virtio device to indicate whether use
"started" flag or not. This property will be true by default and
set to false when machine type <= 4.0.
Suggested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Message-Id: <20190626023130.31315-2-xieyongji@baidu.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We need a proxy device for virtio-pmem, and this device has to be the
actual memory device so we can cleanly hotplug it.
Forward memory device class functions either to the actual device or use
properties of the virtio-pmem device to implement these in the proxy.
virtio-pmem will only be compiled for selected, supported architectures
(that can deal with virtio/pci devices being memory devices). An
architecture that is prepared for that can simply enable
CONFIG_VIRTIO_PMEM to make it work.
As not all architectures support memory devices (and CONFIG_VIRTIO_PMEM
will be enabled per supported architecture), we have to move the PCI proxy
to a separate file.
Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
[ split up patches, memory-device changes, move pci proxy]
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190619094907.10131-5-pagupta@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add linux headers for virtio pmem. These are not yet upstream - include
them temporarily as merge window in which this is supposed to be is
coming up shortly. If virtio-pmem ends up not being merged
then this will be reverted and accordingly virtio-pmem dropped.
Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
Message-Id: <20190619094907.10131-4-pagupta@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This pull request contains a handful of patches that I'd like to target
for the 4.1 soft freeze. There are a handful of new features:
* Support for the 1.11.0, the latest privileged specification.
* Support for reading and writing the PRCI registers.
* Better control over the ISA of the target machine.
* Support for the cpu-topology device tree node.
Additionally, there are a handful of bug fixes including:
* Load reservations are now broken by both store conditional and by
scheduling, which fixes issues with parallel applications.
* Various fixes to the PMP implementation.
* Fixes to the 32-bit linux-user syscall ABI.
* Various fixes for instruction decodeing.
* A fix to the PCI device tree "bus-range" property.
This boots 32-bit and 64-bit OpenEmbedded.
Changes since v2 [riscv-for-master-4.1-sf1-v2]:
* Dropped OpenSBI.
Changes since v1 [riscv-for-master-4.1-sf1]:
* Contains a fix to the sifive_u OpenSBI integration.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAl0caa0THHBhbG1lckBk
YWJiZWx0LmNvbQAKCRDvTKFQLMurQRTNEACwLeajFWIdeltz9PpzlmbABoY9p8rT
amMVjEcisMmIzV7UjLcmqYAyPUFb2KvNy5EA15op0nQKGv7Mm7c4P7V5/eQyz9wE
9/81s3OoOuRa79mwgF3aekA6iQXb0ID+kMaIFSrbTTyECOULb2kb4/xxOH+N13wh
8VfYBdgxsvC37DDShXuzGOdtWpu1wZDgYwYOHAIns/r90FYA1w2vJhjn72AiDfnY
QKwm6FHiplraMfbURRxbgU5oUIXsCelYNB3Fu9dFOuVwgmwrwFHJIG1Mi5pvMkbo
vRnSDX5vKKpeyFj6U23RnCJpxoORe0Bp6GnD5MbByUr9oroBbD3dkNryXU0da5jN
e5Jje3F8v/g3KdRUm6AQJb6JBBOZu4uSUn9jUfqlNLFDvUDLTq74kfJHLzsWr4Ds
k/rP3fmyked27KnEwCOzAEEDI4Z3exbwnoPiP5ik92ZkSL7PVwLpWvB/8tn8HQab
ldq8w/piUkbC9ug2p07TFg6aST8YjzuGHtOW4M8TpX5cuAiHPbjS54dj88+EL5i7
shBWFKEuKV0ESVgIBwJu9qlX5QvQWJA4LSo35PyrGJb6xV+lQ3j0pdcOojL6ZSkk
JDbQtC1nsxDf/Qt4vHc4VbOZQChO+FYs2Smin8Wl5gN+hReY6lB/ZtHjYl7uw2bC
9Ijy2OkCjtz4mw==
=lliO
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/palmer/tags/riscv-for-master-4.1-sf1-v3' into staging
RISC-V Patches for the 4.1 Soft Freeze, Part 2 v3
This pull request contains a handful of patches that I'd like to target
for the 4.1 soft freeze. There are a handful of new features:
* Support for the 1.11.0, the latest privileged specification.
* Support for reading and writing the PRCI registers.
* Better control over the ISA of the target machine.
* Support for the cpu-topology device tree node.
Additionally, there are a handful of bug fixes including:
* Load reservations are now broken by both store conditional and by
scheduling, which fixes issues with parallel applications.
* Various fixes to the PMP implementation.
* Fixes to the 32-bit linux-user syscall ABI.
* Various fixes for instruction decodeing.
* A fix to the PCI device tree "bus-range" property.
This boots 32-bit and 64-bit OpenEmbedded.
Changes since v2 [riscv-for-master-4.1-sf1-v2]:
* Dropped OpenSBI.
Changes since v1 [riscv-for-master-4.1-sf1]:
* Contains a fix to the sifive_u OpenSBI integration.
# gpg: Signature made Wed 03 Jul 2019 09:39:09 BST
# gpg: using RSA key 00CE76D1834960DFCE886DF8EF4CA1502CCBAB41
# gpg: issuer "palmer@dabbelt.com"
# gpg: Good signature from "Palmer Dabbelt <palmer@dabbelt.com>" [unknown]
# gpg: aka "Palmer Dabbelt <palmer@sifive.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 00CE 76D1 8349 60DF CE88 6DF8 EF4C A150 2CCB AB41
* remotes/palmer/tags/riscv-for-master-4.1-sf1-v3: (32 commits)
hw/riscv: Extend the kernel loading support
hw/riscv: Add support for loading a firmware
hw/riscv: Split out the boot functions
riscv: sifive_u: Update the plic hart config to support multicore
riscv: sifive_u: Do not create hard-coded phandles in DT
disas/riscv: Fix `rdinstreth` constraint
disas/riscv: Disassemble reserved compressed encodings as illegal
riscv: virt: Add cpu-topology DT node.
RISC-V: Update syscall list for 32-bit support.
RISC-V: Clear load reservations on context switch and SC
RISC-V: Add support for the Zicsr extension
RISC-V: Add support for the Zifencei extension
target/riscv: Add support for disabling/enabling Counters
target/riscv: Remove user version information
target/riscv: Require either I or E base extension
qemu-deprecated.texi: Deprecate the RISC-V privledge spec 1.09.1
target/riscv: Set privledge spec 1.11.0 as default
target/riscv: Add the mcountinhibit CSR
target/riscv: Add the privledge spec version 1.11.0
target/riscv: Restructure deprecatd CPUs
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently the bitbang_i2c_init() function allocates a
bitbang_i2c_interface struct which it returns. This is unfortunate
because it means that if the function is used from a DeviceState
init method then the memory will be leaked by an "init then delete"
cycle, as used by the qmp/hmp commands that list device properties.
Since three out of four of the uses of this function are in
device init methods, switch the function to do an in-place
initialization of a struct that can be embedded in the
device state struct of the caller.
This fixes LeakSanitizer leak warnings that have appeared in the
patchew configuration (which only tries to run the sanitizers
for the x86_64-softmmu target) now that we use the bitbang-i2c
code in an x86-64 config.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Tested-by: BALATON Zoltan <balaton@eik.bme.hu>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190702163844.20458-1-peter.maydell@linaro.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This is a left-over from "f4ec5e26ed vfio: Add host side DMA window
capabilities", which added support to more than one DMA window.
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Here's my next pull request for qemu-4.1. I'm not sure if this will
squeak in just before the soft freeze, or just after. I don't think
it really matters - most of this is bugfixes anyway. There's some
cleanups which aren't stictly bugfixes, but which I think are safe
enough improvements to go in the soft freeze. There's no true feature
work.
Unfortunately, I wasn't able to complete a few of my standard battery
of pre-pull tests, due to some failures that appear to also be in
master. I'm hoping that hasn't missed anything important in here.
Highlights are:
* A number of fixe and cleanups for the XIVE implementation
* Cleanups to the XICS interrupt controller to fit better with the new
XIVE code
* Numerous fixes and improvements to TCG handling of ppc vector
instructions
* Remove a number of unnnecessary #ifdef CONFIG_KVM guards
* Fix some errors in the PCI hotplug paths
* Assorted other fixes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl0a9JMACgkQbDjKyiDZ
s5ItkQ//bpkDkztJfRbOB7cgFVQCbXIJ5mpG7PBnBJDohXRtEsjCunNwL+GelRMl
FizPJO3sGpR2f+MgH+7MJ+Y6ESSwDhI6u8TbH4MjGTc9kWsqV1YUy6nB3grxwqG7
k9AXN0z6e1MZLaZuseGBrZmPzZcvNwnPKFqEU06ZXqIWscNgXWXteyO5JTZW4O9M
+Ttiser/f6dRCHKrKnlJp3D1blBaJVUXzZTJVqmH6AiJy/xfHq7Ak6LQKrVrt8Vc
I2hGMEqyDE+ppr8cuGku4KR8GWUen9m0F0bTVGjPsG1io+spAznxNZL/Z+KJPzrI
cCFaKoyNknIicx/0/iil5TEuu4rz985erNZBcglarK/w9w0RyW2LlcDbvzV+gO6c
Ln/1WLZZh4WufR4s4195zUJwZPwGp0E4xFdfk20ulzVzV4wVCMbNJHZpchHYFMi3
fW4Yzhpq5zaOTIaew5+tWST+8RuduacZ/Rm+f9LNui42uA52/EMoD8Vo34n8CIro
9DPOS64Jk9BjIr9bMstFOBCyTVt64IFzskDOMCSCznUl51Hm0ytfAJH3Gty7YazQ
ZxncazzlC9E6OzCTYRDNSPnTKGFvccGmuir/SXPWf3bn8oBC9p3P1mPK3cgk//as
CvWW8Y/QAJOrxEls5QZzpIBjxqAcMoMVjir6l1OT2/gvBTJto1Q=
=QAyU
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190702' into staging
ppc patch queue 2019-07-2
Here's my next pull request for qemu-4.1. I'm not sure if this will
squeak in just before the soft freeze, or just after. I don't think
it really matters - most of this is bugfixes anyway. There's some
cleanups which aren't stictly bugfixes, but which I think are safe
enough improvements to go in the soft freeze. There's no true feature
work.
Unfortunately, I wasn't able to complete a few of my standard battery
of pre-pull tests, due to some failures that appear to also be in
master. I'm hoping that hasn't missed anything important in here.
Highlights are:
* A number of fixe and cleanups for the XIVE implementation
* Cleanups to the XICS interrupt controller to fit better with the new
XIVE code
* Numerous fixes and improvements to TCG handling of ppc vector
instructions
* Remove a number of unnnecessary #ifdef CONFIG_KVM guards
* Fix some errors in the PCI hotplug paths
* Assorted other fixes
# gpg: Signature made Tue 02 Jul 2019 07:07:15 BST
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-4.1-20190702: (49 commits)
spapr/xive: Add proper rollback to kvmppc_xive_connect()
ppc/xive: Fix TM_PULL_POOL_CTX special operation
ppc/pnv: Rework cache watch model of PnvXIVE
ppc/xive: Make the PIPR register readonly
ppc/xive: Force the Physical CAM line value to group mode
spapr/xive: simplify spapr_irq_init_device() to remove the emulated init
spapr/xive: rework the mapping the KVM memory regions
spapr_pci: Unregister listeners before destroying the IOMMU address space
target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro
target/ppc: decode target register in VSX_EXTRACT_INSERT at translation time
target/ppc: decode target register in VSX_VECTOR_LOAD_STORE_LENGTH at translation time
target/ppc: introduce GEN_VSX_HELPER_R2_AB macro to fpu_helper.c
target/ppc: introduce GEN_VSX_HELPER_R2 macro to fpu_helper.c
target/ppc: introduce GEN_VSX_HELPER_R3 macro to fpu_helper.c
target/ppc: introduce GEN_VSX_HELPER_X1 macro to fpu_helper.c
target/ppc: introduce GEN_VSX_HELPER_X2_AB macro to fpu_helper.c
target/ppc: introduce GEN_VSX_HELPER_X2 macro to fpu_helper.c
target/ppc: introduce separate generator and helper for xscvqpdp
target/ppc: introduce GEN_VSX_HELPER_X3 macro to fpu_helper.c
target/ppc: introduce separate VSX_CMP macro for xvcmp* instructions
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is the implementation of virtio-pmem device. Support will require
machine changes for the architectures that will support it, so it will
not yet be compiled. It can be unlocked with VIRTIO_PMEM_SUPPORTED per
machine and disabled globally via VIRTIO_PMEM.
We cannot use the "addr" property as that is already used e.g. for
virtio-pci/pci devices. And we will have e.g. virtio-pmem-pci as a proxy.
So we have to choose a different one (unfortunately). "memaddr" it is.
That name should ideally be used by all other virtio-* based memory
devices in the future.
-device virtio-pmem-pci,id=p0,bus=bux0,addr=0x01,memaddr=0x1000000...
Acked-by: Markus Armbruster <armbru@redhat.com>
[ QAPI bits ]
Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
[ MemoryDevice/MemoryRegion changes, cleanups, addr property "memaddr",
split up patches, unplug handler ]
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190619094907.10131-2-pagupta@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
- The stream job no longer relies on a fixed base node
- The rbd block driver can now accomodate growing formats like qcow2
-----BEGIN PGP SIGNATURE-----
iQFGBAABCAAwFiEEkb62CjDbPohX0Rgp9AfbAGHVz0AFAl0aubYSHG1yZWl0ekBy
ZWRoYXQuY29tAAoJEPQH2wBh1c9AglcH/1EoMnmibLg4BumWh+kPMKIXthoMSN0Z
cVK8/OhpR66tAaT0OatX+CKYtfefAdX5Km6Oipa4ygALF21aON6+N2lss+K//ieK
cg71CxncmgYlAS358wdJBgzCCzE06+fCY2JYLa/vRgVoKFJjXwVvlmYymuOA3RN/
htJB64tXZheyEX89B7fqBdlDAf2SEomXVVPgKp9QMPKaP+JchGdZ26VxHFXEEJLo
NZuMZpUUVNSkiDkpEh6ED/6jDFFpTjLzdoNdVnRJmHpeKNEPrLQVxqyjD1bagyt/
InjEtMGQ9i1v4B1IODdYhyLas/bQsituG7Dj149le02307qcvZbLi7s=
=RY+3
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2019-07-02' into staging
Block patches for 4.1-rc0:
- The stream job no longer relies on a fixed base node
- The rbd block driver can now accomodate growing formats like qcow2
# gpg: Signature made Tue 02 Jul 2019 02:56:06 BST
# gpg: using RSA key 91BEB60A30DB3E8857D11829F407DB0061D5CF40
# gpg: issuer "mreitz@redhat.com"
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>" [full]
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* remotes/maxreitz/tags/pull-block-2019-07-02:
block/stream: introduce a bottom node
block/stream: refactor stream_run: drop goto
block: include base when checking image chain for block allocation
block/rbd: increase dynamically the image size
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Version: GnuPG v1
iQEcBAABAgAGBQJdGr/CAAoJEO8Ells5jWIRJY4H/10pMuRX11WDSQqc/ejQMywj
3v+CBFTaNnCvbPqEFKH6p/1Y1KV/G1+ODzuat71QVjHqzlX/tCcLtNVrzIsozal+
R42PADVpBuOGYC9CBBKZxKpORm4YEOApwERKAmjoTDKJUbho1TOOt60Vz7ctlybb
nlEDdNogSKZ7ZxSj/Nn5NvaZnQPpS3ZpFcqMwr1ofQtCuUsYi9DNREd+NLwaknZS
O3ib6nAFxBH4dpA0bSKWJHZi8QU1GcIdvs0iDeSnzvlCgwjbv3pM6eYkse90jXq+
Sq0ZxSMb+97CCu54b5wu8EqOhzBaasVcKApR3NRQc5vHBI/1WMNB5eFWmfSpzWM=
=3+Zi
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Tue 02 Jul 2019 03:21:54 BST
# gpg: using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
migration/colo.c: Add missed filter notify for Xen COLO.
COLO-compare: Add colo-compare remote notify support
COLO-compare: Make the compare_chr_send() can send notification message.
COLO-compare: Add remote notification chardev handler frame
COLO-compare: Add new parameter to communicate with remote colo-frame
net/announce: Expand test for stopping self announce
net/announce: Add HMP optional ID
net/announce: Add optional ID
net/announce: Add HMP optional interface list
net/announce: Allow optional list of interfaces
net: remove unused get_str_sep() function
net: use g_strsplit() for parsing host address and port
net: avoid using variable length array in net_client_init()
net: fix assertion failure when ipv6-prefixlen is not a number
ftgmac100: do not link to netdev
qemu-bridge-helper: Document known shortcomings
MAINTAINERS: Add qemu-bridge-helper.c to "Network device backends"
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Move commands dump-guest-memory, query-dump,
query-dump-guest-memory-capability with their types from misc.json to
new dump.json. Add dump.json to MAINTAINERS section "Dump".
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190619201050.19040-15-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Move the HMP handlers related to qapi/machine.json to
hw/core/machine-hmp-cmds.c, where they are covered by MAINTAINERS
section "Machine core", just like qapi/machine.json.
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190619201050.19040-12-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The handlers for qapi/machine.json's QMP commands are spread over
cpus.c, hw/core/numa.c, monitor/misc.c, monitor/qmp-cmds.c, and vl.c.
Move them all to new hw/core/machine-qmp-cmds.c, where they are
covered by MAINTAINERS section "Machine core", just like
qapi/machine.json.
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190619201050.19040-11-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Move commands cpu-add, query-cpus, query-cpus-fast,
query-current-machine, query-hotpluggable-cpus, query-machines,
query-memdev, and set-numa-node with their types from misc.json to new
machine.json. Also move types X86CPURegister32 and
X86CPUFeatureWordInfo. Add machine.json to MAINTAINERS section
"Machine core".
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190619201050.19040-9-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Move the HMP command handlers related to QOM handlers from
monitor/hmp-cmds.c and qdev-monitor.c to new qom/qom-hmp-cmds.c, where
they are covered by MAINTAINERS section QOM.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Daniel P. Berrange" <berrange@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190619201050.19040-7-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
[Also move hmp_info_qom_tree(), tweak commit message accordingly]
- fix for a tcg test case
- halt/clear support for vfio-ccw, and use a new helper
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEw9DWbcNiT/aowBjO3s9rk8bwL68FAl0Z6bkSHGNvaHVja0By
ZWRoYXQuY29tAAoJEN7Pa5PG8C+vwugQAJPAlCvafepuaZHGvM/hYJVg+goVvbd/
EQhDBVFNWcMXCR6LSt67NAgAjDu21KFPWy+CC/JtzqiXWnGZdwSHd9ydfmsW+Z4t
/aGrNv6MQ9WuEZ7hujsWEyAbMxIY1LnE9noulQBOvw28BcxRVNeQlFt9c5bcEdkb
2/Mvs8X3+VyPes4lnNre6AYSQdCwTqrhcExwONHeQloy7fK6GU118oMjjn8QUabl
NiNIseij2WkuFaqZGf/CPgtwAl6d7OZFPXaqgRVoGT+wKsE2u8dkc4VAYusS8Mrj
lZ8XEkoYRUZVI6mkUcHPY66cvRUvlOjc+pXVw2p0GqzLTiXrr9ZOQIm0tiwTgFs1
EwZv2B5nW7T/jmqh4MZRgs+Qrmr04NHdsoA5y4vDx1cmq7C80Itn5CIwlntXlwMQ
N+949bNj8ZL38CE4ksaYfp5+mfB1ay7FUEL/jEgOXFsWYXMmPlC1r2lZ/1ARBjGw
iCNuCq/9fjBTbHPG5ulXP+z74eHf3Qw6wxT7SXaK3RiY1gN6ZKSTTnWVJws0ai65
OK76Ls3WiCYKYZpT06snIhOkRSt9+gd4DUtF8oLfgWYV7RDkCVC7Fr2hwS570dUh
Sz7nGM2vlAcfXYMeRR4wQZV/hQDVgX5bxlI0cvvrsefVZ1DxuaT0GXBmktls7Pns
N6j6sK8DwTSB
=0dnG
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20190701' into staging
- cleanup/refactoring in the cpu feature code
- fix for a tcg test case
- halt/clear support for vfio-ccw, and use a new helper
# gpg: Signature made Mon 01 Jul 2019 12:08:41 BST
# gpg: using RSA key C3D0D66DC3624FF6A8C018CEDECF6B93C6F02FAF
# gpg: issuer "cohuck@redhat.com"
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>" [unknown]
# gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>" [full]
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>" [full]
# gpg: aka "Cornelia Huck <cohuck@kernel.org>" [unknown]
# gpg: aka "Cornelia Huck <cohuck@redhat.com>" [unknown]
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck/tags/s390x-20190701:
s390x: add cpu feature/model files to KVM section
vfio-ccw: support async command subregion
vfio-ccw: use vfio_set_irq_signaling
s390x/cpumodel: Prepend KDSA features with "KDSA"
s390x/cpumodel: Rework CPU feature definition
tests/tcg/s390x: Fix alignment of csst parameter list
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190619201050.19040-4-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Previously there was a single instance of the timer used by
monitor triggered announces, that's OK, but when combined with the
previous change that lets you have announces for subsets of interfaces
it's a bit restrictive if you want to do different things to different
interfaces.
Add an 'id' field to the announce, and maintain a list of the
timers based on id.
This allows you to for example:
a) Start an announce going on interface eth0 for a long time
b) Start an announce going on interface eth1 for a long time
c) Kill the announce on eth0 while leaving eth1 going.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Allow the caller to restrict the set of interfaces that announces are
sent on. The default is still to send on all interfaces.
e.g.
{ "execute": "announce-self", "arguments": { "initial": 50, "max": 550, "rounds": 5, "step": 50, "interfaces": ["vn2", "vn1"] } }
This doesn't affect the behaviour of migraiton announcments.
Note: There's still only one timer for the qmp command, so that
performing an 'announce-self' on one list of interfaces followed
by another 'announce-self' on another list will stop the announces
on the existing set.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This patch is used in the 'block/stream: introduce a bottom node'
that is following. Instead of the base node, the caller may pass
the node that has the base as its backing image to the function
bdrv_is_allocated_above() with a new parameter include_base = true
and get rid of the dependency on the base that may change during
commit/stream parallel jobs. Now, if the specified base is not
found in the backing image chain, the QEMU will abort.
Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1559152576-281803-2-git-send-email-andrey.shinkevich@virtuozzo.com
[mreitz: Squashed in the following as a rebase on conflicting patches:]
Message-id: e3cf99ae-62e9-8b6e-5a06-d3c8b9363b85@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
The init_emu() handles are now empty. Remove them and rename
spapr_irq_init_device() to spapr_irq_init_kvm().
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190614165920.12670-3-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Today, the interrupt device is fully initialized at reset when the CAS
negotiation process has completed. Depending on the KVM capabilities,
the SpaprXive memory regions (ESB, TIMA) are initialized with a host
MMIO backend or a QEMU emulated backend. This results in a complex
initialization sequence partially done at realize and later at reset,
and some memory region leaks.
To simplify this sequence and to remove of the late initialization of
the emulated device which is required to be done only once, we
introduce new memory regions specific for KVM. These regions are
mapped as overlaps on top of the emulated device to make use of the
host MMIOs. Also provide proper cleanups of these regions when the
XIVE KVM device is destroyed to fix the leaks.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190614165920.12670-2-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Hot-unplugging a PHB with a VFIO device connected to it crashes QEMU:
-device spapr-pci-host-bridge,index=1,id=phb1 \
-device vfio-pci,host=0034:01:00.3,id=vfio0
(qemu) device_del phb1
[ 357.207183] iommu: Removing device 0001:00:00.0 from group 1
[ 360.375523] rpadlpar_io: slot PHB 1 removed
qemu-system-ppc64: memory.c:2742:
do_address_space_destroy: Assertion `QTAILQ_EMPTY(&as->listeners)' failed.
'as' is the IOMMU address space, which indeed has a listener registered
to by vfio_connect_container() when the VFIO device is realized. This
listener is supposed to be unregistered by vfio_disconnect_container()
when the VFIO device is finalized. Unfortunately, the VFIO device hasn't
reached finalize yet at the time the PHB unrealize function is called,
and address_space_destroy() gets called with the VFIO listener still
being registered.
All regions have just been unmapped from the address space. Listeners
aren't needed anymore at this point. Remove them before destroying the
address space.
The VFIO code will try to remove them _again_ at device finalize,
but it is okay since memory_listener_unregister() is idempotent.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156110925375.92514.11649846071216864570.stgit@bahia.lan>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[dwg: Correct spelling error pointed out by aik]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This allows errors happening there to be propagated up to spapr_irq,
just like XIVE already does.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156077921763.433243.4614327010172954196.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Switch to using the connect/disconnect terminology like we already do for
XIVE.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156077920102.433243.6605099291134598170.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 9fb6eb7ca50c added the declaration of xics_spapr_connect(), which
has no implementation and no users.
This is a leftover from a previous iteration of this patch. Drop it.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156077919546.433243.8748677531446035746.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Older KVMs on POWER9 don't support destroying/recreating a KVM XICS
device, which is required by 'dual' interrupt controller mode. This
causes QEMU to emit a warning when the guest is rebooted and to fall
back on XICS emulation:
qemu-system-ppc64: warning: kernel_irqchip allowed but unavailable:
Error on KVM_CREATE_DEVICE for XICS: File exists
If kernel irqchip is required, QEMU will thus exit when the guest is
first rebooted. Failing QEMU this late may be a painful experience
for the user.
Detect that and exit at machine init instead.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156044430517.125694.6207865998817342638.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
QEMU may crash when running a spapr machine in 'dual' interrupt controller
mode on some older (but not that old, eg. ubuntu 18.04.2) KVMs with partial
XIVE support:
qemu-system-ppc64: hw/ppc/spapr_rtas.c:411: spapr_rtas_register:
Assertion `!name || !rtas_table[token].name' failed.
XICS is controlled by the guest thanks to a set of RTAS calls. Depending
on whether KVM XICS is used or not, the RTAS calls are handled by KVM or
QEMU. In both cases, QEMU needs to expose the RTAS calls to the guest
through the "rtas" node of the device tree.
The spapr_rtas_register() helper takes care of all of that: it adds the
RTAS call token to the "rtas" node and registers a QEMU callback to be
invoked when the guest issues the RTAS call. In the KVM XICS case, QEMU
registers a dummy callback that just prints an error since it isn't
supposed to be invoked, ever.
Historically, the XICS controller was setup during machine init and
released during final teardown. This changed when the 'dual' interrupt
controller mode was added to the spapr machine: in this case we need
to tear the XICS down and set it up again during machine reset. The
crash happens because we indeed have an incompatibility with older
KVMs that forces QEMU to fallback on emulated XICS, which tries to
re-registers the same RTAS calls.
This could be fixed by adding proper rollback that would unregister
RTAS calls on error. But since the emulated RTAS calls in QEMU can
now detect when they are mistakenly called while KVM XICS is in
use, it seems simpler to register them once and for all at machine
init. This fixes the crash and allows to remove some now useless
lines of code.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156044429963.125694.13710679451927268758.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It has now became useless with the previous patch.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190612174345.9799-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The PNV_XSCOM_BASE and PNV_XSCOM_SIZE macros are specific to POWER8
and they are used when the device tree is populated and the MMIO
region created, even for POWER9 chips. This is not too much of a
problem today because we don't have important devices on the second
chip, but we might have oneday (PHBs).
Fix by using the appropriate macros in case of P9.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190612174345.9799-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The ast2500 uses the watchdog to reset the SDRAM controller. This
operation is usually performed by u-boot's memory training procedure,
and it is enabled by setting a bit in the SCU and then causing the
watchdog to expire. Therefore, we need the watchdog to be able to
access the SCU's register space.
This causes the watchdog to not perform a system reset when the bit is
set. In the future it could perform a reset of the SDMC model.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190621065242.32535-1-joel@jms.id.au
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The XDMA engine embedded in the Aspeed SOCs performs PCI DMA operations
between the SOC (acting as a BMC) and a host processor in a server.
The XDMA engine exists on the AST2400, AST2500, and AST2600 SOCs, so
enable it for all of those. Add trace events on the important register
writes in the XDMA engine.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20190618165311.27066-21-clg@kaod.org
[clg: - changed title ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The DRAM address of a DMA transaction depends on the DRAM base address
of the SoC. Inform the SMC controller model with this value.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190618165311.27066-15-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Aspeed SoCs have two MACs. Extend the Aspeed model to support a
second NIC.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190618165311.27066-7-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The current models of the Aspeed SoCs only have one CPU but future
ones will support SMP. Introduce a new num_cpus field at the SoC class
level to define the number of available CPUs per SoC and also
introduce a 'num-cpus' property to activate the CPUs configured for
the machine.
The max_cpus limit of the machine should depend on the SoC definition
but, unfortunately, these values are not available when the machine
class is initialized. This is the reason why we add a check on
num_cpus in the AspeedSoC realize handler.
SMP support will be activated when models for such SoCs are implemented.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190618165311.27066-6-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
All systems have an RTC.
The IRQ is hooked up but the model does not use it at this stage. There
is no guest code that uses it, so this limitation is acceptable.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20190618165311.27066-5-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The RTC is modeled to provide time and date functionality. It is
initialised at zero to match the hardware.
There is no modelling of the alarm functionality, which includes the IRQ
line. As there is no guest code to exercise this function that is
acceptable for now.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20190618165311.27066-4-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This will simplify the definition of new SoCs, like the AST2600 which
should use a slightly different address space and have a different set
of controllers.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190618165311.27066-3-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This will simplify the definition of new SoCs, like the AST2600 which
should use a different CPU and a different IRQ number layout.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190618165311.27066-2-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Datasheet for i.MX7 is incorrect and i.MX7's PCI IRQ mapping matches
that of i.MX6:
* INTD/MSI 122
* INTC 123
* INTB 124
* INTA 125
Fix all of the relevant code to reflect that fact. Needed by latest
Linux kernels.
(Reference: Linux kernel commit 538d6e9d597584e80 from an
NXP employee confirming that the datasheet is incorrect and
with a report of a test against hardware.)
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: added ref to kernel commit confirming the datasheet error]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add no-op/unimplemented PCIE PHY IP block. Needed by new kernels to
use PCIE.
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Rename function arguments to make intent clearer.
Better documentation for slot control logic.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
During boot, linux would sometimes overwrites control of a powered off
slot before powering it on. Unfortunately QEMU interprets that as a
power off request and ejects the device.
For example:
/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -S -machine q35 \
-device pcie-root-port,id=pcie_root_port_0,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
-monitor stdio disk.qcow2
(qemu)device_add virtio-balloon-pci,id=balloon,bus=pcie_root_port_0
(qemu)cont
Balloon is deleted during guest boot.
To fix, save control beforehand and check that power
or led state actually change before ejecting.
Note: this is more a hack than a solution, ideally we'd
find a better way to detect ejects, or move away
from ejects completely and instead monitor whether
it's safe to delete device due to e.g. its power state.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
The bitbang i2c implementation is also useful for other device models
such as DDC in display controllers. Move the header to include/hw/i2c/
to allow it to be used from other device models and adjust users of
this include. This also reverts commit 2b4c1125ac which is no longer
needed.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Message-id: 5d1fe4db846ab9be4b77ddb0d43cc74cd200a003.1561028123.git.balaton@eik.bme.hu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add support for loading a firmware file for the virt machine and the
SiFive U. This can be run with the following command:
qemu-system-riscv64 -machine virt -bios fw_jump.bin -kernel vmlinux
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Split the common RISC-V boot functions into a seperate file. This allows
us to share the common code.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
A vfio-ccw device may provide an async command subregion for
issuing halt/clear subchannel requests. If it is present, use
it for sending halt/clear request to the device; if not, fall
back to emulation (as done today).
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Message-Id: <20190613092542.2834-1-cohuck@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
A Xen public header have been imported into QEMU (by
f65eadb639 "xen: import ring.h from xen"), but there are other header
that depends on ring.h which come from the system when building QEMU.
This patch resolves the issue of having headers from the system
importing a different copie of ring.h.
This patch is prompt by the build issue described in the previous
patch: 'Revert xen/io/ring.h of "Clean up a few header guard symbols"'
ring.h and the new imported headers are moved to
"include/hw/xen/interface" as those describe interfaces with a guest.
The imported headers are cleaned up a bit while importing them: some
part of the file that QEMU doesn't use are removed (description
of how to make hypercall in grant_table.h have been removed).
Other cleanup:
- xen-mapcache.c and xen-legacy-backend.c don't need grant_table.h.
- xenfb.c doesn't need event_channel.h.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Message-Id: <20190621105441.3025-3-anthony.perard@citrix.com>
This reverts changes to include/hw/xen/io/ring.h from commit
37677d7db3.
Following 37677d7db3 "Clean up a few header guard symbols", QEMU start
to fail to build:
In file included from ~/xen/tools/../tools/include/xen/io/blkif.h:31:0,
from ~/xen/tools/qemu-xen-dir/hw/block/xen_blkif.h:5,
from ~/xen/tools/qemu-xen-dir/hw/block/xen-block.c:22:
~/xen/tools/../tools/include/xen/io/ring.h:68:0: error: "__CONST_RING_SIZE" redefined [-Werror]
#define __CONST_RING_SIZE(_s, _sz) \
In file included from ~/xen/tools/qemu-xen-dir/hw/block/xen_blkif.h:4:0,
from ~/xen/tools/qemu-xen-dir/hw/block/xen-block.c:22:
~/xen/tools/qemu-xen-dir/include/hw/xen/io/ring.h:66:0: note: this is the location of the previous definition
#define __CONST_RING_SIZE(_s, _sz) \
The issue is that some public xen headers have been imported (by
f65eadb639 "xen: import ring.h from xen") but not all. With the change
in the guards symbole, the ring.h header start to be imported twice.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Message-Id: <20190621105441.3025-2-anthony.perard@citrix.com>
This patch introduces a poll callback for event channel fd-s and uses
this to invoke the channel callback function.
To properly support polling, it is necessary for the event channel callback
function to return a boolean saying whether it has done any useful work or
not. Thus xen_block_dataplane_event() is modified to directly invoke
xen_block_handle_requests() and the latter only returns true if it actually
processes any requests. This also means that the call to qemu_bh_schedule()
is moved into xen_block_complete_aio(), which is more intuitive since the
only reason for doing a deferred poll of the shared ring should be because
there were previously insufficient resources to fully complete a previous
poll.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20190408151617.13025-4-paul.durrant@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
This patch adds an AioContext parameter to xen_device_bind_event_channel()
and then uses aio_set_fd_handler() to set the callback rather than
qemu_set_fd_handler().
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20190408151617.13025-3-paul.durrant@citrix.com>
[Call aio_set_fd_handler() with is_external=true]
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
To better support use of IOThread-s it will be necessary to be able to set
the AioContext for each XenEventChannel and hence it is necessary to open a
separate handle to libxenevtchan for each channel.
This patch stops using NotifierList for event channel callbacks, replacing
that construct by a list of complete XenEventChannel structures. Each of
these now has a xenevtchn_handle pointer in place of the single pointer
previously held in the XenDevice structure. The individual handles are
opened/closed in xen_device_bind/unbind_event_channel(), replacing the
single open/close in xen_device_realize/unrealize().
NOTE: This patch does not add an AioContext parameter to
xen_device_bind_event_channel(). That will be done in a subsequent
patch.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20190408151617.13025-2-paul.durrant@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Coverity pointed out a memory leak in riscv_sifive_e_soc_realize(),
where a pair of recently added MemoryRegion instances would not be freed
if there were errors elsewhere in the function. The fix here is to
simply not use dynamic allocation for these instances: there's always
one of each in SiFiveESoCState, so instead we just include them within
the struct.
Fixes: 30efbf330a ("SiFive RISC-V GPIO Device")
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Writes to the SiFive PRCI registers are preserved while leaving the
ready bits set for the HFX/HFR oscillators and the lock bit set for the
PLL.
Signed-off-by: Nathaniel Graff <nathaniel.graff@sifive.com>
Reviewed-by: Michael Clark <mjc@sifive.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Commit c87759ce87 fixed a regression affecting pc-q35 machines by
introducing a new pc-q35-4.0.1 machine version to be used instead
of pc-q35-4.0. The only purpose was to revert the default behaviour
of not using split irqchip, but the change also introduced the usual
hw_compat and pc_compat bits, and wired them for pc-q35 only.
This raises questions when it comes to add new compat properties for
4.0* machine versions of any architecture. Where to add them ? In
4.0, 4.0.1 or both ? Error prone. Another possibility would be to teach
all other architectures about 4.0.1. This solution isn't satisfying,
especially since this is a pc-q35 specific issue.
It turns out that the split irqchip default is handled in the machine
option function and doesn't involve compat lists at all.
Drop all the 4.0.1 compat lists and use the 4.0 ones instead in the 4.0.1
machine option function.
Move the compat props that were added to the 4.0.1 since c87759ce87 to
4.0.
Even if only hw_compat_4_0_1 had an impact on other architectures,
drop pc_compat_4_0_1 as well for consistency.
Fixes: c87759ce87 "q35: Revert to kernel irqchip"
Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <156051774276.244890.8660277280145466396.stgit@bahia.lan>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Kernel commit 8fcc4b5923af ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE")
introduced new IOCTLs to extract and restore vCPU state related to
Intel VMX & AMD SVM.
Utilize these IOCTLs to add support for migration of VMs which are
running nested hypervisors.
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Tested-by: Maran Wilson <maran.wilson@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-9-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190619162140.133674-8-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Simiar to how kvm_init_vcpu() calls kvm_arch_init_vcpu() to perform
arch-dependent initialisation, introduce kvm_arch_destroy_vcpu()
to be called from kvm_destroy_vcpu() to perform arch-dependent
destruction.
This was added because some architectures (Such as i386)
currently do not free memory that it have allocated in
kvm_arch_init_vcpu().
Suggested-by: Maran Wilson <maran.wilson@oracle.com>
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-3-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If a block node uses bdrv_child_try_set_perm() to change the permission
it takes on its child, the result may be very short-lived. If anything
makes the block layer recalculate the permissions internally, it will
invoke the node driver's .bdrv_child_perm() implementation. The
permission/shared permissions masks that returns will then override the
values previously passed to bdrv_child_try_set_perm().
If drivers want a child edge to have specific values for the
permissions/shared permissions mask, it must return them in
.bdrv_child_perm(). Consequentially, there is no need for them to pass
the same values to bdrv_child_try_set_perm() then: It is better to have
a function that invokes .bdrv_child_perm() and calls
bdrv_child_try_set_perm() with the result. This patch adds such a
function under the name of bdrv_child_refresh_perms().
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Drop remaining users of bs->job:
1. assertions actually duplicated by assert(!bs->refcnt)
2. trace-point seems not enough reason to change stream_start to return
BlockJob pointer
3. Restricting creation of two jobs based on same bs is bad idea, as
3.1 Some jobs creates filters to be their main node, so, this check
don't actually prevent creating second job on same real node (which
will create another filter node) (but I hope it is restricted by
other mechanisms)
3.2 Even without bs->job we have two systems of permissions:
op-blockers and BLK_PERM
3.3 We may want to run several jobs on one node one day
And finally, drop bs->job pointer itself. Hurrah!
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are going to remove bs->job pointer. Drop it's usage in
blockdev_mark_auto_del: instead of looking at bs->job let's check all
jobs for references to bs.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are going to remove bs->job pointer. Drop it's usage in replication
code. Additionally we have to return job pointer from some mirror APIs.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Most callers know which monitor type they want to have. Instead of
calling monitor_init() with flags that can describe both types of
monitors, make monitor_init_{hmp,qmp}() public interfaces that take
specific bools instead of flags and call these functions directly.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20190613153405.24769-15-kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Move the monitor core infrastructure from monitor/misc.c to
monitor/monitor.c. This is code that can be shared for all targets, so
compile it only once.
What remains in monitor/misc.c after this patch is mostly monitor
command implementations (which could move to hmp-cmds.c or qmp-cmds.c
later) and code that requires a system emulator or is even
target-dependent (including HMP command completion code).
The amount of function and particularly extern variables in
monitor_int.h is probably a bit larger than it needs to be, but this way
no non-trivial code modifications are needed. The interfaces between all
monitor parts can be cleaned up later.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20190613153405.24769-13-kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Superfluous #include dropped]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The ReadLineState in Monitor is only used for HMP monitors. Create
MonitorHMP and move it there.
Can't use container_of() in hmp_change(). Cast instead, and mark
FIXME. Will be cleaned up shortly.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190613153405.24769-5-kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Superfluous variable in monitor_data_destroy() eliminated, whitespace
tweaked in hmp_change(), commit message improved]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
monitor_fdset_dup_fd_find_remove() and monitor_fdset_dup_fd_find()
return mon_fdset->id which is int64_t. Downcasting from int64_t to int
leads to a bug with removing fd from fdset with id >= 2^32.
So, fix return types for these function.
Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523094433.30297-1-yury-kotov@yandex-team.ru>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The SSE-200 hardware has configurable integration settings which
determine whether its two CPUs have the FPU and DSP:
* CPU0_FPU (default 0)
* CPU0_DSP (default 0)
* CPU1_FPU (default 1)
* CPU1_DSP (default 1)
Similarly, the IoTKit has settings for its single CPU:
* CPU0_FPU (default 1)
* CPU0_DSP (default 1)
Of our four boards that use either the IoTKit or the SSE-200:
* mps2-an505, mps2-an521 and musca-a use the default settings
* musca-b1 enables FPU and DSP on both CPUs
Currently QEMU models all these boards using CPUs with
both FPU and DSP enabled. This means that we are incorrect
for mps2-an521 and musca-a, which should not have FPU or DSP
on CPU0.
Create QOM properties on the ARMSSE devices corresponding to the
default h/w integration settings, and make the Musca-B1 board
enable FPU and DSP on both CPUs. This fixes the mps2-an521
and musca-a behaviour, and leaves the musca-b1 and mps2-an505
behaviour unchanged.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190517174046.11146-5-peter.maydell@linaro.org
Create "vfp" and "dsp" properties on the armv7m container object
which will be forwarded to its CPU object, so that SoCs can
configure whether the CPU has these features.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190517174046.11146-4-peter.maydell@linaro.org
arm and i386 has almost the same function acpi_add_rom_blob(), except
giving different FWCfgCallback function.
This patch moves acpi_add_rom_blob() to utils.c by passing
FWCfgCallback to it.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
v7:
* rebase on top of current master because of conflict
v6:
* change author from Igor to Michael
v5:
* remove unnecessary header glib/gprintf.h
* rearrange include header to make it more suitable
v4:
* extract -> moves
* adjust comment in source to make checkpatch happy
v3:
* put acpi_add_rom_blob() to hw/acpi/utils.c
v2:
* remove unused header in original source file
Message-Id: <20190610011830.28398-1-richardw.yang@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The code used to assign an interrupt index/subindex to an
eventfd is duplicated many times. Let's introduce an helper that
allows to set/unset the signaling for an ACTION_TRIGGER,
ACTION_MASK or ACTION_UNMASK action.
In the error message, we now use errno in case of any
VFIO_DEVICE_SET_IRQS ioctl failure.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Add new properties to allow setting the maximum display resolution.
Resolutions larger than that will not be included in the mode list.
In linux guests xrandr can be used to list modes.
Note: The existing xres and yres properties set the preferred display
resolution, i.e. the mode should be first in the mode list and guests
should use it by default.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20190607083429.31943-1-kraxel@redhat.com
This is ostensibly to avoid the weirdness of len looking like it might
come from a guest and sometimes being used. While we are at it fix up
the error checking for the arm-linux-user implementation of the API
which got flagged up by Coverity (CID 1401700).
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Next pull request against qemu-4.1. The big thing here is adding
support for hot plug of P2P bridges, and PCI devices under P2P bridges
on the "pseries" machine (which doesn't use SHPC). Other than that
there's just a handful of fixes and small enhancements.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl0AkgwACgkQbDjKyiDZ
s5Jyug//cwxP+t1t2CNHtffKwiXFzuEKx9YSNE1V0wog6aB40EbPKU72FzCq6FfA
lev+pZWV9AwVMzFYe4VM/7Lqh7WFMYDT3DOXaZwfANs4471vYtgvPi21L2TBj80d
hMszlyLWMLY9ByOzCxIq3xnbivGpA94G2q9rKbwXdK4T/5i62Pe3SIfgG+gXiiwW
+YlHWCPX0I1cJz2bBs9ElXdl7ONWnn+7uDf7gNfWkTKuiUq6Ps7mxzy3GhJ1T7nz
OFKmQ5dKzLJsgOULSSun8kWpXBmnPffkM3+fCE07edrWZVor09fMCk4HvtfaRy2K
FFa2Kvzn/V/70TL+44dsSX4QcwdcHQztiaMO7UGPq9CMswx5L7gsNmfX6zvK1Nrb
1t7ORZKNJ72hMyvDPSMiGU2DpVjO3ZbBlSL4/xG8Qeal4An0kgkN5NcFlB/XEfnz
dsKu9XzuGSeD1bWz1Mgcf1x7lPDBoHIKLcX6notZ8epP/otu4ywNFvAkPu4fk8s0
4jQGajIT7328SmzpjXClsmiEskpKsEr7hQjPRhu0hFGrhVc+i9PjkmbDl0TYRAf6
N6k6gJQAi+StJde2rcua1iS7Ra+Tka6QRKy+EctLqfqOKPb2VmkZ6fswQ3nfRRlT
LgcTHt2iJcLeud2klVXs1e4pKXzXchkVyFL4ucvmyYG5VeimMzU=
=ERgu
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190612' into staging
ppc patch queue 2019-06-12
Next pull request against qemu-4.1. The big thing here is adding
support for hot plug of P2P bridges, and PCI devices under P2P bridges
on the "pseries" machine (which doesn't use SHPC). Other than that
there's just a handful of fixes and small enhancements.
# gpg: Signature made Wed 12 Jun 2019 06:47:56 BST
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-4.1-20190612:
ppc/xive: Make XIVE generate the proper interrupt types
ppc/pnv: activate the "dumpdtb" option on the powernv machine
target/ppc: Use tcg_gen_gvec_bitsel
spapr: Allow hot plug/unplug of PCI bridges and devices under PCI bridges
spapr: Direct all PCI hotplug to host bridge, rather than P2P bridge
spapr: Don't use bus number for building DRC ids
spapr: Clean up DRC index construction
spapr: Clean up spapr_drc_populate_dt()
spapr: Clean up dt creation for PCI buses
spapr: Clean up device tree construction for PCI devices
spapr: Clean up device node name generation for PCI devices
target/ppc: Fix lxvw4x, lxvh8x and lxvb16x
spapr_pci: Improve error message
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 58ea30f514 "Clean up header guards that don't match their file
name" messed up contrib/elf2dmp/qemu_elf.h and
tests/migration/migration-test.h.
It missed target/cris/opcode-cris.h and
tests/uefi-test-tools/UefiTestToolsPkg/Include/Guid/BiosTablesTest.h
due to the scripts/clean-header-guards.pl bug fixed in the previous
commit.
Commit a8b991b52d "Clean up ill-advised or unusual header guards"
missed include/hw/xen/io/ring.h for the same reason.
Commit 3979fca4b6 "disas: Rename include/disas/bfd.h back to
include/disas/dis-asm.h" neglected to update the guard symbol for the
rename.
Commit a331c6d774 "semihosting: implement a semihosting console"
created include/hw/semihosting/console.h with an ill-advised guard
symbol.
Clean them up.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190604181618.19980-4-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
This is the common header guard idiom:
/*
* File comment
*/
#ifndef GUARD_SYMBOL_H
#define GUARD_SYMBOL_H
... actual contents ...
#endif
A few of our headers have some #include before the guard.
target/tilegx/spr_def_64.h has #ifndef __DOXYGEN__ outside the guard.
A few more have the #define elsewhere.
Change them to match the common idiom. For spr_def_64.h, that means
dropping #ifndef __DOXYGEN__. While there, rename guard symbols to
make scripts/clean-header-guards.pl happy.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190604181618.19980-2-armbru@redhat.com>
[Rebased with conflicts resolved automatically]
No header includes qemu-common.h after this commit, as prescribed by
qemu-common.h's file comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-5-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c
block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c
target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h
target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h
target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h
target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and
net/tap-bsd.c fixed up]
It should be generic Hypervisor Virtualization interrupts for HV
directed rings and traditional External Interrupts for the OS directed
ring.
Don't generate anything for the user ring as it isn't actually
supported.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190606174409.12502-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This makes some minor cleanups to spapr_drc_populate_dt(), renaming it to
the shorter and more idiomatic spapr_dt_drc() along the way.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Device nodes for PCI bridges (both host and P2P) describe both the bridge
device itself and the bus hanging off it, handling of this is a bit of a
mess.
spapr_dt_pci_device() has a few things it only adds for non-bridges, but
always adds #address-cells and #size-cells which should only appear for
bridges. But the walking down the subordinate PCI bus is done in one of
its callers spapr_populate_pci_devices_dt(). The PHB dt creation in
spapr_populate_pci_dt() open codes some similar logic to the bridge case.
This patch consolidates things in a bunch of ways:
* Bus specific dt info is now created in spapr_dt_pci_bus() used for both
P2P bridges and the host bridge. This includes walking subordinate
devices
* spapr_dt_pci_device() now calls spapr_dt_pci_bus() when called on a
P2P bridge
* We do detection of bridges with the is_bridge field of the device class,
rather than checking PCI config space directly, for consistency with
qemu's core PCI code.
* Several things are renamed for brevity and clarity
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-3-armbru@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Other accelerators have their own headers: sysemu/hax.h, sysemu/hvf.h,
sysemu/kvm.h, sysemu/whpx.h. Only tcg_enabled() & friends sit in
qemu-common.h. This necessitates inclusion of qemu-common.h into
headers, which is against the rules spelled out in qemu-common.h's
file comment.
Move tcg_enabled() & friends into their own header sysemu/tcg.h, and
adjust #include directives.
Cc: Richard Henderson <rth@twiddle.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-2-armbru@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[Rebased with conflicts resolved automatically, except for
accel/tcg/tcg-all.c]
This macro is now always empty, so remove it. This leaves the
entire contents of CPUArchState under the control of the guest
architecture.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have for some time had code within the tcg backends to
handle large positive offsets from env. This move makes
sure that need not happen. Indeed, we are able to assert
at build time that simple offsets suffice for all hosts.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Amusingly, we had already ignored the comment to keep this value
at the end of CPUState. This restores the minimum negative offset
from TCG_AREG0 for code generation.
For the couple of uses within qom/cpu.c, without NEED_CPU_H, add
a pointer from the CPUState object to the IcountDecr object within
CPUNegativeOffsetState.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Nothing in there so far, but all of the plumbing done
within the target ArchCPU state.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Consolidate some boilerplate from foo_cpu_initfn.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that we have ArchCPU, we can define this generically,
in the one place that needs it.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will replace foo_env_get_cpu with a generic definition.
No changes to the target specific code so far.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that we have both ArchCPU and CPUArchState, we can define
this generically instead of via macro in each target's cpu.h.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move all softmmu tlb data into this structure. Arrange the
members so that we are able to place mask+table together and
at a smaller absolute offset from ENV.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For all targets, into this new file move TARGET_LONG_BITS,
TARGET_PAGE_BITS, TARGET_PHYS_ADDR_SPACE_BITS,
TARGET_VIRT_ADDR_SPACE_BITS, and NB_MMU_MODES.
Include this new file from exec/cpu-defs.h.
This now removes the somewhat odd requirement that target/arch/cpu.h
defines TARGET_LONG_BITS before including exec/cpu-defs.h, so push the
bulk of the includes within target/arch/cpu.h to the top.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Both structures are allocated once per mmu_idx.
There is no reason for them to be separate.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The AccelType type was converted to AccelClass QOM
object on b14a0b7469, and the original data type had
a field to store the option name which in turn was
used to search an accelerator. The lookup method
(accel_find) changed too, making the option field
unnecessary but it became AccelClass::opt_name despite
that. Therefore, and given that none accelerator
implementation sets AccelClass::opt_name, let's
remove this attribute.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Message-Id: <20190531165334.20403-2-wainersm@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Let's add all HWCAPs that we can support under TCG right now, when the
respective CPU facilities are enabled.
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Laurent Vivier <laurent@vivier.eu>
Cc: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Add modifier parameter to egl_get_fd_for_texture(), to return the used
modifier on dmabuf exports.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20190529072144.26737-4-kraxel@redhat.com
dmabufs can have a format modifier (DRM_FORMAT_MOD_*) which is used for
tiled layouts for example. Add a field to QemuDmaBuf so we can carry
around that information.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190529072144.26737-2-kraxel@redhat.com
stricter rules for acpi tables: we now fail
on any difference that isn't whitelisted.
vhost-scsi migration.
some cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJc+B4YAAoJECgfDbjSjVRpq1EIAJR7tCxcpu9GggVlinmUA8G4
tmSAe06IryH7+nF3RsnINuGu7ius9qC2/E2y0uJUHhTqiU/RWOfWZ7PPM0EcYZaA
TLPaCe2NUF6/8afeqmvE9Usk7VspI5TDZRms+bonmZz2xP1lHIMN0qW4s7HHLWr8
sZKDtCJ+9cYII93VQwtlR0qiHgv5f0kzcuZeJaZHsAHH6XZGqRuQjI6txcFa4o53
lkdLCEwTnRuwu2wyL84eL5p+E8SzOgR/x1QI+nffrJfsvnmiT7lnOrkjnQlWAp5G
xqwqsUrUxUCuQ+zitwJqmv+H6nx79MwAM7fTHAETCWX703N5o9tZxAnHHqLoa8I=
=cQNg
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pci, pc: cleanups, features
stricter rules for acpi tables: we now fail
on any difference that isn't whitelisted.
vhost-scsi migration.
some cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Wed 05 Jun 2019 20:55:04 BST
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
bios-tables-test: ignore identical binaries
tests: acpi: add simple arm/virt testcase
tests: add expected ACPI tables for arm/virt board
bios-tables-test: list all tables that differ
vhost-scsi: Allow user to enable migration
vhost-scsi: Add VMState descriptor
vhost-scsi: The vhost backend should be stopped when the VM is not running
bios-tables-test: add diff allowed list
vhost: fix memory leak in vhost_user_scsi_realize
vhost: fix incorrect print type
vhost: remove the dead code
docs: smbios: remove family=x from type2 entry description
pci: Fold pci_get_bus_devfn() into its sole caller
pci: Make is_bridge a bool
pcie: Simplify pci_adjust_config_limit()
acpi: pci: use build_append_foo() API to construct MCFG
hw/acpi: Consolidate build_mcfg to pci.c
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This fixes at least one overflow in qcow2_process_discards, which
passes 64bit region length to bdrv_pdiscard where bytes (or sectors in
the past) parameter is int since its introduction in 0b919fae.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All callers of bdrv_set_aio_context() are eliminated now, they have
moved to bdrv_try_set_aio_context() and related safe functions. Remove
bdrv_set_aio_context().
With this, we can now know that the .set_aio_ctx callback must be
present in bdrv_set_aio_context_ignore() because
bdrv_can_set_aio_context() would have returned false previously, so
instead of checking the condition, we can assert it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
So far, we only made sure that updating the AioContext of a node
affected the whole subtree. However, if a node is newly attached to a
new parent, we also need to make sure that both the subtree of the node
and the parent are in the same AioContext. This tries to move the new
child node to the parent AioContext and returns an error if this isn't
possible.
BlockBackends now actually apply their AioContext to their root node.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This makes use of qdev_prop_drive_iothread for scsi-disk so that the
disk can be attached to a node that is already in the target AioContext.
We need to check that the HBA actually supports iothreads, otherwise
scsi-disk must make sure that the node is already in the main
AioContext.
This changes the error message for conflicting iothread settings.
Previously, virtio-scsi produced the error message, now it comes from
blk_set_aio_context(). Update a test case accordingly.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Some qdev block devices have support for iothreads and take care of the
AioContext they are running in, but most devices don't know about any of
this. For the latter category, the qdev drive property must make sure
that their BlockBackend is in the main AioContext.
Unfortunately, while the current code just does the same thing for
devices that do support iothreads, this is not correct and it would show
as soon as we actually try to keep a consistent AioContext assignment
across all nodes and users of a block graph subtree: If a node is
already in a non-default AioContext because of one of its users,
attaching a new device should still be possible if that device can work
in the same AioContext. Switching the node back to the main context
first and only then into the device AioContext causes failure (because
the existing user wouldn't allow the switch to the main context).
So devices that support iothreads need a different kind of drive
property that leaves the node in its current AioContext, but by using
this type, the device promises to check later that it can work with this
context.
This patch adds the qdev infrastructure that allows devices to signal
that they handle iothreads and qdev should leave the AioContext alone.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds a new parameter to blk_new() which requires its callers to
declare from which AioContext this BlockBackend is going to be used (or
the locks of which AioContext need to be taken anyway).
The given context is only stored and kept up to date when changing
AioContexts. Actually applying the stored AioContext to the root node
is saved for another commit.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add an Error parameter to blk_set_aio_context() and use
bdrv_child_try_set_aio_context() internally to check whether all
involved nodes can actually support the AioContext switch.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kenneth Heitke <kenneth.heitke@intel.com>
Reviewed-by: Klaus Birkelund Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callback-based laio_submit() and laio_cancel() were left after
rewriting Linux AIO backend to coroutines in hope that they would be
used in other code that could bypass coroutines. They can be safely
removed because they have not been used since that time.
Signed-off-by: Julia Suvorova <jusual@mail.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
drv_co_block_status digs bs->file for additional, more accurate search
for hole inside region, reported as DATA by bs since 5daa74a6eb.
This accuracy is not free: assume we have qcow2 disk. Actually, qcow2
knows, where are holes and where is data. But every block_status
request calls lseek additionally. Assume a big disk, full of
data, in any iterative copying block job (or img convert) we'll call
lseek(HOLE) on every iteration, and each of these lseeks will have to
iterate through all metadata up to the end of file. It's obviously
ineffective behavior. And for many scenarios we don't need this lseek
at all.
However, lseek is needed when we have metadata-preallocated image.
So, let's detect metadata-preallocation case and don't dig qcow2's
protocol file in other cases.
The idea is to compare allocation size in POV of filesystem with
allocations size in POV of Qcow2 (by refcounts). If allocation in fs is
significantly lower, consider it as metadata-preallocation case.
102 iotest changed, as our detector can't detect shrinked file as
metadata-preallocation, which don't seem to be wrong, as with metadata
preallocation we always have valid file length.
Two other iotests have a slight change in their QMP output sequence:
Active 'block-commit' returns earlier because the job coroutine yields
earlier on a blocking operation. This operation is loading the refcount
blocks in qcow2_detect_metadata_preallocation().
Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit b2fc91db84 ("q35: set split kernel irqchip as default") changed
the default for the pc-q35-4.0 machine type to use split irqchip, which
turned out to have disasterous effects on vfio-pci INTx support. KVM
resampling irqfds are registered for handling these interrupts, but
these are non-functional in split irqchip mode. We can't simply test
for split irqchip in QEMU as userspace handling of this interrupt is a
significant performance regression versus KVM handling (GeForce GPUs
assigned to Windows VMs are non-functional without forcing MSI mode or
re-enabling kernel irqchip).
The resolution is to revert the change in default irqchip mode in the
pc-q35-4.1 machine and create a pc-q35-4.0.1 machine for the 4.0-stable
branch. The qemu-q35-4.0 machine type should not be used in vfio-pci
configurations for devices requiring legacy INTx support without
explicitly modifying the VM configuration to use kernel irqchip.
Link: https://bugs.launchpad.net/qemu/+bug/1826422
Fixes: b2fc91db84 ("q35: set split kernel irqchip as default")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <155786484688.13873.6037015630912983760.stgit@gimli.home>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It's never used anywhere.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20190520030839.6795-5-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In order to perform a valid migration of a vhost-scsi device,
the following requirements must be met:
(1) The virtio-scsi device state needs to be saved & loaded.
(2) The vhost backend must be stopped before virtio-scsi device state
is saved:
(2.1) Sync vhost backend state to virtio-scsi device state.
(2.2) No further I/O requests are made by vhost backend to target
SCSI device.
(2.3) No further guest memory access takes place after VM is stopped.
(3) Requests in-flight to target SCSI device are completed before
migration handover.
(4) Target SCSI device state needs to be saved & loaded into the
destination host target SCSI device.
Previous commit ("vhost-scsi: Add VMState descriptor")
add support to save & load the device state using VMState.
This meets requirement (1).
When VM is stopped by migration thread (On Pre-Copy complete), the
following code path is executed:
migration_completion() -> vm_stop_force_state() -> vm_stop() ->
do_vm_stop().
do_vm_stop() calls first pause_all_vcpus() which pause all guest
vCPUs and then call vm_state_notify().
In case of vhost-scsi device, this will lead to the following code path
to be executed:
vm_state_notify() -> virtio_vmstate_change() ->
virtio_set_status() -> vhost_scsi_set_status() -> vhost_scsi_stop().
vhost_scsi_stop() then calls vhost_scsi_clear_endpoint() and
vhost_scsi_common_stop().
vhost_scsi_clear_endpoint() sends VHOST_SCSI_CLEAR_ENDPOINT ioctl to
vhost backend which will reach kernel's vhost_scsi_clear_endpoint()
which process all pending I/O requests and wait for them to complete
(vhost_scsi_flush()). This meets requirement (3).
vhost_scsi_common_stop() will stop the vhost backend.
As part of this stop, dirty-bitmap is synced and vhost backend state is
synced with virtio-scsi device state. As at this point guest vCPUs are
already paused, this meets requirement (2).
At this point we are left with requirement (4) which is target SCSI
device specific and therefore cannot be done by QEMU. Which is the main
reason why vhost-scsi adds a migration blocker.
However, as this can be handled either by an external orchestrator or
by using shared-storage (i.e. iSCSI), there is no reason to limit the
orchestrator from being able to explictly specify it wish to enable
migration even when VM have a vhost-scsi device.
Considering all the above, this commit allows orchestrator to explictly
specify that it is responsbile for taking care of requirement (4) and
therefore vhost-scsi should not add a migration blocker.
Reviewed-by: Nir Weiner <nir.weiner@oracle.com>
Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190416125912.44001-4-liran.alon@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Next pull request against qemu-4.1. Highlights:
* KVM accelerated support for the XIVE interrupt controller in PAPR
guests
* A number of TCG vector fixes
* Fixes for the PReP / 40p machine
* Improvements to make check-tcg test coverage
Other than that it's just a bunch of assorted fixes, cleanups and
minor improvements.
This supersedes both the pull request dated 2019-05-21 and the one
dated 2019-05-22. I've dropped one hunk which I think may have caused
the check-tcg failure that Peter saw (by enabling the ppc64abi32
build, which I think has been broken for ages). I'm not entirely
certain, since I haven't reproduced exactly the same failure.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAlzuK2AACgkQbDjKyiDZ
s5LFfxAAuvBI2d5gYDSDiniJPMrEzM8ANynf8fYMGSySRNpeKz5PnMhNQieoxaEt
pS9qJnnaCDrpV09jJo6QWStUaqFqnLPOYdWvRsnb7mx0yXe5eWUyYPp0TRAqKj8S
Ainv9ma8WfhVphsH3E01KR6evdC6BDC0F2afDToFGMKcDKXafmnSOEV9ZtFAzFXO
xqh/Az+Y2ATwDmt92uSq7JBS5YRUvhYQORoKslxnrJswKkN+Uwi5+a2FzOHk3Jwe
BlV6soEAVqb9ItFtgwcArclryCMMVxrqzs2VTWOYbhznFX0X1xUNeSQ8H+7F+IVy
Xu1e2fnwufvilvWSsjtYvdYnnCbNvwgWjYfZNMrQ2hmSDtCQnRKyVIYwiU08Qj2y
LmVlQzWN3WYHIRBTACLMDf5VHa9P01QZeJEoVIV6i4m4PCxbSmlzI62eRKNhW917
2d3h8dGIxSDm9/WpXefKMMrt2P7fAqkiz5ZUZIjkspcHaPPmk7qQp0ngFjeEuyFk
tJMd87hgemm9gg+mcF9XQ8yZGkR3oTq7nwDGwZHrp8S0GyRvNwhTbT2iKzAG2cxe
kfWRFswxn1zYPShqkcj3rwNsg8LnC3b22Og/obHYVjQ8ONx4ZB0q8xJSkUpvsQf5
HEUHLHtstBmrInFMf+2KbViUIpobmn4woojjNsqZ32W7OZv6Yk4=
=2q3B
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190529' into staging
ppc patch queue 2019-05-29
Next pull request against qemu-4.1. Highlights:
* KVM accelerated support for the XIVE interrupt controller in PAPR
guests
* A number of TCG vector fixes
* Fixes for the PReP / 40p machine
* Improvements to make check-tcg test coverage
Other than that it's just a bunch of assorted fixes, cleanups and
minor improvements.
This supersedes both the pull request dated 2019-05-21 and the one
dated 2019-05-22. I've dropped one hunk which I think may have caused
the check-tcg failure that Peter saw (by enabling the ppc64abi32
build, which I think has been broken for ages). I'm not entirely
certain, since I haven't reproduced exactly the same failure.
# gpg: Signature made Wed 29 May 2019 07:49:04 BST
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-4.1-20190529: (44 commits)
ppc/pnv: add dummy XSCOM registers for PRD initialization
ppc/pnv: introduce new skiboot platform properties
spapr: Don't migrate the hpt_maxpagesize cap to older machine types
spapr: change default interrupt mode to 'dual'
spapr/xive: fix multiple resets when using the 'dual' interrupt mode
docs: provide documentation on the POWER9 XIVE interrupt controller
spapr/irq: add KVM support to the 'dual' machine
ppc/xics: fix irq priority in ics_set_irq_type()
spapr/irq: initialize the IRQ device only once
spapr/irq: introduce a spapr_irq_init_device() helper
spapr: check for the activation of the KVM IRQ device
spapr: introduce routines to delete the KVM IRQ device
sysbus: add a sysbus_mmio_unmap() helper
spapr/xive: activate KVM support
spapr/xive: add migration support for KVM
spapr/xive: introduce a VM state change handler
spapr/xive: add state synchronization with KVM
spapr/xive: add hcall support when under KVM
spapr/xive: add KVM support
spapr: Print out extra hints when CAS negotiation of interrupt mode fails
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The is_bridge field in PCIDevice acts as a bool, but is declared as an int.
Declare it as a bool for clarity, and change everything that writes it to
use true/false instead of 0/1 to match.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20190513061939.3464-5-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since c2077e2c "pci: Adjust PCI config limit based on bus topology",
pci_adjust_config_limit() has been used in the config space read and write
paths to only permit access to extended config space on buses which permit
it. Specifically it prevents access on devices below a vanilla-PCI bus via
some combination of bridges, even if both the host bridge and the device
itself are PCI-E.
It accomplishes this with a somewhat complex call up the chain of bridges
to see if any of them prohibit extended config space access. This is
overly complex, since we can always know if the bus will support such
access at the point it is constructed.
This patch simplifies the test by using a flag in the PCIBus instance
indicating whether extended configuration space is accessible. It is
false for vanilla PCI buses. For PCI-E buses, it is true for root
buses and equal to the parent bus's's capability otherwise.
For the special case of sPAPR's paravirtualized PCI root bus, which
acts mostly like vanilla PCI, but does allow extended config space
access, we override the default value of the flag from the host bridge
code.
This should cause no behavioural change.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20190513061939.3464-4-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
build_append_foo() API doesn't need explicit endianness conversions
which eliminates a source of errors and it makes build_mcfg() look like
declarative definition of MCFG table in ACPI spec, which makes it easy
to review.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
v3:
* add some comment on the Configuration Space base address allocation
structure
v2:
* miss the reserved[8] of MCFG in last version, add it back
* drop SOBs and make sure bios-tables-test all OK
Message-Id: <20190521062836.6541-3-richardw.yang@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now we have two identical build_mcfg functions.
Consolidate them in acpi/pci.c.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
v4:
* ACPI_PCI depends on both ACPI and PCI
* rebase on latest master, adjust arm Kconfig
v3:
* adjust changelog based on Igor's suggestion
Message-Id: <20190521062836.6541-2-richardw.yang@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add new virtio-gpu devices with a "vhost-user" property. The
associated vhost-user backend is used to handle the virtio rings and
provide rendering results thanks to the vhost-user-gpu protocol.
Example usage:
-object vhost-user-backend,id=vug,cmd="./vhost-user-gpu"
-device vhost-user-vga,vhost-user=vug
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-10-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add a base class that is common to virtio-gpu and vhost-user-gpu
devices.
The VirtIOGPUBase base class provides common functionalities necessary
for both virtio-gpu and vhost-user-gpu:
- common configuration (max-outputs, initial resolution, flags)
- virtio device initialization, including queue setup
- device pre-conditions checks (iommu)
- migration blocker
- virtio device callbacks
- hooking up to qemu display subsystem
- a few common helper functions to reset the device, retrieve display
informations
- a class callback to unblock the rendering (for GL updates)
What is left to the virtio-gpu subdevice to take care of, in short,
are all the virtio queues handling, command processing and migration.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-8-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This will allow to share the format conversion function with
vhost-user-gpu.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-4-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The helper functions are useful to build the vhost-user-gpu backend.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-3-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>