object_property_set_description() and
object_class_property_set_description() fail only when property @name
is not found.
There are 85 calls of object_property_set_description() and
object_class_property_set_description(). None of them can fail:
* 84 immediately follow the creation of the property.
* The one in spapr_rng_instance_init() refers to a property created in
spapr_rng_class_init(), from spapr_rng_properties[].
Every one of them still gets to decide what to pass for @errp.
51 calls pass &error_abort, 32 calls pass NULL, one receives the error
and propagates it to &error_abort, and one propagates it to
&error_fatal. I'm actually surprised none of them violates the Error
API.
What are we gaining by letting callers handle the "property not found"
error? Use when the property is not known to exist is simpler: you
don't have to guard the call with a check. We haven't found such a
use in 5+ years. Until we do, let's make life a bit simpler and drop
the @errp parameter.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-8-armbru@redhat.com>
[One semantic rebase conflict resolved]
qom/object.c provides object_property_get_TYPE() and
object_property_set_TYPE() for a number of common types. These are
all convenience wrappers around object_property_get_qobject() and
object_property_set_qobject().
Except for object_property_get_uint16List(), which is unusual in two ways:
* It bypasses object_property_get_qobject(). Fixable; the previous
commit did it for object_property_get_enum())
* It stores the value through a parameter. Its contract claims it
returns the value, like the other functions do. Also fixable.
Fixing is not worthwhile, though: object_property_get_uint16List() has
seen exactly one user in six years.
Convert the lone user to do its job with the generic
object_property_get_qobject(), and drop object_property_get_uint16List().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200505152926.18877-6-armbru@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
[Commit message typo fixed]
Some stream clients stream an endless stream of data while
other clients stream data in packets. Stream interfaces
usually have a way to signal the end of a packet or the
last beat of a transfer.
This adds an end-of-packet flag to the push interface.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-Id: <20200506082513.18751-6-edgar.iglesias@gmail.com>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEuBi5yt+QicLVzsZrda1lgCoLQhEFAl6yxrEACgkQda1lgCoL
QhELMQf8CNGtB1xmiVL9GY3RJRkcWGTmQ5/wPJ+Fuf6A2fWP9bYWBemJQcqVN+xR
1fBXT2jcZvOy0N0CpioY2tin2oSWXCvVGcoaZbBpiDwGqzLrXzNheOKW9A3530zN
LAc8plTGpL90b2P/lkYSdUlBS3XK30wVdaxDtpFrhT43miJlRL16fTRDOjWPmNHi
+tbI/hGybI2wSjzotzB+g3cP54SD1eZJmXR0498vAJiO5OtpVdC/NGr4Ma+BT+If
Obf46XmgfCRLsqDJQbLNM6vVCN+MuE12BGVDSA2OX7oD6SBGQsb53asuqJ8TutWR
CsBs50WDMQFPgnOvMhHepFP/HzldaQ==
=VnhL
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stefanberger/tags/pull-tpm-2020-05-06-1' into staging
Merge tpm 2020/05/06 v1
# gpg: Signature made Wed 06 May 2020 15:16:17 BST
# gpg: using RSA key B818B9CADF9089C2D5CEC66B75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2020-05-06-1:
hw: add compat machines for 5.1
hw/arm/virt: Remove the compat forcing tpm-tis-device PPI to off
tpm: tpm-tis-device: set PPI to false by default
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
There was no support for 8 bits block registers. Changed
register_init_block32 to be generic and static, adding register
size in bits as parameter. Created one helper for each size.
Signed-off-by: Joaquin de Andres <me@xcancerberox.com.ar>
Message-Id: <20200402162839.76636-1-me@xcancerberox.com.ar>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Introduce a function and macro helpers to setup several clocks
in a device from a static array description.
An element of the array describes the clock (name and direction) as
well as the related callback and an optional offset to store the
created object pointer in the device state structure.
The array must be terminated by a special element QDEV_CLOCK_END.
This is based on the original work of Frederic Konrad.
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20200406135251.157596-5-damien.hedde@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add functions to easily handle clocks with devices.
Clock inputs and outputs should be used to handle clock propagation
between devices.
The API is very similar the GPIO API.
This is based on the original work of Frederic Konrad.
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20200406135251.157596-4-damien.hedde@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20200406135251.157596-3-damien.hedde@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This object may be used to represent a clock inside a clock tree.
A clock may be connected to another clock so that it receives update,
through a callback, whenever the source/parent clock is updated.
Although only the root clock of a clock tree controls the values
(represented as periods) of all clocks in tree, each clock holds
a local state containing the current value so that it can be fetched
independently. It will allows us to fullfill migration requirements
by migrating each clock independently of others.
This is based on the original work of Frederic Konrad.
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20200406135251.157596-2-damien.hedde@greensocs.com
[PMM: Use uint64_t rather than unsigned long long in trace events;
the dtrace backend can't handle the latter]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The previous few commits have made this more obvious, and removed the
one exception. Time to clarify the documentation, and drop dead error
checking.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200424084338.26803-13-armbru@redhat.com>
Any sub-page size update to ACPI MRs will be lost during
migration, as we use aligned size in ram_load_precopy() ->
qemu_ram_resize() path. This will result in inconsistency in
FWCfgEntry sizes between source and destination. In order to avoid
this, save and restore them separately during migration.
Up until now, this problem may not be that relevant for x86 as both
ACPI table and Linker MRs gets padded and aligned. Also at present,
qemu_ram_resize() doesn't invoke callback to update FWCfgEntry for
unaligned size changes. But since we are going to fix the
qemu_ram_resize() in the subsequent patch, the issue may become
more serious especially for RSDP MR case.
Moreover, the issue will soon become prominent in arm/virt as well
where the MRs are not padded or aligned at all and eventually have
acpi table changes as part of future additions like NVDIMM hot-add
feature.
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Acked-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200403101827.30664-3-shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The https://makecode.microbit.org/#editor generates slightly weird
.hex files which work fine on a real microbit but causes QEMU to
choke. The reason is extraneous data after the EOF record which causes
the loader to attempt to write a bigger file than it should to the
"rom". According to the HEX file spec an EOF really should be the last
thing we process so lets do that.
Reported-by: Ursula Bennée <alex.bennee@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200403191150.863-12-alex.bennee@linaro.org>
Commit bb15791166 ("compat: disable edid on virtio-gpu base
device") tried to disable 'edid' on the virtio-gpu base device.
However, that device is not 'virtio-gpu', but 'virtio-gpu-device'.
Fix it.
Fixes: bb15791166 ("compat: disable edid on virtio-gpu base device")
Reported-by: Lukáš Doktor <ldoktor@redhat.com>
Tested-by: Lukáš Doktor <ldoktor@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-id: 20200318093919.24942-1-cohuck@redhat.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Bug fixes:
* memory encryption: Disable mem merge
(Dr. David Alan Gilbert)
Features:
* New EPYC CPU definitions (Babu Moger)
* Denventon-v2 CPU model (Tao Xu)
* New 'note' field on versioned CPU models (Tao Xu)
Cleanups:
* x86 CPU topology cleanups (Babu Moger)
* cpu: Use DeviceClass reset instead of a special CPUClass reset
(Peter Maydell)
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEEWjIv1avE09usz9GqKAeTb5hNxaYFAl5xdnsUHGVoYWJrb3N0
QHJlZGhhdC5jb20ACgkQKAeTb5hNxaYkGA/9Fn1tCdW/74CEREPbcKNOf8twmCr2
L4qykix7mFcZXstFhEQuoNJQMz8mEPJngOfUSQY1c9w4psf0AXE6q3wbdNcxxdj1
1/+cPbaRuoF8EKw63MgR3AaReuWtAV+sGS4+eKBMJTMUbl03pOYARE+irCWJU6rd
YdP0t6CX0NWF4afv+2wMeeZVr+IcKEo81jCCCSjmM0YLkwvu0Vs5ng3jE7vtFKPj
MQHMyqD/lz0FwyksBiOLwjOCbnmIydWc/8VV68UH5ulxka96jk8CwmI0+A9v2UMQ
4PjQ84UeQclJTbec+h/Qy8DoCP3qiqijFMRau2wo1UWCsAjMcaRIJjIe5CSOJFRu
3FrP2FEJCZiWjh11b/x3jIyjK6MDjv3Y1oky1j5VkCnFUNLHbXUA2KY3jaZ/pf+1
BDqa6lNDYJBN+FQQt0yXDWAdGLUxxP87S9jmU9RULzwAwCic0FxVR/a5zk9EUDi0
mA+WL0ekfhIEVACdHYuCTxujGq8QnGiCppr1Wgx3t+GgveR8AjXdd/KclcKskYiw
ozbujtBPQUImuq3xi6FTkRHXuEW+zc+IFbhZ3Zq5OhmJmpdgmSHryFcKAdvNJH/z
VllKAsLg1hffm+PjlpuZLBucC4PBrvHbS7htHhMaemEiJHO9V5EfGDWQdELNRM8p
sKymFNs5XjzQcGE=
=9fEL
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ehabkost/tags/x86-and-machine-pull-request' into staging
x86 and machine queue for 5.0 soft freeze
Bug fixes:
* memory encryption: Disable mem merge
(Dr. David Alan Gilbert)
Features:
* New EPYC CPU definitions (Babu Moger)
* Denventon-v2 CPU model (Tao Xu)
* New 'note' field on versioned CPU models (Tao Xu)
Cleanups:
* x86 CPU topology cleanups (Babu Moger)
* cpu: Use DeviceClass reset instead of a special CPUClass reset
(Peter Maydell)
# gpg: Signature made Wed 18 Mar 2020 01:16:43 GMT
# gpg: using RSA key 5A322FD5ABC4D3DBACCFD1AA2807936F984DC5A6
# gpg: issuer "ehabkost@redhat.com"
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" [full]
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-and-machine-pull-request:
hw/i386: Rename apicid_from_topo_ids to x86_apicid_from_topo_ids
hw/i386: Update structures to save the number of nodes per package
hw/i386: Remove unnecessary initialization in x86_cpu_new
machine: Add SMP Sockets in CpuTopology
hw/i386: Consolidate topology functions
hw/i386: Introduce X86CPUTopoInfo to contain topology info
cpu: Use DeviceClass reset instead of a special CPUClass reset
machine/memory encryption: Disable mem merge
hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs
i386: Add 2nd Generation AMD EPYC processors
i386: Add missing cpu feature bits in EPYC model
target/i386: Add new property note to versioned CPU models
target/i386: Add Denverton-v2 (no MPX) CPU model
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
- docker updates for VirGL
- re-factor gdbstub for static GDBState
- re-factor gdbstub for dynamic arrays
- add SVE support to arm gdbstub
- add some guest debug tests to check-tcg
- add aarch64 userspace register tests
- remove packet size limit to gdbstub
- simplify gdbstub monitor code
- report vContSupported in gdbstub to use proper single-step
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAl5xDUIACgkQ+9DbCVqe
KkQwCwf/YtmUsNxxO+CgNctq2u3jV4FoOdQP3bejvmT2+cigKJhQuBlWPg1/YsqF
RDNkmBQx2JaVVMuVmpnwVK1UD+kmYZqrtlOkPNcVrjPmLCq3BVI1LHe6Rjoerx8F
QoZyH0IMNHbBgDo1I46lSFOWcxmOvo+Ow7NX5bPKwlRzf0dyEqSJahRaZLAgUscR
taTtGfk9uQsnxoRsvH/efiQ4bZtUvrEQuhEX3WW/yVE1jTpcb2llwX4xONJb2It3
/0WREGEEIT8PpnWw2S3FH4THY/BjWgz/FPDwNNZYCKBMWDjuG/8KHryd738T9rzo
lkGP9YcXmiyxMMyFFwS8RD3SHr8LvQ==
=Wm+a
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stsquad/tags/pull-testing-and-gdbstub-170320-1' into staging
Testing and gdbstub updates:
- docker updates for VirGL
- re-factor gdbstub for static GDBState
- re-factor gdbstub for dynamic arrays
- add SVE support to arm gdbstub
- add some guest debug tests to check-tcg
- add aarch64 userspace register tests
- remove packet size limit to gdbstub
- simplify gdbstub monitor code
- report vContSupported in gdbstub to use proper single-step
# gpg: Signature made Tue 17 Mar 2020 17:47:46 GMT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-testing-and-gdbstub-170320-1: (28 commits)
gdbstub: Fix single-step issue by confirming 'vContSupported+' feature to gdb
gdbstub: do not split gdb_monitor_write payload
gdbstub: change GDBState.last_packet to GByteArray
tests/tcg/aarch64: add test-sve-ioctl guest-debug test
tests/tcg/aarch64: add SVE iotcl test
tests/tcg/aarch64: add a gdbstub testcase for SVE registers
tests/guest-debug: add a simple test runner
configure: allow user to specify what gdb to use
tests/tcg/aarch64: userspace system register test
target/arm: don't bother with id_aa64pfr0_read for USER_ONLY
target/arm: generate xml description of our SVE registers
target/arm: default SVE length to 64 bytes for linux-user
target/arm: explicitly encode regnum in our XML
target/arm: prepare for multiple dynamic XMLs
gdbstub: extend GByteArray to read register helpers
target/i386: use gdb_get_reg helpers
target/m68k: use gdb_get_reg helpers
target/arm: use gdb_get_reg helpers
gdbstub: add helper for 128 bit registers
gdbstub: move mem_buf to GDBState and use GByteArray
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Store the smp sockets in CpuTopology. The socket information required to
build the apic id in EPYC mode. Right now socket information is not passed
to down when decoding the apic id. Add the socket information here.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <158396718647.58170.2278448323151215741.stgit@naples-babu.amd.com>
The CPUClass has a 'reset' method. This is a legacy from when
TYPE_CPU used not to inherit from TYPE_DEVICE. We don't need it any
more, as we can simply use the TYPE_DEVICE reset. The 'cpu_reset()'
function is kept as the API which most places use to reset a CPU; it
is now a wrapper which calls device_cold_reset() and then the
tracepoint function.
This change should not cause CPU objects to be reset more often
than they are at the moment, because:
* nobody is directly calling device_cold_reset() or
qdev_reset_all() on CPU objects
* no CPU object is on a qbus, so they will not be reset either
by somebody calling qbus_reset_all()/bus_cold_reset(), or
by the main "reset sysbus and everything in the qbus tree"
reset that most devices are reset by
Note that this does not change the need for each machine or whatever
to use qemu_register_reset() to arrange to call cpu_reset() -- that
is necessary because CPU objects are not on any qbus, so they don't
get reset when the qbus tree rooted at the sysbus bus is reset, and
this isn't being changed here.
All the changes to the files under target/ were made using the
included Coccinelle script, except:
(1) the deletion of the now-inaccurate and not terribly useful
"CPUClass::reset" comments was done with a perl one-liner afterwards:
perl -n -i -e '/ CPUClass::reset/ or print' target/*/*.c
(2) this bit of the s390 change was done by hand, because the
Coccinelle script is not sophisticated enough to handle the
parent_reset call being inside another function:
| @@ -96,8 +96,9 @@ static void s390_cpu_reset(CPUState *s, cpu_reset_type type)
| S390CPU *cpu = S390_CPU(s);
| S390CPUClass *scc = S390_CPU_GET_CLASS(cpu);
| CPUS390XState *env = &cpu->env;
|+ DeviceState *dev = DEVICE(s);
|
|- scc->parent_reset(s);
|+ scc->parent_reset(dev);
| cpu->env.sigp_order = 0;
| s390_cpu_set_state(S390_CPU_STATE_STOPPED, cpu);
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20200303100511.5498-1-peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
When a host is running with memory encryption, the memory isn't visible
to the host kernel; attempts to merge that memory are futile because
what it's really comparing is encrypted memory, usually encrypted
with different keys.
Automatically turn mem-merge off when memory encryption is specified.
https://bugzilla.redhat.com/show_bug.cgi?id=1796356
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200130175046.85850-1-dgilbert@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Instead of passing a pointer to memory now just extend the GByteArray
to all the read register helpers. They can then safely append their
data through the normal way. We don't bother with this abstraction for
write registers as we have already ensured the buffer being copied
from is the correct size.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
Message-Id: <20200316172155.971-15-alex.bennee@linaro.org>
Avoid orphan memory regions being added in the /unattached QOM
container.
This commit was produced with the Coccinelle script
scripts/coccinelle/memory-region-housekeeping.cocci.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Commit 355477f8c7 skips rom reset when we're an incoming migration
so as not to overwrite shared ram in the ignore-shared migration
optimisation.
However, it's got an unexpected side effect that because it skips
freeing the ROM data, when rom_reset gets called later on, after
migration (e.g. during a reboot), the ROM does get reset to the original
file contents. Because of seabios/x86's weird reboot process
this confuses a reboot into hanging after a migration.
Fixes: 355477f8c7 ("migration: do not rom_reset() during incoming migration")
https://bugzilla.redhat.com/show_bug.cgi?id=1809380
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This will store the compression method to use. We start with none.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
Rename multifd-method to multifd-compression
This series removes ad hoc RAM allocation API (memory_region_allocate_system_memory)
and consolidates it around hostmem backend. It allows to
* resolve conflicts between global -mem-prealloc and hostmem's "policy" option,
fixing premature allocation before binding policy is applied
* simplify complicated memory allocation routines which had to deal with 2 ways
to allocate RAM.
* reuse hostmem backends of a choice for main RAM without adding extra CLI
options to duplicate hostmem features. A recent case was -mem-shared, to
enable vhost-user on targets that don't support hostmem backends [1] (ex: s390)
* move RAM allocation from individual boards into generic machine code and
provide them with prepared MemoryRegion.
* clean up deprecated NUMA features which were tied to the old API (see patches)
- "numa: remove deprecated -mem-path fallback to anonymous RAM"
- (POSTPONED, waiting on libvirt side) "forbid '-numa node,mem' for 5.0 and newer machine types"
- (POSTPONED) "numa: remove deprecated implicit RAM distribution between nodes"
Introduce a new machine.memory-backend property and wrapper code that aliases
global -mem-path and -mem-alloc into automatically created hostmem backend
properties (provided memory-backend was not set explicitly given by user).
A bulk of trivial patches then follow to incrementally convert individual
boards to using machine.memory-backend provided MemoryRegion.
Board conversion typically involves:
* providing MachineClass::default_ram_size and MachineClass::default_ram_id
so generic code could create default backend if user didn't explicitly provide
memory-backend or -m options
* dropping memory_region_allocate_system_memory() call
* using convenience MachineState::ram MemoryRegion, which points to MemoryRegion
allocated by ram-memdev
On top of that for some boards:
* missing ram_size checks are added (typically it were boards with fixed ram size)
* ram_size fixups are replaced by checks and hard errors, forcing user to
provide correct "-m" values instead of ignoring it and continuing running.
After all boards are converted, the old API is removed and memory allocation
routines are cleaned up.
The goal is to reduce the amount of requests issued by a guest on
1M reads/writes. This rises the performance up to 4% on that kind of
disk access pattern.
The maximum chunk size to be used for the guest disk accessing is
limited with seg_max parameter, which represents the max amount of
pices in the scatter-geather list in one guest disk request.
Since seg_max is virqueue_size dependent, increasing the virtqueue
size increases seg_max, which, in turn, increases the maximum size
of data to be read/write from a guest disk.
More details in the original problem statment:
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg03721.html
Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-id: 20200214074648.958-1-dplotnikov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
all boards were switched to using memdev backend for main RAM,
so we can drop no longer used memory_region_allocate_system_memory()
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200219160953.13771-73-imammedo@redhat.com>
memory_region_allocate_system_memory() API is going away, so
replace it with memdev allocated MemoryRegion. The later is
initialized by generic code, so board only needs to opt in
to memdev scheme by providing
MachineClass::default_ram_id
and using MachineState::ram instead of manually initializing
RAM memory region.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200219160953.13771-40-imammedo@redhat.com>
In case of NUMA there are 2 cases to consider:
1. '-numa node,memdev', the only one that will be available
for 5.0 and newer machine types.
In this case reuse current behavior, with only difference
memdevs are put into MachineState::ram container +
a temporary glue to keep memory_region_allocate_system_memory()
working until all boards converted.
2. fake NUMA ("-numa node mem" and default RAM splitting)
the later has been deprecated and will be removed but the former
is going to stay available for compat reasons for 5.0 and
older machine types
it takes allocate_system_memory_nonnuma() path, like non-NUMA
case and falls under conversion to memdev. So extend non-NUMA
MachineState::ram initialization introduced in previous patch
to take care of fake NUMA case.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20200219160953.13771-6-imammedo@redhat.com>
the new field will be used by boards to get access to main
RAM memory region and will help to save boiler plate in
boards which often introduce a field or variable just for
this purpose.
Memory region will be equivalent to what currently used
memory_region_allocate_system_memory() is returning apart
from that it will come from hostmem backend.
Followup patches will incrementally switch boards to using
RAM from MachineState::ram.
Patch takes care of non-NUMA case and follow up patch will
initialize MachineState::ram for NUMA case.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200219160953.13771-5-imammedo@redhat.com>
Property will contain link to memory backend that will be
used for backing initial RAM.
Follow up commit will alias -mem-path and -mem-prealloc
CLI options into memory backend options to make memory
handling consistent (using only hostmem backend family
for guest RAM allocation).
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200219160953.13771-3-imammedo@redhat.com>
it has been deprecated since 4.0 by commit
cb79224b7 (deprecate -mem-path fallback to anonymous RAM)
Deprecation period ran out and it's time to remove it
so it won't get in a way of switching to using hostmem
backend for RAM.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200219160953.13771-2-imammedo@redhat.com>
The only difference to hardware revision 4 is that the device doesn't
switch to VGA mode in case someone happens to touch a VGA register,
which should make things more robust in configurations with multiple
vga devices.
Swtiching back to VGA mode happens on reset, either full machine
reset or qxl device reset (QXL_IO_RESET ioport command).
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-id: 20200206074358.4274-1-kraxel@redhat.com
Commit ed65fd1a27 ("virtio-blk: switch off scsi-passthrough by
default") changed the default value of the 'scsi' property of
virtio-blk, which is only available on Linux hosts. It also added
an unconditional compat entry for 2.4 or earlier machines.
Trying to set this property on a pre-2.5 machine on OSX, we get:
Unexpected error in object_property_find() at qom/object.c:1201:
qemu-system-x86_64: -device virtio-blk-pci,id=scsi0,drive=drive0: can't apply global virtio-blk-device.scsi=true: Property '.scsi' not found
Fix this error by marking the property optional.
Fixes: ed65fd1a27 ("virtio-blk: switch off scsi-passthrough by default")
Suggested-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20200207001404.1739-1-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We have many files that apparently do not depend on the target CPU
configuration, i.e. which can be put into common-obj-y instead of
obj-y. This way, the code can be shared for example between
qemu-system-arm and qemu-system-aarch64, or the various big and
little endian variants like qemu-system-sh4 and qemu-system-sh4eb,
so that we do not have to compile the code multiple times anymore.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200130133841.10779-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Deprecate device_legacy_reset(), qdev_reset_all() and
qbus_reset_all() to be replaced by new functions
device_cold_reset() and bus_cold_reset() which uses resettable API.
Also introduce resettable_cold_reset_fn() which may be used as a
replacement for qdev_reset_all_fn and qbus_reset_all_fn().
Following patches will be needed to look at legacy reset call sites
and switch to resettable api. The legacy functions will be removed
when unused.
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200123132823.1117486-9-damien.hedde@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This commit make use of the resettable API to reset the device being
hotplugged when it is realized. Also it ensures it is put in a reset
state coherent with the parent it is plugged into.
Note that there is a difference in the reset. Instead of resetting
only the hotplugged device, we reset also its subtree (switch to
resettable API). This is not expected to be a problem because
sub-buses are just realized too. If a hotplugged device has any
sub-buses it is logical to reset them too at this point.
The recently added should_be_hidden and PCI's partially_hotplugged
mechanisms do not interfere with realize operation:
+ In the should_be_hidden use case, device creation is
delayed.
+ The partially_hotplugged mechanism prevents a device to be
unplugged and unrealized from qdev POV and unrealized.
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200123132823.1117486-8-damien.hedde@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In qdev_set_parent_bus(), when changing the parent bus of a
realized device, if the source and destination buses are not in the
same reset state, some adaptations are required. This patch adds
needed call to resettable_change_parent() to make sure a device reset
state stays coherent with its parent bus.
The addition is a no-op if:
1. the device being parented is not realized.
2. the device is realized, but both buses are not under reset.
Case 2 means that as long as qdev_set_parent_bus() is called
during the machine realization procedure (which is before the
machine reset so nothing is in reset), it is a no op.
There are 52 call sites of qdev_set_parent_bus(). All but one fall
into the no-op case:
+ 29 trivial calls related to virtio (in hw/{s390x,display,virtio}/
{vhost,virtio}-xxx.c) to set a vdev(or vgpu) composing device
parent bus just before realizing the same vdev(vgpu).
+ hw/core/qdev.c: when creating a device in qdev_try_create()
+ hw/core/sysbus.c: when initializing a device in the sysbus
+ hw/i386/amd_iommu.c: before realizing AMDVIState/pci
+ hw/isa/piix4.c: before realizing PIIX4State/rtc
+ hw/misc/auxbus.c: when creating an AUXBus
+ hw/misc/auxbus.c: when creating an AUXBus child
+ hw/misc/macio/macio.c: when initializing a MACIOState child
+ hw/misc/macio/macio.c: before realizing NewWorldMacIOState/pmu
+ hw/misc/macio/macio.c: before realizing NewWorldMacIOState/cuda
+ hw/net/virtio-net.c: Used for migration when using the failover
mechanism to migration a vfio-pci/net. It is
a no-op because at this point the device is
already on the bus.
+ hw/pci-host/designware.c: before realizing DesignwarePCIEHost/root
+ hw/pci-host/gpex.c: before realizing GPEXHost/root
+ hw/pci-host/prep.c: when initialiazing PREPPCIState/pci_dev
+ hw/pci-host/q35.c: before realizing Q35PCIHost/mch
+ hw/pci-host/versatile.c: when initializing PCIVPBState/pci_dev
+ hw/pci-host/xilinx-pcie.c: before realizing XilinxPCIEHost/root
+ hw/s390x/event-facility.c: when creating SCLPEventFacility/
TYPE_SCLP_QUIESCE
+ hw/s390x/event-facility.c: ditto with SCLPEventFacility/
TYPE_SCLP_CPU_HOTPLUG
+ hw/s390x/sclp.c: Not trivial because it is called on a SLCPDevice
just after realizing it. Ok because at this point the destination
bus (sysbus) is not in reset; the realize step is before the
machine reset.
+ hw/sd/core.c: Not OK. Used in sdbus_reparent_card(). See below.
+ hw/ssi/ssi.c: Used to put spi slave on spi bus and connect the cs
line in ssi_auto_connect_slave(). Ok because this function is only
used in realize step in hw/ssi/aspeed_smc.ci, hw/ssi/imx_spi.c,
hw/ssi/mss-spi.c, hw/ssi/xilinx_spi.c and hw/ssi/xilinx_spips.c.
+ hw/xen/xen-legacy-backend.c: when creating a XenLegacyDevice device
+ qdev-monitor.c: in device hotplug creation procedure before realize
Note that this commit alone will have no effect, right now there is no
use of resettable API to reset anything. So a bus will never be tagged
as in-reset by this same API.
The one place where side-effect will occurs is in hw/sd/core.c in
sdbus_reparent_card(). This function is only used in the raspi machines,
including during the sysbus reset procedure. This case will be
carrefully handled when doing the multiple phase reset transition.
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200123132823.1117486-7-damien.hedde@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a function resettable_change_parent() to do the required
plumbing when changing the parent a of Resettable object.
We need to make sure that the reset state of the object remains
coherent with the reset state of the new parent.
We make the 2 following hypothesis:
+ when an object is put in a parent under reset, the object goes in
reset.
+ when an object is removed from a parent under reset, the object
leaves reset.
The added function avoids any glitch if both old and new parent are
already in reset.
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200123132823.1117486-6-damien.hedde@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This commit adds support of Resettable interface to buses and devices:
+ ResettableState structure is added in the Bus/Device state
+ Resettable methods are implemented.
+ device/bus_is_in_reset function defined
This commit allows to transition the objects to the new
multi-phase interface without changing the reset behavior at all.
Object single reset method can be split into the 3 different phases
but the 3 phases are still executed in a row for a given object.
From the qdev/qbus reset api point of view, nothing is changed.
qdev_reset_all() and qbus_reset_all() are not modified as well as
device_legacy_reset().
Transition of an object must be done from parent class to child class.
Care has been taken to allow the transition of a parent class
without requiring the child classes to be transitioned at the same
time. Note that SysBus and SysBusDevice class do not need any transition
because they do not override the legacy reset method.
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200123132823.1117486-5-damien.hedde@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This commit defines an interface allowing multi-phase reset. This aims
to solve a problem of the actual single-phase reset (built in
DeviceClass and BusClass): reset behavior is dependent on the order
in which reset handlers are called. In particular doing external
side-effect (like setting an qemu_irq) is problematic because receiving
object may not be reset yet.
The Resettable interface divides the reset in 3 well defined phases.
To reset an object tree, all 1st phases are executed then all 2nd then
all 3rd. See the comments in include/hw/resettable.h for a more complete
description. The interface defines 3 phases to let the future
possibility of holding an object into reset for some time.
The qdev/qbus reset in DeviceClass and BusClass will be modified in
following commits to use this interface. A mechanism is provided
to allow executing a transitional reset handler in place of the 2nd
phase which is executed in children-then-parent order inside a tree.
This will allow to transition devices and buses smoothly while
keeping the exact current qdev/qbus reset behavior for now.
Documentation will be added in a following commit.
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200123132823.1117486-4-damien.hedde@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Adds trace events to reset procedure and when updating the parent
bus of a device.
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200123132823.1117486-3-damien.hedde@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Provide a temporary device_legacy_reset function doing what
device_reset does to prepare for the transition with Resettable
API.
All occurrence of device_reset in the code tree are also replaced
by device_legacy_reset.
The new resettable API has different prototype and semantics
(resetting child buses as well as the specified device). Subsequent
commits will make the changeover for each call site individually; once
that is complete device_legacy_reset() will be removed.
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200123132823.1117486-2-damien.hedde@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The num-lines property of the TYPE_OR_GATE device sets the number
of input lines it has. An assert() in or_irq_realize() restricts
this to the maximum supported by the implementation. However we
got the condition in the assert wrong: it should be using <=,
because num-lines == MAX_OR_LINES is permitted, and means that
all entries from 0 to MAX_OR_LINES-1 in the s->levels[] array
are used.
We didn't notice this previously because no user has so far
needed that many input lines.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 20200120142235.10432-1-peter.maydell@linaro.org
While loading the executable, some platforms (like AVR) need to
detect CPU type that executable is built for - and, with this patch,
this is enabled by reading the field 'e_flags' of the ELF header of
the executable in question. The change expands functionality of
the following functions:
- load_elf()
- load_elf_as()
- load_elf_ram()
- load_elf_ram_sym()
The argument added to these functions is called 'pflags' and is of
type 'uint32_t*' (that matches 'pointer to 'elf_word'', 'elf_word'
being the type of the field 'e_flags', in both 32-bit and 64-bit
variants of ELF header). Callers are allowed to pass NULL as that
argument, and in such case no lookup to the field 'e_flags' will
happen, and no information will be returned, of course.
CC: Richard Henderson <rth@twiddle.net>
CC: Peter Maydell <peter.maydell@linaro.org>
CC: Edgar E. Iglesias <edgar.iglesias@gmail.com>
CC: Michael Walle <michael@walle.cc>
CC: Thomas Huth <huth@tuxfamily.org>
CC: Laurent Vivier <laurent@vivier.eu>
CC: Philippe Mathieu-Daudé <f4bug@amsat.org>
CC: Aleksandar Rikalo <aleksandar.rikalo@rt-rk.com>
CC: Aurelien Jarno <aurelien@aurel32.net>
CC: Jia Liu <proljc@gmail.com>
CC: David Gibson <david@gibson.dropbear.id.au>
CC: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
CC: BALATON Zoltan <balaton@eik.bme.hu>
CC: Christian Borntraeger <borntraeger@de.ibm.com>
CC: Thomas Huth <thuth@redhat.com>
CC: Artyom Tarasenko <atar4qemu@gmail.com>
CC: Fabien Chouteau <chouteau@adacore.com>
CC: KONRAD Frederic <frederic.konrad@adacore.com>
CC: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Aleksandar Rikalo <aleksandar.rikalo@rt-rk.com>
Signed-off-by: Michael Rolnik <mrolnik@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Message-Id: <1580079311-20447-24-git-send-email-aleksandar.markovic@rt-rk.com>
Use class properties facilities to add properties to the class during
device_class_set_props().
qdev_property_add_static() must be adapted as PropertyInfo now
operates with classes (and not instances), so we must
set_default_value() on the ObjectProperty, before calling its init()
method on the object instance.
Also, PropertyInfo.create() is now exclusively used for class
properties. Fortunately, qdev_property_add_static() is only used in
target/arm/cpu.c so far, which doesn't use "link" properties (that
require create()).
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20200110153039.1379601-22-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the one-user function to the place it is being used.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200110153039.1379601-5-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All callers use error_abort, and even the function itself calls with
error_abort.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20200110153039.1379601-4-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The function is already documented in the header.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20200110153039.1379601-3-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
To ease review/modifications of this Makefile, group generic
objects first, then system-mode specific ones, and finally
peripherals (which are only used in system-mode).
No logical changes introduced here.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200118140619.26333-7-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The user-mode code does not use this API, restrict it
to the system-mode.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20200118140619.26333-6-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Similarly to what we already do with qdev, use a helper to overload the
reset QOM methods of the parent in children classes, for clarity.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <157650847239.354886.2782881118916307978.stgit@bahia.lan>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Define the new macro VMSTATE_INSTANCE_ID_ANY for callers who wants to
auto-generate the vmstate instance ID. Previously it was hard coded
as -1 instead of this macro. It helps to change this default value in
the follow up patches. No functional change.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
If the redirected device has this capability, Windows guest may
place the device into D2 and expect it to wake when the device
becomes active, but this will never happen. For example, when
internal Bluetooth adapter is redirected, keyboards and mice
connected to it do not work. Current commit removes this
capability (starting from machine 5.0)
Set 'usb-redir.suppress-remote-wake' property to 'off' to keep
'remote wake' as is or to 'on' to remove 'remote wake' on
4.2 or earlier.
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Message-id: 20200108091044.18055-3-yuri.benditovich@daynix.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
If the redirected device has this capability, Windows guest may
place the device into D2 and expect it to wake when the device
becomes active, but this will never happen. For example, when
internal Bluetooth adapter is redirected, keyboards and mice
connected to it do not work. Current commit removes this
capability (starting from machine 5.0)
Set 'usb-host.suppress-remote-wake' property to 'off' to keep
'remote wake' as is or to 'on' to remove 'remote wake' on
4.2 or earlier.
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Message-id: 20200108091044.18055-2-yuri.benditovich@daynix.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Hi,
QDEV_PROP_PTR is marked in multiple places as "FIXME/TODO/remove
me". In most cases, it can be easily replaced with QDEV_PROP_LINK when
the pointer points to an Object.
There are a few places where such substitution isn't possible. For
those places, it seems reasonable to use a specific setter method
instead, and keep the user_creatable = false. In other places,
proper usage of qdev or other facilies is the solution.
The serial code wasn't converted to qdev, which makes it a bit more
archaic to deal with. Let's convert it first, so we can more easily
embed it from other devices, and re-export some properties and drop
QDEV_PROP_PTR usage.
-----BEGIN PGP SIGNATURE-----
iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAl4UnUYcHG1hcmNhbmRy
ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5cfYEACcTfXklXdxLDj94Q5/
d6MxYqZWckO+vyMqOwonodl9BS3clpDDxbYzyfTpqwKS2cVg1eUUBPR7/eioX6zT
grM0rlgsKWJf9UurJwJWw7Zys7dXZMVJ2BdigLUEZrv9hFF15t344qoKgk4wYmBj
2wC7l7j2WZZ0vtXN7IH4/ZXnaN5/kdoPj6BrF0oNSJaq1AjPByQxmOJhvrxVsm6y
gn3la4XbfMIC68qPjcDJAScGXtCWG1Vydw9cFHwRpMfcvPyL70l6FMjIwrLYNQ9b
j1AkcEXeev5nWT+gLGxt+TGXB0Sd2ID9uRYxhyZRA4fdjHFtlWfdOwepOOlSlTO+
yfpf9STDLuDQGLTJyNZpYGGDDcm4xsJ8arD/7/Mq/35BQl9ZUT+m6uC1tDhxEHzf
+AD/Kh8rMptyAjwtqD2XbqyLoaFJCsPjZbjTj3SY08WaeqClmaAbSD2eaJiNXy4H
+rFg9P/eOB+71R1AoMKfiBFzdGV6TG5PLZOJ/oN02yqp0oW8eDWYcETB3j0tIgS1
u2WVCS2cd8IqYa+UQ7COOpoX0UwICmIWV64kxioD7uFQiK/1nQYw4UnPHv29qY6k
fTa8jUC5hPiDN1rRYqNpNoVJsstSZfSgpo5jV75sxSyDucupu+SM9qmo3+fBab+q
Eol3Ypz4virkNU8IYCYFFiG4Qg==
=iYVd
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/elmarco/tags/prop-ptr-pull-request' into staging
Clean-ups: qom-ify serial and remove QDEV_PROP_PTR
Hi,
QDEV_PROP_PTR is marked in multiple places as "FIXME/TODO/remove
me". In most cases, it can be easily replaced with QDEV_PROP_LINK when
the pointer points to an Object.
There are a few places where such substitution isn't possible. For
those places, it seems reasonable to use a specific setter method
instead, and keep the user_creatable = false. In other places,
proper usage of qdev or other facilies is the solution.
The serial code wasn't converted to qdev, which makes it a bit more
archaic to deal with. Let's convert it first, so we can more easily
embed it from other devices, and re-export some properties and drop
QDEV_PROP_PTR usage.
# gpg: Signature made Tue 07 Jan 2020 15:01:26 GMT
# gpg: using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg: issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* remotes/elmarco/tags/prop-ptr-pull-request: (37 commits)
qdev/qom: remove some TODO limitations now that PROP_PTR is gone
qdev: remove QDEV_PROP_PTR
qdev: remove PROP_MEMORY_REGION
omap-gpio: remove PROP_PTR
omap-i2c: remove PROP_PTR
omap-intc: remove PROP_PTR
smbus-eeprom: remove PROP_PTR
cris: improve passing PIC interrupt vector to the CPU
mips/cps: fix setting saar property
qdev: use g_strcmp0() instead of open-coding it
leon3: use qdev gpio facilities for the PIL
leon3: use qemu_irq framework instead of callback as property
dp8393x: replace PROP_PTR with PROP_LINK
etraxfs: remove PROP_PTR usage
lance: replace PROP_PTR with PROP_LINK
vmmouse: replace PROP_PTR with PROP_LINK
sm501: make SerialMM a child, export chardev property
mips: use sysbus_mmio_get_region() instead of internal fields
mips: use sysbus_add_io()
mips: baudbase is 115200 by default
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Bugfixes all over the place.
HMAT support.
New flags for vhost-user-blk utility.
Auto-tuning of seg max for virtio storage.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl4TaMEPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpvzgH/2LyDAzCa9h93ikSJjmyUk5FUaqve38daEb3
S3JYjwKxQx7u1ydooKhvBQnBCZ2i3S+k62gfYyKB+nBv8xvjs0Eg5D1YJ5E8hciy
lf5OFGWWtX2iPDjZwQwT13kiJe0o3JRGxJJ6XqTEG+1EYOp7cky/FEv4PD030b9m
I2wROZ/Am+onB9YJX8c0Vv1CG+AryuJNXnvwQzTXEjj4U7bEYUyJwVZaCRyAdWQ3
uYXIZN9VwjVX6BFvy9ZAJbEsUVJvOM1/aQaDqcrLz+VlzRT7bRkKHi2G3vakrm1I
r5OpgyLo84132awCncbSykKDH5o8WaxLaJBjGmuBfasMz9wPzAg=
=uL1o
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pci, pc: fixes, features
Bugfixes all over the place.
HMAT support.
New flags for vhost-user-blk utility.
Auto-tuning of seg max for virtio storage.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon 06 Jan 2020 17:05:05 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (32 commits)
intel_iommu: add present bit check for pasid table entries
intel_iommu: a fix to vtd_find_as_from_bus_num()
virtio-net: delete also control queue when TX/RX deleted
virtio: reset region cache when on queue deletion
virtio-mmio: update queue size on guest write
tests: add virtio-scsi and virtio-blk seg_max_adjust test
virtio: make seg_max virtqueue size dependent
hw: fix using 4.2 compat in 5.0 machine types for i440fx/q35
vhost-user-scsi: reset the device if supported
vhost-user: add VHOST_USER_RESET_DEVICE to reset devices
hw/pci/pci_host: Let pci_data_[read/write] use unsigned 'size' argument
hw/pci/pci_host: Remove redundant PCI_DPRINTF()
virtio-mmio: Clear v2 transport state on soft reset
ACPI: add expected files for HMAT tests (acpihmat)
tests/bios-tables-test: add test cases for ACPI HMAT
tests/numa: Add case for QMP build HMAT
hmat acpi: Build Memory Side Cache Information Structure(s)
hmat acpi: Build System Locality Latency and Bandwidth Information Structure(s)
hmat acpi: Build Memory Proximity Domain Attributes Structure(s)
numa: Extend CLI to provide memory side cache information
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
No longer used in the tree. The comment about user_creatable is still
quite relevant, but there is already a similar comment in qdev-core.h.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Hi,
With external processes or helpers participating to the VM support, it
becomes necessary to handle their migration. Various options exist to
transfer their state:
1) as the VM memory, RAM or devices (we could say that's how
vhost-user devices can be handled today, they are expected to
restore from ring state)
2) other "vmstate" (as with TPM emulator state blobs)
3) left to be handled by management layer
1) is not practical, since an external processes may legitimatelly
need arbitrary state date to back a device or a service, or may not
even have an associated device.
2) needs ad-hoc code for each helper, but is simple and working
3) is complicated for management layer, QEMU has the migration timing
The proposed "dbus-vmstate" object will connect to a given D-Bus
address, and save/load from org.qemu.VMState1 owners on migration.
Thus helpers can easily have their state migrated with QEMU, without
implementing ad-hoc support (such as done for TPM emulation)
D-Bus is ubiquitous on Linux (it is systemd IPC), and can be made to
work on various other OSes. There are several implementations and good
bindings for various languages. (the tests/dbus-vmstate-test.c is a
good example of how simple the implementation of services can be, even
in C)
dbus-vmstate is put into use by the libvirt series "[PATCH 00/23] Use
a slirp helper process".
v2:
- fix build with broken mingw-glib
-----BEGIN PGP SIGNATURE-----
iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAl4TR5ccHG1hcmNhbmRy
ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5R6EEACFTd4hDG8i/GnxCFut
MGcTusJr+2IklIT/K0qpLf0axNUoIqycwv8m0T9QhoG8h+9lMykOd1YJpNetT5qK
gifOF2gcPK/9WIdFbX7dLSUAWpzO6fG/RzKK65Nc1uJSnXlb8JV0BU/6FrfCE+3U
Bg5PvVtxxtwejQfQPOI7bPxOqxr/SmjUGcbFgacMAMG0Lm/VG/92kdoC6Z4Xf/bd
FcAeiO2CiPoGXG5zD4WF1emwxnSu65PgcFpSpqvvFlmDbYlTwoMt4VWxTfkAzbAM
IES7j2IbhUEe3p0hvMTqmmsmds1QNCBgnQI/LtQiXPTnbfpBcZ0wT6QsSZXWvHz8
ClA9OAimxyELblTGjD9vsi3G5m2DQS+NdfPOX7hfHouVQzDJJaS8jxDItpPgXwSO
fZ9mUO8ps3N2YTakuKNBP/IzDOuyExrBg80GF+HbEc59Uhj8Yq/awyz1XsqjQzVP
54+TUjwC8HZxVWgMeqiJ1njPTfRJo6uAnguLbfAXj8P9vaXLtsy/3JGsmKiziXXW
XzvQDzhfOMjm7Uo7vN7Hp3X/UYJxnaQ3dViqZnv/gqG6yv+igVlqyrTx2IBhN2NW
DZt3c7VqVUBYFShLgfy0zDjzM/s7mFkQKCFHUsBqIwODugYEc3TTdAa60QYjX5i9
negngax45KM6nF3tq74fJpwWVw==
=M4kD
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/elmarco/tags/dbus-vmstate7-pull-request' into staging
Add dbus-vmstate
Hi,
With external processes or helpers participating to the VM support, it
becomes necessary to handle their migration. Various options exist to
transfer their state:
1) as the VM memory, RAM or devices (we could say that's how
vhost-user devices can be handled today, they are expected to
restore from ring state)
2) other "vmstate" (as with TPM emulator state blobs)
3) left to be handled by management layer
1) is not practical, since an external processes may legitimatelly
need arbitrary state date to back a device or a service, or may not
even have an associated device.
2) needs ad-hoc code for each helper, but is simple and working
3) is complicated for management layer, QEMU has the migration timing
The proposed "dbus-vmstate" object will connect to a given D-Bus
address, and save/load from org.qemu.VMState1 owners on migration.
Thus helpers can easily have their state migrated with QEMU, without
implementing ad-hoc support (such as done for TPM emulation)
D-Bus is ubiquitous on Linux (it is systemd IPC), and can be made to
work on various other OSes. There are several implementations and good
bindings for various languages. (the tests/dbus-vmstate-test.c is a
good example of how simple the implementation of services can be, even
in C)
dbus-vmstate is put into use by the libvirt series "[PATCH 00/23] Use
a slirp helper process".
v2:
- fix build with broken mingw-glib
# gpg: Signature made Mon 06 Jan 2020 14:43:35 GMT
# gpg: using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg: issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* remotes/elmarco/tags/dbus-vmstate7-pull-request:
tests: add dbus-vmstate-test
tests: add migration-helpers unit
dockerfiles: add dbus-daemon to some of latest distributions
configure: add GDBUS_CODEGEN
Add dbus-vmstate object
util: add dbus helper unit
docs: start a document to describe D-Bus usage
vmstate: replace DeviceState with VMStateIf
vmstate: add qom interface to get id
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Before the patch, seg_max parameter was immutable and hardcoded
to 126 (128 - 2) without respect to queue size. This has two negative effects:
1. when queue size is < 128, we have Virtio 1.1 specfication violation:
(2.6.5.3.1 Driver Requirements) seq_max must be <= queue_size.
This violation affects the old Linux guests (ver < 4.14). These guests
crash on these queue_size setups.
2. when queue_size > 128, as was pointed out by Denis Lunev <den@virtuozzo.com>,
seg_max restrics guest's block request length which affects guests'
performance making them issues more block request than needed.
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg03721.html
To mitigate this two effects, the patch adds the property adjusting seg_max
to queue size automaticaly. Since seg_max is a guest visible parameter,
the property is machine type managable and allows to choose between
old (seg_max = 126 always) and new (seg_max = queue_size - 2) behaviors.
Not to change the behavior of the older VMs, prevent setting the default
seg_max_adjust value for older machine types.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-Id: <20191220140905.1718-2-dplotnikov@virtuozzo.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Replace DeviceState dependency with VMStateIf on vmstate API.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Add an interface to get the instance id, instead of depending on
Device and qdev_get_dev_path().
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Add -numa hmat-cache option to provide Memory Side Cache Information.
These memory attributes help to build Memory Side Cache Information
Structure(s) in ACPI Heterogeneous Memory Attribute Table (HMAT).
Before using hmat-cache option, enable HMAT with -machine hmat=on.
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Liu Jingqi <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-4-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Add -numa hmat-lb option to provide System Locality Latency and
Bandwidth Information. These memory attributes help to build
System Locality Latency and Bandwidth Information Structure(s)
in ACPI Heterogeneous Memory Attribute Table (HMAT). Before using
hmat-lb option, enable HMAT with -machine hmat=on.
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Liu Jingqi <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-3-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
In ACPI 6.3 chapter 5.2.27 Heterogeneous Memory Attribute Table (HMAT),
The initiator represents processor which access to memory. And in 5.2.27.3
Memory Proximity Domain Attributes Structure, the attached initiator is
defined as where the memory controller responsible for a memory proximity
domain. With attached initiator information, the topology of heterogeneous
memory can be described. Add new machine property 'hmat' to enable all
HMAT specific options.
Extend CLI of "-numa node" option to indicate the initiator numa node-id.
In the linux kernel, the codes in drivers/acpi/hmat/hmat.c parse and report
the platform's HMAT tables. Before using initiator option, enable HMAT with
-machine hmat=on.
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-2-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently the SLOF firmware for pseries guests will disable/re-enable
a PCI device multiple times via IO/MEM/MASTER bits of PCI_COMMAND
register after the initial probe/feature negotiation, as it tends to
work with a single device at a time at various stages like probing
and running block/network bootloaders without doing a full reset
in-between.
In QEMU, when PCI_COMMAND_MASTER is disabled we disable the
corresponding IOMMU memory region, so DMA accesses (including to vring
fields like idx/flags) will no longer undergo the necessary
translation. Normally we wouldn't expect this to happen since it would
be misbehavior on the driver side to continue driving DMA requests.
However, in the case of pseries, with iommu_platform=on, we trigger the
following sequence when tearing down the virtio-blk dataplane ioeventfd
in response to the guest unsetting PCI_COMMAND_MASTER:
#2 0x0000555555922651 in virtqueue_map_desc (vdev=vdev@entry=0x555556dbcfb0, p_num_sg=p_num_sg@entry=0x7fffe657e1a8, addr=addr@entry=0x7fffe657e240, iov=iov@entry=0x7fffe6580240, max_num_sg=max_num_sg@entry=1024, is_write=is_write@entry=false, pa=0, sz=0)
at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:757
#3 0x0000555555922a89 in virtqueue_pop (vq=vq@entry=0x555556dc8660, sz=sz@entry=184)
at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:950
#4 0x00005555558d3eca in virtio_blk_get_request (vq=0x555556dc8660, s=0x555556dbcfb0)
at /home/mdroth/w/qemu.git/hw/block/virtio-blk.c:255
#5 0x00005555558d3eca in virtio_blk_handle_vq (s=0x555556dbcfb0, vq=0x555556dc8660)
at /home/mdroth/w/qemu.git/hw/block/virtio-blk.c:776
#6 0x000055555591dd66 in virtio_queue_notify_aio_vq (vq=vq@entry=0x555556dc8660)
at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:1550
#7 0x000055555591ecef in virtio_queue_notify_aio_vq (vq=0x555556dc8660)
at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:1546
#8 0x000055555591ecef in virtio_queue_host_notifier_aio_poll (opaque=0x555556dc86c8)
at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:2527
#9 0x0000555555d02164 in run_poll_handlers_once (ctx=ctx@entry=0x55555688bfc0, timeout=timeout@entry=0x7fffe65844a8)
at /home/mdroth/w/qemu.git/util/aio-posix.c:520
#10 0x0000555555d02d1b in try_poll_mode (timeout=0x7fffe65844a8, ctx=0x55555688bfc0)
at /home/mdroth/w/qemu.git/util/aio-posix.c:607
#11 0x0000555555d02d1b in aio_poll (ctx=ctx@entry=0x55555688bfc0, blocking=blocking@entry=true)
at /home/mdroth/w/qemu.git/util/aio-posix.c:639
#12 0x0000555555d0004d in aio_wait_bh_oneshot (ctx=0x55555688bfc0, cb=cb@entry=0x5555558d5130 <virtio_blk_data_plane_stop_bh>, opaque=opaque@entry=0x555556de86f0)
at /home/mdroth/w/qemu.git/util/aio-wait.c:71
#13 0x00005555558d59bf in virtio_blk_data_plane_stop (vdev=<optimized out>)
at /home/mdroth/w/qemu.git/hw/block/dataplane/virtio-blk.c:288
#14 0x0000555555b906a1 in virtio_bus_stop_ioeventfd (bus=bus@entry=0x555556dbcf38)
at /home/mdroth/w/qemu.git/hw/virtio/virtio-bus.c:245
#15 0x0000555555b90dbb in virtio_bus_stop_ioeventfd (bus=bus@entry=0x555556dbcf38)
at /home/mdroth/w/qemu.git/hw/virtio/virtio-bus.c:237
#16 0x0000555555b92a8e in virtio_pci_stop_ioeventfd (proxy=0x555556db4e40)
at /home/mdroth/w/qemu.git/hw/virtio/virtio-pci.c:292
#17 0x0000555555b92a8e in virtio_write_config (pci_dev=0x555556db4e40, address=<optimized out>, val=1048832, len=<optimized out>)
at /home/mdroth/w/qemu.git/hw/virtio/virtio-pci.c:613
I.e. the calling code is only scheduling a one-shot BH for
virtio_blk_data_plane_stop_bh, but somehow we end up trying to process
an additional virtqueue entry before we get there. This is likely due
to the following check in virtio_queue_host_notifier_aio_poll:
static bool virtio_queue_host_notifier_aio_poll(void *opaque)
{
EventNotifier *n = opaque;
VirtQueue *vq = container_of(n, VirtQueue, host_notifier);
bool progress;
if (!vq->vring.desc || virtio_queue_empty(vq)) {
return false;
}
progress = virtio_queue_notify_aio_vq(vq);
namely the call to virtio_queue_empty(). In this case, since no new
requests have actually been issued, shadow_avail_idx == last_avail_idx,
so we actually try to access the vring via vring_avail_idx() to get
the latest non-shadowed idx:
int virtio_queue_empty(VirtQueue *vq)
{
bool empty;
...
if (vq->shadow_avail_idx != vq->last_avail_idx) {
return 0;
}
rcu_read_lock();
empty = vring_avail_idx(vq) == vq->last_avail_idx;
rcu_read_unlock();
return empty;
but since the IOMMU region has been disabled we get a bogus value (0
usually), which causes virtio_queue_empty() to falsely report that
there are entries to be processed, which causes errors such as:
"virtio: zero sized buffers are not allowed"
or
"virtio-blk missing headers"
and puts the device in an error state.
This patch works around the issue by introducing virtio_set_disabled(),
which sets a 'disabled' flag to bypass checks like virtio_queue_empty()
when bus-mastering is disabled. Since we'd check this flag at all the
same sites as vdev->broken, we replace those checks with an inline
function which checks for either vdev->broken or vdev->disabled.
The 'disabled' flag is only migrated when set, which should be fairly
rare, but to maintain migration compatibility we disable it's use for
older machine types. Users requiring the use of the flag in conjunction
with older machine types can set it explicitly as a virtio-device
option.
NOTES:
- This leaves some other oddities in play, like the fact that
DRIVER_OK also gets unset in response to bus-mastering being
disabled, but not restored (however the device seems to continue
working)
- Similarly, we disable the host notifier via
virtio_bus_stop_ioeventfd(), which seems to move the handling out
of virtio-blk dataplane and back into the main IO thread, and it
ends up staying there till a reset (but otherwise continues working
normally)
Cc: David Gibson <david@gibson.dropbear.id.au>,
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Message-Id: <20191120005003.27035-1-mdroth@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit aa57020774, by mistake used MachineClass::numa_mem_supported
to check if NUMA is supported by machine and also as unrelated change
set it to true for sbsa-ref board.
Luckily change didn't break machines that support NUMA, as the field
is set to true for them.
But the field is not intended for checking if NUMA is supported and
will be flipped to false within this release for new machine types.
Fix it:
- by using previously used condition
!mc->cpu_index_to_instance_props || !mc->get_default_cpu_node_id
the first time and then use MachineState::numa_state down the road
to check if NUMA is supported
- dropping stray sbsa-ref chunk
Fixes: aa57020774
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1576154936-178362-3-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Currently parse_numa_node() is always called from already numa
enabled context.
Drop unnecessary check if numa is supported.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1576154936-178362-2-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Rename Error ** parameter in check_only_migratable to common errp.
In device_set_realized:
- Move "if (local_err != NULL)" closer to error setters.
- Drop 'Error **local_errp': it doesn't save any LoCs, but it's very
unusual.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20191205174635.18758-10-vsementsov@virtuozzo.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
We don't need Error **, as all callers pass local Error object, which
isn't used after the call. Use Error * instead.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20191205174635.18758-5-vsementsov@virtuozzo.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
fit_load_fdt() passes @errp to fit_image_addr(), then recovers from
ENOENT failures. Passing @errp is wrong, because it works only as
long as @errp is neither @error_fatal nor @error_abort. Error
recovery dereferences @errp. That's also wrong; see the big comment
in error.h. Error recovery can leave *errp pointing to a freed
Error object. Wrong, it must be null on success. Messed up in
commit 3eb99edb48 "loader-fit: Wean off error_printf()".
No caller actually passes such values, or uses *errp on success.
Fix anyway: splice in a local Error *err, and error_propagate().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20191204093625.14836-8-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
The first machine property to fall is Xen's Intel integrated graphics
passthrough. The "-machine igd-passthru" option does not set anymore
a property on the machine object, but desugars to a GlobalProperty on
accelerator objects.
The setter is very simple, since the value ends up in a
global variable, so this patch also provides an example before the more
complicated cases that follow it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Drop the "accel" property from MachineState, and instead desugar
"-machine accel=" to a list of "-accel" options.
This has a semantic change due to removing merge_lists from -accel.
For example:
- "-accel kvm -accel tcg" all but ignored "-accel kvm". This is a bugfix.
- "-accel kvm -accel thread=single" ignored "thread=single", since it
applied the option to KVM. Now it fails due to not specifying the
accelerator on "-accel thread=single".
- "-accel tcg -accel thread=single" chose single-threaded TCG, while now
it will fail due to not specifying the accelerator on "-accel
thread=single".
Also, "-machine accel" and "-accel" become incompatible.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Virtio spec 1.1 (and earlier), 5.2.5.2 Driver Requirements: Device
Initialization:
"Devices SHOULD always offer VIRTIO_BLK_F_FLUSH, and MUST offer it if
they offer VIRTIO_BLK_F_CONFIG_WCE"
Currently F_CONFIG_WCE and F_WCE are not connected to each other.
Qemu will advertise F_CONFIG_WCE if config-wce argument is
set for virtio-blk device. And F_WCE is advertised only if
underlying block backend actually has it's caching enabled.
Fix this by advertising F_WCE if F_CONFIG_WCE is also advertised.
To preserve backwards compatibility with newer machine types make this
behaviour governed by "x-enable-wce-if-config-wce" virtio-blk-device
property and introduce hw_compat_4_2 with new property being off by
default for all machine types <= 4.2 (but don't introduce 4.3
machine type itself yet).
Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>
Message-Id: <1572978137-189218-1-git-send-email-wrfsh@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If memory allocation fails when using -mem-path, QEMU is supposed to print
out a message to indicate that fallback to anonymous RAM is deprecated. This
is done with error_printf() which does output buffering. As a consequence,
the message is only printed at the next flush, eg. when quiting QEMU, and
it also lacks a trailing newline:
qemu-system-ppc64: unable to map backing store for guest RAM: Cannot allocate memory
qemu-system-ppc64: warning: falling back to regular RAM allocation
QEMU 4.1.50 monitor - type 'help' for more information
(qemu) q
This is deprecated. Make sure that -mem-path specified path has sufficient resources to allocate -m specified RAM amountgreg@boss02:~/Work/qemu/qemu-spapr$
Add the missing \n to fix both issues.
Fixes: cb79224b7e "deprecate -mem-path fallback to anonymous RAM"
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <157304440026.351774.14607704217028190097.stgit@bahia.lan>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Now all the users of ptimers have converted to the transaction-based
API, we can remove ptimer_init_with_bh() and all the code paths
that are used only by bottom-half based ptimers, and tidy up the
documentation comments to consider the transaction-based API the
only possibility.
The code changes result from:
* s->bh no longer exists
* s->callback is now always non-NULL
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191025142411.17085-1-peter.maydell@linaro.org
This function isn't used anymore.
This reverts commit 22ec3283ef.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Esteban Bosse <estebanbosse@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
- use --enable-plugins @ configure
- low impact introspection (-plugin empty.so to measure overhead)
- plugins cannot alter guest state
- example plugins included in source tree (tests/plugins)
- -d plugin to enable plugin output in logs
- check-tcg runs extra tests when plugins enabled
- documentation in docs/devel/plugins.rst
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAl23BZMACgkQ+9DbCVqe
KkRPegf/QHygZ4ER2jOaWEookxiOEcik+dzQKVGNqLNXeMLvo5fGjGVpFoFxSgfv
ZvCAL4xbW44zsYlVfh59tfn4Tu9qK7s7/qM3WXpHsmuvEuhoWef0Lt2jSe+D46Rs
KeG/aX+rHLUR8rr9eCgE+1/MQmxPUj3VUonkUpNkk2ebBbSNoLSOudB4DD9Vcyl7
Pya1kPvA6W9bwI20ZSWihE7flg13o62Pp+LgAFLrsfxXOxOMkPrU8Pp+B0Dvr+hL
5Oh0clZLhiRi75x+KVGZ90TVsoftdjYoOWGMOudS/+NNmqKT1NTLm0K1WJYyRMQ1
V0ne4/OcGNq7x8gcOx/xs09ADu5/VA==
=UXR/
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stsquad/tags/pull-tcg-plugins-281019-4' into staging
TCG Plugins initial implementation
- use --enable-plugins @ configure
- low impact introspection (-plugin empty.so to measure overhead)
- plugins cannot alter guest state
- example plugins included in source tree (tests/plugins)
- -d plugin to enable plugin output in logs
- check-tcg runs extra tests when plugins enabled
- documentation in docs/devel/plugins.rst
# gpg: Signature made Mon 28 Oct 2019 15:13:23 GMT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-tcg-plugins-281019-4: (57 commits)
travis.yml: enable linux-gcc-debug-tcg cache
MAINTAINERS: add me for the TCG plugins code
scripts/checkpatch.pl: don't complain about (foo, /* empty */)
.travis.yml: add --enable-plugins tests
include/exec: wrap cpu_ldst.h in CONFIG_TCG
accel/stubs: reduce headers from tcg-stub
tests/plugin: add hotpages to analyse memory access patterns
tests/plugin: add instruction execution breakdown
tests/plugin: add a hotblocks plugin
tests/tcg: enable plugin testing
tests/tcg: drop test-i386-fprem from TESTS when not SLOW
tests/tcg: move "virtual" tests to EXTRA_TESTS
tests/tcg: set QEMU_OPTS for all cris runs
tests/tcg/Makefile.target: fix path to config-host.mak
tests/plugin: add sample plugins
linux-user: support -plugin option
vl: support -plugin option
plugin: add qemu_plugin_outs helper
plugin: add qemu_plugin_insn_disas helper
plugin: expand the plugin_init function to include an info block
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In "b06424de62 migration: Disable hotplug/unplug during migration" we
added a check to disable unplug for all devices until we have figured
out what works. For failover primary devices qdev_unplug() is called
from the migration handler, i.e. during migration.
This patch adds a flag to DeviceState which is set to false for all
devices and makes an exception for PCI devices that are also
primary devices in a failover pair.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Message-Id: <20191029114905.6856-8-jfreimann@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This adds support for hiding a device to the qbus and qdev APIs. The
first user of this will be the virtio-net failover feature but the API
introduced with this patch could be used to implement other features as
well, for example hiding pci devices when a pci bus is powered off.
qdev_device_add() is modified to check for a failover_pair_id
argument in the option string. A DeviceListener callback
should_be_hidden() is added. It can be used by a standby device to
inform qdev that this device should not be added now. The standby device
handler can store the device options to plug the device in at a later
point in time.
One reason for hiding the device is that we don't want to expose both
devices to the guest kernel until the respective virtio feature bit
VIRTIO_NET_F_STANDBY was negotiated and we know that the devices will be
handled correctly by the guest.
More information on the kernel feature this is using:
https://www.kernel.org/doc/html/latest/networking/net_failover.html
An example where the primary device is a vfio-pci device and the standby
device is a virtio-net device:
A device is hidden when it has an "failover_pair_id" option, e.g.
-device virtio-net-pci,...,failover=on,...
-device vfio-pci,...,failover_pair_id=net1,...
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Message-Id: <20191029114905.6856-2-jfreimann@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Add MachineClass::auto_enable_numa field. When it is true, a NUMA node
is expected to be created implicitly.
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190905083238.1799-1-tao3.xu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Provide the new transaction-based API. If a ptimer is created
using ptimer_init() rather than ptimer_init_with_bh(), then
instead of providing a QEMUBH, it provides a pointer to the
callback function directly, and has opted into the transaction
API. All calls to functions which modify ptimer state:
- ptimer_set_period()
- ptimer_set_freq()
- ptimer_set_limit()
- ptimer_set_count()
- ptimer_run()
- ptimer_stop()
must be between matched calls to ptimer_transaction_begin()
and ptimer_transaction_commit(). When ptimer_transaction_commit()
is called it will evaluate the state of the timer after all the
changes in the transaction, and call the callback if necessary.
In the old API the individual update functions generally would
call ptimer_trigger() immediately, which would schedule the QEMUBH.
In the new API the update functions will instead defer the
"set s->next_event and call ptimer_reload()" work to
ptimer_transaction_commit().
Because ptimer_trigger() can now immediately call into the
device code which may then call other ptimer functions that
update ptimer_state fields, we must be more careful in
ptimer_reload() not to cache fields from ptimer_state across
the ptimer_trigger() call. (This was harmless with the QEMUBH
mechanism as the BH would not be invoked until much later.)
We use assertions to check that:
* the functions modifying ptimer state are not called outside
a transaction block
* ptimer_transaction_begin() and _commit() calls are paired
* the transaction API is not used with a QEMUBH ptimer
There is some slight repetition of code:
* most of the set functions have similar looking "if s->bh
call ptimer_reload, otherwise set s->need_reload" code
* ptimer_init() and ptimer_init_with_bh() have similar code
We deliberately don't try to avoid this repetition, because
it will all be deleted when the QEMUBH version of the API
is removed.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-3-peter.maydell@linaro.org
Currently the ptimer design uses a QEMU bottom-half as its
mechanism for calling back into the device model using the
ptimer when the timer has expired. Unfortunately this design
is fatally flawed, because it means that there is a lag
between the ptimer updating its own state and the device
callback function updating device state, and guest accesses
to device registers between the two can return inconsistent
device state.
We want to replace the bottom-half design with one where
the guest device's callback is called either immediately
(when the ptimer triggers by timeout) or when the device
model code closes a transaction-begin/end section (when the
ptimer triggers because the device model changed the
ptimer's count value or other state). As the first step,
rename ptimer_init() to ptimer_init_with_bh(), to free up
the ptimer_init() name for the new API. We can then convert
all the ptimer users away from ptimer_init_with_bh() before
removing it entirely.
(Commit created with
git grep -l ptimer_init | xargs sed -i -e 's/ptimer_init/ptimer_init_with_bh/'
and three overlong lines folded by hand.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-2-peter.maydell@linaro.org
Both, "rom->addr" and "addr" are derived from the binary image
that can be loaded with the "-kernel" paramer. The code in
rom_copy() then calculates:
d = dest + (rom->addr - addr);
and uses "d" as destination in a memcpy() some lines later. Now with
bad kernel images, it is possible that rom->addr is smaller than addr,
thus "rom->addr - addr" gets negative and the memcpy() then tries to
copy contents from the image to a bad memory location. This could
maybe be used to inject code from a kernel image into the QEMU binary,
so we better fix it with an additional sanity check here.
Cc: qemu-stable@nongnu.org
Reported-by: Guangming Liu
Buglink: https://bugs.launchpad.net/qemu/+bug/1844635
Message-Id: <20190925130331.27825-1-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
With the merge of notdirty handling into store_helper,
the last user of cpu->mem_io_vaddr was removed.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>