This pull request supersedes the one from 2018-12-13.
This is a revised first ppc pull request for qemu-4.0. Highlights
are:
* Most of the code for the POWER9 "XIVE" interrupt controller
(not complete yet, but we're getting there)
* A number of g_new vs. g_malloc cleanups
* Some IRQ wiring cleanups
* A fix for how we advertise NUMA nodes to the guest for pseries
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAlwce1QACgkQbDjKyiDZ
s5JqlhAAhES3+UoNHa/eTf/o2OOgIgZ3A2FVP1kGQbb3Q415hxx1blpYkBDk6sUg
POEgwbj8QMvJ8npOICb2NnHNsAhKRfgGxx/lqVgxLPTDdwGBq7Jr1lfhyX4D99WD
C2oLtJWvQrA7yIsDzurMjJpFvw8SYSogppom4lqE5667pm6U0j7JggFJkwIo+VAj
jzl706vvB6/EL3PHZ8eCzsxT2oRpxxMStE3lJ1JPpKc60mFb5gkXMk3hura1L8Ez
t4NEN9I4ePivXlh6YYDp7Pv5l9JSzKV7Uu8xrYeMdz33e0jUWyoERrdhM51mI4s1
WGoQm6eL6p0jngAUPYtAdIGC6ZGaCMT5rkoDZ4K+us94kVvdqzWjQyRNp84GpQq0
Z/sxJaTSK2DZMnQL3LE19upk7XkB5uBgnjs5T5FcFia7bIDG3p8MY4VwIM4dRum9
WuirEUJRKg28eTTnuK9NQX2+MEnrRWc/FSNaBLjxrijD4C4jHogXTpssNprYnkV7
HgkQ2MaidcnNLftfOUeBr0aTx+rGqtUB56Xas1UK+WqykKVfRdZ6hnbRg30FJvQ/
X4SIc/QZcLcA78C/SvCuXa1uqWqlrZMhN2e5r+eiEXaFYyriWgMYe9w+mXKbrTsb
ZPkbax6xz1esqOZ15ytCTneQONyhMXy5iFDfb0khO4DO0uXq4dA=
=IN1s
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.0-20181221' into staging
ppc patch queue 2018-12-21
This pull request supersedes the one from 2018-12-13.
This is a revised first ppc pull request for qemu-4.0. Highlights
are:
* Most of the code for the POWER9 "XIVE" interrupt controller
(not complete yet, but we're getting there)
* A number of g_new vs. g_malloc cleanups
* Some IRQ wiring cleanups
* A fix for how we advertise NUMA nodes to the guest for pseries
# gpg: Signature made Fri 21 Dec 2018 05:34:12 GMT
# gpg: using RSA key 6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-4.0-20181221: (40 commits)
MAINTAINERS: PPC: add a XIVE section
spapr: change default CPU type to POWER9
spapr: introduce an 'ic-mode' machine option
spapr: add an extra OV5 field to the sPAPR IRQ backend
spapr: add a 'reset' method to the sPAPR IRQ backend
spapr: extend the sPAPR IRQ backend for XICS migration
spapr: allocate the interrupt thread context under the CPU core
spapr: add device tree support for the XIVE exploitation mode
spapr: add hcalls support for the XIVE exploitation interrupt mode
spapr: introduce a new machine IRQ backend for XIVE
spapr-iommu: Always advertise the maximum possible DMA window size
spapr/xive: use the VCPU id as a NVT identifier
spapr/xive: introduce a XIVE interrupt controller
ppc/xive: notify the CPU when the interrupt priority is more privileged
ppc/xive: introduce a simplified XIVE presenter
ppc/xive: introduce the XIVE interrupt thread context
ppc/xive: add support for the END Event State Buffers
Changes requirement for "vsubsbs" instruction
spapr: export and rename the xics_max_server_number() routine
spapr: introduce a spapr_irq_init() routine
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
VTD fixes
IR and split irqchip are now the default for Q35
ACPI refactoring
hotplug refactoring
new names for virtio devices
multiple pcie link width/speeds
PCI fixes
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJcG967AAoJECgfDbjSjVRpUlMH/1wy8WW7Br/4JxlWUPsfZTqZ
0Lg2n59wuFzRVS+gLotp6bUaJGR9xn9fKjI1wfD59oVrDTKyauuw82v0OityEs33
ZquFecuJvP6k5N40DkqA9YJjKSw7nUmLrsyrD0t2H43npikP2aD9f6yootrr3oVV
nPwBvyT9ykIBQc7IzsHDiLw3EPdIf+9IR7+l+PLptzV0zK43Jfwi/nzHIQq00UMz
eLM/ejQLIx4BZJnYGS0Cy6v3K7cS3k45LHDY0cGc0id2jHFN2vdQyHCF9I1ps72q
pSlhMaLEwvQSYyre6iFTG5uuvyIPWv3LOkaBEwMSA5B/HXuEb2RKHThYzS9dc68=
=OwW7
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc, virtio: fixes, features
VTD fixes
IR and split irqchip are now the default for Q35
ACPI refactoring
hotplug refactoring
new names for virtio devices
multiple pcie link width/speeds
PCI fixes
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Thu 20 Dec 2018 18:26:03 GMT
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (44 commits)
x86-iommu: turn on IR by default if proper
x86-iommu: switch intr_supported to OnOffAuto type
q35: set split kernel irqchip as default
pci: Adjust PCI config limit based on bus topology
spapr_pci: perform unplug via the hotplug handler
pci/shpc: perform unplug via the hotplug handler
pci: Reuse pci-bridge hotplug handler handlers for pcie-pci-bridge
pci/pcie: perform unplug via the hotplug handler
pci/pcihp: perform unplug via the hotplug handler
pci/pcihp: overwrite hotplug handler recursively from the start
pci/pcihp: perform check for bus capability in pre_plug handler
s390x/pci: rename hotplug handler callbacks
pci/shpc: rename hotplug handler callbacks
pci/pcie: rename hotplug handler callbacks
hw/i386: Remove deprecated machines pc-0.10 and pc-0.11
hw: acpi: Remove AcpiRsdpDescriptor and fix tests
hw: acpi: Export and share the ARM RSDP build
hw: arm: Support both legacy and current RSDP build
hw: arm: Convert the RSDP build to the buid_append_foo() API
hw: arm: Carry RSDP specific data through AcpiRsdpData
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This option is used to select the interrupt controller mode (XICS or
XIVE) with which the machine will operate. XICS being the default
mode for now.
When running a machine with the XIVE interrupt mode backend, the guest
OS is required to have support for the XIVE exploitation mode. In the
case of legacy OS, the mode selected by CAS should be XICS and the OS
should fail to boot. However, QEMU could possibly detect it, terminate
the boot process and reset to stop in the SLOF firmware. This is not
yet handled.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The interrupt modes supported by the hypervisor are advertised to the
guest with new bits definitions of the option vector 5 of property
"ibm,arch-vec-5-platform-support. The byte 23 bits 0-1 of the OV5 are
defined as follow :
0b00 PAPR 2.7 and earlier (Legacy systems)
0b01 XIVE Exploitation mode only
0b10 Either available
If the client/guest selects the XIVE interrupt mode, it informs the
hypervisor by returning the value 0b01 in byte 23 bits 0-1. A 0b00
value indicates the use of the XICS interrupt mode (Legacy systems).
The sPAPR IRQ backend is extended with these definitions and the
values are directly used to populate the "ibm,arch-vec-5-platform-support"
property. The interrupt mode is advertised under TCG and under KVM.
Although a KVM XIVE device is not yet available, the machine can still
operate with kernel_irqchip=off. However, we apply a restriction on
the CPU which is required to be a POWER9 when a XIVE interrupt
controller is in use.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
For the time being, the XIVE reset handler updates the OS CAM line of
the vCPU as it is done under a real hypervisor when a vCPU is
scheduled to run on a HW thread. This will let the XIVE presenter
engine find a match among the NVTs dispatched on the HW threads.
This handler will become even more useful when we introduce the
machine supporting both interrupt modes, XIVE and XICS. In this
machine, the interrupt mode is chosen by the CAS negotiation process
and activated after a reset.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: Fix style nits]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Introduce a new sPAPR IRQ handler to handle resend after migration
when the machine is using a KVM XICS interrupt controller model.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Each interrupt mode has its own specific interrupt presenter object,
that we store under the CPU object, one for XICS and one for XIVE.
Extend the sPAPR IRQ backend with a new handler to support them both.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The XIVE interface for the guest is described in the device tree under
the "interrupt-controller" node. A couple of new properties are
specific to XIVE :
- "reg"
contains the base address and size of the thread interrupt
managnement areas (TIMA), for the User level and for the Guest OS
level. Only the Guest OS level is taken into account today.
- "ibm,xive-eq-sizes"
the size of the event queues. One cell per size supported, contains
log2 of size, in ascending order.
- "ibm,xive-lisn-ranges"
the IRQ interrupt number ranges assigned to the guest for the IPIs.
and also under the root node :
- "ibm,plat-res-int-priorities"
contains a list of priorities that the hypervisor has reserved for
its own use. OPAL uses the priority 7 queue to automatically
escalate interrupts for all other queues (DD2.X POWER9). So only
priorities [0..6] are allowed for the guest.
Extend the sPAPR IRQ backend with a new handler to populate the DT
with the appropriate "interrupt-controller" node.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: Fix style nits]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The different XIVE virtualization structures (sources and event queues)
are configured with a set of Hypervisor calls :
- H_INT_GET_SOURCE_INFO
used to obtain the address of the MMIO page of the Event State
Buffer (ESB) entry associated with the source.
- H_INT_SET_SOURCE_CONFIG
assigns a source to a "target".
- H_INT_GET_SOURCE_CONFIG
determines which "target" and "priority" is assigned to a source
- H_INT_GET_QUEUE_INFO
returns the address of the notification management page associated
with the specified "target" and "priority".
- H_INT_SET_QUEUE_CONFIG
sets or resets the event queue for a given "target" and "priority".
It is also used to set the notification configuration associated
with the queue, only unconditional notification is supported for
the moment. Reset is performed with a queue size of 0 and queueing
is disabled in that case.
- H_INT_GET_QUEUE_CONFIG
returns the queue settings for a given "target" and "priority".
- H_INT_RESET
resets all of the guest's internal interrupt structures to their
initial state, losing all configuration set via the hcalls
H_INT_SET_SOURCE_CONFIG and H_INT_SET_QUEUE_CONFIG.
- H_INT_SYNC
issue a synchronisation on a source to make sure all notifications
have reached their queue.
Calls that still need to be addressed :
H_INT_SET_OS_REPORTING_LINE
H_INT_GET_OS_REPORTING_LINE
See the code for more documentation on each hcall.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Folded in fix for field accessors]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The XIVE IRQ backend uses the same layout as the new XICS backend but
covers the full range of the IRQ number space. The IRQ numbers for the
CPU IPIs are allocated at the bottom of this space, below 4K, to
preserve compatibility with XICS which does not use that range.
This should be enough given that the maximum number of CPUs is 1024
for the sPAPR machine under QEMU. For the record, the biggest POWER8
or POWER9 system has a maximum of 1536 HW threads (16 sockets, 192
cores, SMT8).
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
sPAPRXive models the XIVE interrupt controller of the sPAPR machine.
It inherits from the XiveRouter and provisions storage for the routing
tables :
- Event Assignment Structure (EAS)
- Event Notification Descriptor (END)
The sPAPRXive model incorporates an internal XiveSource for the IPIs
and for the interrupts of the virtual devices of the guest. This model
is consistent with XIVE architecture which also incorporates an
internal IVSE for IPIs and accelerator interrupts in the IVRE
sub-engine.
The sPAPRXive model exports two memory regions, one for the ESB
trigger and management pages used to control the sources and one for
the TIMA pages. They are mapped by default at the addresses found on
chip 0 of a baremetal system. This is also consistent with the XIVE
architecture which defines a Virtualization Controller BAR for the
internal IVSE ESB pages and a Thread Managment BAR for the TIMA.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Fold in field accessor fixes]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The last sub-engine of the XIVE architecture is the Interrupt
Virtualization Presentation Engine (IVPE). On HW, the IVRE and the
IVPE share elements, the Power Bus interface (CQ), the routing table
descriptors, and they can be combined in the same HW logic. We do the
same in QEMU and combine both engines in the XiveRouter for
simplicity.
When the IVRE has completed its job of matching an event source with a
Notification Virtual Target (NVT) to notify, it forwards the event
notification to the IVPE sub-engine. The IVPE scans the thread
interrupt contexts of the Notification Virtual Targets (NVT)
dispatched on the HW processor threads and if a match is found, it
signals the thread. If not, the IVPE escalates the notification to
some other targets and records the notification in a backlog queue.
The IVPE maintains the thread interrupt context state for each of its
NVTs not dispatched on HW processor threads in the Notification
Virtual Target table (NVTT).
The model currently only supports single NVT notifications.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: Folded in fix for field accessors]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Each POWER9 processor chip has a XIVE presenter that can generate four
different exceptions to its threads:
- hypervisor exception,
- O/S exception
- Event-Based Branch (EBB)
- msgsnd (doorbell).
Each exception has a state independent from the others called a Thread
Interrupt Management context. This context is a set of registers which
lets the thread handle priority management and interrupt acknowledgment
among other things. The most important ones being :
- Interrupt Priority Register (PIPR)
- Interrupt Pending Buffer (IPB)
- Current Processor Priority (CPPR)
- Notification Source Register (NSR)
These registers are accessible through a specific MMIO region, called
the Thread Interrupt Management Area (TIMA), four aligned pages, each
exposing a different view of the registers. First page (page address
ending in 0b00) gives access to the entire context and is reserved for
the ring 0 view for the physical thread context. The second (page
address ending in 0b01) is for the hypervisor, ring 1 view. The third
(page address ending in 0b10) is for the operating system, ring 2
view. The fourth (page address ending in 0b11) is for user level, ring
3 view.
The thread interrupt context is modeled with a XiveTCTX object
containing the values of the different exception registers. The TIMA
region is mapped at the same address for each CPU.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The Event Notification Descriptor (END) XIVE structure also contains
two Event State Buffers providing further coalescing of interrupts,
one for the notification event (ESn) and one for the escalation events
(ESe). A MMIO page is assigned for each to control the EOI through
loads only. Stores are not allowed.
The END ESBs are modeled through an object resembling the 'XiveSource'
It is stateless as the END state bits are backed into the XiveEND
structure under the XiveRouter and the MMIO accesses follow the same
rules as for the XiveSource ESBs.
END ESBs are not supported by the Linux drivers neither on OPAL nor on
sPAPR. Nevetherless, it provides a mean to study the question in the
future and validates a bit more the XIVE model.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: Fold in a later fix for field access]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The XIVE sPAPR IRQ backend will use it to define the number of ENDs of
the IC controller.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Initialize the MSI bitmap from it as this will be necessary for the
sPAPR IRQ backend for XIVE.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
To complete the event routing, the IVRE sub-engine uses a second table
containing Event Notification Descriptor (END) structures.
An END specifies on which Event Queue (EQ) the event notification
data, defined in the associated EAS, should be posted when an
exception occurs. It also defines which Notification Virtual Target
(NVT) should be notified.
The Event Queue is a memory page provided by the O/S defining a
circular buffer, one per server and priority couple, containing Event
Queue entries. These are 4 bytes long, the first bit being a
'generation' bit and the 31 following bits the END Data field. They
are pulled by the O/S when the exception occurs.
The END Data field is a way to set an invariant logical event source
number for an IRQ. On sPAPR machines, it is set with the
H_INT_SET_SOURCE_CONFIG hcall when the EISN flag is used.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: Fold in a later fix from Cédric fixing field accessors]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The XiveRouter models the second sub-engine of the XIVE architecture :
the Interrupt Virtualization Routing Engine (IVRE).
The IVRE handles event notifications of the IVSE and performs the
interrupt routing process. For this purpose, it uses a set of tables
stored in system memory, the first of which being the Event Assignment
Structure (EAS) table.
The EAT associates an interrupt source number with an Event Notification
Descriptor (END) which will be used in a second phase of the routing
process to identify a Notification Virtual Target.
The XiveRouter is an abstract class which needs to be inherited from
to define a storage for the EAT, and other upcoming tables.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: Folded in parts of a later fix by Cédric fixing field access]
[dwg: Fix style nits]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The XiveNotifier offers a simple interface, between the XiveSource
object and the main interrupt controller of the machine. It will
forward event notifications to the XIVE Interrupt Virtualization
Routing Engine (IVRE).
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: Adjust type name string for XiveNotifier]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The 'sent' status of the LSI interrupt source is modeled with the 'P'
bit of the ESB and the assertion status of the source is maintained
with an extra bit under the main XiveSource object. The type of the
source is stored in the same array for practical reasons.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: Fix style nit]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The first sub-engine of the overall XIVE architecture is the Interrupt
Virtualization Source Engine (IVSE). An IVSE can be integrated into
another logic, like in a PCI PHB or in the main interrupt controller
to manage IPIs.
Each IVSE instance is associated with an Event State Buffer (ESB) that
contains a two bit state entry for each possible event source. When an
event is signaled to the IVSE, by MMIO or some other means, the
associated interrupt state bits are fetched from the ESB and
modified. Depending on the resulting ESB state, the event is forwarded
to the IVRE sub-engine of the controller doing the routing.
Each supported ESB entry is associated with either a single or a
even/odd pair of pages which provides commands to manage the source:
to EOI, to turn off the source for instance.
On a sPAPR machine, the O/S will obtain the page address of the ESB
entry associated with a source and its characteristic using the
H_INT_GET_SOURCE_INFO hcall. On PowerNV, a similar OPAL call is used.
The xive_source_notify() routine is in charge forwarding the source
event notification to the routing engine. It will be filled later on.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The OpenPIC have 5 outputs per connected CPU. The machine init code hence
needs a bi-dimensional array (smp_cpu lines, 5 columns) to wire up the irqs
between the PIC and the CPUs.
The current code first allocates an array of smp_cpus pointers to qemu_irq
type, then it allocates another array of smp_cpus * 5 qemu_irq and fills the
first array with pointers to each line of the second array. This is rather
convoluted.
Simplify the logic by introducing a structured type that describes all the
OpenPIC outputs for a single CPU, ie, fixed size of 5 qemu_irq, and only
allocate a smp_cpu sized array of those.
This also allows to use g_new(T, n) instead of g_malloc(sizeof(T) * n)
as recommended in HACKING.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The watermark bits are set in the interrupt pending register according
to the configuration of txcnt and rxcnt in the txctrl and rxctrl
registers.
Since the UART TX does not implement a FIFO, the txwm bit is set as long
as the TX watermark level is greater than zero.
Signed-off-by: Nathaniel Graff <nathaniel.graff@sifive.com>
Reviewed-by: Michael Clark <mjc@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Connect the gpex PCIe device based on the device tree included in the
HiFive Unleashed ROM.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Increase the number of interrupts to match the HiFive Unleashed board.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Switch the intr_supported variable from a boolean to OnOffAuto type so
that we can know whether the user specified it or not. With that
we'll have a chance to help the user to choose more wisely where
possible. Introduce x86_iommu_ir_supported() to mask these changes.
No functional change at all.
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Starting from QEMU 4.0, let's specify "split" as the default value for
kernel-irqchip.
So for QEMU>=4.0 we'll have: allowed=Y,required=N,split=Y
for QEMU<=3.1 we'll have: allowed=Y,required=N,split=N
(omitting all the "kernel_irqchip_" prefix)
Note that this will let the default q35 machine type to depend on
Linux version 4.4 or newer because that's where split irqchip is
introduced in kernel. But it's fine since we're boosting supported
Linux version for QEMU 4.0 to around Linux 4.5. For more information
please refer to the discussion on AMD's RDTSCP:
https://lore.kernel.org/lkml/20181210181328.GA762@zn.tnic/
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce and use the "unplug" callback.
This is a preparation for multi-stage hotplug handlers, whereby the bus
hotplug handler is overwritten by the machine hotplug handler. This handler
will then pass control to the bus hotplug handler. So to get this running
cleanly, we also have to make sure to go via the hotplug handler chain when
actually unplugging a device after an unplug request. Lookup the hotplug
handler and call "unplug".
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
These functions are essentially the same, we only have to use
object_get_typename() for reporting errors. So let's share the
implementation of hotplug handler callbacks.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce and use the "unplug" callback.
This is a preparation for multi-stage hotplug handlers, whereby the bus
hotplug handler is overwritten by the machine hotplug handler. This handler
will then pass control to the bus hotplug handler. So to get this running
cleanly, we also have to make sure to go via the hotplug handler chain when
actually unplugging a device after an unplug request. Lookup the hotplug
handler and call "unplug".
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce and use the "unplug" callback.
This is a preparation for multi-stage hotplug handlers, whereby the bus
hotplug handler is overwritten by the machine hotplug handler. This handler
will then pass control to the bus hotplug handler. So to get this running
cleanly, we also have to make sure to go via the hotplug handler chain when
actually unplugging a device after an unplug request. Lookup the hotplug
handler and call "unplug".
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Perform the check in the pre_plug handler. In addition, we need the
capability only if the device is actually hotplugged (and not created
during machine initialization). This is a preparation for coldplugging
pci devices via that hotplug handler.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The callbacks are also called for cold plugged devices. Drop the "hot"
to better match the actual callback names.
While at it, also rename shpc_device_hotplug_common() to
shpc_device_plug_common().
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The callbacks are also called for cold plugged devices. Drop the "hot"
to better match the actual callback names.
While at it, also rename pcie_cap_slot_hotplug_common() to
pcie_cap_slot_plug_common().
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The only remaining AcpiRsdpDescriptor users are the ACPI utils for the
BIOS table tests.
We remove that dependency and can thus remove the structure itself.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since "s390x/tcg: avoid overflows in time2tod/tod2time", the
time2tod() function tries to deal with the 9 uppermost bits in the
time value, but uses the wrong mask for this: 0xff80000000000000 should
be used instead of 0xff10000000000000 here.
Fixes: 14055ce53c
Cc: qemu-stable@nongnu.org
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1544792887-14575-1-git-send-email-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[CH: tweaked commit message]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes, with the changes
to the following files manually reverted:
contrib/libvhost-user/libvhost-user-glib.h
contrib/libvhost-user/libvhost-user.c
contrib/libvhost-user/libvhost-user.h
linux-user/mips64/cpu_loop.c
linux-user/mips64/signal.c
linux-user/sparc64/cpu_loop.c
linux-user/sparc64/signal.c
linux-user/x86_64/cpu_loop.c
linux-user/x86_64/signal.c
target/s390x/gen-features.c
tests/migration/s390x/a-b-bios.c
tests/test-rcu-simpleq.c
tests/test-rcu-tailq.c
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20181204172535.2799-1-armbru@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Acked-by: Yuval Shaia <yuval.shaia@oracle.com>
Acked-by: Viktor Prutyanov <viktor.prutyanov@phystech.edu>
Now that build_rsdp() supports building both legacy and current RSDP
tables, we can move it to a generic folder (hw/acpi) and have the i386
ACPI code reuse it in order to reduce code duplication.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
That will allow us to generalize the ARM build_rsdp() routine to support
both legacy RSDP (The current i386 implementation) and extended RSDP
(The ARM implementation).
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Support DMA read/write draining should be easy for existing VT-d
emulation since the emulation itself does not have any request queue
there so we don't need to do anything to flush the un-commited queue.
What we need to do is to declare the support.
These capabilities are required to pass Windows SVVP test program. It
is verified that when with parameters "x-aw-bits=48,caching-mode=off"
we can pass the Windows SVVP test with this patch applied. Otherwise
we'll fail with:
IOMMU[0] - DWD (DMA write draining) not supported
IOMMU[0] - DWD (DMA read draining) not supported
Segment 0 has no DMA remapping capable IOMMU units
However since these bits are not declared support for QEMU<=3.1, we'll
need a compatibility bit for it and we turn this on by default only
for QEMU>=4.0.
Please refer to VT-d spec 6.5.4 for more information.
CC: Yu Wang <wyu@redhat.com>
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1654550
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Change the default speed and width for new machine types to the
fastest and widest currently supported. This should be compatible to
the PCIe 4.0 spec. Pre-QEMU-4.0 machine types remain at 2.5GT/s, x1
width.
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add fields allowing the PCIe link speed and width of a PCIESlot to
be configured, with an instance_post_init callback on the root port
parent class to set defaults. This allows child classes to set these
via properties or via their own instance_init callback, without
requiring all implementions to support arbitrary user selected values.
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Tested-by: Geoffrey McRae <geoff@hostfission.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Create properties to be able to define speeds and widths for PCIe
links. The only tricky bit here is that our get and set callbacks
translate from the fixed QAPI automagic enums to those we define
in PCI code to represent the actual register segment value.
Cc: Eric Blake <eblake@redhat.com>
Tested-by: Geoffrey McRae <geoff@hostfission.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The PCIe link speed and width between a downstream device and its
upstream port is negotiated on real hardware and susceptible to
dynamic changes due to signal issues and power management. In the
emulated device case there is no real hardware link, but we still
might wish to have some consistency between endpoint and downstream
port via a virtual negotiation. There is of course a real link for
assigned devices and this same virtual negotiation allows the
downstream port to match the endpoint, synchronizing on every read
to support underlying physical hardware dynamically adjusting the
link.
This negotiation is intentionally unidirectional for compatibility.
If the endpoint exceeds the capabilities of the downstream port or
there is no endpoint device, the downstream port reports negotiation
to its maximum speed and width, matching the previous case where
negotiation was absent. De-tuning the endpoint to match a virtual
link doesn't seem to benefit anyone and is a condition we've thus
far reported without functional issues.
Note that PCI_EXP_LNKSTA is already ignored for migration
compatibility via pcie_cap_v1_fill().
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Tested-by: Geoffrey McRae <geoff@hostfission.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In preparation for reporting higher virtual link speeds and widths,
create enums and macros to help us manage them.
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Tested-by: Geoffrey McRae <geoff@hostfission.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
SMBIOS is just another firmware interface used by some QEMU models.
We will later introduce more firmware interfaces in this subdirectory.
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
All the consumers of "hw/smbios/ipmi.h" are located in hw/smbios/.
There is no need to have this include publicly exposed,
reduce the visibility by moving it in hw/smbios/.
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The qmp/hmp command 'system_wakeup' is simply a direct call to
'qemu_system_wakeup_request' from vl.c. This function verifies if
runstate is SUSPENDED and if the wake up reason is valid before
proceeding. However, no error or warning is thrown if any of those
pre-requirements isn't met. There is no way for the caller to
differentiate between a successful wakeup or an error state caused
when trying to wake up a guest that wasn't suspended.
This means that system_wakeup is silently failing, which can be
considered a bug. Adding error handling isn't an API break in this
case - applications that didn't check the result will remain broken,
the ones that check it will have a chance to deal with it.
Adding to that, the commit before previous created a new QMP API called
query-current-machine, with a new flag called wakeup-suspend-support,
that indicates if the guest has the capability of waking up from suspended
state. Although such guest will never reach SUSPENDED state and erroring
it out in this scenario would suffice, it is more informative for the user
to differentiate between a failure because the guest isn't suspended versus
a failure because the guest does not have support for wake up at all.
All this considered, this patch changes qmp_system_wakeup to check if
the guest is capable of waking up from suspend, and if it is suspended.
After this patch, this is the output of system_wakeup in a guest that
does not have wake-up from suspend support (ppc64):
(qemu) system_wakeup
wake-up from suspend is not supported by this guest
(qemu)
And this is the output of system_wakeup in a x86 guest that has the
support but isn't suspended:
(qemu) system_wakeup
Unable to wake up: guest is not in suspended state
(qemu)
Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20181205194701.17836-4-danielhb413@gmail.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
When issuing the qmp/hmp 'system_wakeup' command, what happens in a
nutshell is:
- qmp_system_wakeup_request set runstate to RUNNING, sets a wakeup_reason
and notify the event
- in the main_loop, all vcpus are paused, a system reset is issued, all
subscribers of wakeup_notifiers receives a notification, vcpus are then
resumed and the wake up QAPI event is fired
Note that this procedure alone doesn't ensure that the guest will awake
from SUSPENDED state - the subscribers of the wake up event must take
action to resume the guest, otherwise the guest will simply reboot. At
this moment, only the ACPI machines via acpi_pm1_cnt_init and xen_hvm_init
have wake-up from suspend support.
However, only the presence of 'system_wakeup' is required for QGA to
support 'guest-suspend-ram' and 'guest-suspend-hybrid' at this moment.
This means that the user/management will expect to suspend the guest using
one of those suspend commands and then resume execution using system_wakeup,
regardless of the support offered in system_wakeup in the first place.
This patch creates a new API called query-current-machine [1], that holds
a new flag called 'wakeup-suspend-support' that indicates if the guest
supports wake up from suspend via system_wakeup. The machine is considered
to implement wake-up support if a call to a new 'qemu_register_wakeup_support'
is made during its init, as it is now being done inside acpi_pm1_cnt_init
and xen_hvm_init. This allows for any other machine type to declare wake-up
support regardless of ACPI state or wakeup_notifiers subscription, making easier
for newer implementations that might have their own mechanisms in the future.
This is the expected output of query-current-machine when running a x86
guest:
{"execute" : "query-current-machine"}
{"return": {"wakeup-suspend-support": true}}
Running the same x86 guest, but with the --no-acpi option:
{"execute" : "query-current-machine"}
{"return": {"wakeup-suspend-support": false}}
This is the output when running a pseries guest:
{"execute" : "query-current-machine"}
{"return": {"wakeup-suspend-support": false}}
With this extra tool, management can avoid situations where a guest
that does not have proper suspend/wake capabilities ends up in
inconsistent state (e.g.
https://github.com/open-power-host-os/qemu/issues/31).
[1] the decision of creating the query-current-machine API is based
on discussions in the QEMU mailing list where it was decided that
query-target wasn't a proper place to store the wake-up flag, neither
was query-machines because this isn't a static property of the
machine object. This new API can then be used to store other
dynamic machine properties that are scattered around the code
ATM. More info at:
https://lists.gnu.org/archive/html/qemu-devel/2018-05/msg04235.html
Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20181205194701.17836-2-danielhb413@gmail.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Needed so the patch after next can add ShutdownCause to QMP events
SHUTDOWN and RESET.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Message-Id: <20181205110131.23049-2-d.csapak@proxmox.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
- Return success from patch_reloc
- Preserve 32-bit values as zero-extended on x86_64
- Make bswap during memory ops as optional
- Cleanup xxhash
- Revert constant pooling for tcg/sparc/
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJcFxchAAoJEGTfOOivfiFfBUcIALmEeTTRkDtY8rCX0Thegd6g
O9roAEHvSu2BS3Zd3EwA+mu5OxcL8WeZY2LYBodFlCCsl/yQ09Lv7QmxrGtX7WNx
VF96BftTxYFGVC3Xc6+Q16/dSYM4qcWLuDxAE9BAh47m9NvTjPq+9ntEJMlalIDh
My8ANyGByBZeUeBXJuNReJcsGP5eUmNyuaM+aOlMjcVJeFAtvFacwkKpJdLPDM53
feDEiKhRWCkZq1ll4yFtuVTc+dQeYfLnPk8bkJcv7UAJnYIveXZk/eJcs5/vYjCx
8aePb9PwjbYrgXJgbo8mgVhgLBmakObQa8lJvlc3IZfIMp8OK/6au3TDXDSQAts=
=4Kdn
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20181216' into staging
- Remove retranslation remenents
- Return success from patch_reloc
- Preserve 32-bit values as zero-extended on x86_64
- Make bswap during memory ops as optional
- Cleanup xxhash
- Revert constant pooling for tcg/sparc/
# gpg: Signature made Mon 17 Dec 2018 03:25:21 GMT
# gpg: using RSA key 64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-tcg-20181216: (33 commits)
xxhash: match output against the original xxhash32
include: move exec/tb-hash-xx.h to qemu/xxhash.h
exec: introduce qemu_xxhash{2,4,5,6,7}
qht-bench: document -p flag
tcg: Drop nargs from tcg_op_insert_{before,after}
tcg/mips: Improve the add2/sub2 command to use TCG_TARGET_REG_BITS
tcg: Add TCG_TARGET_HAS_MEMORY_BSWAP
tcg/optimize: Optimize bswap
tcg: Clean up generic bswap64
tcg: Clean up generic bswap32
tcg/i386: Add setup_guest_base_seg for FreeBSD
tcg/i386: Precompute all guest_base parameters
tcg/i386: Assume 32-bit values are zero-extended
tcg/i386: Implement INDEX_op_extr{lh}_i64_i32 for 32-bit guests
tcg/i386: Propagate is64 to tcg_out_qemu_ld_slow_path
tcg/i386: Propagate is64 to tcg_out_qemu_ld_direct
tcg/s390x: Return false on failure from patch_reloc
tcg/ppc: Return false on failure from patch_reloc
tcg/arm: Return false on failure from patch_reloc
tcg/aarch64: Return false on failure from patch_reloc
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
These will gain some users very soon.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
This paves the way for upcoming work.
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Change the order in which we extract a/b and c/d to
match the output of the upstream xxhash32.
Tested with:
https://github.com/cota/xxhash/tree/qemu
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Before moving them all to include/qemu/xxhash.h.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
- qcow2: Decompression worker threads
- dmg: lzfse compression support
- file-posix: Simplify delegation to worker thread
- Don't pass flags to bdrv_reopen_queue()
- iotests: make 235 work on s390 (and others)
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJcE4wNAAoJEH8JsnLIjy/WpJQP/39XmFQr/UO/Z7fsQNJD7Kbn
yUzAunMt7r7nfyuC5CP7a57apjKzbLHIbKDKrI8v2/SHysZ2zvjGx9QFCYNM44P7
XRmwd/fJJUqcyaDZDjiIHZtfSvVQB09xOjl62K9b6tVYCTztBwqVzY9uE4oA0coh
tAofAwG8vHYYxhjkPxKaftBv/GO/a9jB1Dk6DG7cX4FUm0lwEnGcT3ZmRNUBRAQ4
F0HfG+OubqljHOSR3VN3PPoienDwQOTsroqhIL4R0Jeb6I/1IVyeO56C4WYrfn9L
Tjgsu1v/te4F+7/BBICQKp5y9nNYrg6uPlC4cD/st/xZQe0oMUHEGcSESm61wOc5
bP8A5D7iiCn1c3kZXrPVyuvUQBn3fIJUOgVHQ7Oa4x2i9VcjpzQKAL2Wuu9NEgwc
Acn9lj9ey3rZwcJisCyOchn5sG/M4dYstHP8aAUafeSpAvsXje+hPKnWe0+SqxZx
btmVt6Suh205fP86w9POeNzy1la69FzF/xqe3Eohl5mEZsylL5jT0w9CfAzJSJrz
dDhgnelgQZ0/YcoEc1pqqQ8EP+9EJuIzjB7mEaCfZUmylq7mL/QvWgtjSbIr1yFG
RFvg6wTqcnrtOKoLvLSfw64QJXgDFwQ3cZ7Wl8XakZNPMfffndk9AThQxBBgofqg
XOyuW5gg3g3xzZrQswsf
=XKq9
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- qcow2: Decompression worker threads
- dmg: lzfse compression support
- file-posix: Simplify delegation to worker thread
- Don't pass flags to bdrv_reopen_queue()
- iotests: make 235 work on s390 (and others)
# gpg: Signature made Fri 14 Dec 2018 10:55:09 GMT
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (42 commits)
block/mirror: add missing coroutine_fn annotations
iotests: make 235 work on s390 (and others)
block: Assert that flags are up-to-date in bdrv_reopen_prepare()
block: Remove assertions from update_flags_from_options()
block: Stop passing flags to bdrv_reopen_queue_child()
block: Remove flags parameter from bdrv_reopen_queue()
block: Clean up reopen_backing_file() in block/replication.c
qemu-io: Put flag changes in the options QDict in reopen_f()
block: Drop bdrv_reopen()
block: Use bdrv_reopen_set_read_only() in the mirror driver
block: Use bdrv_reopen_set_read_only() in external_snapshot_commit()
block: Use bdrv_reopen_set_read_only() in qmp_change_backing_file()
block: Use bdrv_reopen_set_read_only() in stream_start/complete()
block: Use bdrv_reopen_set_read_only() in bdrv_commit()
block: Use bdrv_reopen_set_read_only() in commit_start/complete()
block: Use bdrv_reopen_set_read_only() in bdrv_backing_update_filename()
block: Add bdrv_reopen_set_read_only()
file-posix: Avoid aio_worker() for QEMU_AIO_IOCTL
file-posix: Switch to .bdrv_co_ioctl
file-posix: Remove paio_submit_co()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJcE0VvAAoJEDhwtADrkYZTLCkP/RvRR9iTcoM98kcFNjqBZRQa
rUbSNBavxwzutPiT40WcNhg7hc0Uaptve8oMkGcfyyTh9UyhdOe8WNPxTos96vYt
GtUhNhknGlvP4A7Zjs6KIIhl084MtPkpuPERkXZL4lgNrIw8BrFoj5hkZ3UIvItf
14oA1o6Zf9UxN1Yt12lZnG9N8t4ld5IKhkXh/FQ6OJNHz9GrhPq4A7vd4ipBRBjt
PjvXVOYCEkiHRfJ3Qv5Thk2C1xzLRFusA5ff1rju324KGPoM8oZ+xGSUVqD0hhMe
Kpzv4a6HV7SuM1fqJoZrF87VOhAO9bpxzIHUp83FhpKGDH4xqppDWYno/+9imPDA
DAHUaOeaKpX6O4ttB96jRwTEOAbq3TzPqtYiyRaXhbtCc0dKi0HxHmIpwS4KNkHK
Y3VuoTavarMfuLl2gDO+9PJhHxol8g0oYiaxXddW0svgnSM3xBTz/hGE2duStHTb
DSWDVB/oVIOyR8eWSglUnc+OOJrxSkiaJelSU730Uc6kIk7hiY8PFQiwqebsI6uq
IOABDG1/W0FkSRNl5QwXnGlD0eUzl1ySm2zvsgvJrC8ooAhzjjWdkcwtEdEYxlUj
KqH+8ZFP+mOckrW9boqYPVqOL4GzNMnK23vEoidurhyShsmiCTyk+jckiJrl/IMy
OlwA850MKVJ3W3+knR0I
=bymN
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2018-12-13-v2' into staging
QAPI patches for 2018-12-13
# gpg: Signature made Fri 14 Dec 2018 05:53:51 GMT
# gpg: using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-qapi-2018-12-13-v2: (32 commits)
qapi: add conditions to REPLICATION type/commands on the schema
qapi: add more conditions to SPICE
qapi: add condition to variants documentation
qapi: add 'If:' condition to struct members documentation
qapi: add 'If:' condition to enum values documentation
qapi: Add #if conditions to generated code members
qapi: add 'if' to alternate members
qapi: add 'if' to union members
qapi: Add 'if' to implicit struct members
qapi: add a dictionary form for TYPE
qapi-events: add 'if' condition to implicit event enum
qapi: add 'if' to enum members
qapi: add a dictionary form with 'name' key for enum members
qapi: improve reporting of unknown or missing keys
qapi: factor out checking for keys
tests: print enum type members more like object type members
qapi: change enum visitor and gen_enum* to take QAPISchemaMember
qapi: Do not define enumeration value explicitly
qapi: break long lines at 'data' member
qapi: rename QAPISchemaEnumType.values to .members
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* Convert various devices from sysbus init to instance_init
* Remove the now unused sysbus init support entirely
* Allow AArch64 processors to boot from a kernel placed over 4GB
* hw: arm: musicpal: drop TYPE_WM8750 in object_property_set_link()
* versal: minor fixes to virtio-mmio instantation
* arm: Implement the ARMv8.1-HPD extension
* arm: Implement the ARMv8.2-AA32HPD extension
* arm: Implement the ARMv8.1-LOR extension (as the trivial
"no limited ordering regions provided" minimum)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJcEnIpAAoJEDwlJe0UNgzeLJ8P/j1KGpnnOy4Cdxal4zRd8sWF
iMMVzuzUzcrMWy0gFCHsioSxsvlAidNnPp2Vbf4wmmZnoresKMWvojPke8RWJsL3
4X80cVTYDjjwIVSvXs9SntWQmLREffPOSNlAIP2WfPq+5sjxzrytcXB1Nc7V/zKJ
9b7R1a4ea1ZET+C3c9QMf4VwAoo/jf5VzA7gE4f8ePYwKH7HluiJSDhUaUrxsnZr
ibjQCF+/4DYkI5DGKVRltR6vPcsKUJomn7ImQylIQkkyCiA3WjFJ5Mc+BHYOj3pm
UbW/sxI6ONjoW6KHwg/15R3UZFhzTkQMUHGY6n6oLosN4IoPt3c7vUtnNjtqaU1D
+EBZHdUMYnZMJp2XD1Nyv9iR0v/A9MI1ldx0fBjqPsFGx48DOKTYwBloiz+0o2z7
g3GC/Tjpcs37GrieNuJ7HB1NefNPW2Hk1xitTPegMfjO8ukg3tccCuY9KCBlAnOe
hGJsrl0NM4E/s98PEMEEgcZf/fmE2fCNZgLPAGOYXNHZku1reLg6yCIpIZSusLOd
gLmndngGZbWm39h6uBrEthnZ+3ktRe+T7ERAKsv/o2p06XWF0tbBd0AjQvnOBRgR
uYFJ416xVOYULXme+oJO0Vt6mM41UstACKCtUOkk3jmIY3xmAxGfxu6nC/p+iIR6
5djxiqi/JqccdpafWF2V
=fIbS
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20181213' into staging
target-arm queue:
* Convert various devices from sysbus init to instance_init
* Remove the now unused sysbus init support entirely
* Allow AArch64 processors to boot from a kernel placed over 4GB
* hw: arm: musicpal: drop TYPE_WM8750 in object_property_set_link()
* versal: minor fixes to virtio-mmio instantation
* arm: Implement the ARMv8.1-HPD extension
* arm: Implement the ARMv8.2-AA32HPD extension
* arm: Implement the ARMv8.1-LOR extension (as the trivial
"no limited ordering regions provided" minimum)
# gpg: Signature made Thu 13 Dec 2018 14:52:25 GMT
# gpg: using RSA key 3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20181213: (37 commits)
target/arm: Implement the ARMv8.1-LOR extension
target/arm: Use arm_hcr_el2_eff more places
target/arm: Introduce arm_hcr_el2_eff
target/arm: Implement the ARMv8.2-AA32HPD extension
target/arm: Implement the ARMv8.1-HPD extension
target/arm: Tidy scr_write
target/arm: Fix HCR_EL2.TGE check in arm_phys_excp_target_el
target/arm: Add SCR_EL3 bits up to ARMv8.5
target/arm: Add HCR_EL2 bits up to ARMv8.5
target/arm: Move id_aa64mmfr* to ARMISARegisters
hw/arm: versal: Correct the nr of IRQs to 192
hw/arm: versal: Use IRQs 111 - 118 for virtio-mmio
hw/arm: versal: Reduce number of virtio-mmio instances
hw/arm: versal: Remove bogus virtio-mmio creation
core/sysbus: remove the SysBusDeviceClass::init path
xen_backend: remove xen_sysdev_init() function
usb/tusb6010: Convert sysbus init function to realize function
timer/puv3_ost: Convert sysbus init function to realize function
timer/grlib_gptimer: Convert sysbus init function to realize function
timer/etraxfs_timer: Convert sysbus init function to realize function
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a documentation comment for load_image_size().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20181130151712.2312-11-peter.maydell@linaro.org
The load_image() function is now no longer used anywhere, so
we can remove it completely. (Use load_image_size() or
g_file_get_contents() instead.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20181130151712.2312-10-peter.maydell@linaro.org
Currently the load_elf function in elf_ops.h uses
cpu_physical_memory_write() to write the ELF file to
memory if it is not handling it as a ROM blob. This
means we ignore the AddressSpace that the function
is passed to define where it should be loaded.
Use address_space_write() instead.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20181122172653.3413-4-peter.maydell@linaro.org
The API of cpu_physical_memory_write_rom() is odd, because it
takes an AddressSpace, unlike all the other cpu_physical_memory_*
access functions. Rename it to address_space_write_rom(), and
bring its API into line with address_space_write().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20181122133507.30950-3-peter.maydell@linaro.org
Now that all callers are passing all flag changes as QDict options,
the flags parameter is no longer necessary, so we can get rid of it.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
No one is using this function anymore, so we can safely remove it.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Most callers of bdrv_reopen() only use it to switch a BlockDriverState
between read-only and read-write, so this patch adds a new function
that does just that.
We also want to get rid of the flags parameter in the bdrv_reopen()
API, so this function sets the "read-only" option and passes the
original flags (which will then be updated in bdrv_reopen_prepare()).
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
No real reason to keep using the callback based mechanism here when the
rest of the file-posix driver is coroutine based. Changing it brings
ioctls more in line with how other request types work.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The input visitor has some problems right now, especially
- unsigned type "Range" is used to process signed ranges, resulting in
inconsistent behavior and ugly/magical code
- uint64_t are parsed like int64_t, so big uint64_t values are not
supported and error messages are misleading
- lists/ranges of int64_t are accepted although no list is parsed and
we should rather report an error
- lists/ranges are preparsed using int64_t, making it hard to
implement uint64_t values or uint64_t lists
- types that don't support lists don't bail out
- visiting beyond the end of a list is not handled properly
- we don't actually parse lists, we parse *sets*: members are sorted,
and duplicates eliminated
So let's rewrite it by getting rid of usage of the type "Range" and
properly supporting lists of int64_t and uint64_t (including ranges of
both types), fixing the above mentioned issues.
Lists of other types are not supported and will properly report an
error. Virtual walks are now supported.
Tests have to be fixed up:
- Two BUGs were hardcoded that are fixed now
- The string-input-visitor now actually returns a parsed list and not
an ordered set.
Please note that no users/callers have to be fixed up. Candidates using
visit_type_uint16List() and friends are:
- backends/hostmem.c:host_memory_backend_set_host_nodes()
-- Code can deal with duplicates/unsorted lists
- numa.c::query_memdev()
-- via object_property_get_uint16List(), the list will still be sorted
and without duplicates (via host_memory_backend_get_host_nodes())
- qapi-visit.c::visit_type_Memdev_members()
- qapi-visit.c::visit_type_NumaNodeOptions_members()
- qapi-visit.c::visit_type_RockerOfDpaGroup_members
- qapi-visit.c::visit_type_RxFilterInfo_members()
-- Not used with string-input-visitor.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181121164421.20780-7-david@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
qemu_strtosz() & friends reject NaNs, but happily accept infinities.
They shouldn't. Fix that.
The fix makes use of qemu_strtod_finite(). To avoid ugly casts,
change the @end parameter of qemu_strtosz() & friends from char **
to const char **.
Also, add two test cases, testing that "inf" and "NaN" are properly
rejected. While at it, also fixup the function documentation.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181121164421.20780-3-david@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Let's provide a wrapper for strtod().
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181121164421.20780-2-david@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Correct the nr of IRQs to 192.
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20181129163655.20370-5-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use IRQs 111 - 118 for virtio-mmio. The interrupts we're currently
using 160+ are not available in the Versal GIC.
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20181129163655.20370-4-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The two thing that should be handled are cipher and ivgen. For ivgen
the solution is just mutex, as iv calculations should not be long in
comparison with encryption/decryption. And for cipher let's just keep
per-thread ciphers.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Just like on other architectures, we should stop the clock while the guest
is not running. This is already properly done for TCG. Right now, doing an
offline migration (stop, migrate, cont) can easily trigger stalls in the
guest.
Even doing a
(hmp) stop
... wait 2 minutes ...
(hmp) cont
will already trigger stalls.
So whenever the guest stops, backup the KVM TOD. When continuing to run
the guest, restore the KVM TOD.
One special case is starting a simple VM: Reading the TOD from KVM to
stop it right away until the guest is actually started means that the
time of any simple VM will already differ to the host time. We can
simply leave the TOD running and the guest won't be able to recognize
it.
For migration, we actually want to keep the TOD stopped until really
starting the guest. To be able to catch most errors, we should however
try to set the TOD in addition to simply storing it. So we can still
catch basic migration problems.
If anything goes wrong while backing up/restoring the TOD, we have to
ignore it (but print a warning). This is then basically a fallback to
old behavior (TOD remains running).
I tested this very basically with an initrd:
1. Start a simple VM. Observed that the TOD is kept running. Old
behavior.
2. Ordinary live migration. Observed that the TOD is temporarily
stopped on the destination when setting the new value and
correctly started when finally starting the guest.
3. Offline live migration. (stop, migrate, cont). Observed that the
TOD will be stopped on the source with the "stop" command. On the
destination, the TOD is temporarily stopped when setting the new
value and correctly started when finally starting the guest via
"cont".
4. Simple stop/cont correctly stops/starts the TOD. (multiple stops
or conts in a row have no effect, so works as expected)
In the future, we might want to send the guest a special kind of time sync
interrupt under some conditions, so it can synchronize its tod to the
host tod. This is interesting for migration scenarios but also when we
get time sync interrupts ourselves. This however will most probably have
to be handled in KVM (e.g. when the tods differ too much) and is not
desired e.g. when debugging the guest (single stepping should not
result in permanent time syncs). I consider something like that an add-on
on top of this basic "don't break the guest" handling.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181130094957.4121-1-david@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Out-of-band command execution was introduced in commit cf869d5317.
Unfortunately, we ran into a regression, and had to turn it into an
experimental option for 2.12 (commit be933ffc23).
http://lists.gnu.org/archive/html/qemu-devel/2018-03/msg06231.html
The regression has since been fixed (commit 951702f39c "monitor: bind
dispatch bh to iohandler context"). A thorough re-review of OOB
commands led to a few more issues, which have also been addressed.
This patch partly reverts be933ffc23 (monitor: new parameter "x-oob"),
and makes QMP monitors again offer capability "oob" whenever they can
provide it, i.e. when the monitor's character device is capable of
running in an I/O thread.
Some trivial touch-up in the test code is required to make sure qmp-test
won't break.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20181009062718.1914-4-peterx@redhat.com>
[Conflict with "monitor: check if chardev can switch gcontext for OOB"
resolved, commit message updated]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Clang 3.4 considers duplicate typedef in ppc4xx_i2c.h and
bitbang_i2c.h an error even if they are identical. Move it to a common
place to allow building with this clang version.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The code that used it has already been removed a while ago with commit
dc41aa7d34 ("tcg: Remove GET_TCGV_* and MAKE_TCGV_*").
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Since we require GCC version 4.8 or newer now, we can be sure that
the builtin functions are always available on GCC. And for Clang,
we can check the availablility with __has_builtin instead.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When a QMP client sends in-band commands more quickly that we can
process them, we can either queue them without limit (QUEUE), drop
commands when the queue is full (DROP), or suspend receiving commands
when the queue is full (SUSPEND). None of them is ideal:
* QUEUE lets a misbehaving client make QEMU eat memory without bounds.
Not such a hot idea.
* With DROP, the client has to cope with dropped in-band commands. To
inform the client, we send a COMMAND_DROPPED event then. The event is
flawed by design in two ways: it's ambiguous (see commit d621cfe0a1),
and it brings back the "eat memory without bounds" problem.
* With SUSPEND, the client has to manage the flow of in-band commands to
keep the monitor available for out-of-band commands.
We currently DROP. Switch to SUSPEND.
Managing the flow of in-band commands to keep the monitor available for
out-of-band commands isn't really hard: just count the number of
"outstanding" in-band commands (commands sent minus replies received),
and if it exceeds the limit, hold back additional ones until it drops
below the limit again.
Note that we need to be careful pairing the suspend with a resume, or
else the monitor will hang, possibly forever. And here since we need to
make sure both:
(1) popping request from the req queue, and
(2) reading length of the req queue
will be in the same critical section, we let the pop function take the
corresponding queue lock when there is a request, then we release the
lock from the caller.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20181009062718.1914-2-peterx@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
QEMU_CHAR_FEATURE_GCONTEXT declares the character device can switch
GMainContext.
Assert we don't switch context when the character device doesn't
provide this feature. Character device users must not violate this
restriction. In particular, user configurations that violate them
must be rejected.
Existing frontend that rely on context switching would now assert() if
the backend doesn't allow it (instead of silently producing undesired
events in the default context). Following patches improve the
situation by reporting an error earlier instead, on the frontend side.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20181205203737.9011-4-marcandre.lureau@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJcD/usAAoJEPMMOL0/L748br0P/iLL5RjzHJ+vrBsphRNPZ0eM
1wvgWJwvo+4JicebsnTWDmElprgetu2+disXyxSJhBOllSb7lwmxKR2OyHlicu5x
RDSk3CTZZuI/CqN08MlEVZiCuCT1LZuJ8Y0RzXBAsJlT51ZpvwprbXO1oyMjjx2P
UhXFuYIa8Wk+8+zuFnYI4nKPo3o8ra8OrtI2AdIneQ8zWEMvJCWhHqrZHeyuHOzb
N5bYEXi3JSIT2qyHyZlFNmXjPCNuMxhrrBc1yjmo6KJgHkVUgvn61hdod4BzvLsd
DWAfdEamBgP4HuU2fUTVFAYXirK+A4tM+ROblJ/Z/V7RHifoulrdedQzrqdM7FA5
6f4SUo+SSjqY1CYnS+zXp5USu6/ciaYZv9jE7W0WKjXt5kCsy9rlT0V8Q4RlOQji
ZoT6LpcVj0qnfdPdYwdWgpqWbr8G5Y3Xm91a+XLSqzj+xfFJQ6h7fCnF1/Ngn0Ep
o3LOtiJCDSma4maFemV7qhWfuaa20vUwfbRKeOvnTirDUv6oXpsP19kScbH06DeD
Hs9aFgi7XmFWNypWHeZNqy00UwiZmb2GcpqL/vAVwkqdMTgttfBdh5P4srRwwM2D
OQROJaHaBya8mzDO7BldapQOGXRHu9UDCP9gkK+BGGu4Edu0U+eRL/08GoFRM1M4
ra2Dn/AQTIW6lk0N9He0
=+od8
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/vivier2/tags/trivial-patches-pull-request' into staging
Trivial patches (2018-12-11)
# gpg: Signature made Tue 11 Dec 2018 18:02:20 GMT
# gpg: using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg: aka "Laurent Vivier <laurent@vivier.eu>"
# gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C
* remotes/vivier2/tags/trivial-patches-pull-request: (30 commits)
Fixes i386 xchgq test
maint: Grammar fix to mailmap
MAINTAINERS: Update email address for Fam Zheng
cutils: Assert in-range base for string-to-integer conversions
util: vfio-helpers: use ARRAY_SIZE in qemu_vfio_init_pci()
target: hax: fix errors in comment
MAINTAINERS: Use my work email to review Build and test automation patches
MAINTAINERS: Add a missing entry for the NVDIMM device
MAINTAINERS: Add a missing entry to the QMP section
MAINTAINERS: Add a missing entry to SPICE
MAINTAINERS: Add missing entries for the MPS2 machine
MAINTAINERS: Add missing entries for the Canon DIGIC machine
MAINTAINERS: Add missing entries to the vhost section
MAINTAINERS: Add missing entries to the PC Chipset section
MAINTAINERS: Add a missing entry for the sun4m machines
MAINTAINERS: Add a missing entry for the Old World machines
MAINTAINERS: Add a missing entry for the Xilinx S3A-DSP 1800 machine
MAINTAINERS: Add missing entries for the Jazz machine
MAINTAINERS: Add missing entries for the Xilinx ZynqMP machine
MAINTAINERS: Add a missing entry to the SPARC CPU
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Instead of trying to implement something that isn't well specified,
remove it. (it would be tricky to implement, since a class struct is
memcpy on children types...)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20181204142023.15982-7-marcandre.lureau@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The function is only used by a test, move it there.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20181204142023.15982-6-marcandre.lureau@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
global_props is only used for Xen xen_compat_props. It's a static
array of GlobalProperty, like machine globals in SET_MACHINE_COMPAT().
Let's register the globals the same way, without extra copy allocation.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20181204142023.15982-5-marcandre.lureau@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Instead of accepting any Object*, change user_creatable_complete() to
require a UserCreatable*. Modify the callers to pass the appropriate
argument, removing redundant dynamic cast checks in object creation.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20181204142023.15982-4-marcandre.lureau@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Including all machine types that might have a pcie-root-port.
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <154394083644.28192.8501647946108201466.stgit@gimli.home>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
[ehabkost: fixed accidental recursion at spapr_machine_3_1_class_options()]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
There's no reason to violate our naming conventions by having a
struct with a different name than its typedef. Messed up since
its introduction in commit 8c85901e, but made more obvious when
commit 3bfe5716 promoted it to typedefs.h.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181115211752.1295571-3-eblake@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This makes their function more clear and prevents conflicts when adding
the actual devices to the machine state, if necessary.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20181107152434.22219-1-minyard@acm.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
If there are no changes, let's use a const pointer.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181023152306.3123-4-david@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
We try to detect and drop too large packet (>INT_MAX) in 1592a99470
("net: ignore packet size greater than INT_MAX") during packet
delivering. Unfortunately, this is not sufficient as we may hit
another integer overflow when trying to queue such large packet in
qemu_net_queue_append_iov():
- size of the allocation may overflow on 32bit
- packet->size is integer which may overflow even on 64bit
Fixing this by moving the check to qemu_sendv_packet_async() which is
the entrance of all networking codes and reduce the limit to
NET_BUFSIZE to be more conservative. This works since:
- For the callers that call qemu_sendv_packet_async() directly, they
only care about if zero is returned to determine whether to prevent
the source from producing more packets. A callback will be triggered
if peer can accept more then source could be enabled. This is
usually used by high speed networking implementation like virtio-net
or netmap.
- For the callers that call qemu_sendv_packet() that calls
qemu_sendv_packet_async() indirectly, they often ignore the return
value. In this case qemu will just the drop packets if peer can't
receive.
Qemu will copy the packet if it was queued. So it was safe for both
kinds of the callers to assume the packet was sent.
Since we move the check from qemu_deliver_packet_iov() to
qemu_sendv_packet_async(), it would be safer to make
qemu_deliver_packet_iov() static to prevent any external user in the
future.
This is a revised patch of CVE-2018-17963.
Cc: qemu-stable@nongnu.org
Cc: Li Qiang <liq3ea@163.com>
Fixes: 1592a99470 ("net: ignore packet size greater than INT_MAX")
Reported-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20181204035347.6148-2-jasowang@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Because they are supposed to remain const.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20181114132931.22624-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
GNUTLS takes a paranoid approach when seeing 0 bytes returned by the
underlying OS read() function. It will consider this an error and
return GNUTLS_E_PREMATURE_TERMINATION instead of propagating the 0
return value. It expects apps to arrange for clean termination at
the protocol level and not rely on seeing EOF from a read call to
detect shutdown. This is to harden apps against a malicious 3rd party
causing termination of the sockets layer.
This is unhelpful for the QEMU NBD code which does have a clean
protocol level shutdown, but still relies on seeing 0 from the I/O
channel read in the coroutine handling incoming replies.
The upshot is that when using a plain NBD connection shutdown is
silent, but when using TLS, the client spams the console with
Cannot read from TLS channel: Broken pipe
The NBD connection has, however, called qio_channel_shutdown()
at this point to indicate that it is done with I/O. This gives
the opportunity to optimize the code such that when the channel
has been shutdown in the read direction, the error code
GNUTLS_E_PREMATURE_TERMINATION gets turned into a '0' return
instead of an error.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20181119134228.11031-1-berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Add the spapr cap SPAPR_CAP_NESTED_KVM_HV to be used to control the
availability of nested kvm-hv to the level 1 (L1) guest.
Assuming a hypervisor with support enabled an L1 guest can be allowed to
use the kvm-hv module (and thus run it's own kvm-hv guests) by setting:
-machine pseries,cap-nested-hv=true
or disabled with:
-machine pseries,cap-nested-hv=false
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The spapr-rng device is suboptimal when compared to virtio-rng, so
users might want to disable it in their builds. Thus let's introduce
a proper CONFIG switch to allow us to compile QEMU without this device.
The function spapr_rng_populate_dt is required for linking, so move it
to a different location.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add documentation for the qemu_thread_atexit_add() and
qemu_thread_atexit_remove() functions.
We include a (previously undocumented) constraint that notifiers
may not be called if a thread is exiting because the entire
process is exiting. This is fine for our current use because
the callers use it only for cleaning up resources which go away
on process exit (memory, Win32 fibers), and we will need the
flexibility for the new posix implementation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181105135538.28025-2-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Emulation of the block limits VPD page called back into scsi-disk.c,
which however expected the request to be for a SCSIDiskState and
accessed a scsi-generic device outside the bounds of its struct
(namely to retrieve s->max_unmap_size and s->max_io_size).
To avoid this, move the emulation code to a separate function that
takes a new SCSIBlockLimits struct and marshals it into the VPD
response format.
Reported-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a new flag to mark memory region that are used as non-volatile, by
NVDIMM for example. That bit is propagated down to the flat view, and
reflected in HMP info mtree with a "nv-" prefix on the memory type.
This way, guest_phys_blocks_region_add() can skip the NV memory
regions for dumps and TCG memory clear in a following patch.
Cc: dgilbert@redhat.com
Cc: imammedo@redhat.com
Cc: pbonzini@redhat.com
Cc: guangrong.xiao@linux.intel.com
Cc: mst@redhat.com
Cc: xiaoguangrong.eric@gmail.com
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20181003114454.5662-2-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The 'q35' machine type implements an Intel Series 3 chipset,
of which there are several variants:
https://www.intel.com/Assets/PDF/datasheet/316966.pdf
The key difference between the 82P35 MCH ('p35', PCI device ID 0x29c0)
and 82Q35 GMCH ('q35', PCI device ID 0x29b0) variants is that the latter
has an integrated graphics adapter. QEMU does not implement integrated
graphics, so uses the PCI ID for the 82P35 chipset, despite calling the
machine type 'q35'. Thus we rename the PCI device ID constant to reflect
reality, to avoid confusing future developers. The new name more closely
matches what pci.ids reports it to be:
$ grep P35 /usr/share/hwdata/pci.ids | grep 29
29c0 82G33/G31/P35/P31 Express DRAM Controller
29c1 82G33/G31/P35/P31 Express PCI Express Root Port
29c4 82G33/G31/P35/P31 Express MEI Controller
29c5 82G33/G31/P35/P31 Express MEI Controller
29c6 82G33/G31/P35/P31 Express PT IDER Controller
29c7 82G33/G31/P35/P31 Express Serial KT Controller
$ grep Q35 /usr/share/hwdata/pci.ids | grep 29
29b0 82Q35 Express DRAM Controller
29b1 82Q35 Express PCI Express Root Port
29b2 82Q35 Express Integrated Graphics Controller
29b3 82Q35 Express Integrated Graphics Controller
29b4 82Q35 Express MEI Controller
29b5 82Q35 Express MEI Controller
29b6 82Q35 Express PT IDER Controller
29b7 82Q35 Express Serial KT Controller
Arguably the QEMU machine type should be named 'p35'. At this point in
time, however, it is not worth the churn for management applications &
documentation to worry about renaming it.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180830105757.10577-1-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
AMD IOMMU VAPIC support + fixes all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJb4IrKAAoJECgfDbjSjVRp9xEIAIT25r0SeThU32cl8955dBu3
L2q2e+4du4KcwrC1a65mhBeATFtRthL/cWFHf1rvmwsp1t6ib+uVBH/3ezH1b48o
rhrPjysYGbX+M/gxHv8uBM01JnMnmsaZVJv2iAifkO1fjJ5VCWXqJt89y7VryeUz
LRzN1Zzq84umDXUuqptBKI8MF8ySwqnRHCE6YrbpTAppaJRY8zIyWkQzMd+Ls9m/
Rwuo6QiySD4z5WrnL2hpvUCQw2qDTct9xDNrlGpxL1JVvOgo5Y5VFkF2X9IP7qap
TIC7Y9cfUjGNf8ferYsydgzpyTjFrBMUqqcu65HjUlpACXwwwrLHPScfpT37VJI=
=WPCi
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc, virtio: fixes, features
AMD IOMMU VAPIC support + fixes all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon 05 Nov 2018 18:24:10 GMT
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (33 commits)
vhost-scsi: prevent using uninitialized vqs
piix_pci: fix i440fx data sheet link
piix: use TYPE_FOO constants than string constats
i440fx: use ARRAY_SIZE for pam_regions
pci_bridge: fix typo in comment
hw/pci: Add missing include
hw/pci-bridge/ioh3420: Remove unuseful header
hw/pci-bridge/xio3130: Remove unused functions
tests/bios-tables-test: add 64-bit PCI MMIO aperture round-up test on Q35
bios-tables-test: prepare expected files for mmio64
hw/pci-host/x86: extend the 64-bit PCI hole relative to the fw-assigned base
hw/pci-host/x86: extract get_pci_hole64_start_value() helpers
pci-testdev: add optional memory bar
MAINTAINERS: list "tests/acpi-test-data" files in ACPI/SMBIOS section
x86_iommu/amd: Enable Guest virtual APIC support
x86_iommu/amd: Add interrupt remap support when VAPIC is enabled
i386: acpi: add IVHD device entry for IOAPIC
x86_iommu/amd: Add interrupt remap support when VAPIC is not enabled
x86_iommu/amd: Prepare for interrupt remap support
x86_iommu/amd: make the address space naming consistent with intel-iommu
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Noted while refactoring:
CC mips-softmmu/hw/mips/gt64xxx_pci.o
In file included from include/hw/pci-host/gt64xxx.h:2,
from hw/mips/gt64xxx_pci.c:30:
include/hw/pci/pci_bus.h:23:5: error: unknown type name ‘PCIIOMMUFunc’
PCIIOMMUFunc iommu_fn;
^~~~~~~~~~~~
include/hw/pci/pci_bus.h:27:5: error: unknown type name ‘pci_set_irq_fn’
pci_set_irq_fn set_irq;
^~~~~~~~~~~~~~
include/hw/pci/pci_bus.h:28:5: error: unknown type name ‘pci_map_irq_fn’
pci_map_irq_fn map_irq;
^~~~~~~~~~~~~~
include/hw/pci/pci_bus.h:29:5: error: unknown type name ‘pci_route_irq_fn’
pci_route_irq_fn route_intx_to_irq;
^~~~~~~~~~~~~~~~
include/hw/pci/pci_bus.h:31:24: error: ‘PCI_SLOT_MAX’ undeclared here (not in a function)
PCIDevice *devices[PCI_SLOT_MAX * PCI_FUNC_MAX];
^~~~~~~~~~~~
include/hw/pci/pci_bus.h:31:39: error: ‘PCI_FUNC_MAX’ undeclared here (not in a function)
PCIDevice *devices[PCI_SLOT_MAX * PCI_FUNC_MAX];
^~~~~~~~~~~~
make[1]: *** [rules.mak:69: hw/mips/gt64xxx_pci.o] Error 1
make: *** [Makefile:482: subdir-mips-softmmu] Error 2
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The vtd_generate_msi_message() in intel-iommu is used to construct a MSI
Message from IRQ. A similar function will be needed when we add interrupt
remapping support in amd-iommu. Moving the function in common file to
avoid the code duplication. Rename it to x86_iommu_irq_to_msi_message().
There is no logic changes in the code flow.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Suggested-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The lookup table for power-of-two sizes was added in commit 540b849261
for the purpose of having convenient shortcuts for these sizes in cases
when the literal number has to be present at compile time, and
expressions as '(1 * KiB)' can not be used. One such case is the
stringification of sizes. Beyond that, it is convenient to use these
shortcuts for all power-of-two sizes, even if they don't have to be
literal numbers.
Despite its convenience, this table introduced 55 lines of "dumb" code,
the purpose and origin of which are obscure without reading the message
of the commit which introduced it. This patch fixes that by adding a
comment to the code itself with a brief explanation for the reasoning
behind this table. This comment includes the short AWK script that
generated the table, so that anyone who's interested could make sure
that the values in it are correct (otherwise these values look as if
they were typed manually).
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds some whitespace into the option help (including indentation)
and puts angle brackets around the type names. Furthermore, the list
name is no longer printed as part of every line, but only once in
advance, and only if the caller did not print a caption already.
This patch also restores the description alignment we had before commit
9cbef9d68e, just at 24 instead of 16 characters like we used to.
This increase is because now we have the type and two spaces of
indentation before the description, and with a usual type name length of
three chracters, this sums up to eight additional characters -- which
means that we now need 24 characters to get the same amount of padding
for most options. Also, 24 is a third of 80, which makes it kind of a
round number in terminal terms.
Finally, this patch amends the reference output of iotest 082 to match
the changes (and thus makes it pass again).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Some block drivers have traditionally changed their node to read-only
mode without asking the user. This behaviour has been marked deprecated
since 2.11, expecting users to provide an explicit read-only=on option.
Now that we have auto-read-only=on, enable these drivers to make use of
the option.
This is the only use of bdrv_set_read_only(), so we can make it a bit
more specific and turn it into a bdrv_apply_auto_read_only() that is
more convenient for drivers to use.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
If a management application builds the block graph node by node, the
protocol layer doesn't inherit its read-only option from the format
layer any more, so it must be set explicitly.
Backing files should work on read-only storage, but at the same time, a
block job like commit should be able to reopen them read-write if they
are on read-write storage. However, without option inheritance, reopen
only changes the read-only option for the root node (typically the
format layer), but not the protocol layer, so reopening fails (the
format layer wants to get write permissions, but the protocol layer is
still read-only).
A simple workaround for the problem in the management tool would be to
open the protocol layer always read-write and to make only the format
layer read-only for backing files. However, sometimes the file is
actually stored on read-only storage and we don't know whether the image
can be opened read-write (for example, for NBD it depends on the server
we're trying to connect to). This adds an option that makes QEMU try to
open the image read-write, but allows it to degrade to a read-only mode
without returning an error.
The documentation for this option is consciously phrased in a way that
allows QEMU to switch to a better model eventually: Instead of trying
when the image is first opened, making the read-only flag dynamic and
changing it automatically whenever the first BLK_PERM_WRITE user is
attached or the last one is detached would be much more useful
behaviour.
Unfortunately, this more useful behaviour is also a lot harder to
implement, and libvirt needs a solution now before it can switch to
-blockdev, so let's start with this easier approach for now.
Instead of adding a new auto-read-only option, turning the existing
read-only into an enum (with a bool alternate for compatibility) was
considered, but it complicated the implementation to the point that it
didn't seem to be worth it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The divdeu instruction was added to ISA 2.06 (Power7).
Exclude this block from older cpus.
Fixes: 27ae5109a2 (softfloat: Specialize udiv_qrnnd for ppc64)
Reported-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add a model of Xilinx Versal SoC.
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20181102131913.1535-2-edgar.iglesias@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Wire up nRF51 UART in the corresponding SoC.
Signed-off-by: Julia Suvorova <jusual@mail.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Not implemented: CTS/NCTS, PSEL*.
Signed-off-by: Julia Suvorova <jusual@mail.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* MSR-based feature support for
MSR_IA32_ARCH_CAPABILITIES bits (Robert Hoo)
* Cascadelake-Server CPU model (Tao Xu)
* Add PKU on Skylake-Server CPU model (Tao Xu)
* Correct cpu_x86_cpuid(0xd) (Sebastian Andrzej Siewior)
* Remove dead code (Peter Maydell)
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJb2balAAoJECgHk2+YTcWm+mEP/1Ktfs6rcn5M2YaSNEGJK3PH
Xr8Jr1bqNHpE+e0pDdWp+kp/DRaidYqbiP9gzF5ogxruh5PHphYuTxIl1B7wCpY7
1l7UNnyeOCjwIBf/Izyw2CWAZWR2bgjjUzFYAdV/5gZY+L+qw9/EbQ7Cjya56O8M
z5Y/HyZhNKUkhjtmWGMTfvyVz0hnRZQwQ6JpDpgMD7yDeiVNDEIXXVfTaPUlbOHh
NQvz3o0V436PZJ/nFDt54PppL1iW9WfpdDF0ueHVrH5fp+99ryWiBEv2zuTDWOcG
dzdGuj0VCoW2t9U03+rrZqwqfHRLV2G1gtA7dY6GoqnZs8MHIIrzNUKfmUFgWSSL
10esCfiDaOhIEg9/VJMQusGcDqMvJTPl6Ic4NSSvoTe/Qxz2jKgt3UlgAMMwuMjQ
Z4zjThgiwPiUfXW2U3dxPGKBMqAqygrOpwqbUzGFIQlc5knMpexIe3ahqEOh1kXY
0HqU3pIKekHYKMPMb/GkHiZmdFPec82oPiHW/F7ROBnK+yb+I1yy2O7EvQFXUX44
7k2288ItxTGY0nWwD/JUjlMYQ4/7i4+4QpNBz4hLpiBvn2STbhnOFBy/9P8BI0Gd
8fHDDQDn4e1O/6IlZtOeD7eYwFlM4xYyLkWckL27qm1FMA3WcSfFTJWrYHGMK8n1
mNpf+zYyWWwiwgUeOGWC
=hWk7
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ehabkost/tags/x86-next-pull-request' into staging
x86 queue, 2018-10-30
* MSR-based feature support for
MSR_IA32_ARCH_CAPABILITIES bits (Robert Hoo)
* Cascadelake-Server CPU model (Tao Xu)
* Add PKU on Skylake-Server CPU model (Tao Xu)
* Correct cpu_x86_cpuid(0xd) (Sebastian Andrzej Siewior)
* Remove dead code (Peter Maydell)
# gpg: Signature made Wed 31 Oct 2018 14:05:25 GMT
# gpg: using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-next-pull-request:
i386: Add PKU on Skylake-Server CPU model
i386: Add new model of Cascadelake-Server
x86: define a new MSR based feature word -- FEATURE_WORDS_ARCH_CAPABILITIES
x86: Data structure changes to support MSR based features
kvm: Add support to KVM_GET_MSR_FEATURE_INDEX_LIST and KVM_GET_MSRS system ioctl
target/i386: Remove #ifdeffed-out icebp debugging hack
i386: correct cpu_x86_cpuid(0xd)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is essentially redundant with tlb_c.dirty.
Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Especially for guests with large numbers of tlbs, like ARM or PPC,
we may well not use all of them in between flush operations.
Remember which tlbs have been used since the last flush, and
avoid any useless flushing.
Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Our only statistic so far was "full" tlb flushes, where all mmu_idx
are flushed at the same time.
Now count "partial" tlb flushes where sets of mmu_idx are flushed,
but the set is not maximal. Account one per mmu_idx flushed, as
that is the unit of work performed.
We don't actually count elided flushes yet, but go ahead and change
the interface presented to the monitor all at once.
Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The rest of the tlb victim cache is per-tlb,
the next use index should be as well.
Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The set of large pages in the kernel is probably not the same
as the set of large pages in the application. Forcing one
range to cover both will flush more often than necessary.
This allows tlb_flush_page_async_work to flush just the one
mmu_idx implicated, which in turn allows us to remove
tlb_check_page_and_flush_by_mmuidx_async_work.
Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Protect it with the tlb_lock instead of using atomics.
The move puts it in or near the same cacheline as the lock;
using the lock means we don't need a second atomic operation
in order to perform the update. Which makes it cheap to also
update pending_flush in tlb_flush_by_mmuidx_async_work.
Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is the first of several moves to reduce the size of the
CPU_COMMON_TLB macro and improve some locality of refernce.
Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
As the release document ref below link (page 13):
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf
PKU is supported in Skylake Server (Only Server) and later, and
on Intel(R) Xeon(R) Processor Scalable Family. So PKU is supposed
to be in Skylake-Server CPU model. And PKU's CPUID has been
exposed to QEMU. But PKU can't be find in Skylake-Server CPU
model in the code. So this patch will fix this issue in
Skylake-Server CPU model.
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <5014b57f834dcfa8fd3781504d98dcf063d54fde.1540801392.git.tao3.xu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add kvm_get_supported_feature_msrs() to get supported MSR feature index list.
Add kvm_arch_get_supported_msr_feature() to get each MSR features value.
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1539578845-37944-2-git-send-email-robert.hu@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(Thank you to Thomas Huth)
v2: fix 32bit build with updated patch (v3) from Philippe Mathieu-Daudé
built in a 32bit debian sid chroot
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJb2D8VAAoJEPMMOL0/L748dZEP/11pPehjPPYVxesxM++pFeuf
2EOrLuOTkwlRX23itj2JHv8UTY3YZR9Z8kkF3SWe7qYfp4kB4dTEYjnJY5Im6fWQ
TUbC9D9SivknOOPyQUtGXZQRN8D8m6V4hN2ZcoXC2M48GT23/uqUWBwCKYeHxdLf
iJQFmhwDnXSZr+D0l9mpMK2vBsZ5ywcbne8GufTtrkz7Dq9A0nDWVc/XUEHzzahf
C+6r2fRPjtImxIjhAGQeAEzOk5tYnqK/3kXjy6T4UygvnZw0pkAS1rIb3hvlzm1e
kBlbA+pgL0kKumMmT9LBR4Os4hlL95URUF+BDNGa3EusImSL/wmhsawslQbfxVyv
5at3VKIdvPXr7GQvmhaJ3dllXiQixX7A+axevkwyZkuIcYLnuhvh6bCR3ap+4mq/
GRk4vwXStS6S8rDLAzo4GA4DsE4EDYJSnU13wMEaj1L9sYPVg1224AgCjnlIBbQa
ntGD3lY7+nG5q1BeVfZXmpNZ4+N4TSpu2uEBxNvWY2/YkaouleQXJ8W4eFirB1Eo
G8TN2fbroLcKgxhOlpvgFrfrgs8T5ZprpqQnvpE2h6M2Nu4JWJq4008q3uIPOwTy
o9MrquqOjdG0+OBHr8Ji5HwDKex68NRQhl8BYhqtPhi/+XycDo47YSodNBfw2U/Q
Ec9301/TQjBcvCBLEzrt
=sHPv
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/vivier2/tags/qemu-trivial-for-3.1-pull-request' into staging
QEMU trivial patches collected between June and October 2018
(Thank you to Thomas Huth)
v2: fix 32bit build with updated patch (v3) from Philippe Mathieu-Daudé
built in a 32bit debian sid chroot
# gpg: Signature made Tue 30 Oct 2018 11:23:01 GMT
# gpg: using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg: aka "Laurent Vivier <laurent@vivier.eu>"
# gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C
* remotes/vivier2/tags/qemu-trivial-for-3.1-pull-request:
milkymist-minimac2: Use qemu_log_mask(GUEST_ERROR) instead of error_report
ppc: move at24c to its own CONFIG_ symbol
hw/intc/gicv3: Remove useless parenthesis around DIV_ROUND_UP macro
hw/pci-host: Remove useless parenthesis around DIV_ROUND_UP macro
tests/bios-tables-test: Remove an useless cast
xen: Use the PCI_DEVICE macro
qobject: Catch another straggler for use of qdict_put_str()
configure: Support pkg-config for zlib
tests: Fix typos in comments and help message (found by codespell)
cpu.h: fix a typo in comment
linux-user: fix comment s/atomic_write/atomic_set/
qemu-iotests: make 218 executable
scripts/qemu.py: remove trailing quotes on docstring
scripts/decodetree.py: remove unused imports
docs/devel/testing.rst: add missing newlines after code block
qemu-iotests: fix filename containing checks
tests/tcg/README: fix location for lm32 tests
memory.h: fix typos in comments
vga_int: remove unused function protype
configs/alpha: Remove unused CONFIG_PARALLEL_ISA switch
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJb1yMkAAoJENSXKoln91plSrUIAIp8e63jdI/YX8gIp0iEVZmJ
+QDAfgTRc3/zvIFYie4A4mEnEj6c8iwmrvINalxQ+tZDtNcMLU8zI+0bz2YxwgiT
1YbVrhNPJxqx65YOqwEAQ/vjlCC3iVtTP6s6eKpR5MZRBLUWrkuEub6gDWpKxrK0
lfSRXS8Bj2gAOzefxeLIcFhBcV/z8hlRe7wxGpSjmPcJ36G3Bv28nyV+LbfmCsTb
QekIrEUtxlSqNJbb1apZHP1754mKURc43KoH6ZdXWXQWj2RedARltIfVxbprR0bK
huYwwSSl1fD7ltvJW1gXGYKdRABUbvTMeRsheA7YwGXlIjeQLOAnkwc8ZwQkidU=
=A7R3
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/amarkovic/tags/mips-queue-october-2018-part-4' into staging
MIPS queue for October 2018, part 4
# gpg: Signature made Mon 29 Oct 2018 15:11:32 GMT
# gpg: using RSA key D4972A8967F75A65
# gpg: Good signature from "Aleksandar Markovic <amarkovic@wavecomp.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8526 FBF1 5DA3 811F 4A01 DD75 D497 2A89 67F7 5A65
* remotes/amarkovic/tags/mips-queue-october-2018-part-4: (27 commits)
linux-user: Add prctl() PR_SET_FP_MODE and PR_GET_FP_MODE implementations
linux-user: Determine the desired FPU mode from MIPS.abiflags
linux-user: Read and set FP ABI value from MIPS abiflags
linux-user: Extract MIPS abiflags from ELF file
linux-user: Extend image_info struct with MIPS fp_abi and interp_fp_abi fields
elf: Define MIPS_ABI_FP_UNKNOWN macro
target/mips: Amend MXU ASE overview note
target/mips: Move MXU_EN check one level higher
target/mips: Add emulation of MXU instructions S32LDD and S32LDDR
target/mips: Add emulation of MXU instructions Q8MUL and Q8MULSU
target/mips: Add emulation of MXU instruction D16MAC
target/mips: Add emulation of MXU instruction D16MUL
target/mips: Add emulation of MXU instruction S8LDD
target/mips: Move MUL, S32M2I, S32I2M handling out of main MXU switch
target/mips: Add emulation of MXU instructions S32I2M and S32M2I
target/mips: Add emulation of non-MXU MULL within MXU decoding engine
target/mips: Add bit encoding for MXU operand getting pattern 'optn3'
target/mips: Add bit encoding for MXU operand getting pattern 'optn2'
target/mips: Add bit encoding for MXU execute add/sub pattern 'eptn2'
target/mips: Add bit encoding for MXU accumulate add/sub 2-bit pattern 'aptn2'
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch aims to bring the following behavior:
1. We don't load bitmaps, when started in inactive mode. It's the case
of incoming migration. In this case we wait for bitmaps migration
through migration channel (if 'dirty-bitmaps' capability is enabled) or
for invalidation (to load bitmaps from the image).
2. We don't remove persistent bitmaps on inactivation. Instead, we only
remove bitmaps after storing. This is the only way to restore bitmaps,
if we decided to resume source after [failed] migration with
'dirty-bitmaps' capability enabled (which means, that bitmaps were not
stored).
3. We load bitmaps on open and any invalidation, it's ok for all cases:
- normal open
- migration target invalidation with dirty-bitmaps capability
(bitmaps are migrating through migration channel, the are not
stored, so they should have IN_USE flag set and will be skipped
when loading. However, it would fail if bitmaps are read-only[1])
- migration target invalidation without dirty-bitmaps capability
(normal load of the bitmaps, if migrated with shared storage)
- source invalidation with dirty-bitmaps capability
(skip because IN_USE)
- source invalidation without dirty-bitmaps capability
(bitmaps were dropped, reload them)
[1]: to accurately handle this, migration of read-only bitmaps is
explicitly forbidden in this patch.
New mechanism for not storing bitmaps when migrate with dirty-bitmaps
capability is introduced: migration filed in BdrvDirtyBitmap.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Instead of both frozen and qmp_locked checks, wrap it into one check.
frozen implies the bitmap is split in two (for backup), and shouldn't
be modified. qmp_locked implies it's being used by another operation,
like being exported over NBD. In both cases it means we shouldn't allow
the user to modify it in any meaningful way.
Replace any usages where we check both frozen and qmp_locked with the
new check.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20181002230218.13949-2-jsnow@redhat.com
[w/edits Suggested-By: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>]
Signed-off-by: John Snow <jsnow@redhat.com>
Add backup parameter to bdrv_merge_dirty_bitmap() to be used then with
bdrv_restore_dirty_bitmap() if it needed to restore the bitmap after
merge operation.
This is needed to implement bitmap merge transaction action in further
commit.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Use more generic names to reuse the function for bitmap merge in the
following commit.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Add MIPS_ABI_FP_UNKNOWN as QEMU internal value to represent
unknown fp_abi (based on kernel mips/include/asm/elf.h definition)
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Signed-off-by: Stefan Markovic <smarkovic@wavecomp.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJb0iQKAAoJENSXKoln91plUngH/icGvr5sa6JbT/4bDP20Wv7y
gwJ8Ax6kKDU4Z/JbBt+2diXVRrPXCF6xt/dvcaWCnxKyjIZN0i2azHv75jtMEA5t
+khdqqREzTZ8RiEI+u0r+OkSNJ3837O+ahQFdRxjqSDIScC8mcwW8h1md9ThjzbQ
yBhRvNo8QkXGGx9MCWZ7kUGkPnJDQnL0jGiFj0xhtyDSGXfnnOpUgpQKRWu5cQzl
Q7JKFPQgt676kd6UyG7f+xYw/a6uERmMBWp30CfN6bP4bPcdFHdUlgIM60VRAfhA
qYA4led5sWcuqmA96PoZIOc+05/8Q8NkgP+nYbXeMkW8/9QOCPa/30p7QayOpZA=
=F4up
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/amarkovic/tags/mips-queue-oct-2018-part-3' into staging
MIPS queue for October 2018 - part 3
# gpg: Signature made Thu 25 Oct 2018 21:14:02 BST
# gpg: using RSA key D4972A8967F75A65
# gpg: Good signature from "Aleksandar Markovic <amarkovic@wavecomp.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8526 FBF1 5DA3 811F 4A01 DD75 D497 2A89 67F7 5A65
* remotes/amarkovic/tags/mips-queue-oct-2018-part-3:
target/mips: Add disassembler support for nanoMIPS
target/mips: Implement emulation of nanoMIPS EVA instructions
target/mips: Add nanoMIPS CRC32 instruction pool
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Found by reading the code.
Signed-off-by: Li Qiang <liq3ea@gmail.com>
Message-Id: <1536150548-2797-1-git-send-email-liq3ea@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Add disassembler support for nanoMIPS.
Reviewed-by: Stefan Markovic <smarkovic@wavecomp.com>
Signed-off-by: Matthew Fortune <matthew.fortune@mips.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
With the new memory device functions in place, we can factor out
unplugging of memory devices completely.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-16-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
With the new memory device functions in place, we can factor out
plugging of memory devices completely.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-15-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
With all required memory device class functions in place, we can factor
out pre_plug handling of memory devices. Take proper care of errors. We
still have to carry along legacy_align required for pc compatibility
handling.
We will factor out tracing of the address separately in a follow-up
patch.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-14-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
To be able to factor out address assignment of memory devices, we will
have to read (get_addr()) and write (set_addr()) the address.
We can't use properties for this purpose, as properties are device
specific. E.g. while the address property for a DIMM is called "addr", it
might be called differently (e.g. "memaddr") for other devices.
Especially virtio based memory devices cannot use "addr" as that is already
reserved and used for the address on the bus (for the proxy device).
Also, it might be possible to have memory devices without address
properties (e.g. internal DIMM-like thingies).
In contrast to get_addr(), we expect that set_addr() can fail.
Keep it simple for now for pc-dimm and simply set the static property, that
will fail once realized.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-13-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
There are no remaining users of get_region_size() except
memory_device_get_region_size() itself. We can make
memory_device_get_region_size() work directly on get_memory_region()
instead and drop get_region_size().
In addition, we can now use memory_device_get_region_size() in pc-dimm
code to implement get_plugged_size()"
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-12-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The memory region is necessary for plugging/unplugging a memory device.
The region size (via get_region_size()) is no longer sufficient, as
besides the alignment, also the region itself is required in order to
add it to the device memory region of the machine via
- memory_region_add_subregion
- memory_region_del_subregion
So, to factor out plugging/unplugging of memory devices from pc-dimm
code, we have to factor out access to the memory region first.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-11-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
We will factor out get_memory_region() from pc-dimm to memory device code
soon. Once that is done, get_region_size() can be implemented
generically and essentially be replaced by
memory_device_get_region_size (and work only on get_memory_region()).
We have some users of get_memory_region() (spapr and pc-dimm code) that are
only interested in the size. So let's rework them to use
memory_device_get_region_size() first, then we can factor out
get_memory_region() and eventually remove get_region_size() without
touching the same code multiple times.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-10-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Document the functions. Don't document get_region_size(), as we will be
dropping/replacing that one soon.
Use same documentation style as in include/exec/memory.h, but don't
document the parameters, as they are self-explanatory.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-9-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Let's properly forward the errors, so errors from get_region_size() /
get_plugged_size() can be handled.
Users right now call both functions after the device has been realized,
which is will never fail, so it is fine to continue using error_abort.
While at it, remove a leftover error check (suggested by Igor).
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-8-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
We're plugging/unplugging a PCDIMMDevice, so directly pass this type
instead of a more generic DeviceState.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-5-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbzcCHAAoJEDhwtADrkYZT3YsP/2qE4HNY/htj3IP6vNJuSaqw
CLPRTz7zWmUBTE6FqSkvLsq3X2BMFFLeaIPA9EFcbyn2km6qPqBYgg9ElXXvPZBm
6hDeRIoC8FdRD0Apozd5MGC94/lE47PheDRV8V+4KrGLaaMXEPxMZ0wP4AfdS5pS
6Pt2xuF7nPu1+OWVxMk0fXadGjGLEuOQQmTh3B21J5RaynQ3gtd6h7XFC/LJyOGG
LC/6GyPc0h7KU83VnvrRjH/EOpu1wENgrsvWsS0sem8op35Z+i9jU5BfCp4qFkDy
gCHHUEyEeyexS+W+Tj87eBtK2gfrqQx9ovo8CIsWcUwpKbdD6AMK4FKGsDNMNHab
Kg5u/M+O8nHCB7DuursF+3mqEbZHb05cfKe6JEtiq49EuORMV5hp4Ap966noSwTw
UEU0NJNA1p8EdmXVudyyyYR7wpoSSmZpoenA+bJ3nthK8K0KcU4RUGk6ZEbxfJy+
7ENl+3R2IxmxzgXv/x0tz0uFisaVW1rltTXtMte+ElQsO0qy74iHdfR7JHsmLxj9
CO/ABMVoYsWq2OJv8pWLrdKpT4v3HQLJdHhknyu0ZcJGDyICqX29ULLEhPrNEZvW
rxVxAkiemlaqxlUjbrM46CDQQm+w03OCnk7aCYcV4oK+u5+o3mCag705gMPErapZ
6uOE3fAjiWw43sA31mek
=kPZX
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2018-10-22' into staging
Error reporting patches for 2018-10-22
# gpg: Signature made Mon 22 Oct 2018 13:20:23 BST
# gpg: using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-error-2018-10-22: (40 commits)
error: Drop bogus "use error_setg() instead" admonitions
vpc: Fail open on bad header checksum
block: Clean up bdrv_img_create()'s error reporting
vl: Simplify call of parse_name()
vl: Fix exit status for -drive format=help
blockdev: Convert drive_new() to Error
vl: Assert drive_new() does not fail in default_drive()
fsdev: Clean up error reporting in qemu_fsdev_add()
spice: Clean up error reporting in add_channel()
tpm: Clean up error reporting in tpm_init_tpmdev()
numa: Clean up error reporting in parse_numa()
vnc: Clean up error reporting in vnc_init_func()
ui: Convert vnc_display_init(), init_keyboard_layout() to Error
ui/keymaps: Fix handling of erroneous include files
vl: Clean up error reporting in device_init_func()
vl: Clean up error reporting in parse_fw_cfg()
vl: Clean up error reporting in mon_init_func()
vl: Clean up error reporting in machine_set_property()
vl: Clean up error reporting in chardev_init_func()
qom: Clean up error reporting in user_creatable_add_opts_foreach()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In several places we use assert(FEATURE), and assume that if FEATURE
is disabled, all following code is removed as unreachable. Which allows
us to compile-out functions that are only present with FEATURE, and
have a link-time failure if the functions remain used.
MinGW does not mark its internal function _assert() as noreturn, so the
compiler cannot see when code is unreachable, which leads to link errors
for this host that are not present elsewhere.
The current build-time failure concerns 62823083b8, but I remember
having seen this same error before. Fix it once and for all for MinGW.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20181022181623.8810-1-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Version: GnuPG v1
iQEcBAABAgAGBQJbyUxzAAoJEO8Ells5jWIR24kIAKgKwm2Y4DU5vvaFVBzaK0/F
97aUnnXDGQ9FjBC6xG2Eo99fCRCbQ3AGHKlFR/sdPhK/5RaV6Zes7DJuYGlmE5wk
u+nieFDL3d+B3kmoJI2Uy2h2dQEtGwTGvLi+WnjrDteCIDAapiaN1ZWOnN5OvP3a
c903tU1l5oZsgr/4SGONXn86dw6JaWR8VD+GW6yFQgiB4+icGxMJrLTFTG2Ef4RG
5WFwqmFlTddTWWdcGDA+myLVLgakWwkxmFK75doeNDcNBjazZ+V8b0PH2Ph+aGXZ
9AGH2l8W2xPs1TLTJFUbya+2XnRbOTiFl6klVUZvDH3/74nY+kOVtPpljFjiBn0=
=AhoN
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Fri 19 Oct 2018 04:16:03 BST
# gpg: using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request: (26 commits)
qemu-options: Fix bad "macaddr" property in the documentation
e1000: indicate dropped packets in HW counters
net: ignore packet size greater than INT_MAX
pcnet: fix possible buffer overflow
rtl8139: fix possible out of bound access
ne2000: fix possible out of bound access in ne2000_receive
clean up callback when del virtqueue
docs: Add COLO status diagram to COLO-FT.txt
COLO: quick failover process by kick COLO thread
COLO: notify net filters about checkpoint/failover event
filter-rewriter: handle checkpoint and failover event
filter: Add handle_event method for NetFilterClass
COLO: flush host dirty ram from cache
savevm: split the process of different stages for loadvm/savevm
qapi: Add new command to query colo status
qapi/migration.json: Rename COLO unknown mode to none mode.
qmp event: Add COLO_EXIT event to notify users while exited COLO
COLO: Flush memory data from ram cache
ram/COLO: Record the dirty pages that SVM received
COLO: Load dirty pages into SVM's RAM cache firstly
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Calling error_report() from within a function that takes an Error **
argument is suspicious. drive_new() calls error_report() even though
it can run within drive_init_func(), which takes an Error ** argument.
drive_init_func()'s caller main(), via qemu_opts_foreach(), is fine
with it, but clean it up anyway:
* Convert drive_new() to Error
* Update add_init_drive() to report the error received from
drive_new()
* Make main() pass &error_fatal through qemu_opts_foreach(),
drive_init_func() to drive_new()
* Make default_drive() pass &error_abort through qemu_opts_foreach(),
drive_init_func() to drive_new()
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20181017082702.5581-34-armbru@redhat.com>
Calling error_report() in a function that takes an Error ** argument
is suspicious. tpm_init_tpmdev() does that, and then fails without
setting an error. Its caller main(), via tpm_init() and
qemu_opts_foreach(), is fine with it, but clean it up anyway.
Cc: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-Id: <20181017082702.5581-30-armbru@redhat.com>
Calling error_report() in a function that takes an Error ** argument
is suspicious. parse_numa() does that, and then fails without setting
an error. Its caller main(), via qemu_opts_foreach(), is fine with
it, but clean it up anyway.
While there, give parse_numa() internal linkage.
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20181017082702.5581-29-armbru@redhat.com>
The previous commit changed vfio's warning messages from
vfio warning: DEV-NAME: Could not frobnicate
to
warning: vfio DEV-NAME: Could not frobnicate
To match this change, change error messages from
vfio error: DEV-NAME: On fire
to
vfio DEV-NAME: On fire
Note the loss of "error". If we think marking error messages that way
is a good idea, we should mark *all* error messages, i.e. make
error_report() print it.
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <20181017082702.5581-7-armbru@redhat.com>
The vfio code reports warnings like
error_report(WARN_PREFIX "Could not frobnicate", DEV-NAME);
where WARN_PREFIX is defined so the message comes out as
vfio warning: DEV-NAME: Could not frobnicate
This usage predates the introduction of warn_report() & friends in
commit 97f40301f1. It's time to convert to that interface. Since
these functions already prefix the message with "warning: ", replace
WARN_PREFIX by VFIO_MSG_PREFIX, so the messages come out like
warning: vfio DEV-NAME: Could not frobnicate
The next commit will replace ERR_PREFIX.
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181017082702.5581-6-armbru@redhat.com>
From include/qapi/error.h:
* Pass an existing error to the caller with the message modified:
* error_propagate(errp, err);
* error_prepend(errp, "Could not frobnicate '%s': ", name);
Fei Li pointed out that doing error_propagate() first doesn't work
well when @errp is &error_fatal or &error_abort: the error_prepend()
is never reached.
Since I doubt fixing the documentation will stop people from getting
it wrong, introduce error_propagate_prepend(), in the hope that it
lures people away from using its constituents in the wrong order.
Update the instructions in error.h accordingly.
Convert existing error_prepend() next to error_propagate to
error_propagate_prepend(). If any of these get reached with
&error_fatal or &error_abort, the error messages improve. I didn't
check whether that's the case anywhere.
Cc: Fei Li <fli@suse.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181017082702.5581-2-armbru@redhat.com>
qerror.h contains leftovers from the now-defunct QError API.
There's only a handful of string macros left, and no one is supposed
to add anything else. The check-qerror.sh script was used to make sure
that all definitions on the qerror.c and qerror.h files were sorted
alphabetically. The former was removed three years ago, and the latter
is now in a different location, so the script doesn't even work (as
a matter of fact the alphabetical order was broken last time someone
added a macro -also in 2015- and no one seemed to notice).
There's no point in fixing this script so let's just remove it.
The rogue macro is also moved to its correct location.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20181017151738.20299-1-berto@igalia.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Add handling of POST_MESSAGE hypercall. For that, add an interface to
regsiter a handler for the messages arrived from the guest on a
particular connection id (IOW set up a message connection in Hyper-V
speak).
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082217.29481-10-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add handling of SIGNAL_EVENT hypercall. For that, provide an interface
to associate an EventNotifier with an event connection number, so that
it's signaled when the SIGNAL_EVENT hypercall with the matching
connection ID is called by the guest.
Support for using KVM functionality for this will be added in a followup
patch.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082217.29481-8-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add infrastructure to signal SynIC event flags by atomically setting the
corresponding bit in the event flags page and firing a SINT if
necessary.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082217.29481-7-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add infrastructure to deliver SynIC messages to the SynIC message page.
Note that KVM may also want to deliver (SynIC timer) messages to the
same message slot.
The problem is that the access to a SynIC message slot is controlled by
the value of its .msg_type field which indicates if the slot is being
owned by the hypervisor (zero) or by the guest (non-zero).
This leaves no room for synchronizing multiple concurrent producers.
The simplest way to deal with this for both KVM and QEMU is to only
deliver messages in the vcpu thread. KVM already does this; this patch
makes it for QEMU, too.
Specifically,
- add a function for posting messages, which only copies the message
into the staging buffer if its free, and schedules a work on the
corresponding vcpu to actually deliver it to the guest slot;
- instead of a sint ack callback, set up the sint route with a message
status callback. This function is called in a bh whenever there are
updates to the message slot status: either the vcpu made definitive
progress delivering the message from the staging buffer (succeeded or
failed) or the guest issued EOM; the status is passed as an argument
to the callback.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082217.29481-6-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Certain configurations do not allow SynIC to be used in QEMU. In
particular,
- when hyperv_vpindex is off, SINT routes can't be used as they refer to
the destination vCPU by vp_index
- older KVM (which doesn't expose KVM_CAP_HYPERV_SYNIC2) zeroes out
SynIC message and event pages on every msr load, breaking migration
OTOH in-KVM users of SynIC -- SynIC timers -- do work in those
configurations, and we shouldn't stop the guest from using them.
To cover both scenarios, introduce an X86CPU property that makes CPU
init code to skip creation of the SynIC object (and thus disables any
SynIC use in QEMU) but keeps the KVM part of the SynIC working.
The property is clear by default but is set via compat logic for older
machine types.
As a result, when hv_synic and a modern machine type are specified, QEMU
will refuse to run unless vp_index is on and the kernel is recent
enough. OTOH with an older machine type QEMU will run fine with
hv_synic=on against an older kernel and/or without vp_index enabled but
will disallow the in-QEMU uses of SynIC (in e.g. VMBus).
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082217.29481-4-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make Hyper-V SynIC a device which is attached as a child to a CPU. For
now it only makes SynIC visibile in the qom hierarchy, and maintains its
internal fields in sync with the respecitve msrs of the parent cpu (the
fields will be used in followup patches).
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082217.29481-3-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A significant part of hyperv.c is not actually tied to x86, and can
be moved to hw/.
This will allow to maintain most of Hyper-V and VMBus
target-independent, and to avoid conflicts with inclusion of
arch-specific headers down the road in VMBus implementation.
Also this stuff can now be opt-out with CONFIG_HYPERV.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082041.29380-4-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some parts of the Hyper-V hypervisor-guest interface appear to be
target-independent, so move them into a proper header.
Not that Hyper-V ARM64 emulation is around the corner but it seems more
conveninent to have most of Hyper-V and VMBus target-independent, and
allows to avoid conflicts with inclusion of arch-specific headers down
the road in VMBus implementation.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082041.29380-2-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When [2] was fixed it was agreed that adding and calling post_plug()
callback after device_reset() was low risk approach to hotfix issue
right before release. So it was merged instead of moving already
existing plug() callback after device_reset() is called which would
be more risky and require all plug() callbacks audit.
Looking at the current plug() callbacks, it doesn't seem that moving
plug() callback after device_reset() is breaking anything, so here
goes agreed upon [3] proper fix which essentially reverts [1][2]
and moves plug() callback after device_reset().
This way devices always comes to plug() stage, after it's been fully
initialized (including being reset), which fixes race condition [2]
without need for an extra post_plug() callback.
1. (25e897881 "qdev: add HotplugHandler->post_plug() callback")
2. (8449bcf94 "virtio-scsi: fix hotplug ->reset() vs event race")
3. https://www.mail-archive.com/qemu-devel@nongnu.org/msg549915.html
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1539696820-273275-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Pierre Morel<pmorel@linux.ibm.com>
Acked-by: Pierre Morel<pmorel@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
accel_init_machine sets *(acc->allowed) to true if acc->init_machine(ms)
succeeds. There's no need to have both hvf_allowed and hvf_disabled.
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20181018143051.48508-1-r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Adds EXTERNAL attribute definition to qemu timers subsystem and assigns
it to virtual clock timers, used in slirp (ICMP IPv6) and ui (key queue).
Virtual clock processing in rr mode can use this attribute instead of a
separate clock type.
Fixes: 87f4fe7653
Fixes: 775a412bf8
Fixes: 9888091404
Signed-off-by: Artem Pisarenko <artem.k.pisarenko@gmail.com>
Message-Id: <e771f96ab94e86b54b9a783c974f2af3009fe5d1.1539764043.git.artem.k.pisarenko@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Attributes are simple flags, associated with individual timers for their
whole lifetime. They intended to be used to mark individual timers for
special handling when they fire.
New/init functions family in timer interface updated and refactored (new
'attribute' argument added, timer_list replaced with timer_list_group+type
combinations, comments improved to avoid info duplication). Also existing
aio interface extended with attribute-enabled variants of functions,
which create/initialize timers.
Signed-off-by: Artem Pisarenko <artem.k.pisarenko@gmail.com>
Message-Id: <f47b81dbce734e9806f9516eba8ca588e6321c2f.1539764043.git.artem.k.pisarenko@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
That patch series introduced new virtual clock type for use in external
subsystems. It breaks desired behavior in non-record/replay usage
scenarios due to a small change to existing behavior. Processing of
virtual timers belonging to new clock type is kicked off to the main
loop, which makes these timers asynchronous with vCPU thread and,
in icount mode, with whole guest execution. This breaks expected
determinism in non-record/replay icount mode of emulation where these
"external subsystems" are isolated from the host (i.e. they are
external only to guest core, not to the entire emulation environment).
Example for slirp ("user" backend for network device):
User runs qemu in icount mode with rtc clock=vm without any external
communication interfaces but with "-netdev user,restrict=on". It expects
deterministic execution, because network services are emulated inside
qemu and isolated from host. There are no reasons to get reply from DHCP
server with different delay or something like that.
The next patches revert reimplements the same changes in a better way.
This reverts commit 87f4fe7653.
This reverts commit 775a412bf8.
This reverts commit 9888091404.
Signed-off-by: Artem Pisarenko <artem.k.pisarenko@gmail.com>
Message-Id: <18b1e7c8f155fe26976f91be06bde98eef6f8751.1539764043.git.artem.k.pisarenko@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Filter needs to process the event of checkpoint/failover or
other event passed by COLO frame.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
We should not load PVM's state directly into SVM, because there maybe some
errors happen when SVM is receving data, which will break SVM.
We need to ensure receving all data before load the state into SVM. We use
an extra memory to cache these data (PVM's ram). The ram cache in secondary side
is initially the same as SVM/PVM's memory. And in the process of checkpoint,
we cache the dirty pages of PVM into this ram cache firstly, so this ram cache
always the same as PVM's memory at every checkpoint, then we flush this cached ram
to SVM after we receive all PVM's state.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
We need to know if migration is going into COLO state for
incoming side before start normal migration.
Instead by using the VMStateDescription to send colo_state
from source side to destination side, we use MIG_CMD_ENABLE_COLO
to indicate whether COLO is enabled or not.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
While do checkpoint, we need to flush all the unhandled packets,
By using the filter notifier mechanism, we can easily to notify
every compare object to do this process, which runs inside
of compare threads as a coroutine.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
GCC7+ will no longer advertise support for 16-byte __atomic operations
if only cmpxchg is supported, as for x86_64. Fortunately, x86_64 still
has support for __sync_compare_and_swap_16 and we can make use of that.
AArch64 does not have, nor ever has had such support, so open-code it.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Isolate the computation of an index from an address into a
helper before we change that function.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
[ cota: convert tlb_vaddr_to_host; use atomic_read on addr_write ]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20181009175129.17888-2-cota@braap.org>
Currently we rely on atomic operations for cross-CPU invalidations.
There are two cases that these atomics miss: cross-CPU invalidations
can race with either (1) vCPU threads flushing their TLB, which
happens via memset, or (2) vCPUs calling tlb_reset_dirty on their TLB,
which updates .addr_write with a regular store. This results in
undefined behaviour, since we're mixing regular and atomic ops
on concurrent accesses.
Fix it by using tlb_lock, a per-vCPU lock. All updaters of tlb_table
and the corresponding victim cache now hold the lock.
The readers that do not hold tlb_lock must use atomic reads when
reading .addr_write, since this field can be updated by other threads;
the conversion to atomic reads is done in the next patch.
Note that an alternative fix would be to expand the use of atomic ops.
However, in the case of TLB flushes this would have a huge performance
impact, since (1) TLB flushes can happen very frequently and (2) we
currently use a full memory barrier to flush each TLB entry, and a TLB
has many entries. Instead, acquiring the lock is barely slower than a
full memory barrier since it is uncontended, and with a single lock
acquisition we can flush the entire TLB.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20181009174557.16125-6-cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Paves the way for the addition of a per-TLB lock.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20181009174557.16125-4-cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When we implemented per-vCPU TCG contexts, we forgot to also
distribute the tcg_time counter, which has remained as a global
accessed without any serialization, leading to potentially missed
counts.
Fix it by distributing the field over the TCG contexts, embedding
it into TCGProfile with a field called "cpu_exec_time", which is more
descriptive than "tcg_time". Add a function to query this value
directly, and for completeness, fill in the field in
tcg_profile_snapshot, even though its callers do not use it.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20181010144853.13005-5-cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Regarding R5900 CPU, some sources indicate that the Emotion Engine
ISA/ASE was designed by Toshiba and licensed to Sony. Others sources
claim it was a joint effort. It therefore makes sense to refer to
the CPU as "Toshiba/Sony R5900".
Also, remove and "'s" in the line for some other CPU, for the sake
of consistency.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reported-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Fredrik Noring <noring@nocrew.org>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Add Mips_elf_abiflags_v0 structure to elf.h. The source of information
is kernel header arch/mips/include/asm/elf.h.
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Signed-off-by: Stefan Markovic <smarkovic@wavecomp.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>