This patch introduces new QEMU_CLOCK_VIRTUAL_RT clock, which
should be used for icount warping. In the next patch, it
will be used to avoid a huge icount warp when a virtual
machine is stopped for a long time.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In this case, QEMU might longjmp out of cpu-exec.c and miss the final
cleanup in cpu_exec_nocache. Do this manually through a new compile
flag.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Separate accessing the instruction counter from the compensation for
speed and halting that are introduced by qemu_icount_bias. This
introduces new infrastructure used by the record/replay patches.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch sets can_do_io function to allow reading icount
within cpu-exec, but outside TB execution.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Exception index is reset at every entry at every entry into cpu_exec()
function. This may cause missing the exceptions while replaying them.
This patch moves exception_index reset to the locations where they are
processed.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In icount mode cpu_exec_nocache function is used to execute part of the
existing TB. At the end of cpu_exec_nocache newly created TB is deleted.
Sometimes io_read function needs to recompile current TB and restart TB
lookup and execution. After that tb_find_fast function finds old (bigger)
TB again. This TB cannot be executed (because icount is not big enough)
and cpu_exec_nocache is called again. Such a loop continues over and over.
This patch deletes old TB and avoids finding it in the TB cache.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The QEMU block layer has a limit of INT_MAX bytes per transfer.
Expose it in the block limits VPD page for both regular transfers
and WRITE SAME.
Reported-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer,
for two reasons. One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.
This commit only touches allocations with size arguments of the form
sizeof(T).
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer,
for two reasons. One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.
This commit only touches allocations with size arguments of the form
sizeof(T).
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add xsaves related definition, it also adds corresponding part
to kvm_get/put, and vmstate.
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These represent xsave-related capabilities of the processor, and KVM may
or may not support them.
Add feature bits so that they are considered by "-cpu ...,enforce", and use
the new feature work instead of calling kvm_arch_get_supported_cpuid.
Bit 3 (XSAVES) is not migratables because it requires saving MSR_IA32_XSS.
Neither KVM nor any commonly available hardware supports it anyway.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini reported that Coverity reports an uninitialized pad value.
Let's use a designated initializer for kvm_irq_routing_entry to avoid
this false positive. This is similar to kvm_irqchip_add_msi_route and
other users of kvm_irq_routing_entry.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
struct kvm_fpu contains an alignment padding on s390x. Let's use a
designated initializer to avoid false positives from valgrind/memcheck.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
struct kvm_vcpu_events contains reserved fields. Let's use a
designated initializer to avoid false positives in valgrind.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
struct kvm_msrs contains a pad field. Let's use a designated
initializer on the info part to avoid false positives from
valgrind/memcheck.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
struct kvm_msrs contains padding bytes. Let's use a designated
initializer on the info part to avoid false positives from
valgrind/memcheck. Do the same for generic MSRS, the TSC and
feature control.
We also need to zero out the reserved fields in the entries.
We do this in kvm_msr_entry_set as suggested by Paolo. This
avoids a big memset that a designated initializer on the
full structure would do.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
struct kvm_xcrs contains padding bytes. Let's use a designated
initializer to avoid false positives from valgrind/memcheck.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
struct kvm_pit_state2 contains pad fields. Let's use a designated
initializer to avoid false positives from valgrind/memcheck.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm_clock_data contains pad fields. Let's use a designated
initializer to avoid false positives from valgrind/memcheck.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
struct kvm_dirty_log contains padding fields that trigger false
positives in valgrind. Let's use a designated initializer to avoid
false positives from valgrind/memcheck.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use the kvm_resamplefds_enabled function
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Compute kvm_irqfds_allowed by checking the KVM_CAP_IRQFD extension.
Remove direct settings in architecture specific files.
Add a new kvm_resamplefds_allowed variable, initialized by
checking the KVM_CAP_IRQFD_RESAMPLE extension. Add a corresponding
kvm_resamplefds_enabled() function.
A special notice for s390 where KVM_CAP_IRQFD was not immediatly
advirtised when irqfd capability was introduced in the kernel.
KVM_CAP_IRQ_ROUTING was advertised instead.
This was fixed in "KVM: s390: announce irqfd capability",
ebc3226202d5956a5963185222982d435378b899 whereas irqfd support
was brought in 84223598778ba08041f4297fda485df83414d57e,
"KVM: s390: irq routing for adapter interrupts". Both commits
first appear in 3.15 so there should not be any kernel
version impacted by this QEMU modification.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch simplifies the AES code, by directly accessing the newly added
S-Box, InvS-Box and InvMixColumns tables instead of recreating them by
using the AES_Te and AES_Td tables.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The name is set to "qemu".
Signed-off-by: Drew DeVault <sir@cmpwn.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Drew DeVault <sircmpwn@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ongoing discussions on how we are going to specify the console,
so tag the command as experiental so we can refine things in
the 2.3 development cycle.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1416923657-10614-1-git-send-email-armbru@redhat.com
[Spell out "not a stable API", and x- the QAPI schema, too]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A bunch of bugfixes for 2.2.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUc4AjAAoJECgfDbjSjVRp084IAIYh48pK0MrCTSrDh2/3UCkN
Z+htfYS9uEpJHbIeJrn4u1e2Z90bgcovq8Cg36pUeGZtI5CVZYv/lmPjcOwJEruX
RIveOes87S58rp39mM2/24k1lp2u6VyS+QlrQlmO/ukKEz4ABhy0yF70yTuldX3b
RQ6d1aa/MaEsEx/iTWcQo6cDIdeWUYg8Mt34fSLfcphbHcSC02t8VCOX9cOR4jR9
80Lf78LGEdTCMexrcxs392o7npZGJLzCTWXLRtf6Q4/uPBM8Cmq60yt8JI0WCqbT
ram3XLv9u7GwPgeAp5HPKVV/LxRsDf5HjET3fW8QwbrUP2BvQ5PCgzKCUY6sT20=
=akFH
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, misc bugfixes
A bunch of bugfixes for 2.2.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon 24 Nov 2014 18:59:47 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream:
pc: acpi: mark all possible CPUs as enabled in SRAT
pcie: fix improper use of negative value
pcie: fix typo in pcie_cap_deverr_init()
target-i386: move generic memory hotplug methods to DSDTs
acpi-build: mark RAM dirty on table update
hw/pci: fix crash on shpc error flow
pc: count in 1Gb hugepage alignment when sizing hotplug-memory container
pc: explicitly check maxmem limit when adding DIMM
pc: pc-dimm: use backend alignment during address auto allocation
pc: align DIMM's address/size by backend's alignment value
memory: expose alignment used for allocating RAM as MemoryRegion API
pc: limit DIMM address and size to page aligned values
pc: make pc_dimm_plug() more readble
pc: kvm: check if KVM has free memory slots to avoid abort()
qemu-char: fix tcp_get_fds
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If QEMU is started with -numa ... Windows only notices that
CPU has been hot-added but it will not online such CPUs.
It's caused by the fact that possible CPUs are flagged as
not enabled in SRAT and Windows honoring that information
doesn't use corresponding CPU.
ACPI 5.0 Spec regarding to flag says:
"
Table 5-47 Local APIC Flags
...
Enabled: if zero, this processor is unusable, and the operating system
support will not attempt to use it.
"
Fix QEMU to adhere to spec and mark possible CPUs as enabled
in SRAT.
With that Windows onlines hot-added CPUs as expected.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reported-by:
https://bugs.launchpad.net/qemu/+bug/1393440
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This makes it simpler to keep the SSDT byte-for-byte identical for a
given machine type, which is a goal we want to have for 2.2 and newer
types.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
acpi build modifies internal FW CFG RAM on first access
but we forgot to mark it dirty.
If this RAM has been migrated already, it won't be
migrated again, returning corrupted tables to guest.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If the pci bridge enters in error flow as part
of init process it will only delete the shpc mmio
subregion but not remove it from the properties list,
resulting in segmentation fault when the bridge runs
the exit function.
Example: add a pci bridge without specifing the chassis number:
<qemu-bin> ... -device pci-bridge,id=p1
Result:
(qemu) qemu-system-x86_64: -device pci-bridge,id=p1: Bridge chassis not specified. Each bridge is required to be assigned a unique chassis id > 0.
qemu-system-x86_64: -device pci-bridge,id=p1: Device
initialization failed.
Segmentation fault (core dumped)
if (child->class->unparent) {
#0 0x00005555558d629b in object_finalize_child_property (obj=0x555556d2e830, name=0x555556d30630 "shpc-mmio[0]", opaque=0x555556a42fc8) at qom/object.c:1078
#1 0x00005555558d4b1f in object_property_del_all (obj=0x555556d2e830) at qom/object.c:367
#2 0x00005555558d4ca1 in object_finalize (data=0x555556d2e830) at qom/object.c:412
#3 0x00005555558d55a1 in object_unref (obj=0x555556d2e830) at qom/object.c:720
#4 0x000055555572c907 in qdev_device_add (opts=0x5555563544f0) at qdev-monitor.c:566
#5 0x0000555555744f16 in device_init_func (opts=0x5555563544f0, opaque=0x0) at vl.c:2213
#6 0x00005555559cf5f0 in qemu_opts_foreach (list=0x555555e0f8e0 <qemu_device_opts>, func=0x555555744efa <device_init_func>, opaque=0x0, abort_on_failure=1) at util/qemu-option.c:1057
#7 0x000055555574a11b in main (argc=16, argv=0x7fffffffdde8, envp=0x7fffffffde70) at vl.c:423
Unparent the shpc mmio region as part of shpc cleanup.
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
if DIMMs with different size/alignment are interleaved
in creation order, it could lead to hotplug-memory
container fragmentation and following inability to use
all RAM upto maxmem.
For example:
-m 4G,slots=3,maxmem=7G
-object memory-backend-file,id=mem-1,size=256M,mem-path=/pagesize-2MB
-device pc-dimm,id=mem1,memdev=mem-1
-object memory-backend-file,id=mem-2,size=1G,mem-path=/pagesize-1GB
-device pc-dimm,id=mem2,memdev=mem-2
-object memory-backend-file,id=mem-3,size=256M,mem-path=/pagesize-2MB
-device pc-dimm,id=mem3,memdev=mem-3
fragments hotplug-memory container and doesn't allow
to use 1GB hugepage backend to consume remainig 1Gb.
To ease managment factor count in max 1Gb alignment for
each memory slot when sizing hotplug-memory region so
that regadless of fragmentaion it would be possible to
add max aligned DIMM.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently maxmem limit is not checked and depends on
hotplug region container not being able to fit more RAM
than maxmem. Do check explicitly so that it would
be possible to change hotplug container size later
to deal with fragmentation.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJUczTcAAoJEL/70l94x66D/dEIAKMy5aOkmcPHcptzpK77LBJC
H6FuYmdUOvLlzrPPV9DeWmFQDN8i1ySvJKRQ0F1KKAE9hm4vKl/w6cibY3abI/VB
xu/c7o8XTeLt5f+3OFeqSvRX1Wx/3Fj+v0lauHzAjyVZxd3mQ1wOxygzQtSkoJPq
akJEgb0jURN8MfiqwMl1ws1iSdRhPE4pR/q63GgOCNZFF9O0ADZueO+SNztBxTTp
t2xYiNqjeG+b+AKrn0bkFdAQvLD4jzBIyOmQClwdQsd6Qwc88/246FJD6OdBwBBW
Zp3HAeLuxvw8aWGfs+x4FRLDAeOJ+BPfwlyJJiyb9DuzpA/ayL0qQfWXcX9pY3g=
=8ow4
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Three patches to fix ExtINT for the QEMU implementation of the local APIC.
# gpg: Signature made Mon 24 Nov 2014 13:38:36 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
apic: fix incorrect handling of ExtINT interrupts wrt processor priority
apic: fix loss of IPI due to masked ExtINT
apic: avoid getting out of halted state on masked PIC interrupts
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This fixes another failure with ExtINT, demonstrated by QNX. The failure
mode is as follows:
- IPI sent to cpu 0 (bit set in APIC irr)
- IPI accepted by cpu 0 (bit cleared in irr, set in isr)
- IPI sent to cpu 0 (bit set in both irr and isr)
- PIC interrupt sent to cpu 0
The PIC interrupt causes CPU_INTERRUPT_HARD to be set, but
apic_irq_pending observes that the highest pending APIC interrupt priority
(the IPI) is the same as the processor priority (since the IPI is still
being handled), so apic_get_interrupt returns a spurious interrupt rather
than the pending PIC interrupt. The result is an endless sequence of
spurious interrupts, since nothing will clear CPU_INTERRUPT_HARD.
Instead, ExtINT interrupts should have ignored the processor priority.
Calling apic_check_pic early in apic_get_interrupt ensures that
apic_deliver_pic_intr is called instead of delivering the spurious
interrupt. apic_deliver_pic_intr then clears CPU_INTERRUPT_HARD if needed.
Reported-by: Richard Bilson <rbilson@qnx.com>
Tested-by: Richard Bilson <rbilson@qnx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch fixes an obscure failure of the QNX kernel on QEMU x86 SMP.
In QNX, all hardware interrupts come via the PIC, and are delivered by
the cpu 0 LAPIC in ExtINT mode, while IPIs are delivered by the LAPIC
in fixed mode.
This bug happens as follows:
- cpu 0 masks a particular PIC interrupt
- IPI sent to cpu 0 (CPU_INTERRUPT_HARD is set)
- before the IPI is accepted, the masked interrupt line is asserted by the
device
Since the interrupt is masked, apic_deliver_pic_intr will clear
CPU_INTERRUPT_HARD. The IPI will still be set in the APIC irr, but since
CPU_INTERRUPT_HARD is not set the cpu will not notice. Depending on the
scenario this can cause a system hang, i.e. if cpu 0 is expected to unmask
the interrupt.
In order to fix this, do a full check of the APIC before an EXTINT
is acknowledged. This can result in clearing CPU_INTERRUPT_HARD, but
can also result in delivering the lost IPI.
Reported-by: Richard Bilson <rbilson@qnx.com>
Tested-by: Richard Bilson <rbilson@qnx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After the next patch, if a masked PIC interrupts causes CPU_INTERRUPT_POLL
to be set, the CPU will spuriously get out of halted state. While this
is technically valid, we should avoid that.
Make CPU_INTERRUPT_POLL run apic_update_irq in the right thread and then
look at CPU_INTERRUPT_HARD. If CPU_INTERRUPT_HARD does not get set,
do not report the CPU as having work.
Also move the handling of software-disabled APIC from apic_update_irq
to apic_irq_pending, and always trigger CPU_INTERRUPT_POLL. This will
be important once we will add a case that resets CPU_INTERRUPT_HARD
from apic_update_irq. We want to run it even if we go through
CPU_INTERRUPT_POLL, and even if the local APIC is software disabled.
Reported-by: Richard Bilson <rbilson@qnx.com>
Tested-by: Richard Bilson <rbilson@qnx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit 000c4dfff4.
The main reason for reverting this commit before the 2.2 release is that
it adds a QAPI interface that we don't want to keep: The 'nocow' flag
doesn't generally make sense for block nodes, but only for the raw-posix
driver. It should therefore be part of ImageInfoSpecific rather than
ImageInfo.
The commit contains more problems, but unlike the API stability issue
they wouldn't justify reverting it.
Conflicts:
block/qapi.c
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Performance wise it's better to align GVA by the backend's
page size.
Also do not allow to create DIMM device with suboptimal
size (i.e. not aligned to backends page size) to aviod
memory loss.
Do above only for 2.2 and newer machine types to avoid
breaking working configs with 2.1 machine type.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
introduce memory_region_get_alignment() that returns
underlying memory block alignment or 0 if it's not
relevant/implemented for backend.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>