Armv8-A removes UNPREDICTABLE for R13 for these cases.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191117090621.32425-3-richard.henderson@linaro.org
[PMM: changed ENABLE_ARCH_8 checks to check a new bool 'v8a',
since these cases are still UNPREDICTABLE for v8M]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
There was too much cut and paste between ldrexd and strexd,
as ldrexd does prohibit two output registers the same.
Fixes: af28822899
Reported-by: Michael Goffioul <michael.goffioul@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191117090621.32425-2-richard.henderson@linaro.org
Reviewed-by: Robert Foley <robert.foley@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Linux kernel PHY driver sets AN_RESTART in the BMCR of the
PHY when autonegotiation is started.
Recently the kernel started to read back the PHY's AN_RESTART
bit and now checks whether the autonegotiation is complete and
the bit was cleared [1]. Otherwise the link status is down.
The emulated PHY needs to clear AN_RESTART immediately to inform
the kernel driver about the completion of autonegotiation phase.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c36757eb9dee
Signed-off-by: Linus Ziegert <linus.ziegert+qemu@holoplot.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20191104181604.21943-1-linus.ziegert+qemu@holoplot.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A few configuration register writes need not update the spi bus state, so just
return after the register write.
Signed-off-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Tested-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1573830705-14579-1-git-send-email-sai.pavan.boddu@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Coverity reports, in sve_zcr_get_valid_len,
"Subtract operation overflows on operands
arm_cpu_vq_map_next_smaller(cpu, start_vq + 1U) and 1U"
First, the aarch32 stub version of arm_cpu_vq_map_next_smaller,
returning 0, does exactly what Coverity reports. Remove it.
Second, the aarch64 version of arm_cpu_vq_map_next_smaller has
a set of asserts, but they don't cover the case in question.
Further, there is a fair amount of extra arithmetic needed to
convert from the 0-based zcr register, to the 1-base vq form,
to the 0-based bitmap, and back again. This can be simplified
by leaving the value in the 0-based form.
Finally, use test_bit to simplify the common case, where the
length in the zcr registers is in fact a supported length.
Reported-by: Coverity (CID 1407217)
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 20191118091414.19440-1-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The current PL031 RTCICR register implementation always clears the
IRQ pending status on a register write, regardless of the value the
guest writes.
To justify that behavior, it references the ARM926EJ-S Development
Chip Reference Manual (DDI0287B) and indicates that said document
states that any write clears the internal IRQ state. It is indeed
true that in section 11.1 this document says:
"The interrupt is cleared by writing any data value to the
interrupt clear register RTCICR".
However, later in section 11.2.2 it contradicts itself by saying:
"Writing 1 to bit 0 of RTCICR clears the RTCINTR flag."
The latter statement matches the PL031 TRM (DDI0224C), which says:
"Writing 1 to bit position 0 clears the corresponding interrupt.
Writing 0 has no effect."
Let's assume that the self-contradictory DDI0287B is in error, and
follow the reference manual for the device itself, by making the
register write-one-to-clear.
Reported-by: Hendrik Borghorst <hborghor@amazon.de>
Signed-off-by: Alexander Graf <graf@amazon.com>
Message-id: 20191104115228.30745-1-graf@amazon.com
[PMM: updated commit message to note that DDI0287B says two
conflicting things]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 369b41359a broke timer interrupt
reinjection when there is no period change by the guest. In that
case, old_period is 0, which ends up zeroing irq_coalesced (counter of
reinjected interrupts).
The consequence is Windows 7 is unable to synchronize time via NTP.
Easily reproducible by playing a fullscreen video with cirrus and VNC.
Fix by passing s->period when periodic_timer_update is called due to
expiration of the timer. With this change, old_period == 0 only
means that the periodic timer was off.
Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
Co-developed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It's an old compatibility shim that just delegates to scsi-cd or scsi-hd.
Just like ide-drive, we don't need this.
Acked-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Attempting to migrate a VM using the microvm machine class results in the source
QEMU aborting with the following message/backtrace:
target/i386/machine.c:955:tsc_khz_needed: Object 0x555556608fa0 is not an
instance of type generic-pc-machine
abort()
object_class_dynamic_cast_assert()
vmstate_save_state_v()
vmstate_save_state()
vmstate_save()
qemu_savevm_state_complete_precopy()
migration_thread()
migration_thread()
migration_thread()
qemu_thread_start()
start_thread()
clone()
The access to the machine class returned by MACHINE_GET_CLASS() in
tsc_khz_needed() is crashing as it is trying to dereference a different
type of machine class object (TYPE_PC_MACHINE) to that of this microVM.
This can be resolved by extending the changes in the following commit
f0bb276bf8 ("hw/i386: split PCMachineState deriving X86MachineState from it")
and moving the save_tsc_khz field in PCMachineClass to X86MachineClass.
Fixes: f0bb276bf8 ("hw/i386: split PCMachineState deriving X86MachineState from it")
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <1574075605-25215-1-git-send-email-liam.merwick@oracle.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a new section explaining the particularities of the microvm
machine type for triggering a guest-initiated shut down.
Signed-off-by: Sergio Lopez <slp@redhat.com>
Message-Id: <20191115161338.42864-3-slp@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix the alignment of the items in the "Limitations" section.
Signed-off-by: Sergio Lopez <slp@redhat.com>
Message-Id: <20191115161338.42864-2-slp@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
hw/vfio/display.c needs the EDID subsystem, select it.
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When CONFIG_IDE_ISA is disabled, compilation currently fails:
hw/i386/pc_piix.c: In function ‘pc_init1’:
hw/i386/pc_piix.c:81:9: error: unused variable ‘i’ [-Werror=unused-variable]
Move the variable declaration to the right code block to avoid
this problem.
Fixes: 4501d317b5 ("hw/i386/pc: Extract pc_i8259_create()")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20191115145049.26868-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
TSX Async Abort (TAA) is a side channel attack on internal buffers in
some Intel processors similar to Microachitectural Data Sampling (MDS).
Some future Intel processors will use the ARCH_CAP_TAA_NO bit in the
IA32_ARCH_CAPABILITIES MSR to report that they are not vulnerable to
TAA. Make this bit available to guests.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In microvm_fix_kernel_cmdline(), fw_cfg_modify_string() is duplicating
cmdline instead of taking ownership of it. Free it afterwards to avoid
leaking it.
Reported-by: Coverity (CID 1407218)
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Sergio Lopez <slp@redhat.com>
Message-Id: <20191112163423.91884-1-slp@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Recent git versions support worktrees where .git is not a directory but
a file with a path to the .git repository; however the get_maintainer.pl
script only recognises the .git directory, let's fix it.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Tested-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20191112034532.69079-1-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When a monitor's queue is filled up in handle_qmp_command()
it gets suspended. It's the dispatcher bh's job currently to
resume the monitor, which it does after processing an event
from the queue. However, it is possible for a
CHR_EVENT_CLOSED event to be processed before before the bh
is scheduled, which will clear the queue without resuming
the monitor, thereby preventing the dispatcher from reaching
the resume() call.
Any new connections to the qmp socket will be accept()ed and
show the greeting, but will not respond to any messages sent
afterwards (as they will not be read from the
still-suspended socket).
Fix this by resuming the monitor when clearing a queue which
was filled up.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Message-Id: <20191115085914.21287-1-w.bumiller@proxmox.com>
Run the core of the test twice, once without iothreads, and again
with, for more coverage of both setups.
Suggested-by: Nir Soffer <nsoffer@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20191114213415.23499-5-eblake@redhat.com>
We generally include relevant HMP input in .out files, by virtue of
the fact that HMP echoes its input. But QMP does not, so we have to
explicitly inject it in the output stream (appropriately filtered to
keep the tests passing), in order to make it easier to read .out files
to see what behavior is being tested (especially true where the output
file is a sequence of {'return': {}}).
Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20191114213415.23499-4-eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Up to now, all it took to cause a lot of iotest failures was to have a
background process such as 'nbdkit -p 10810 null' running, because we
hard-coded the TCP port. Switching to a Unix socket eliminates this
contention. We still have TCP coverage in test 233, and that test is
more careful to not pick a hard-coded port.
Add a comment explaining where the format layer applies when using
NBD as protocol (until NBD gains support for a resize extension, we
only pipe raw bytes over the wire).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20191114213415.23499-3-eblake@redhat.com>
[eblake: Tweak socket name per Max Reitz' review]
This test has been broken since 3.0. It used TEST_IMG to influence
the name of a file created during _make_test_img, but commit 655ae6bb
changed things so that the wrong file name is being created, which
then caused _launch_qemu to fail. In the meantime, the set of events
issued for the actions of the test has increased.
Why haven't we noticed the failure? Because the test rarely gets run:
'./check -qcow2 173' is insufficient (that defaults to using file protocol)
'./check -nfs 173' is insufficient (that defaults to using raw format)
so the test is only run with:
./check -qcow2 -nfs 173
Note that we already have a number of other problems with -nfs:
./check -nfs (fails 18/30)
./check -qcow2 -nfs (fails 45/76 after this patch, if exports does
not permit 'insecure')
and it's not on my priority list to fix those. Rather, I found this
because of my next patch's work on tests using _send_qemu_cmd.
Fixes: 655ae6b
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20191114213415.23499-2-eblake@redhat.com>
Qemu as server currently won't accept export names larger than 256
bytes, nor create dirty bitmap names longer than 1023 bytes, so most
uses of qemu as client or server have no reason to get anywhere near
the NBD spec maximum of a 4k limit per string.
However, we weren't actually enforcing things, ignoring when the
remote side violates the protocol on input, and also having several
code paths where we send oversize strings on output (for example,
qemu-nbd --description could easily send more than 4k). Tighten
things up as follows:
client:
- Perform bounds check on export name and dirty bitmap request prior
to handing it to server
- Validate that copied server replies are not too long (ignoring
NBD_INFO_* replies that are not copied is not too bad)
server:
- Perform bounds check on export name and description prior to
advertising it to client
- Reject client name or metadata query that is too long
- Adjust things to allow full 4k name limit rather than previous
256 byte limit
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20191114024635.11363-4-eblake@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
We document that for qcow2 persistent bitmaps, the name cannot exceed
1023 bytes. It is inconsistent if transient bitmaps do not have to
abide by the same limit, and it is unlikely that any existing client
even cares about using bitmap names this long. It's time to codify
that ALL bitmaps managed by qemu (whether persistent in qcow2 or not)
have a documented maximum length.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20191114024635.11363-3-eblake@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
As long as we limit NBD names to 256 bytes (the bare minimum permitted
by the standard), stack-allocation works for parsing a name received
from the client. But as mentioned in a comment, we eventually want to
permit up to the 4k maximum of the NBD standard, which is too large
for stack allocation; so switch everything in the server to use heap
allocation. For now, there is no change in actually supported name
length.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20191114024635.11363-2-eblake@redhat.com>
[eblake: fix uninit variable compile failure]
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Coverity warns that we store the address of a stack variable through a
pointer passed in by the caller, which would let the caller trivially
trigger use-after-free if that stored value is still present when we
finish execution. However, the way coroutines work is that after our
call to qemu_coroutine_yield(), control is temporarily continued in
the caller prior to our function concluding, and in order to resume
our coroutine, the caller must poll until the variable has been set to
NULL. Thus, we can add an assert that we do not leak stack storage to
the caller on function exit.
Fixes: Coverity CID 1406474
CC: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20191111203524.21912-1-eblake@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
The test for an NBD client. The NBD server is disconnected after the
client write request. The NBD client should reconnect and complete
the write operation.
Suggested-by: Denis V. Lunev <den@openvz.org>
Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Message-Id: <1573529976-815699-1-git-send-email-andrey.shinkevich@virtuozzo.com>
hw/vfio/display.c needs the EDID subsystem, select it.
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
When an error occurs in migrate_add_blocker() it sets a
negative return value and uses error pointer we pass in.
Instead of just looking at the error pointer check for a negative return
value and avoid a coverity error because the return value is
set but never used. This fixes CID 1407219.
Reported-by: Coverity (CID 1407219)
Fixes: f045a0104c ("vfio: unplug failover primary device before migration")
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
When user tries to hotplug a VFIO device, but the operation fails
somewhere in the middle (in my testing it failed because of
RLIMIT_MEMLOCK forbidding more memory allocation), then a double
free occurs. In vfio_realize() the vdev->migration_blocker is
allocated, then something goes wrong which causes control to jump
onto 'error' label where the error is freed. But the pointer is
left pointing to invalid memory. Later, when
vfio_instance_finalize() is called, the memory is freed again.
In my testing the second hunk was sufficient to fix the bug, but
I figured the first hunk doesn't hurt either.
==169952== Invalid read of size 8
==169952== at 0xA47DCD: error_free (error.c:266)
==169952== by 0x4E0A18: vfio_instance_finalize (pci.c:3040)
==169952== by 0x8DF74C: object_deinit (object.c:606)
==169952== by 0x8DF7BE: object_finalize (object.c:620)
==169952== by 0x8E0757: object_unref (object.c:1074)
==169952== by 0x45079C: memory_region_unref (memory.c:1779)
==169952== by 0x45376B: do_address_space_destroy (memory.c:2793)
==169952== by 0xA5C600: call_rcu_thread (rcu.c:283)
==169952== by 0xA427CB: qemu_thread_start (qemu-thread-posix.c:519)
==169952== by 0x80A8457: start_thread (in /lib64/libpthread-2.29.so)
==169952== by 0x81C96EE: clone (in /lib64/libc-2.29.so)
==169952== Address 0x143137e0 is 0 bytes inside a block of size 48 free'd
==169952== at 0x4A342BB: free (vg_replace_malloc.c:530)
==169952== by 0xA47E05: error_free (error.c:270)
==169952== by 0x4E0945: vfio_realize (pci.c:3025)
==169952== by 0x76A4FF: pci_qdev_realize (pci.c:2099)
==169952== by 0x689B9A: device_set_realized (qdev.c:876)
==169952== by 0x8E2C80: property_set_bool (object.c:2080)
==169952== by 0x8E0EF6: object_property_set (object.c:1272)
==169952== by 0x8E3FC8: object_property_set_qobject (qom-qobject.c:26)
==169952== by 0x8E11DB: object_property_set_bool (object.c:1338)
==169952== by 0x5E7BDD: qdev_device_add (qdev-monitor.c:673)
==169952== by 0x5E81E5: qmp_device_add (qdev-monitor.c:798)
==169952== by 0x9E18A8: do_qmp_dispatch (qmp-dispatch.c:132)
==169952== Block was alloc'd at
==169952== at 0x4A35476: calloc (vg_replace_malloc.c:752)
==169952== by 0x51B1158: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.6000.6)
==169952== by 0xA47357: error_setv (error.c:61)
==169952== by 0xA475D9: error_setg_internal (error.c:97)
==169952== by 0x4DF8C2: vfio_realize (pci.c:2737)
==169952== by 0x76A4FF: pci_qdev_realize (pci.c:2099)
==169952== by 0x689B9A: device_set_realized (qdev.c:876)
==169952== by 0x8E2C80: property_set_bool (object.c:2080)
==169952== by 0x8E0EF6: object_property_set (object.c:1272)
==169952== by 0x8E3FC8: object_property_set_qobject (qom-qobject.c:26)
==169952== by 0x8E11DB: object_property_set_bool (object.c:1338)
==169952== by 0x5E7BDD: qdev_device_add (qdev-monitor.c:673)
Fixes: f045a0104c ("vfio: unplug failover primary device before migration")
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Test that doing a second blockdev-snapshot doesn't make the first
overlay's backing file go away.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
bs->options and bs->explicit_options shouldn't contain any options for
child nodes. bdrv_open_inherited() takes care to remove any options that
match a child name after opening the image and the same is done when
reopening.
However, we miss the case of 'backing': null, which is a child option,
but results in no child being created. This means that a 'backing': null
remains in bs->options and bs->explicit_options.
A typical use for 'backing': null is in live snapshots: blockdev-add for
the qcow2 overlay makes sure not to open the backing file (because it is
already opened and blockdev-snapshot will attach it). After doing a
blockdev-snapshot, bs->options and bs->explicit_options become
inconsistent with the actual state (bs has a backing file now, but the
options still say null). On the next occasion that the image is
reopened, e.g. switching it from read-write to read-only when another
snapshot is taken, the option will take effect again and the node
incorrectly loses its backing file.
Fix bdrv_open_inherited() to remove the 'backing' option from
bs->options and bs->explicit_options even for the case where it
specifies that no backing file is wanted.
Reported-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
The variable for error messages to be displayed is $results, not
$reason. Fix 'check' to print the "no qualified output" error message
again instead of having a failure without any message telling the user
why it failed.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
From the two values compared, make it obvious which is found at path, and
which is expected.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Due to lchs support merge in upstream seabios gone wrong (applied v3
instead of v4) here is another seabios snapshot update with the
mis-merge fixed up, so lchs support should actually work in -rc2.
Also picked up two tpm bugfixes.
git shortlog from previous snapshot
===================================
Gerd Hoffmann (4):
Revert "geometry: Apply LCHS values for boot devices"
Revert "config: Add toggle for bootdevice information"
Revert "geometry: Add boot_lchs_find_*() utility functions"
Revert "geometry: Read LCHS from fw_cfg"
Sam Eiderman (4):
geometry: Read LCHS from fw_cfg
boot: Build ata and scsi paths in function
geometry: Add boot_lchs_find_*() utility functions
geometry: Apply LCHS values for boot devices
Stefan Berger (2):
tpm: Require a response to have minimum size of a valid response header
tcgbios: Check for enough bytes returned from TPM2_GetCapability
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
With the Quadra 800 emulation, mos6522 timers processing can consume
until 70% of the host CPU time with an idle guest (I guess the problem
should also happen with PowerMac emulation).
On a recent system, it can be painless (except if you look at top), but
on an old host like a PowerMac G5 the guest kernel can be terribly slow
during the boot sequence (for instance, unpacking initramfs can take 15
seconds rather than only 3 seconds).
We can avoid this CPU overload by enabling QEMU internal timers only if
the mos6522 counter interrupts are enabled. Sometime the guest kernel
wants to read the counters values, but we don't need the timers to
update the counters.
With this patch applied, an idle Q800 consumes only 3% of host CPU time
(and the guest can boot in a decent time).
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20191102154919.17775-1-laurent@vivier.eu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
We have to set the default model of all machine classes, not just for
the active one. Otherwise, "query-machines" will indicate the wrong
CPU model (e.g. "power9_v2.0-powerpc64-cpu" instead of
"host-powerpc64-cpu") as "default-cpu-type".
s390x already fixed this in de60a92e "s390x/kvm: Set default cpu model for
all machine classes". This patch applies a similar fix for the pseries-*
machine types on ppc64.
Doing a
{"execute":"query-machines"}
under KVM now results in
{
"hotpluggable-cpus": true,
"name": "pseries-4.2",
"numa-mem-supported": true,
"default-cpu-type": "host-powerpc64-cpu",
"is-default": true,
"cpu-max": 1024,
"deprecated": false,
"alias": "pseries"
},
{
"hotpluggable-cpus": true,
"name": "pseries-4.1",
"numa-mem-supported": true,
"default-cpu-type": "host-powerpc64-cpu",
"cpu-max": 1024,
"deprecated": false
},
...
Libvirt probes all machines via "-machine none,accel=kvm:tcg" and will
currently see the wrong CPU model under KVM.
Reported-by: Jiři Denemark <jdenemar@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Since "spapr: Render full FDT on ibm,client-architecture-support" we build
the entire flatten device tree (FDT) twice - at the reset time and
when "ibm,client-architecture-support" (CAS) is called. The full FDT from
CAS is then applied on top of the SLOF internal device tree.
This is mostly ok, however there is a case when the QEMU is started with
-initrd and for some reason the guest decided to move/unpack the init RAM
disk image - the guest correctly notifies SLOF about the change but
at CAS it is overridden with the QEMU initial location addresses and
the guest may fail to boot if the original initrd memory was changed.
This fixes the problem by only adding the /chosen node at the reset time
to prevent the original QEMU's linux,initrd-start/linux,initrd-end to
override the updated addresses.
This only treats /chosen differently as we know there is a special case
already and it is unlikely anything else will need to change /chosen at CAS
we are better off not touching /chosen after we handed it over to SLOF.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20191024041308.5673-1-aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
CPU_FOREACH() can race with vCPU hotplug/unplug on sPAPR machines, ie.
we may try to print out info about a vCPU with a NULL presenter pointer.
Check that in order to prevent QEMU from crashing.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157192725327.3146912.12047076483178652551.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
When a VCPU gets connected to the XIVE interrupt controller, we add a
const link targetting the CPU object to the TCTX object. Similar links
are added to the ICP object when using the XICS interrupt controller.
As explained in <qom/object.h>:
* The caller must ensure that @target stays alive as long as
* this property exists. In the case @target is a child of @obj,
* this will be the case. Otherwise, the caller is responsible for
* taking a reference.
We're in the latter case for both XICS and XIVE. Add the missing
calls to object_ref() and object_unref().
This doesn't fix any known issue because the life cycle of the TCTX or
ICP happens to be shorter than the one of the CPU or XICS fabric, but
better safe than sorry.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <157192724770.3146912.15400869269097231255.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
SpaprInterruptControllerClass and PnvChipClass have an intc_create() method
that calls the appropriate routine, ie. icp_create() or xive_tctx_create(),
to establish the link between the VCPU and the presenter component of the
interrupt controller during realize.
There aren't any symmetrical call to be called when the VCPU gets unrealized
though. It is assumed that object_unparent() is the only thing to do.
This is questionable because the parenting logic around the CPU and
presenter objects is really an implementation detail of the interrupt
controller. It shouldn't be open-coded in the machine code.
Fix this by adding an intc_destroy() method that undoes what was done in
intc_create(). Also NULLify the presenter pointers to avoid having
stale pointers around. This will allow to reliably check if a vCPU has
a valid presenter.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157192724208.3146912.7254684777515287626.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>