* unlocked MMIO support in KVM
* support for compilation with ICC
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJVmnuoAAoJEL/70l94x66DKTUH/RFrc20KXRkn/Pb/8qHY/wFz
Wt3YaS5VYPHElHbxHSdpwlV3K50FAX4QaC25Dnw4dsTelyxe5k7od+I7x8PQxD9v
3N+mFFF1BV6PqXTPVnUCnb14EXprJX524E97O6Z3lDGcwSLHDxeveSCk3IvMFErz
JzP3vtigSvtdPPQXlGcndP/r1EXeVjgNIsZ+NKaI/kmoSz1fHFrCN8hTnrxA9RSI
ZPhfmgHI5EMFtAf/HiZID6GSHOHajgeRT2bIiiy1okS++no0uRZlVMvcnFNPZHoG
e9XCGBXJSdmCoi7sIgShXirKszxYkRTbCyxxjz6aYfhrQzo0h+Yn9OPuvQrgynE=
=+YEv
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* more of Peter Crosthwaite's multiarch preparation patches
* unlocked MMIO support in KVM
* support for compilation with ICC
# gpg: Signature made Mon Jul 6 13:59:20 2015 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
exec: skip MMIO regions correctly in cpu_physical_memory_write_rom_internal
Stop including qemu-common.h in memory.h
kvm: Switch to unlocked MMIO
acpi: mark PMTIMER as unlocked
kvm: Switch to unlocked PIO
kvm: First step to push iothread lock out of inner run loop
memory: let address_space_rw/ld*/st* run outside the BQL
exec: pull qemu_flush_coalesced_mmio_buffer() into address_space_rw/ld*/st*
memory: Add global-locking property to memory regions
main-loop: introduce qemu_mutex_iothread_locked
main-loop: use qemu_mutex_lock_iothread consistently
Fix irq route entries exceeding KVM_MAX_IRQ_ROUTES
cpu-defs: Move out TB_JMP defines
include/exec: Move tb hash functions out
include/exec: Move standard exceptions to cpu-all.h
cpu-defs: Move CPU_TEMP_BUF_NLONGS to tcg
memory_mapping: Rework cpu related includes
cutils: allow compilation with icc
qemu-common: add VEC_OR macro
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Including qemu-common.h from other header files is generally a bad
idea, because it means it's very easy to end up with a circular
dependency. For instance, if we wanted to include memory.h from
qom/cpu.h we'd end up with this loop:
memory.h -> qemu-common.h -> cpu.h -> cpu-qom.h -> qom/cpu.h -> memory.h
Remove the include from memory.h. This requires us to fix up a few
other files which were inadvertently getting declarations indirectly
through memory.h.
The biggest change is splitting the fprintf_function typedef out
into its own header so other headers can get at it without having
to include qemu-common.h.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1435933104-15216-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Whenever we touch the access control registers, we have to make sure that
the values will make it into kvm. Otherwise the change will simply be lost.
When synchronizing qemu and kvm, a normal KVM_PUT_RUNTIME_STATE does not take
care of these registers. Let's simply trigger a KVM_PUT_FULL_STATE sync,
so the values will directly be written to kvm. The performance overhead can
be ignored and this is much cleaner than manually writing these registers to kvm
via our two supported ways.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
This opens the path to get rid of the iothread lock on vmexits in KVM
mode. On x86, the in-kernel irqchips has to be used because we otherwise
need to synchronize APIC and other per-cpu state accesses that could be
changed concurrently.
Regarding pre/post-run callbacks, s390x and ARM should be fine without
specific locking as the callbacks are empty. MIPS and POWER require
locking for the pre-run callback.
For the handle_exit callback, it is non-empty in x86, POWER and s390.
Some POWER cases could do without the locking, but it is left in
place for now.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1434646046-27150-7-git-send-email-pbonzini@redhat.com>
In particular, don't include it into headers.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
disas does not need to access the CPU env for any reason. Change the
APIs to accept CPU pointers instead. Small change pattern needs to be
applied to all target translate.c. This brings us closer to making
disas.o a common-obj and less architecture specific in general.
Cc: Richard Henderson <rth@twiddle.net>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Michael Walle <michael@walle.cc>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Jia Liu <proljc@gmail.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Acked-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
This patch adds support for PER Breaking-Event-Address register. Like
real hardware, it save the current PSW address when the PSW address is
changed by an instruction. We have to take care of optimizations QEMU
does, a branch to the next instruction is still a branch.
This register is copied to low core memory when a program exception
happens.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
For the instruction-fetch nullification event, we just reuse the
existing instruction-fetch code and trigger the exception immediately
in that case.
There is no need to save the CPU state in the TCG code as it has been
saved by the previous instruction before calling the per_check_exception
helper.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
This PER event happens each time the STURA or STURG instructions are
used. As they use helpers, we can just save the event in the PER code
there, if enabled.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
For the PER storage-alteration event we can use the QEMU watchpoint
infrastructure. When PER is enabled or PER control register changed we
enable the corresponding watchpoints. When a watchpoint arises we can
save the event. Unfortunately the current code does not provide the
address space used to trigger the watchpoint. For now we assume it comes
from the default ASC.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
For the PER instruction-fetch, we can't use the QEMU breakpoint
infrastructure as it triggers for a single address and not a full
address range, and as it actually stop before the instruction and
not before.
We therefore call an helper with the just fetched instruction address,
which check if the address is within the PER address range. If it is
the case, an event is recorded and will be signaled through an
exception.
Note that we implement here the PER-3 behaviour, that is an invalid
opcode is not considered as an instruction fetch. Without PER-3 this
behavious is undefined.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
For the PER successful-branching event support, we can't rely on any
QEMU infrastucture. We therefore call an helper in all places where
a branch can be taken. We have to pay attention to the branch to next
case, as it's still a taken branch.
We don't need to care about the cases using goto_tb, as we have disabled
them in the previous patch.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch add basic support to generate PER exceptions. It adds two
fields to the cpu structure to record for the PER address and PER
code & ATMID values. When an exception is triggered and a PER event is
pending, the two PER values are copied to the lowcore area.
At the end of an instruction, an helper is checking for a possible
pending PER event and triggers an exception in that case. For that to
work with branches, we need to disable TB chaining when PER is
activated. Fortunately it's already in the TB flags.
Finally in case of a SERVICE CALL exception, we need to trigger the PER
exception immediately after.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
This function checks if an address is in between the PER starting
address and the PER ending address, taking care of a possible
address range loop.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
This function returns the ATMID field that is stored in the
per_perc_atmid lowcore entry.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
mvc_fast_memmove is bypassing the softmmu functions, getting the
physical source and destination addresses using the mmu_translate
function and accessing the corresponding physical memory. This
prevents watchpoints to work correctly.
Instead use the tlb_vaddr_to_host function to get the host addresses
corresponding to the guest source and destination addresses through the
softmmu code and fallback to the byte level code in case the
corresponding address are not in the QEMU TLB or being examined through
a watchpoint. As a bonus it works even for area crossing pages by
splitting the are into chunks contained in a single page, bringing some
performances improvements. We can therefore remove the 8-byte
loads/stores method, as it is now quite unlikely to be used.
At the same time change the name of the function to fast_memmove as it's
not specific to mvc and use the same argument order as the C memmove
function.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
mvc_fast_memset is bypassing the softmmu functions, getting the
physical address using the mmu_translate function and accessing the
corresponding physical memory. This prevents watchpoints to work
correctly.
Instead use the tlb_vaddr_to_host function to get the host address
corresponding to the guest address through the softmmu code and fallback
to the byte level code in case the corresponding address is not in the
QEMU TLB or being examined through a watchpoint. As a bonus it works
even for area crossing pages by splitting the are into chunks contained
in a single page, bringing some performances improvements.
At the same time change the name of the function to fast_memset as it's
not specific to mvc and use the same argument order as the C memset
function.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch adds a function to adjust the length of a transfer so that
it doesn't cross a page boundary in softmmu mode. It does nothing in
user mode.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
The code handling the I/O instructions for KVM decodes the instruction
itself. In TCG mode also pass the full instruction word to the helpers.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
DIAG IPL is already implemented for KVM, but not wired from TCG. For
that change the format of the instruction so that we can get R1 and R3
numbers in addition to the function code.
The diag function can change plenty of things, including CC, so we
should enter with a static CC. Also it doesn't set the value of general
register 2 to 0 as in the current code. We also need to exit the CPU
loop after a reset, which means a new PSW.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
The s390_cpu_initial_reset function zeroes a big part of the CPU state
structure, including CPU_COMMON, and thus the QEMU TLB structure. As
they should not be initialized with zeroes only, we need to call the
tlb_flush to initialize it correctly.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
env->io_index[] should be set to -1 during CPU reset to mark the
I/O interrupt queue as empty.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
env->ext_index should be initialized to -1 to mark the external
interrupt queue as emtpy. This should not be done in s390_cpu_initfn
as all the interrupt fields are later reset to 0 by the memset in
s390_cpu_initial_reset or s390_cpu_full_reset. Move the initialization
there.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
In TCG mode we should store the CC value in env->cc_op. However do it
inconditionnaly because:
- the tcg_enabled function is not inlined
- it's probably faster to always store the value, especially given it
is likely in the same cache line than env->psw.mask.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
This remove the corresponding error messages in TCG mode, and allow to
simplify the s390_assign_subch_ioeventfd() function.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
The ioinst_schib_valid gets a SCHIB in guest endianness, we should
byteswap the fields we access.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
The I/O-Interruption Subclass field corresponds to bits 2 to 5 (BE
notation) of the Interruption-Identification Word. The value should
be shift by 27 instead of 24.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
We create optional sections with this patch. But we already have
optional subsections. Instead of having two mechanism that do the
same, we can just generalize it.
For subsections we just change:
- Add a needed function to VMStateDescription
- Remove VMStateSubsection (after removal of the needed function
it is just a VMStateDescription)
- Adjust the whole tree, moving the needed function to the corresponding
VMStateDescription
Signed-off-by: Juan Quintela <quintela@redhat.com>
Intercept the diag288 requests from kvm guests, and hand the
requested command to the diag288 watchdog device for further
handling.
Signed-off-by: Xu Wang <gesaint@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
commit 46c804def4 ("s390x: move fpu regs into a subsection
of the vmstate") moved the fprs into a subsection and bumped
the version number. This will allow to not transfer fprs in
the future if necessary. Add a comment to mark the return true
as intentional.
CC: Juan Quintela <quintela@redhat.com>
CC: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <1433758884-2997-1-git-send-email-borntraeger@de.ibm.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
We allocate ram_size / PAGE_SIZE storage keys, so we need to make sure that
we only access that many. Unfortunately the code can overrun this array by
one, potentially overwriting unrelated memory.
Fix it by limiting storage keys to their scope.
Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
The MVC instruction and the memmove C funtion do not have the same
semantic when memory areas overlap:
MVC: When the operands overlap, the result is obtained as if the
operands were processed one byte at a time and each result byte were
stored immediately after fetching the necessary operand byte.
memmove: Copying takes place as though the bytes in src are first copied
into a temporary array that does not overlap src or dest, and the bytes
are then copied from the temporary array to dest.
The behaviour is therefore the same when the destination is at a lower
address than the source, but not in the other case. This is actually a
trick for propagating a value to an area. While the current code detects
that and call memset in that case, it only does for 1-byte value. This
trick can and is used for propagating two or more bytes to an area.
In the softmmu case, the call to mvc_fast_memmove is correct as the
above tests verify that source and destination are each within a page,
and both in a different page. The part doing the move 8 bytes by 8 bytes
is wrong and we need to check that if the source and destination
overlap, they do with a distance of minimum 8 bytes before copying 8
bytes at a time.
In the user code, we should check check that the destination is at a
lower address than source or than the end of the source is at a lower
address than the destination before calling memmove. In the opposite
case we fallback to the same code as the softmmu one. Note that l
represents (length - 1).
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
mvcp and mvcs helper get access to the physical memory by a call to
mmu_translate for the virtual to real conversion and then using ldb_phys
and stb_phys to physically access the data. In practice this is quite
slow because it bypasses the QEMU softmmu TLB and because stb_phys calls
try to invalidate the corresponding memory for each access.
Instead use cpu_ldb_{primary,secondary} for the loads and
cpu_stb_{primary,secondary} for the stores. Ideally this should be
further optimized by a call to memcpy, but that already improves the
boot time of a guest by a factor 1.8.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
s390_cpu_handle_mmu_fault currently looks at the current ASC mode
defined in PSW mask instead of the MMU index. This prevent emulating
easily instructions using a specific ASC mode. Fix that by using the
MMU index converted back to ASC using the just added cpu_mmu_idx_to_asc
function.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
Use constants to define the MMU indexes, and add a function to do
the reverse conversion of cpu_mmu_index.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
Besides RISBHG and RISBLG, all high-word instructions are not
implemented. Fix that.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
At the same time move the trap code from op_ct into gen_trap and use it
for all new functions. The value needs to be stored back to register
before the exception, but also before the brcond (as we don't use
temp locals). That's why we can't use wout helper.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
RISBGN is the same as RISBG, but without setting the condition code.
CLT and CLGT are the same as CLRT and CLGRT, but using memory for the
second operand.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
This complete the floating point support sign handling facility.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
It is part of the basic zArchitecture instructions.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
It is part of the basic zArchitecture instructions. Allow it to be call
from EXECUTE.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
This is needed to pass the gcc.c-torture/execute/ieee/20010114-2.c test
in the gcc testsuite.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
It belongs to the DFP rounding facility.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
STORE CLOCK FAST should be in the SCF facility.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
Change to match the PoP. In practice both format RIL-a and RIL-b have
the same fields. They differ on the way we decode the fields, and it's
done correctly in QEMU.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>