Diag318 fencing needs to be determined on the current VM PV state and
not on the state that the VM has when we create the CPU model.
Fixes: fabdada935 ("s390: guest support for diagnose 0x318")
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Tested-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Message-Id: <20201022103135.126033-3-frankja@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
DIAGNOSE 0x318 (diag318) is an s390 instruction that allows the storage
of diagnostic information that is collected by the firmware in the case
of hardware/firmware service events.
QEMU handles the instruction by storing the info in the CPU state. A
subsequent register sync will communicate the data to the hypervisor.
QEMU handles the migration via a VM State Description.
This feature depends on the Extended-Length SCCB (els) feature. If
els is not present, then a warning will be printed and the SCLP bit
that allows the Linux kernel to execute the instruction will not be
set.
Availability of this instruction is determined by byte 134 (aka fac134)
bit 0 of the SCLP Read Info block. This coincidentally expands into the
space used for CPU entries, which means VMs running with the diag318
capability may not be able to read information regarding all CPUs
unless the guest kernel supports an extended-length SCCB.
This feature is not supported in protected virtualization mode.
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-Id: <20200915194416.107460-9-walling@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
As more features and facilities are added to the Read SCP Info (RSCPI)
response, more space is required to store them. The space used to store
these new features intrudes on the space originally used to store CPU
entries. This means as more features and facilities are added to the
RSCPI response, less space can be used to store CPU entries.
With the Extended-Length SCCB (ELS) facility, a KVM guest can execute
the RSCPI command and determine if the SCCB is large enough to store a
complete reponse. If it is not large enough, then the required length
will be set in the SCCB header.
The caller of the SCLP command is responsible for creating a
large-enough SCCB to store a complete response. Proper checking should
be in place, and the caller should execute the command once-more with
the large-enough SCCB.
This facility also enables an extended SCCB for the Read CPU Info
(RCPUI) command.
When this facility is enabled, the boundary violation response cannot
be a result from the RSCPI, RSCPI Forced, or RCPUI commands.
In order to tolerate kernels that do not yet have full support for this
feature, a "fixed" offset to the start of the CPU Entries within the
Read SCP Info struct is set to allow for the original 248 max entries
when this feature is disabled.
Additionally, this is introduced as a CPU feature to protect the guest
from migrating to a machine that does not support storing an extended
SCCB. This could otherwise hinder the VM from being able to read all
available CPU entries after migration (such as during re-ipl).
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-Id: <20200915194416.107460-7-walling@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We need some little help in the code to reduce the valgrind noise.
This patch does this with some designated initializers for the cpu
model features and subfunctions.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200429074201.100924-1-borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The unpack facility is an indication that diagnose 308 subcodes 8-10
are available to the guest. That means, that the guest can put itself
into protected mode.
Once it is in protected mode, the hardware stops any attempt of VM
introspection by the hypervisor.
Some features are currently not supported in protected mode:
* vfio devices
* Migration
* Huge page backings
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20200319131921.2367-17-frankja@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
SCLP for a protected guest is done over the SIDAD, so we need to use
the s390_cpu_pv_mem_* functions to access the SIDAD instead of guest
memory when reading/writing SCBs.
To not confuse the sclp emulation, we set 0x4000 as the SCCB address,
since the function that injects the sclp external interrupt would
reject a zero sccb address.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20200319131921.2367-10-frankja@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
For protected guests, we need to put the STSI emulation results into
the SIDA, so SIE will write them into the guest at the next entry.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20200319131921.2367-9-frankja@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Protected guests save the instruction control blocks in the SIDA
instead of QEMU/KVM directly accessing the guest's memory.
Let's introduce new functions to access the SIDA.
The memops for doing so are available with KVM_CAP_S390_PROTECTED, so
let's check for that.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20200319131921.2367-8-frankja@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Protected VMs no longer intercept with code 4 for an instruction
interception. Instead they have codes 104 and 108 for protected
instruction interception and protected instruction notification
respectively.
The 104 mirrors the 4 interception.
The 108 is a notification interception to let KVM and QEMU know that
something changed and we need to update tracking information or
perform specific tasks. It's currently taken for the following
instructions:
* spx (To inform about the changed prefix location)
* sclp (On incorrect SCCB values, so we can inject a IRQ)
* sigp (All but "stop and store status")
* diag308 (Subcodes 0/1)
Of these exits only sclp errors, state changing sigps and diag308 will
reach QEMU. QEMU will do its parts of the job, while the ultravisor
has done the instruction part of the job.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20200319131921.2367-7-frankja@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The unpack facility provides the means to setup a protected guest. A
protected guest cannot be introspected by the hypervisor or any
user/administrator of the machine it is running on.
Protected guests are encrypted at rest and need a special boot
mechanism via diag308 subcode 8 and 10.
Code 8 sets the PV specific IPLB which is retained separately from
those set via code 5.
Code 10 is used to unpack the VM into protected memory, verify its
integrity and start it.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Co-developed-by: Christian Borntraeger <borntraeger@de.ibm.com> [Changes
to machine]
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20200323083606.24520-1-frankja@linux.ibm.com>
[CH: fixed up KVM_PV_VM_ -> KVM_PV_]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The cpu number reporting is handled by KVM and QEMU only fills in the
VM name, uuid and other values.
Unfortunately KVM doesn't report reserved cpus and doesn't even know
they exist until the are created via the ioctl.
So let's fix up the cpu values after KVM has written its values to the
3.2.2 sysib. To be consistent, we use the same code to retrieve the cpu
numbers as the STSI TCG code in target/s390x/misc_helper.c:HELPER(stsi).
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200331110123.3774-1-frankja@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Up to now we only had an ioctl to reset vcpu data QEMU couldn't reach
for the initial reset, which was also called for the clear reset. To
be architecture compliant, we also need to clear local interrupts on a
normal reset.
Because of this and the upcoming protvirt support we need to add
ioctls for the missing clear and normal resets.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200214151636.8764-3-frankja@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The AIS feature has been disabled late in the v2.10 development cycle since
there were some issues with migration (see commit 3f2d07b3b0 -
"s390x/ais: for 2.10 stable: disable ais facility"). We originally wanted
to enable it again for newer machine types, but apparently we forgot to do
this so far. Let's do it now for the machines that support proper CPU models.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1756946
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20200122101437.5069-1-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The KVMState struct is opaque, so provide accessors for the fields
that will be moved from current_machine to the accelerator. For now
they just forward to the machine object, but this will change.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It defaults to returning 0 anyway and that return value is not
necessary, as 0 is also the default rc that the caller would return.
While doing that we can simplify the logic a bit and return early if
we inject a PGM exception.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20191129091713.4582-1-frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We have to set the default model of all machine classes, not just for
the active one. Otherwise, "query-machines" will indicate the wrong
CPU model ("qemu-s390x-cpu" instead of "host-s390x-cpu") as
"default-cpu-type".
Doing a
{"execute":"query-machines"}
under KVM now results in
{"return": [
{
"hotpluggable-cpus": true,
"name": "s390-ccw-virtio-4.0",
"numa-mem-supported": false,
"default-cpu-type": "host-s390x-cpu",
"cpu-max": 248,
"deprecated": false},
{
"hotpluggable-cpus": true,
"name": "s390-ccw-virtio-2.7",
"numa-mem-supported": false,
"default-cpu-type": "host-s390x-cpu",
"cpu-max": 248,
"deprecated": false
} ...
Libvirt probes all machines via "-machine none,accel=kvm:tcg" and will
currently see the wrong CPU model under KVM.
Reported-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Fixes: b6805e127c ("s390x: use generic cpu_model parsing")
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20191021100515.6978-1-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Instead of splitting at an unaligned address, we can simply split at
4TB.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
s390 was trying to solve limited KVM memslot size issue by abusing
memory_region_allocate_system_memory(), which breaks API contract
where the function might be called only once.
Beside an invalid use of API, the approach also introduced migration
issue, since RAM chunks for each KVM_SLOT_MAX_BYTES are transferred in
migration stream as separate RAMBlocks.
After discussion [1], it was agreed to break migration from older
QEMU for guest with RAM >8Tb (as it was relatively new (since 2.12)
and considered to be not actually used downstream).
Migration should keep working for guests with less than 8TB and for
more than 8TB with QEMU 4.2 and newer binary.
In case user tries to migrate more than 8TB guest, between incompatible
QEMU versions, migration should fail gracefully due to non-exiting
RAMBlock ID or RAMBlock size mismatch.
Taking in account above and that now KVM code is able to split too
big MemorySection into several memslots, partially revert commit
(bb223055b s390-ccw-virtio: allow for systems larger that 7.999TB)
and use kvm_set_max_memslot_size() to set KVMSlot size to
KVM_SLOT_MAX_BYTES.
1) [PATCH RFC v2 4/4] s390: do not call memory_region_allocate_system_memory() multiple times
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20190924144751.24149-5-imammedo@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Since QEMU v2.10, the KVM acceleration does not work on older kernels
anymore since the code accidentally requires the KVM_CAP_DEVICE_CTRL
capability now - it should have been optional instead.
Instead of fixing the bug, we asked in the ChangeLog of QEMU 2.11 - 3.0
that people should speak up if they still need support of QEMU running
with KVM on older kernels, but seems like nobody really complained.
Thus let's make this official now and turn it into a proper error
message, telling the users to use at least kernel 3.15 now.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190913091443.27565-1-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
sysemu/sysemu.h is a rather unfocused dumping ground for stuff related
to the system-emulator. Evidence:
* It's included widely: in my "build everything" tree, changing
sysemu/sysemu.h still triggers a recompile of some 1100 out of 6600
objects (not counting tests and objects that don't depend on
qemu/osdep.h, down from 5400 due to the previous two commits).
* It pulls in more than a dozen additional headers.
Split stuff related to run state management into its own header
sysemu/runstate.h.
Touching sysemu/sysemu.h now recompiles some 850 objects. qemu/uuid.h
also drops from 1100 to 850, and qapi/qapi-types-run-state.h from 4400
to 4200. Touching new sysemu/runstate.h recompiles some 500 objects.
Since I'm touching MAINTAINERS to add sysemu/runstate.h anyway, also
add qemu/main-loop.h.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-30-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
[Unbreak OS-X build]
In my "build everything" tree, changing qemu/main-loop.h triggers a
recompile of some 5600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h). It includes block/aio.h,
which in turn includes qemu/event_notifier.h, qemu/notify.h,
qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h,
qemu/thread.h, qemu/timer.h, and a few more.
Include qemu/main-loop.h only where it's needed. Touching it now
recompiles only some 1700 objects. For block/aio.h and
qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the
others, they shrink only slightly.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-21-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
In my "build everything" tree, changing hw/hw.h triggers a recompile
of some 2600 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).
The previous commits have left only the declaration of hw_error() in
hw/hw.h. This permits dropping most of its inclusions. Touching it
now recompiles less than 200 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
The global smp variables in s390x are replaced with smp machine properties.
A local variable of the same name would be introduced in the declaration
phase if it's used widely in the context OR replace it on the spot if it's
only used once. No semantic changes.
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190518205428.90532-7-like.xu@linux.intel.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[ehabkost: fix build failure at VCPU_IRQ_BUF_SIZE]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
fixup! hw/s390x: Replace global smp variables with machine smp properties
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Simiar to how kvm_init_vcpu() calls kvm_arch_init_vcpu() to perform
arch-dependent initialisation, introduce kvm_arch_destroy_vcpu()
to be called from kvm_destroy_vcpu() to perform arch-dependent
destruction.
This was added because some architectures (Such as i386)
currently do not free memory that it have allocated in
kvm_arch_init_vcpu().
Suggested-by: Maran Wilson <maran.wilson@oracle.com>
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-3-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPU_DoubleU is primarily used to reinterpret between integer and floats.
We don't really need this functionality. So let's just keep it simple
and use an uint64_t.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Provide the MSA9 facility (stfle.155). This also contains pckmo
subfunctions for key wrapping. Keep them in a separate group to disable
those as a block if necessary. This is for example needed when disabling
key wrapping via the HMC.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20190429090250.7648-5-borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Right now we configure the pagesize quite early, when initializing KVM.
This is long before system memory is actually allocated via
memory_region_allocate_system_memory(), and therefore memory backends
marked as mapped.
Instead, let's configure the maximum page size after initializing
memory in s390_memory_init(). cap_hpage_1m is still properly
configured before creating any CPUs, and therefore before configuring
the CPU model and eventually enabling CMMA.
This is not a fix but rather a preparation for the future, when initial
memory might reside on memory backends (not the case for s390x right now)
We will replace qemu_getrampagesize() soon by a function that will always
return the maximum page size (not the minimum page size, which only
works by pure luck so far, as there are no memory backends).
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190417113143.5551-2-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
kvm_s390_mem_op() can fail in two ways: when !cap_mem_op, it returns
-ENOSYS, and when kvm_vcpu_ioctl() fails, it returns -errno set by
ioctl(). Its caller s390_cpu_virt_mem_rw() recovers from both
failures.
kvm_s390_mem_op() prints "KVM_S390_MEM_OP failed" with error_printf()
in the latter failure mode. Since this is obviously a warning, use
warn_report().
Perhaps the reporting should be left to the caller. It could warn on
failure other than -ENOSYS.
Cc: Thomas Huth <thuth@redhat.com>
Cc: qemu-s390x@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190417190641.26814-9-armbru@redhat.com>
We tried to make pci support optional on s390x in the past;
unfortunately, we still require the s390 phb to be created
unconditionally due to backwards compatibility issues.
Instead of sinking more effort into this (including compat
handling for older machines etc.) for non-obvious gains, let's
just make CONFIG_PCI something that is always set on s390x.
Note that you can still fence off pci for the _guest_ if you
provide a cpu model without the zpci feature.
Message-Id: <20190211113255.3837-1-cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The license information in these files is rather confusing. The text
declares LGPL first, but then says that contributions after 2012 are
licensed under the GPL instead. How should the average user who just
downloaded the release tarball know which part is now GPL and which
is LGPL?
Looking at the text of the LGPL (see COPYING.LIB in the top directory),
the license clearly states how this should be done instead:
"3. You may opt to apply the terms of the ordinary GNU General Public
License instead of this License to a given copy of the Library. To do
this, you must alter all the notices that refer to this License, so
that they refer to the ordinary GNU General Public License, version 2,
instead of to this License."
Thus let's clean up the confusing statements and use the proper GPL
text only.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1549456893-16589-1-git-send-email-thuth@redhat.com>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
I plan to deprecate -mem-path option and replace it with memory-backend,
for that it's necessary to get rid of mem_path global variable.
Do it for s390x case, replacing it with alternative way to enable
1Mb hugepages capability.
Todo that replace qemu_mempath_getpagesize() with qemu_getrampagesize()
which also checks for -mem-path provided RAM.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <1548834906-133241-1-git-send-email-imammedo@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's use the KVM_SET_DEVICE_ATTR ioctl to enable hardware
interpretation of AP instructions executed on the guest.
If the S390_FEAT_AP feature is switched on for the guest,
AP instructions must be interpreted by default; otherwise,
they will be intercepted.
This attribute setting may be overridden by a device. For example,
a device may want to provide AP instructions to the guest (i.e.,
S390_FEAT_AP turned on), but it may want to emulate them. In this
case, the AP instructions executed on the guest must be
intercepted; so when the device is realized, it must disable
interpretation.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Tested-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20181010170309.12045-4-akrowiak@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Move it into TCG-only code and provide a stub. Turn it into noreturn.
As Richard noted, we currently don't log the psw.addr before restoring
the state, fix that by moving (duplicating) the qemu_log_mask in the
tcg/kvm handlers.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180927130303.12236-2-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
As the kernel has no way of disallowing the start of a huge page
backed VM, we can migrate a running huge backed VM to a host that has
no huge page KVM support.
Let's glue huge page support support to the 3.1 machine, so we do not
migrate to a destination host that doesn't have QEMU huge page support
and can stop migration if KVM doesn't indicate support.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Message-Id: <20180928093435.198573-1-frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
QEMU has had huge page support for a longer time already, but KVM
memory management under s390x needed some changes to work with huge
backings.
Now that we have support, let's enable it if requested and
available. Otherwise we now properly tell the user if there is no
support and back out instead of failing to run the VM later on.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180802070201.257406-1-frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Provide the etoken facility. We need to handle cpu model, migration and
clear reset.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Message-Id: <20180731090448.36662-3-borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's do this for completeness reason, although we don't support e.g.
PCDIMM/NVDIMM, which would use the alignment for placing the memory
region in guest physical memory. But maybe someday we would want to
support something like this - then we don't forget about this if
allowing multiple allocations in legacy_s390_alloc().
Use the same alignment as we would set in qemu_anon_ram_alloc(). Our
fixed address satisfies this alignment (1MB). This implicitly sets the
alignment of the underlying memory region.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180628113817.30814-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We always allocate at a fixed address, a second allocation can therefore
of course never work. We would simply overwrite mappings.
This can e.g. happen in s390_memory_init(), if trying to allocate more
than > 8TB. Let's just bail out, as there is no need for supporting it
(legacy handling for z/VM).
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180628113817.30814-2-david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We are going to factor out the TOD into a separate device and use const
pointers for device class functions where possible. We are passing right
now ordinary pointers that should never be touched when setting the TOD.
Let's just pass the values directly.
Note that s390_set_clock() will be removed in a follow-on patch and
therefore its calling convention is not changed.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180627134410.4901-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Calling pause_all_vcpus()/resume_all_vcpus() from a VCPU thread might
not be the best idea. As pause_all_vcpus() temporarily drops the qemu
mutex, two parallel calls to pause_all_vcpus() can be active at a time,
resulting in a deadlock. (either by two VCPUs or by the main thread and a
VCPU)
Let's handle it via the main loop instead, as suggested by Paolo. If we
would have two parallel reset requests by two different VCPUs at the
same time, the last one would win.
We use the existing ipl device to handle it. The nice side effect is
that we can get rid of reipl_requested.
This change implies that all reset handling now goes via the common
path, so "no-reboot" handling is now active for all kinds of reboots.
Let's execute any CPU initialization code on the target CPU using
run_on_cpu.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180424101859.10239-1-david@redhat.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We have a call to cpu_synchronize_state() on every kvm_arch_handle_exit().
Let's remove the ones that are no longer needed.
Remaining places (for s390x) are in
- target/s390x/sigp.c, on the target CPU
- target/s390x/cpu.c:s390_cpu_get_crash_info()
While at it, use kvm_cpu_synchronize_state() instead of
cpu_synchronize_state() in KVM code. (suggested by Thomas Huth)
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180412093521.2469-1-david@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Manually having to use cpu_synchronize_state() is error prone. And as
Christian Borntraeger discovered, e.g. handle_diag() is currently
missing a cpu_synchronize_state(), as decode_basedisp_s() uses a
general purpose register value internally.
So let's do an overall cpu_synchronize_state(), which fixes at least the
one mentioned BUG. We will clean up the superfluous cpu_synchronize_state()
calls later.
We now also call it (although maybe not neded) for
- KVM_EXIT_S390_RESET -> s390_reipl_request()
- KVM_EXIT_DEBUG -> kvm_arch_handle_debug_exit()
- unmanagable/unimplemented intercepts
- ICPT_CPU_STOP -> do_stop_interrupt() -> cpu gets halted
- Scenarios where we inject an operation exception
- handle_stsi()
I don't think any of these are performance critical. Especially as we
have all information directly contained in kvm_run, there are no
additional IOCTLs to issue on modern kernels.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180406093552.13016-1-david@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
In my "build everything" tree, a change to the types in
qapi-schema.json triggers a recompile of about 4800 out of 5100
objects.
The previous commit split up qmp-commands.h, qmp-event.h, qmp-visit.h,
qapi-types.h. Each of these headers still includes all its shards.
Reduce compile time by including just the shards we actually need.
To illustrate the benefits: adding a type to qapi/migration.json now
recompiles some 2300 instead of 4800 objects. The next commit will
improve it further.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-24-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
Not needed anymore after removal of the memory hotplug code.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Presently s390x is the only architecture not exposing specific
CPU information via QMP query-cpus. Upstream discussion has shown
that it could make sense to report the architecture specific CPU
state, e.g. to detect that a CPU has been stopped.
With this change the output of query-cpus will look like this on
s390:
[
{"arch": "s390", "current": true,
"props": {"core-id": 0}, "cpu-state": "operating", "CPU": 0,
"qom_path": "/machine/unattached/device[0]",
"halted": false, "thread_id": 63115},
{"arch": "s390", "current": false,
"props": {"core-id": 1}, "cpu-state": "stopped", "CPU": 1,
"qom_path": "/machine/unattached/device[1]",
"halted": true, "thread_id": 63116}
]
This change doesn't add the s390-specific data to HMP 'info cpus'.
A follow-on patch will remove all architecture specific information
from there.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1518797321-28356-2-git-send-email-mihajlov@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This patch is the s390 implementation of guest crash information,
similar to commit d187e08dc4 ("i386/cpu: add crash-information QOM
property") and the related commits. We will detect several crash
reasons, with the "disabled wait" being the most important one, since
this is used by all s390 guests as a "panic like" notification.
Demonstrate these ways with examples as follows.
1. crash-information QOM property;
Run qemu with -qmp unix:qmp-sock,server, then use utility "qmp-shell"
to execute "qom-get" command, and might get the result like,
(QEMU) (QEMU) qom-get path=/machine/unattached/device[0] \
property=crash-information
{"return": {"core": 0, "reason": "disabled-wait", "psw-mask": 562956395872256, \
"type": "s390", "psw-addr": 1102832}}
2. GUEST_PANICKED event reporting;
Run qemu with a socket option, and telnet or nc to that,
-chardev socket,id=qmp,port=4444,host=localhost,server \
-mon chardev=qmp,mode=control,pretty=on \
Negotiating the mode by { "execute": "qmp_capabilities" }, and the crash
information will be reported on a guest crash event like,
{
"timestamp": {
"seconds": 1518004739,
"microseconds": 552563
},
"event": "GUEST_PANICKED",
"data": {
"action": "pause",
"info": {
"core": 0,
"psw-addr": 1102832,
"reason": "disabled-wait",
"psw-mask": 562956395872256,
"type": "s390"
}
}
}
3. log;
Run qemu with the parameters: -D <logfile> -d guest_errors, to
specify the logfile and log item. The results might be,
Guest crashed on cpu 0: disabled-wait
PSW: 0x0002000180000000 0x000000000010d3f0
Co-authored-by: Jing Liu <liujbjl@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20180209122543.25755-1-borntraeger@de.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[CH: tweaked qapi comment]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>