Commit Graph

154 Commits

Author SHA1 Message Date
Harsh Prateek Bora
cfb52d07f5 target/ppc: handle vcpu hotplug failure gracefully
On ppc64, the PowerVM hypervisor runs with limited memory and a VCPU
creation during hotplug may fail during kvm_ioctl for KVM_CREATE_VCPU,
leading to termination of guest since errp is set to &error_fatal while
calling kvm_init_vcpu. This unexpected behaviour can be avoided by
pre-creating and parking vcpu on success or return error otherwise.
This enables graceful error delivery for any vcpu hotplug failures while
the guest can keep running.

Also introducing KVM AccelCPUClass to init cpu_target_realize for kvm.

Tested OK by repeatedly doing a hotplug/unplug of vcpus as below:

 #virsh setvcpus hotplug 40
 #virsh setvcpus hotplug 70
error: internal error: unable to execute QEMU command 'device_add':
kvmppc_cpu_realize: vcpu hotplug failed with -12

Signed-off by: Harsh Prateek Bora <harshpb@linux.ibm.com>

Reported-by: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com>
Suggested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Suggested-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Tested-by: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-07-26 09:21:06 +10:00
Alex Bennée
5b7d54d4ed gdbstub: move enums into separate header
This is an experiment to further reduce the amount we throw into the
exec headers. It might not be as useful as I initially thought because
just under half of the users also need gdbserver_start().

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240620152220.2192768-3-alex.bennee@linaro.org>
2024-06-24 10:14:17 +01:00
Nicholas Piggin
c700b5e162 spapr: avoid overhead of finding vhyp class in critical operations
PPC_VIRTUAL_HYPERVISOR_GET_CLASS is used in critical operations like
interrupts and TLB misses and is quite costly. Running the
kvm-unit-tests sieve program with radix MMU enabled thrashes the TCG
TLB and spends a lot of time in TLB and page table walking code. The
test takes 67 seconds to complete with a lot of time being spent in
code related to finding the vhyp class:

   12.01%  [.] g_str_hash
    8.94%  [.] g_hash_table_lookup
    8.06%  [.] object_class_dynamic_cast
    6.21%  [.] address_space_ldq
    4.94%  [.] __strcmp_avx2
    4.28%  [.] tlb_set_page_full
    4.08%  [.] address_space_translate_internal
    3.17%  [.] object_class_dynamic_cast_assert
    2.84%  [.] ppc_radix64_xlate

Keep a pointer to the class and avoid this lookup. This reduces the
execution time to 40 seconds.

Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24 08:57:50 +10:00
Paolo Bonzini
566abdb4d9 kvm: ppc: disable sPAPR code if CONFIG_PSERIES is disabled
target/ppc/kvm.c calls out to code in hw/ppc/spapr*.c; that code is
not present and fails to link if CONFIG_PSERIES is not enabled.
Adjust kvm.c to depend on CONFIG_PSERIES instead of TARGET_PPC64,
and compile out anything that requires cap_papr, because only
the pseries machine will call kvmppc_set_papr().

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-03 15:47:47 +02:00
Paolo Bonzini
a99c0c66eb KVM: remove kvm_arch_cpu_check_are_resettable
Board reset requires writing a fresh CPU state.  As far as KVM is
concerned, the only thing that blocks reset is that CPU state is
encrypted; therefore, kvm_cpus_are_resettable() can simply check
if that is the case.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-23 17:35:25 +02:00
Peter Maydell
51e31f2140 * PAPR nested hypervisor host implementation for spapr TCG
* excp_helper.c code cleanups and improvements
 * Move more ops to decodetree
 * Deprecate pseries-2.12 machines and P9 and P10 DD1.0 CPUs
 * Document running Linux on AmigaNG
 * Update dt feature advertising POWER CPUs.
 * Add P10 PMU SPRs
 * Improve pnv topology calculation for SMT8 CPUs.
 * Various bug fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmXwiT8ACgkQZ7MCdqhi
 HK7C/w//XxEO2bQTFPLFDTrP/voq7pcX8XeQNVyXCkXYjvsbu05oQow50k+Y5UAE
 US4MFjt8jFz0vuIKuKyoA3kG41zDSOzoX4TQXMM+tyTWbuFF3KAyfizb1xE6SYAN
 xJEGvmiXv/EgoSBD7BTKQp1tMPdIGZLwSdYiA0lmOo7YaMCgYAXaujW5hnNjQecT
 873sN+10pHtQY++mINtD9Nfb6AcDGMWw0b+bykqIXhNRkI8IGOS4WF4vAuMBrwfe
 UM00wDnNRb86Dk14bv2XVNDr6/i0VRtUMwM4yiptrQ1TQx18LZaPSQFYjQfPaan7
 LwN4QkMFnBX54yJ7Npvjvu8BCBF47kwOVu4CIAFJ4sIm0WfTmozDpPttwcZ5w7Ve
 iXDOB9ECAB4pQ2rCgbSNG8MYUZgoHHOuThqolOP0Vh9NHRRJxpdw6CyAbmCGftc0
 lvRDPFiKp8xmCNJ/j3XzoUdHoG7NMwpUmHv9ruGU18SdQ8hyJN9AcQGWYrB4v0RV
 /hs2RAbwntG7ahkcwd8uy5aFw88Wph/uGXPXc49EWj7i49vHeIV2y5+gtthMywje
 qqjFXkistXuF+JHVnyoYmqqCyXaHX5CEwtawMv4EQeaJs76bLhMeMTKKl9rRp8qB
 DtbIZphO8iMsocrBnje48sA5HR0PM+H4HTjw10i8R0fLlWitaIY=
 =XnY5
 -----END PGP SIGNATURE-----

Merge tag 'pull-ppc-for-9.0-2-20240313' of https://gitlab.com/npiggin/qemu into staging

* PAPR nested hypervisor host implementation for spapr TCG
* excp_helper.c code cleanups and improvements
* Move more ops to decodetree
* Deprecate pseries-2.12 machines and P9 and P10 DD1.0 CPUs
* Document running Linux on AmigaNG
* Update dt feature advertising POWER CPUs.
* Add P10 PMU SPRs
* Improve pnv topology calculation for SMT8 CPUs.
* Various bug fixes.

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmXwiT8ACgkQZ7MCdqhi
# HK7C/w//XxEO2bQTFPLFDTrP/voq7pcX8XeQNVyXCkXYjvsbu05oQow50k+Y5UAE
# US4MFjt8jFz0vuIKuKyoA3kG41zDSOzoX4TQXMM+tyTWbuFF3KAyfizb1xE6SYAN
# xJEGvmiXv/EgoSBD7BTKQp1tMPdIGZLwSdYiA0lmOo7YaMCgYAXaujW5hnNjQecT
# 873sN+10pHtQY++mINtD9Nfb6AcDGMWw0b+bykqIXhNRkI8IGOS4WF4vAuMBrwfe
# UM00wDnNRb86Dk14bv2XVNDr6/i0VRtUMwM4yiptrQ1TQx18LZaPSQFYjQfPaan7
# LwN4QkMFnBX54yJ7Npvjvu8BCBF47kwOVu4CIAFJ4sIm0WfTmozDpPttwcZ5w7Ve
# iXDOB9ECAB4pQ2rCgbSNG8MYUZgoHHOuThqolOP0Vh9NHRRJxpdw6CyAbmCGftc0
# lvRDPFiKp8xmCNJ/j3XzoUdHoG7NMwpUmHv9ruGU18SdQ8hyJN9AcQGWYrB4v0RV
# /hs2RAbwntG7ahkcwd8uy5aFw88Wph/uGXPXc49EWj7i49vHeIV2y5+gtthMywje
# qqjFXkistXuF+JHVnyoYmqqCyXaHX5CEwtawMv4EQeaJs76bLhMeMTKKl9rRp8qB
# DtbIZphO8iMsocrBnje48sA5HR0PM+H4HTjw10i8R0fLlWitaIY=
# =XnY5
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 16:56:31 GMT
# gpg:                using RSA key 4E437DDA56616F4329B0A79567B30276A8621CAE
# gpg: Good signature from "Nicholas Piggin <npiggin@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4E43 7DDA 5661 6F43 29B0  A795 67B3 0276 A862 1CAE

* tag 'pull-ppc-for-9.0-2-20240313' of https://gitlab.com/npiggin/qemu: (38 commits)
  spapr: nested: Introduce cap-nested-papr for Nested PAPR API
  spapr: nested: Introduce H_GUEST_RUN_VCPU hcall.
  spapr: nested: Use correct source for parttbl info for nested PAPR API.
  spapr: nested: Introduce H_GUEST_[GET|SET]_STATE hcalls.
  spapr: nested: Initialize the GSB elements lookup table.
  spapr: nested: Extend nested_ppc_state for nested PAPR API
  spapr: nested: Introduce H_GUEST_CREATE_VCPU hcall.
  spapr: nested: Introduce H_GUEST_[CREATE|DELETE] hcalls.
  spapr: nested: Introduce H_GUEST_[GET|SET]_CAPABILITIES hcalls.
  spapr: nested: Document Nested PAPR API
  spapr: nested: keep nested-hv related code restricted to its API.
  spapr: nested: Introduce SpaprMachineStateNested to store related info.
  spapr: nested: move nested part of spapr_get_pate into spapr_nested.c
  spapr: nested: register nested-hv api hcalls only for cap-nested-hv
  target/ppc: Remove interrupt handler wrapper functions
  target/ppc: Clean up ifdefs in excp_helper.c, part 3
  target/ppc: Clean up ifdefs in excp_helper.c, part 2
  target/ppc: Clean up ifdefs in excp_helper.c, part 1
  target/ppc: Add gen_exception_err_nip() function
  target/ppc: Readability improvements in exception handlers
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-03-13 12:37:27 +00:00
Nicholas Piggin
8f054d9ee8 ppc: Drop support for POWER9 and POWER10 DD1 chips
The POWER9 DD1 and POWER10 DD1 chips are not public and are no longer of
any use in QEMU. Remove them.

Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-03-13 02:47:04 +10:00
Philippe Mathieu-Daudé
794511bc51 target/ppc: Prefer fast cpu_env() over slower CPU QOM cast macro
Mechanical patch produced running the command documented
in scripts/coccinelle/cpu_env.cocci_template header.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240129164514.73104-22-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12 12:04:24 +01:00
Philippe Mathieu-Daudé
ee1004bba6 bulk: Access existing variables initialized to &S->F when available
When a variable is initialized to &struct->field, use it
in place. Rationale: while this makes the code more concise,
this also helps static analyzers.

Mechanical change using the following Coccinelle spatch script:

 @@
 type S, F;
 identifier s, m, v;
 @@
      S *s;
      ...
      F *v = &s->m;
      <+...
 -    &s->m
 +    v
      ...+>

Inspired-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240129164514.73104-2-philmd@linaro.org>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
[thuth: Dropped hunks that need a rebase, and fixed sizeof() in pmu_realize()]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-12 11:46:16 +01:00
Thomas Huth
97c2fc5076 target/ppc/kvm: Replace variable length array in kvmppc_read_hptes()
HPTES_PER_GROUP is 8 and HASH_PTE_SIZE_64 is 16, so we don't waste
too many bytes by always allocating the maximum amount of bytes on
the stack here to get rid of the variable length array.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20240221162636.173136-3-thuth@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-02-23 08:13:52 +01:00
Thomas Huth
aba594da96 target/ppc/kvm: Replace variable length array in kvmppc_save_htab()
To be able to compile QEMU with -Wvla (to prevent potential security
issues), we need to get rid of the variable length array in the
kvmppc_save_htab() function. Replace it with a heap allocation instead.

Message-ID: <20240221162636.173136-2-thuth@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-02-23 08:13:52 +01:00
Stefan Hajnoczi
195801d700 system/cpus: rename qemu_mutex_lock_iothread() to bql_lock()
The Big QEMU Lock (BQL) has many names and they are confusing. The
actual QemuMutex variable is called qemu_global_mutex but it's commonly
referred to as the BQL in discussions and some code comments. The
locking APIs, however, are called qemu_mutex_lock_iothread() and
qemu_mutex_unlock_iothread().

The "iothread" name is historic and comes from when the main thread was
split into into KVM vcpu threads and the "iothread" (now called the main
loop thread). I have contributed to the confusion myself by introducing
a separate --object iothread, a separate concept unrelated to the BQL.

The "iothread" name is no longer appropriate for the BQL. Rename the
locking APIs to:
- void bql_lock(void)
- void bql_unlock(void)
- bool bql_locked(void)

There are more APIs with "iothread" in their names. Subsequent patches
will rename them. There are also comments and documentation that will be
updated in later patches.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Fabiano Rosas <farosas@suse.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Peter Xu <peterx@redhat.com>
Acked-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Acked-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20240102153529.486531-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-01-08 10:45:43 -05:00
Philippe Mathieu-Daudé
aa6edf97ce sysemu/kvm: Restrict kvmppc_get_radix_page_info() to ppc targets
kvm_get_radix_page_info() is only defined for ppc targets (in
target/ppc/kvm.c). The declaration is not useful in other targets,
reduce its scope.
Rename using the 'kvmppc_' prefix following other declarations
from target/ppc/kvm_ppc.h.

Suggested-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20231003070427.69621-2-philmd@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-11-07 12:13:28 +01:00
Cédric Le Goater
0edc9e45f3 target/ppc: Clean up local variable shadowing in kvm_arch_*_registers()
Remove extra 'i' variable to fix this warning :

  ../target/ppc/kvm.c: In function ‘kvm_arch_put_registers’:
  ../target/ppc/kvm.c:963:13: warning: declaration of ‘i’ shadows a previous local [-Wshadow=compatible-local]
    963 |         int i;
        |             ^
  ../target/ppc/kvm.c:906:9: note: shadowed declaration is here
    906 |     int i;
        |         ^
  ../target/ppc/kvm.c: In function ‘kvm_arch_get_registers’:
  ../target/ppc/kvm.c:1265:13: warning: declaration of ‘i’ shadows a previous local [-Wshadow=compatible-local]
   1265 |         int i;
        |             ^
  ../target/ppc/kvm.c:1212:9: note: shadowed declaration is here
   1212 |     int i, ret;
        |         ^

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20231006053526.1031252-1-clg@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2023-10-06 13:27:48 +02:00
jianchunfu
76d93e1467 target/ppc: Fix the order of kvm_enable judgment about kvmppc_set_interrupt()
It's unnecessary for non-KVM accelerators(TCG, for example),
to call this function, so change the order of kvm_enable() judgment.

The static inline function that returns -1 directly does not work
 in TCG's situation.

Signed-off-by: jianchunfu <chunfu.jian@shingroup.cn>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-09-06 11:19:33 +02:00
Anton Johansson
b8a6eb1862 sysemu/kvm: Use vaddr for kvm_arch_[insert|remove]_hw_breakpoint
Changes the signature of the target-defined functions for
inserting/removing kvm hw breakpoints. The address and length arguments
are now of vaddr type, which both matches the type used internally in
accel/kvm/kvm-all.c and makes the api target-agnostic.

Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230807155706.9580-4-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-08-24 11:21:35 -07:00
Akihiko Odaki
5e0d65909c kvm: Introduce kvm_arch_get_default_type hook
kvm_arch_get_default_type() returns the default KVM type. This hook is
particularly useful to derive a KVM type that is valid for "none"
machine model, which is used by libvirt to probe the availability of
KVM.

For MIPS, the existing mips_kvm_type() is reused. This function ensures
the availability of VZ which is mandatory to use KVM on the current
QEMU.

Cc: qemu-stable@nongnu.org
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20230727073134.134102-2-akihiko.odaki@daynix.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: added doc comment for new function]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-08-22 17:31:02 +01:00
Cédric Le Goater
c4550e6e98 target/ppc: Fix timer register accessors when !KVM
When the Timer Control and Timer Status registers are modified, avoid
calling the KVM backend when not available

Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-06-25 22:41:30 +02:00
Nicholas Piggin
ccc5a4c5e1 spapr: Add SPAPR_CAP_AIL_MODE_3 for AIL mode 3 support for H_SET_MODE hcall
The behaviour of the Address Translation Mode on Interrupt resource is
not consistently supported by all CPU versions or all KVM versions: KVM
HV does not support mode 2, and does not support mode 3 on POWER7 or
early POWER9 processesors. KVM PR only supports mode 0. TCG supports all
modes (0, 2, 3) on CPUs with support for the corresonding LPCR[AIL] mode.
This leads to inconsistencies in guest behaviour and could cause problems
migrating guests.

This was not noticable for Linux guests for a long time because the
kernel only uses modes 0 and 3, and it used to consider AIL-3 to be
advisory in that it would always keep the AIL-0 vectors around, so it
did not matter whether or not interrupts were delivered according to
the AIL mode. Recent Linux guests depend on AIL mode 3 working as
specified in order to support the SCV facility interrupt. If AIL-3 can
not be provided, then H_SET_MODE must return an error to Linux so it can
disable the SCV facility (failure to do so can lead to userspace being
able to crash the guest kernel).

Add the ail-mode-3 capability to specify that AIL-3 is supported. AIL-0
is implied as the baseline, and AIL-2 is no longer supported by spapr.
AIL-2 is not known to be used by any software, but support in TCG could
be restored with an ail-mode-2 capability quite easily if a regression
is reported.

Modify the H_SET_MODE Address Translation Mode on Interrupt resource
handler to check capabilities and correctly return error if not
supported.

KVM has a cap to advertise support for AIL-3.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20230515160216.394612-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2023-05-28 07:13:54 -03:00
Harsh Prateek Bora
2060436aab ppc: spapr: cleanup cr get/set with helpers.
The bits in cr reg are grouped into eight 4-bit fields represented
by env->crf[8] and the related calculations should be abstracted to
keep the calling routines simpler to read. This is a step towards
cleaning up the related/calling code for better readability.

Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230503093619.2530487-2-harshpb@linux.ibm.com>
[danielhb: add 'const' modifier to fix linux-user build]
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2023-05-05 12:34:22 -03:00
Philippe Mathieu-Daudé
60f5fadd13 target/ppc/kvm: Remove unused "sysbus.h" header
Nothing requires SysBus declarations here.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221217172907.8364-6-philmd@linaro.org>
2023-02-27 22:29:01 +01:00
Paolo Bonzini
3dba0a335c kvm: allow target-specific accelerator properties
Several hypervisor capabilities in KVM are target-specific.  When exposed
to QEMU users as accelerator properties (i.e. -accel kvm,prop=value), they
should not be available for all targets.

Add a hook for targets to add their own properties to -accel kvm, for
now no such property is defined.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220929072014.20705-3-chenyi.qiang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-10 09:23:16 +02:00
Murilo Opsfelder Araujo
1a42c69237 target/ppc/kvm: Skip current and parent directories in kvmppc_find_cpu_dt
Some systems have /proc/device-tree/cpus/../clock-frequency. However,
this is not the expected path for a CPU device tree directory.

Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20220712210810.35514-1-muriloo@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-07-18 13:59:43 -03:00
Víctor Colombo
ca241959cd target/ppc: Remove msr_ts macro
msr_ts macro hides the usage of env->msr, which is a bad
behavior. Substitute it with FIELD_EX64 calls that explicitly use
env->msr as a parameter.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220504210541.115256-19-victor.colombo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-05 15:36:17 -03:00
Víctor Colombo
0939b8f8df target/ppc: Remove msr_ee macro
msr_ee macro hides the usage of env->msr, which is a bad behavior
Substitute it with FIELD_EX64 calls that explicitly use env->msr
as a parameter.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220504210541.115256-8-victor.colombo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-05 15:36:17 -03:00
Daniel Henrique Barboza
55baf4b584 target/ppc: init 'rmmu_info' in kvm_get_radix_page_info()
Init the struct to avoid Valgrind complaints about unitialized bytes,
such as this one:

==39549== Syscall param ioctl(generic) points to uninitialised byte(s)
==39549==    at 0x55864E4: ioctl (in /usr/lib64/libc.so.6)
==39549==    by 0xD1F7EF: kvm_vm_ioctl (kvm-all.c:3035)
==39549==    by 0xAF8F5B: kvm_get_radix_page_info (kvm.c:276)
==39549==    by 0xB00533: kvmppc_host_cpu_class_init (kvm.c:2369)
==39549==    by 0xD3DCE7: type_initialize (object.c:366)
==39549==    by 0xD3FACF: object_class_foreach_tramp (object.c:1071)
==39549==    by 0x502757B: g_hash_table_foreach (in /usr/lib64/libglib-2.0.so.0.7000.5)
==39549==    by 0xD3FC1B: object_class_foreach (object.c:1093)
==39549==    by 0xB0141F: kvm_ppc_register_host_cpu_type (kvm.c:2613)
==39549==    by 0xAF87E7: kvm_arch_init (kvm.c:157)
==39549==    by 0xD1E2A7: kvm_init (kvm-all.c:2595)
==39549==    by 0x8E6E93: accel_init_machine (accel-softmmu.c:39)
==39549==  Address 0x1fff00e208 is on thread 1's stack
==39549==  in frame #2, created by kvm_get_radix_page_info (kvm.c:267)
==39549==  Uninitialised value was created by a stack allocation
==39549==    at 0xAF8EE8: kvm_get_radix_page_info (kvm.c:267)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20220331001717.616938-5-danielhb413@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-05 15:36:16 -03:00
Daniel Henrique Barboza
b339427cfc target/ppc: init 'sregs' in kvmppc_put_books_sregs()
Init 'sregs' to avoid Valgrind complaints about uninitialized bytes
from kvmppc_put_books_sregs():

==54059== Thread 3:
==54059== Syscall param ioctl(generic) points to uninitialised byte(s)
==54059==    at 0x55864E4: ioctl (in /usr/lib64/libc.so.6)
==54059==    by 0xD1FA23: kvm_vcpu_ioctl (kvm-all.c:3053)
==54059==    by 0xAFB18B: kvmppc_put_books_sregs (kvm.c:891)
==54059==    by 0xAFB47B: kvm_arch_put_registers (kvm.c:949)
==54059==    by 0xD1EDA7: do_kvm_cpu_synchronize_post_init (kvm-all.c:2766)
==54059==    by 0x481AF3: process_queued_cpu_work (cpus-common.c:343)
==54059==    by 0x4EF247: qemu_wait_io_event_common (cpus.c:412)
==54059==    by 0x4EF343: qemu_wait_io_event (cpus.c:436)
==54059==    by 0xD21E83: kvm_vcpu_thread_fn (kvm-accel-ops.c:54)
==54059==    by 0xFFEBF3: qemu_thread_start (qemu-thread-posix.c:556)
==54059==    by 0x54E6DC3: start_thread (in /usr/lib64/libc.so.6)
==54059==    by 0x5596C9F: clone (in /usr/lib64/libc.so.6)
==54059==  Address 0x799d1cc is on thread 3's stack
==54059==  in frame #2, created by kvmppc_put_books_sregs (kvm.c:851)
==54059==  Uninitialised value was created by a stack allocation
==54059==    at 0xAFAEB0: kvmppc_put_books_sregs (kvm.c:851)

This happens because Valgrind does not consider the 'sregs'
initialization done by kvm_vcpu_ioctl() at the end of the function.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20220331001717.616938-4-danielhb413@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-05 15:36:16 -03:00
Daniel Henrique Barboza
59411579b2 target/ppc: init 'lpcr' in kvmppc_enable_cap_large_decr()
'lpcr' is used as an input of kvm_get_one_reg(). Valgrind doesn't
understand that and it returns warnings as such for this function:

==55240== Thread 1:
==55240== Conditional jump or move depends on uninitialised value(s)
==55240==    at 0xB011E4: kvmppc_enable_cap_large_decr (kvm.c:2546)
==55240==    by 0x92F28F: cap_large_decr_cpu_apply (spapr_caps.c:523)
==55240==    by 0x930C37: spapr_caps_cpu_apply (spapr_caps.c:921)
==55240==    by 0x955D3B: spapr_reset_vcpu (spapr_cpu_core.c:73)
==55240==    by 0x95612B: spapr_cpu_core_reset (spapr_cpu_core.c:209)
==55240==    by 0x95619B: spapr_cpu_core_reset_handler (spapr_cpu_core.c:218)
==55240==    by 0xD3605F: qemu_devices_reset (reset.c:69)
==55240==    by 0x92112B: spapr_machine_reset (spapr.c:1641)
==55240==    by 0x4FBD63: qemu_system_reset (runstate.c:444)
==55240==    by 0x62812B: qdev_machine_creation_done (machine.c:1247)
==55240==    by 0x5064C3: qemu_machine_creation_done (vl.c:2725)
==55240==    by 0x5065DF: qmp_x_exit_preconfig (vl.c:2748)
==55240==  Uninitialised value was created by a stack allocation
==55240==    at 0xB01158: kvmppc_enable_cap_large_decr (kvm.c:2540)

Init 'lpcr' to avoid this warning.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20220331001717.616938-3-danielhb413@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-05 15:36:16 -03:00
Daniel Henrique Barboza
942069e0d2 target/ppc: initialize 'val' union in kvm_get_one_spr()
Valgrind isn't convinced that we are initializing the values we assign
to env->spr[spr] because it doesn't understand that the 'val' union is
being written by the kvm_vcpu_ioctl() that follows (via struct
kvm_one_reg).

This results in Valgrind complaining about uninitialized values every
time we use env->spr in a conditional, like this instance:

==707578== Thread 1:
==707578== Conditional jump or move depends on uninitialised value(s)
==707578==    at 0xA10A40: hreg_compute_hflags_value (helper_regs.c:106)
==707578==    by 0xA10C9F: hreg_compute_hflags (helper_regs.c:173)
==707578==    by 0xA110F7: hreg_store_msr (helper_regs.c:262)
==707578==    by 0xA051A3: ppc_cpu_reset (cpu_init.c:7168)
==707578==    by 0xD4730F: device_transitional_reset (qdev.c:799)
==707578==    by 0xD4A11B: resettable_phase_hold (resettable.c:182)
==707578==    by 0xD49A77: resettable_assert_reset (resettable.c:60)
==707578==    by 0xD4994B: resettable_reset (resettable.c:45)
==707578==    by 0xD458BB: device_cold_reset (qdev.c:296)
==707578==    by 0x48FBC7: cpu_reset (cpu-common.c:114)
==707578==    by 0x97B5EB: spapr_reset_vcpu (spapr_cpu_core.c:38)
==707578==    by 0x97BABB: spapr_cpu_core_reset (spapr_cpu_core.c:209)
==707578==  Uninitialised value was created by a stack allocation
==707578==    at 0xB11F08: kvm_get_one_spr (kvm.c:543)

Initializing 'val' has no impact in the logic and makes Valgrind output
more bearable.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220331001717.616938-2-danielhb413@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-05 15:36:16 -03:00
Fabiano Rosas
f290a23868 target/ppc: Improve KVM hypercall trace
Before:

  kvm_handle_papr_hcall handle PAPR hypercall
  kvm_handle_papr_hcall handle PAPR hypercall
  kvm_handle_papr_hcall handle PAPR hypercall
  kvm_handle_papr_hcall handle PAPR hypercall
  kvm_handle_papr_hcall handle PAPR hypercall
  kvm_handle_papr_hcall handle PAPR hypercall

After:

  kvm_handle_papr_hcall 0x3a8
  kvm_handle_papr_hcall 0x3ac
  kvm_handle_papr_hcall 0x108
  kvm_handle_papr_hcall 0x104
  kvm_handle_papr_hcall 0x104
  kvm_handle_papr_hcall 0x108

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20220325223316.276494-1-farosas@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-04-20 18:00:30 -03:00
Marc-André Lureau
0f9668e0c1 Remove qemu-common.h include from most units
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220323155743.1585078-33-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06 14:31:55 +02:00
Marc-André Lureau
8e3b0cbb72 Replace qemu_real_host_page variables with inlined functions
Replace the global variables with inlined helper functions. getpagesize() is very
likely annotated with a "const" function attribute (at least with glibc), and thus
optimization should apply even better.

This avoids the need for a constructor initialization too.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06 10:50:38 +02:00
Marc-André Lureau
e03b56863d Replace config-time define HOST_WORDS_BIGENDIAN
Replace a config-time define with a compile time condition
define (compatible with clang and gcc) that must be declared prior to
its usage. This avoids having a global configure time define, but also
prevents from bad usage, if the config header wasn't included before.

This can help to make some code independent from qemu too.

gcc supports __BYTE_ORDER__ from about 4.6 and clang from 3.2.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[ For the s390x parts I'm involved in ]
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220323155743.1585078-7-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06 10:50:37 +02:00
Bharata B Rao
82123b756a target/ppc: Support for H_RPT_INVALIDATE hcall
If KVM_CAP_RPT_INVALIDATE KVM capability is enabled, then

- indicate the availability of H_RPT_INVALIDATE hcall to the guest via
  ibm,hypertas-functions property.
- Enable the hcall

Both the above are done only if the new sPAPR machine capability
cap-rpt-invalidate is set.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Message-Id: <20210706112440.1449562-3-bharata@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-07-09 11:01:06 +10:00
Greg Kurz
9cbcfb5924 target/ppc/kvm: Cache timebase frequency
Each vCPU core exposes its timebase frequency in the DT. When running
under KVM, this means parsing /proc/cpuinfo in order to get the timebase
frequency of the host CPU.

The parsing appears to slow down the boot quite a bit with higher number
of cores:

# of cores     seconds spent in spapr_dt_cpus()
      8                  0.550122
     16                  1.342375
     32                  2.850316
     64                  5.922505
     96                  9.109224
    128                 12.245504
    256                 24.957236
    384                 37.389113

The timebase frequency of the host CPU is identical for all
cores and it is an invariant for the VM lifetime. Cache it
instead of doing the same expensive parsing again and again.

Rename kvmppc_get_tbfreq() to kvmppc_get_tbfreq_procfs() and
rename the 'retval' variable to make it clear it is used as
fallback only. Come up with a new version of kvmppc_get_tbfreq()
that calls kvmppc_get_tbfreq_procfs() only once and keep the
value in a static.

Zero is certainly not a valid value for the timebase frequency.
Treat atoi() returning zero as another parsing error and return
the fallback value instead. This allows kvmppc_get_tbfreq() to
use zero as an indicator that kvmppc_get_tbfreq_procfs() hasn't
been called yet.

With this patch applied:

    384                 0.518382

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <161600382766.1780699.6787739229984093959.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-03-31 11:10:50 +11:00
Tom Lendacky
92a5199b29 sev/i386: Don't allow a system reset under an SEV-ES guest
An SEV-ES guest does not allow register state to be altered once it has
been measured. When an SEV-ES guest issues a reboot command, Qemu will
reset the vCPU state and resume the guest. This will cause failures under
SEV-ES. Prevent that from occuring by introducing an arch-specific
callback that returns a boolean indicating whether vCPUs are resettable.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Aleksandar Rikalo <aleksandar.rikalo@syrmia.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: David Hildenbrand <david@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Message-Id: <1ac39c441b9a3e970e9556e1cc29d0a0814de6fd.1611682609.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-16 17:15:39 +01:00
David Gibson
6c8ebe30ea spapr: Add PEF based confidential guest support
Some upcoming POWER machines have a system called PEF (Protected
Execution Facility) which uses a small ultravisor to allow guests to
run in a way that they can't be eavesdropped by the hypervisor.  The
effect is roughly similar to AMD SEV, although the mechanisms are
quite different.

Most of the work of this is done between the guest, KVM and the
ultravisor, with little need for involvement by qemu.  However qemu
does need to tell KVM to allow secure VMs.

Because the availability of secure mode is a guest visible difference
which depends on having the right hardware and firmware, we don't
enable this by default.  In order to run a secure guest you need to
create a "pef-guest" object and set the confidential-guest-support
property to point to it.

Note that this just *allows* secure guests, the architecture of PEF is
such that the guest still needs to talk to the ultravisor to enter
secure mode.  Qemu has no direct way of knowing if the guest is in
secure mode, and certainly can't know until well after machine
creation time.

To start a PEF-capable guest, use the command line options:
    -object pef-guest,id=pef0 -machine confidential-guest-support=pef0

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2021-02-08 16:57:38 +11:00
zhaolichang
136fbf654d ppc/: fix some comment spelling errors
I found that there are many spelling errors in the comments of qemu/target/ppc.
I used spellcheck to check the spelling errors and found some errors in the folder.

Signed-off-by: zhaolichang <zhaolichang@huawei.com>
Reviewed-by: David Edmondson <david.edmondson@oracle.com>
Message-Id: <20201009064449.2336-3-zhaolichang@huawei.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-28 01:08:53 +11:00
Greg Kurz
0a06e4d626 target/ppc: Fix kvmppc_load_htab_chunk() error reporting
If kvmppc_load_htab_chunk() fails, its return value is propagated up
to vmstate_load(). It should thus be a negative errno, not -1 (which
maps to EPERM and would lure the user into thinking that the problem
is necessarily related to a lack of privilege).

Return the error reported by KVM or ENOSPC in case of short write.
While here, propagate the error message through an @errp argument
and have the caller to print it with error_report_err() instead
of relying on fprintf().

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160371604713.305923.5264900354159029580.stgit@bahia.lan>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-28 01:08:53 +11:00
Laurent Vivier
aef92d87c5 pseries: fix kvmppc_set_fwnmi()
QEMU issues the ioctl(KVM_CAP_PPC_FWNMI) on the first vCPU.

If the first vCPU is currently running, the vCPU mutex is held
and the ioctl() cannot be done and waits until the mutex is released.
This never happens and the VM is stuck.

To avoid this deadlock, issue the ioctl on the same vCPU doing the
RTAS call.

The problem can be reproduced by booting a guest with several vCPUs
(the probability to have the problem is (n - 1) / n,  n = # of CPUs),
and then by triggering a kernel crash with "echo c >/proc/sysrq-trigger".

On the reboot, the kernel hangs after:

...
[    0.000000] -----------------------------------------------------
[    0.000000] ppc64_pft_size    = 0x0
[    0.000000] phys_mem_size     = 0x48000000
[    0.000000] dcache_bsize      = 0x80
[    0.000000] icache_bsize      = 0x80
[    0.000000] cpu_features      = 0x0001c06f8f4f91a7
[    0.000000]   possible        = 0x0003fbffcf5fb1a7
[    0.000000]   always          = 0x00000003800081a1
[    0.000000] cpu_user_features = 0xdc0065c2 0xaee00000
[    0.000000] mmu_features      = 0x3c006041
[    0.000000] firmware_features = 0x00000085455a445f
[    0.000000] physical_start    = 0x8000000
[    0.000000] -----------------------------------------------------
[    0.000000] numa:   NODE_DATA [mem 0x47f33c80-0x47f3ffff]

Fixes: ec010c0066 ("ppc/spapr: KVM FWNMI should not be enabled until guest requests it")
Cc: npiggin@gmail.com
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20200724083533.281700-1-lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-07-27 11:09:25 +10:00
Ganesh Goudar
211a7784b9 target/ppc: Fix wrong interpretation of the disposition flag.
Bitwise AND with kvm_run->flags to evaluate if we recovered from
MCE or not is not correct, As disposition in kvm_run->flags is a
two-bit integer value and not a bit map, So check for equality
instead of bitwise AND.

Without the fix qemu treats any unrecoverable mce error as recoverable
and ends up in a mce loop inside the guest, Below are the MCE logs before
and after the fix.

Before fix:

[   66.775757] MCE: CPU0: Initiator CPU
[   66.775891] MCE: CPU0: Unknown
[   66.776587] MCE: CPU0: machine check (Harmless) Host UE Indeterminate [Recovered]
[   66.776857] MCE: CPU0: NIP: [c0080000000e00b8] mcetest_tlbie+0xb0/0x128 [mcetest_tlbie]

After fix:

[ 20.650577] CPU: 0 PID: 1415 Comm: insmod Tainted: G M O 5.6.0-fwnmi-arv+ #11
[ 20.650618] NIP: c0080000023a00e8 LR: c0080000023a00d8 CTR: c000000000021fe0
[ 20.650660] REGS: c0000001fffd3d70 TRAP: 0200 Tainted: G M O (5.6.0-fwnmi-arv+)
[ 20.650708] MSR: 8000000002a0b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 42000222 XER: 20040000
[ 20.650758] CFAR: c00000000000b940 DAR: c0080000025e00e0 DSISR: 00000200 IRQMASK: 0
[ 20.650758] GPR00: c0080000023a00d8 c0000001fddd79a0 c0080000023a8500 0000000000000039
[ 20.650758] GPR04: 0000000000000001 0000000000000000 0000000000000000 0000000000000007
[ 20.650758] GPR08: 0000000000000007 c0080000025e00e0 0000000000000000 00000000000000f7
[ 20.650758] GPR12: 0000000000000000 c000000001900000 c00000000101f398 c0080000025c052f
[ 20.650758] GPR16: 00000000000003a8 c0080000025c0000 c0000001fddd7d70 c0000000015b7940
[ 20.650758] GPR20: 000000000000fff1 c000000000f72c28 c0080000025a0988 0000000000000000
[ 20.650758] GPR24: 0000000000000100 c0080000023a05d0 c0000000001f1d70 0000000000000000
[ 20.650758] GPR28: c0000001fde20000 c0000001fd02b2e0 c0080000023a0000 c0080000025e0000
[ 20.651178] NIP [c0080000023a00e8] mcetest_tlbie+0xe8/0xf0 [mcetest_tlbie]
[ 20.651220] LR [c0080000023a00d8] mcetest_tlbie+0xd8/0xf0 [mcetest_tlbie]
[ 20.651262] Call Trace:
[ 20.651280] [c0000001fddd79a0] [c0080000023a00d8] mcetest_tlbie+0xd8/0xf0 [mcetest_tlbie] (unreliable)
[ 20.651340] [c0000001fddd7a10] [c00000000001091c] do_one_initcall+0x6c/0x2c0
[ 20.651390] [c0000001fddd7af0] [c0000000001f7998] do_init_module+0x90/0x298
[ 20.651433] [c0000001fddd7b80] [c0000000001f61a8] load_module+0x1f58/0x27a0
[ 20.651476] [c0000001fddd7d40] [c0000000001f6c70] __do_sys_finit_module+0xe0/0x100
[ 20.651526] [c0000001fddd7e20] [c00000000000b9d0] system_call+0x5c/0x68
[ 20.651567] Instruction dump:
[ 20.651594] e8410018 3c620000 e8638020 480000cd e8410018 3c620000 e8638028 480000bd
[ 20.651646] e8410018 7be904e4 39400000 612900e0 <7d434a64> 4bffff74 3c4c0001 38428410
[ 20.651699] ---[ end trace 4c40897f016b4340 ]---
[ 20.653310]
Bus error
[ 20.655575] MCE: CPU0: machine check (Harmless) Host UE Indeterminate [Not recovered]
[ 20.655575] MCE: CPU0: NIP: [c0080000023a00e8] mcetest_tlbie+0xe8/0xf0 [mcetest_tlbie]
[ 20.655576] MCE: CPU0: Initiator CPU
[ 20.655576] MCE: CPU0: Unknown

Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Message-Id: <20200408170944.16003-1-ganeshgr@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-04-17 10:38:29 +10:00
Nicholas Piggin
ec010c0066 ppc/spapr: KVM FWNMI should not be enabled until guest requests it
The KVM FWNMI capability should be enabled with the "ibm,nmi-register"
rtas call. Although MCEs from KVM will be delivered as architected
interrupts to the guest before "ibm,nmi-register" is called, KVM has
different behaviour depending on whether the guest has enabled FWNMI
(it attempts to do more recovery on behalf of a non-FWNMI guest).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20200325142906.221248-2-npiggin@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-04-07 08:55:10 +10:00
David Gibson
6a84737c80 spapr,ppc: Simplify signature of kvmppc_rma_size()
This function calculates the maximum size of the RMA as implied by the
host's page size of structure of the VRMA (there are a number of other
constraints on the RMA size which will supersede this one in many
circumstances).

The current interface takes the current RMA size estimate, and clamps it
to the VRMA derived size.  The only current caller passes in an arguably
wrong value (it will match the current RMA estimate in some but not all
cases).

We want to fix that, but for now just keep concerns separated by having the
KVM helper function just return the VRMA derived limit, and let the caller
combine it with other constraints.  We call the new function
kvmppc_vrma_limit() to more clearly indicate its limited responsibility.

The helper should only ever be called in the KVM enabled case, so replace
its !CONFIG_KVM stub with an assert() rather than a dummy value.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cedric Le Goater <clg@fr.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2020-03-17 09:41:15 +11:00
Aravinda Prasad
81fe70e443 target/ppc: Build rtas error log upon an MCE
Upon a machine check exception (MCE) in a guest address space,
KVM causes a guest exit to enable QEMU to build and pass the
error to the guest in the PAPR defined rtas error log format.

This patch builds the rtas error log, copies it to the rtas_addr
and then invokes the guest registered machine check handler. The
handler in the guest takes suitable action(s) depending on the type
and criticality of the error. For example, if an error is
unrecoverable memory corruption in an application inside the
guest, then the guest kernel sends a SIGBUS to the application.
For recoverable errors, the guest performs recovery actions and
logs the error.

Signed-off-by: Aravinda Prasad <arawinda.p@gmail.com>
[Assume SLOF has allocated enough room for rtas error log]
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20200130184423.20519-5-ganeshgr@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-02-03 11:33:10 +11:00
Aravinda Prasad
9ac703ac5f target/ppc: Handle NMI guest exit
Memory error such as bit flips that cannot be corrected
by hardware are passed on to the kernel for handling.
If the memory address in error belongs to guest then
the guest kernel is responsible for taking suitable action.
Patch [1] enhances KVM to exit guest with exit reason
set to KVM_EXIT_NMI in such cases. This patch handles
KVM_EXIT_NMI exit.

[1] https://www.spinics.net/lists/kvm-ppc/msg12637.html
    (e20bbd3d and related commits)

Signed-off-by: Aravinda Prasad <arawinda.p@gmail.com>
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20200130184423.20519-4-ganeshgr@linux.ibm.com>
[dwg: #ifdefs to fix compile for 32-bit target]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-02-03 11:33:10 +11:00
Aravinda Prasad
9d953ce447 ppc: spapr: Introduce FWNMI capability
Introduce fwnmi an spapr capability and add a helper function
which tries to enable it, which would be used by following patch
of the series. This patch by itself does not change the existing
behavior.

Signed-off-by: Aravinda Prasad <arawinda.p@gmail.com>
[eliminate cap_ppc_fwnmi, add fwnmi cap to migration state
 and reprhase the commit message]
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20200130184423.20519-3-ganeshgr@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-02-03 11:33:10 +11:00
Fabiano Rosas
6e0552a3a7 target/ppc: Clarify the meaning of return values in kvm_handle_debug
The kvm_handle_debug function can return 0 to go back into the guest
or return 1 to notify the gdbstub thread and pass control to GDB.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20200110151344.278471-2-farosas@linux.ibm.com>
Tested-by: Leonardo Bras <leonardo@ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-02-02 14:07:57 +11:00
Philippe Mathieu-Daudé
4f7f589381 accel: Replace current_machine->accelerator by current_accel() wrapper
We actually want to access the accelerator, not the machine, so
use the current_accel() wrapper instead.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200121110349.25842-10-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 20:59:11 +01:00
Bharata B Rao
0b73197881 ppc/spapr: Don't call KVM_SVM_OFF ioctl on TCG
Invoking KVM_SVM_OFF ioctl for TCG guests will lead to a QEMU crash.
Fix this by ensuring that we don't call KVM_SVM_OFF ioctl on TCG.

Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Fixes: 4930c1966249 ("ppc/spapr: Support reboot of secure pseries guest")
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Message-Id: <20200102054155.13175-1-bharata@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-01-08 11:01:59 +11:00
Bharata B Rao
905db91697 ppc/spapr: Support reboot of secure pseries guest
A pseries guest can be run as a secure guest on Ultravisor-enabled
POWER platforms. When such a secure guest is reset, we need to
release/reset a few resources both on ultravisor and hypervisor side.
This is achieved by invoking this new ioctl KVM_PPC_SVM_OFF from the
machine reset path.

As part of this ioctl, the secure guest is essentially transitioned
back to normal mode so that it can reboot like a regular guest and
become secure again.

This ioctl has no effect when invoked for a normal guest. If this ioctl
fails for a secure guest, the guest is terminated.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Message-Id: <20191219031445.8949-3-bharata@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-01-08 11:01:59 +11:00