Commit Graph

120 Commits

Author SHA1 Message Date
Greg Kurz
9cbcfb5924 target/ppc/kvm: Cache timebase frequency
Each vCPU core exposes its timebase frequency in the DT. When running
under KVM, this means parsing /proc/cpuinfo in order to get the timebase
frequency of the host CPU.

The parsing appears to slow down the boot quite a bit with higher number
of cores:

# of cores     seconds spent in spapr_dt_cpus()
      8                  0.550122
     16                  1.342375
     32                  2.850316
     64                  5.922505
     96                  9.109224
    128                 12.245504
    256                 24.957236
    384                 37.389113

The timebase frequency of the host CPU is identical for all
cores and it is an invariant for the VM lifetime. Cache it
instead of doing the same expensive parsing again and again.

Rename kvmppc_get_tbfreq() to kvmppc_get_tbfreq_procfs() and
rename the 'retval' variable to make it clear it is used as
fallback only. Come up with a new version of kvmppc_get_tbfreq()
that calls kvmppc_get_tbfreq_procfs() only once and keep the
value in a static.

Zero is certainly not a valid value for the timebase frequency.
Treat atoi() returning zero as another parsing error and return
the fallback value instead. This allows kvmppc_get_tbfreq() to
use zero as an indicator that kvmppc_get_tbfreq_procfs() hasn't
been called yet.

With this patch applied:

    384                 0.518382

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <161600382766.1780699.6787739229984093959.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-03-31 11:10:50 +11:00
Tom Lendacky
92a5199b29 sev/i386: Don't allow a system reset under an SEV-ES guest
An SEV-ES guest does not allow register state to be altered once it has
been measured. When an SEV-ES guest issues a reboot command, Qemu will
reset the vCPU state and resume the guest. This will cause failures under
SEV-ES. Prevent that from occuring by introducing an arch-specific
callback that returns a boolean indicating whether vCPUs are resettable.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Aleksandar Rikalo <aleksandar.rikalo@syrmia.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: David Hildenbrand <david@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Message-Id: <1ac39c441b9a3e970e9556e1cc29d0a0814de6fd.1611682609.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-16 17:15:39 +01:00
David Gibson
6c8ebe30ea spapr: Add PEF based confidential guest support
Some upcoming POWER machines have a system called PEF (Protected
Execution Facility) which uses a small ultravisor to allow guests to
run in a way that they can't be eavesdropped by the hypervisor.  The
effect is roughly similar to AMD SEV, although the mechanisms are
quite different.

Most of the work of this is done between the guest, KVM and the
ultravisor, with little need for involvement by qemu.  However qemu
does need to tell KVM to allow secure VMs.

Because the availability of secure mode is a guest visible difference
which depends on having the right hardware and firmware, we don't
enable this by default.  In order to run a secure guest you need to
create a "pef-guest" object and set the confidential-guest-support
property to point to it.

Note that this just *allows* secure guests, the architecture of PEF is
such that the guest still needs to talk to the ultravisor to enter
secure mode.  Qemu has no direct way of knowing if the guest is in
secure mode, and certainly can't know until well after machine
creation time.

To start a PEF-capable guest, use the command line options:
    -object pef-guest,id=pef0 -machine confidential-guest-support=pef0

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2021-02-08 16:57:38 +11:00
zhaolichang
136fbf654d ppc/: fix some comment spelling errors
I found that there are many spelling errors in the comments of qemu/target/ppc.
I used spellcheck to check the spelling errors and found some errors in the folder.

Signed-off-by: zhaolichang <zhaolichang@huawei.com>
Reviewed-by: David Edmondson <david.edmondson@oracle.com>
Message-Id: <20201009064449.2336-3-zhaolichang@huawei.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-28 01:08:53 +11:00
Greg Kurz
0a06e4d626 target/ppc: Fix kvmppc_load_htab_chunk() error reporting
If kvmppc_load_htab_chunk() fails, its return value is propagated up
to vmstate_load(). It should thus be a negative errno, not -1 (which
maps to EPERM and would lure the user into thinking that the problem
is necessarily related to a lack of privilege).

Return the error reported by KVM or ENOSPC in case of short write.
While here, propagate the error message through an @errp argument
and have the caller to print it with error_report_err() instead
of relying on fprintf().

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160371604713.305923.5264900354159029580.stgit@bahia.lan>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-28 01:08:53 +11:00
Laurent Vivier
aef92d87c5 pseries: fix kvmppc_set_fwnmi()
QEMU issues the ioctl(KVM_CAP_PPC_FWNMI) on the first vCPU.

If the first vCPU is currently running, the vCPU mutex is held
and the ioctl() cannot be done and waits until the mutex is released.
This never happens and the VM is stuck.

To avoid this deadlock, issue the ioctl on the same vCPU doing the
RTAS call.

The problem can be reproduced by booting a guest with several vCPUs
(the probability to have the problem is (n - 1) / n,  n = # of CPUs),
and then by triggering a kernel crash with "echo c >/proc/sysrq-trigger".

On the reboot, the kernel hangs after:

...
[    0.000000] -----------------------------------------------------
[    0.000000] ppc64_pft_size    = 0x0
[    0.000000] phys_mem_size     = 0x48000000
[    0.000000] dcache_bsize      = 0x80
[    0.000000] icache_bsize      = 0x80
[    0.000000] cpu_features      = 0x0001c06f8f4f91a7
[    0.000000]   possible        = 0x0003fbffcf5fb1a7
[    0.000000]   always          = 0x00000003800081a1
[    0.000000] cpu_user_features = 0xdc0065c2 0xaee00000
[    0.000000] mmu_features      = 0x3c006041
[    0.000000] firmware_features = 0x00000085455a445f
[    0.000000] physical_start    = 0x8000000
[    0.000000] -----------------------------------------------------
[    0.000000] numa:   NODE_DATA [mem 0x47f33c80-0x47f3ffff]

Fixes: ec010c0066 ("ppc/spapr: KVM FWNMI should not be enabled until guest requests it")
Cc: npiggin@gmail.com
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20200724083533.281700-1-lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-07-27 11:09:25 +10:00
Ganesh Goudar
211a7784b9 target/ppc: Fix wrong interpretation of the disposition flag.
Bitwise AND with kvm_run->flags to evaluate if we recovered from
MCE or not is not correct, As disposition in kvm_run->flags is a
two-bit integer value and not a bit map, So check for equality
instead of bitwise AND.

Without the fix qemu treats any unrecoverable mce error as recoverable
and ends up in a mce loop inside the guest, Below are the MCE logs before
and after the fix.

Before fix:

[   66.775757] MCE: CPU0: Initiator CPU
[   66.775891] MCE: CPU0: Unknown
[   66.776587] MCE: CPU0: machine check (Harmless) Host UE Indeterminate [Recovered]
[   66.776857] MCE: CPU0: NIP: [c0080000000e00b8] mcetest_tlbie+0xb0/0x128 [mcetest_tlbie]

After fix:

[ 20.650577] CPU: 0 PID: 1415 Comm: insmod Tainted: G M O 5.6.0-fwnmi-arv+ #11
[ 20.650618] NIP: c0080000023a00e8 LR: c0080000023a00d8 CTR: c000000000021fe0
[ 20.650660] REGS: c0000001fffd3d70 TRAP: 0200 Tainted: G M O (5.6.0-fwnmi-arv+)
[ 20.650708] MSR: 8000000002a0b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 42000222 XER: 20040000
[ 20.650758] CFAR: c00000000000b940 DAR: c0080000025e00e0 DSISR: 00000200 IRQMASK: 0
[ 20.650758] GPR00: c0080000023a00d8 c0000001fddd79a0 c0080000023a8500 0000000000000039
[ 20.650758] GPR04: 0000000000000001 0000000000000000 0000000000000000 0000000000000007
[ 20.650758] GPR08: 0000000000000007 c0080000025e00e0 0000000000000000 00000000000000f7
[ 20.650758] GPR12: 0000000000000000 c000000001900000 c00000000101f398 c0080000025c052f
[ 20.650758] GPR16: 00000000000003a8 c0080000025c0000 c0000001fddd7d70 c0000000015b7940
[ 20.650758] GPR20: 000000000000fff1 c000000000f72c28 c0080000025a0988 0000000000000000
[ 20.650758] GPR24: 0000000000000100 c0080000023a05d0 c0000000001f1d70 0000000000000000
[ 20.650758] GPR28: c0000001fde20000 c0000001fd02b2e0 c0080000023a0000 c0080000025e0000
[ 20.651178] NIP [c0080000023a00e8] mcetest_tlbie+0xe8/0xf0 [mcetest_tlbie]
[ 20.651220] LR [c0080000023a00d8] mcetest_tlbie+0xd8/0xf0 [mcetest_tlbie]
[ 20.651262] Call Trace:
[ 20.651280] [c0000001fddd79a0] [c0080000023a00d8] mcetest_tlbie+0xd8/0xf0 [mcetest_tlbie] (unreliable)
[ 20.651340] [c0000001fddd7a10] [c00000000001091c] do_one_initcall+0x6c/0x2c0
[ 20.651390] [c0000001fddd7af0] [c0000000001f7998] do_init_module+0x90/0x298
[ 20.651433] [c0000001fddd7b80] [c0000000001f61a8] load_module+0x1f58/0x27a0
[ 20.651476] [c0000001fddd7d40] [c0000000001f6c70] __do_sys_finit_module+0xe0/0x100
[ 20.651526] [c0000001fddd7e20] [c00000000000b9d0] system_call+0x5c/0x68
[ 20.651567] Instruction dump:
[ 20.651594] e8410018 3c620000 e8638020 480000cd e8410018 3c620000 e8638028 480000bd
[ 20.651646] e8410018 7be904e4 39400000 612900e0 <7d434a64> 4bffff74 3c4c0001 38428410
[ 20.651699] ---[ end trace 4c40897f016b4340 ]---
[ 20.653310]
Bus error
[ 20.655575] MCE: CPU0: machine check (Harmless) Host UE Indeterminate [Not recovered]
[ 20.655575] MCE: CPU0: NIP: [c0080000023a00e8] mcetest_tlbie+0xe8/0xf0 [mcetest_tlbie]
[ 20.655576] MCE: CPU0: Initiator CPU
[ 20.655576] MCE: CPU0: Unknown

Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Message-Id: <20200408170944.16003-1-ganeshgr@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-04-17 10:38:29 +10:00
Nicholas Piggin
ec010c0066 ppc/spapr: KVM FWNMI should not be enabled until guest requests it
The KVM FWNMI capability should be enabled with the "ibm,nmi-register"
rtas call. Although MCEs from KVM will be delivered as architected
interrupts to the guest before "ibm,nmi-register" is called, KVM has
different behaviour depending on whether the guest has enabled FWNMI
(it attempts to do more recovery on behalf of a non-FWNMI guest).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20200325142906.221248-2-npiggin@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-04-07 08:55:10 +10:00
David Gibson
6a84737c80 spapr,ppc: Simplify signature of kvmppc_rma_size()
This function calculates the maximum size of the RMA as implied by the
host's page size of structure of the VRMA (there are a number of other
constraints on the RMA size which will supersede this one in many
circumstances).

The current interface takes the current RMA size estimate, and clamps it
to the VRMA derived size.  The only current caller passes in an arguably
wrong value (it will match the current RMA estimate in some but not all
cases).

We want to fix that, but for now just keep concerns separated by having the
KVM helper function just return the VRMA derived limit, and let the caller
combine it with other constraints.  We call the new function
kvmppc_vrma_limit() to more clearly indicate its limited responsibility.

The helper should only ever be called in the KVM enabled case, so replace
its !CONFIG_KVM stub with an assert() rather than a dummy value.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cedric Le Goater <clg@fr.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2020-03-17 09:41:15 +11:00
Aravinda Prasad
81fe70e443 target/ppc: Build rtas error log upon an MCE
Upon a machine check exception (MCE) in a guest address space,
KVM causes a guest exit to enable QEMU to build and pass the
error to the guest in the PAPR defined rtas error log format.

This patch builds the rtas error log, copies it to the rtas_addr
and then invokes the guest registered machine check handler. The
handler in the guest takes suitable action(s) depending on the type
and criticality of the error. For example, if an error is
unrecoverable memory corruption in an application inside the
guest, then the guest kernel sends a SIGBUS to the application.
For recoverable errors, the guest performs recovery actions and
logs the error.

Signed-off-by: Aravinda Prasad <arawinda.p@gmail.com>
[Assume SLOF has allocated enough room for rtas error log]
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20200130184423.20519-5-ganeshgr@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-02-03 11:33:10 +11:00
Aravinda Prasad
9ac703ac5f target/ppc: Handle NMI guest exit
Memory error such as bit flips that cannot be corrected
by hardware are passed on to the kernel for handling.
If the memory address in error belongs to guest then
the guest kernel is responsible for taking suitable action.
Patch [1] enhances KVM to exit guest with exit reason
set to KVM_EXIT_NMI in such cases. This patch handles
KVM_EXIT_NMI exit.

[1] https://www.spinics.net/lists/kvm-ppc/msg12637.html
    (e20bbd3d and related commits)

Signed-off-by: Aravinda Prasad <arawinda.p@gmail.com>
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20200130184423.20519-4-ganeshgr@linux.ibm.com>
[dwg: #ifdefs to fix compile for 32-bit target]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-02-03 11:33:10 +11:00
Aravinda Prasad
9d953ce447 ppc: spapr: Introduce FWNMI capability
Introduce fwnmi an spapr capability and add a helper function
which tries to enable it, which would be used by following patch
of the series. This patch by itself does not change the existing
behavior.

Signed-off-by: Aravinda Prasad <arawinda.p@gmail.com>
[eliminate cap_ppc_fwnmi, add fwnmi cap to migration state
 and reprhase the commit message]
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20200130184423.20519-3-ganeshgr@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-02-03 11:33:10 +11:00
Fabiano Rosas
6e0552a3a7 target/ppc: Clarify the meaning of return values in kvm_handle_debug
The kvm_handle_debug function can return 0 to go back into the guest
or return 1 to notify the gdbstub thread and pass control to GDB.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20200110151344.278471-2-farosas@linux.ibm.com>
Tested-by: Leonardo Bras <leonardo@ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-02-02 14:07:57 +11:00
Philippe Mathieu-Daudé
4f7f589381 accel: Replace current_machine->accelerator by current_accel() wrapper
We actually want to access the accelerator, not the machine, so
use the current_accel() wrapper instead.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200121110349.25842-10-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 20:59:11 +01:00
Bharata B Rao
0b73197881 ppc/spapr: Don't call KVM_SVM_OFF ioctl on TCG
Invoking KVM_SVM_OFF ioctl for TCG guests will lead to a QEMU crash.
Fix this by ensuring that we don't call KVM_SVM_OFF ioctl on TCG.

Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Fixes: 4930c1966249 ("ppc/spapr: Support reboot of secure pseries guest")
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Message-Id: <20200102054155.13175-1-bharata@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-01-08 11:01:59 +11:00
Bharata B Rao
905db91697 ppc/spapr: Support reboot of secure pseries guest
A pseries guest can be run as a secure guest on Ultravisor-enabled
POWER platforms. When such a secure guest is reset, we need to
release/reset a few resources both on ultravisor and hypervisor side.
This is achieved by invoking this new ioctl KVM_PPC_SVM_OFF from the
machine reset path.

As part of this ioctl, the secure guest is essentially transitioned
back to normal mode so that it can reboot like a regular guest and
become secure again.

This ioctl has no effect when invoked for a normal guest. If this ioctl
fails for a secure guest, the guest is terminated.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Message-Id: <20191219031445.8949-3-bharata@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-01-08 11:01:59 +11:00
Fabiano Rosas
38fc68a489 target/ppc: Remove unused PPC_INPUT_INT defines
They were added in "16415335be Use correct input constant" with a
single use in kvm_arch_pre_run but that function's implementation was
removed by "1e8f51e856 ppc: remove idle_timer logic".

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20191218014616.686124-1-farosas@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-01-08 11:01:59 +11:00
Peter Maydell
4800819827 * More uses of RCU_READ_LOCK_GUARD (Dave, myself)
* QOM doc improvments (Greg)
 * Cleanups from the Meson conversion (Marc-André)
 * Support for multiple -accel options (myself)
 * Many x86 machine cleanup (Philippe, myself)
 * tests/migration-test cleanup (Juan)
 * PC machine removal and next round of deprecation (Thomas)
 * kernel-doc integration (Peter, myself)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJd+YJGAAoJEL/70l94x66D0YYIAIZpS6i6NYJC8KHCl49fjI7U
 qHDN7MiKYTU+l3i0+iGmQL6XN5ClAY0pXkY5LBFIDpsohHR5f4jdrIKjyvcHzuIM
 gx/NLsiA45/niHYrn/hEo0P7CwGTrrdWL+SVmScnKcwYiBzMO/uYblxlbUBKLPNn
 eGaKQmEkvlUBR9GS6S1+jYg8234ZRZ4+12t5dqqADBQ7Kc0wn6KC5yebIoQxCgVc
 9F5Ezdkl7befrTI7El3EC6aT18bKhIBZIs1PT/hzqzlGFhBuKM7uKDb43Yx8c7XQ
 bk5vzHmblPAgQyK4OETQ+DM745AOk6vBiJZbR9nrDUXWvUkrEXTQZMJKU0FXdlE=
 =hyYX
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* More uses of RCU_READ_LOCK_GUARD (Dave, myself)
* QOM doc improvments (Greg)
* Cleanups from the Meson conversion (Marc-André)
* Support for multiple -accel options (myself)
* Many x86 machine cleanup (Philippe, myself)
* tests/migration-test cleanup (Juan)
* PC machine removal and next round of deprecation (Thomas)
* kernel-doc integration (Peter, myself)

# gpg: Signature made Wed 18 Dec 2019 01:35:02 GMT
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (87 commits)
  vga: cleanup mapping of VRAM for non-PCI VGA
  hw/display: Remove "rombar" hack from vga-pci and vmware_vga
  hw/pci: Remove the "command_serr_enable" property
  hw/audio: Remove the "use_broken_id" hack from the AC97 device
  hw/i386: Remove the deprecated machines 0.12 up to 0.15
  hw/pci-host: Add Kconfig entry to select the IGD Passthrough Host Bridge
  hw/pci-host/i440fx: Extract the IGD passthrough host bridge device
  hw/pci-host/i440fx: Use definitions instead of magic values
  hw/pci-host/i440fx: Use size_t to iterate over ARRAY_SIZE()
  hw/pci-host/i440fx: Extract PCII440FXState to "hw/pci-host/i440fx.h"
  hw/pci-host/i440fx: Correct the header description
  Fix some comment spelling errors.
  target/i386: remove unused pci-assign codes
  WHPX: refactor load library
  migration: check length directly to make sure the range is aligned
  memory: include MemoryListener documentation and some missing function parameters
  docs: add memory API reference
  memory.h: Silence kernel-doc complaints
  docs: Create bitops.rst as example of kernel-docs
  bitops.h: Silence kernel-doc complaints
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-12-20 11:20:25 +00:00
Vladimir Sementsov-Ogievskiy
0c115681a5 ppc: make Error **errp const where it is appropriate
Mostly, Error ** is for returning error from the function, so the
callee sets it. However kvmppc_hint_smt_possible gets already filled
errp parameter. It doesn't change the pointer itself, only change the
internal state of referenced Error object. So we can make it Error
*const * errp, to stress the behavior. It will also help coccinelle
script (in future) to distinguish such cases from common errp usage.

While there, rename the function to
kvmppc_error_append_smt_possible_hint().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20191205174635.18758-8-vsementsov@virtuozzo.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Commit message replaced]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2019-12-18 08:43:19 +01:00
Markus Armbruster
1a639fdf96 Revert "ppc: well form kvmppc_hint_smt_possible error hint helper"
This reverts commit cdcca22aab.

Commit cdcca22aab is a superseded version of the next commit that
crept in by accident.  Revert it, so the final version applies.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
2019-12-18 08:40:09 +01:00
Paolo Bonzini
4376c40ded kvm: introduce kvm_kernel_irqchip_* functions
The KVMState struct is opaque, so provide accessors for the fields
that will be moved from current_machine to the accelerator.  For now
they just forward to the machine object, but this will change.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-17 19:32:45 +01:00
Vladimir Sementsov-Ogievskiy
cdcca22aab ppc: well form kvmppc_hint_smt_possible error hint helper
Make kvmppc_hint_smt_possible hint append helper well formed:
rename errp to errp_in, as it is IN-parameter here (which is unusual
for errp), rename function to be kvmppc_error_append_*_hint.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20191127191434.20945-1-vsementsov@virtuozzo.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
David Gibson
165dc3edd7 spapr/kvm: Set default cpu model for all machine classes
We have to set the default model of all machine classes, not just for
the active one. Otherwise, "query-machines" will indicate the wrong
CPU model (e.g. "power9_v2.0-powerpc64-cpu" instead of
"host-powerpc64-cpu") as "default-cpu-type".

s390x already fixed this in de60a92e "s390x/kvm: Set default cpu model for
all machine classes".  This patch applies a similar fix for the pseries-*
machine types on ppc64.

Doing a
    {"execute":"query-machines"}
under KVM now results in
    {
      "hotpluggable-cpus": true,
      "name": "pseries-4.2",
      "numa-mem-supported": true,
      "default-cpu-type": "host-powerpc64-cpu",
      "is-default": true,
      "cpu-max": 1024,
      "deprecated": false,
      "alias": "pseries"
    },
    {
      "hotpluggable-cpus": true,
      "name": "pseries-4.1",
      "numa-mem-supported": true,
      "default-cpu-type": "host-powerpc64-cpu",
      "cpu-max": 1024,
      "deprecated": false
    },
    ...

Libvirt probes all machines via "-machine none,accel=kvm:tcg" and will
currently see the wrong CPU model under KVM.

Reported-by: Jiři Denemark <jdenemar@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2019-11-18 11:50:39 +01:00
Wei Yang
038adc2f58 core: replace getpagesize() with qemu_real_host_page_size
There are three page size in qemu:

  real host page size
  host page size
  target page size

All of them have dedicate variable to represent. For the last two, we
use the same form in the whole qemu project, while for the first one we
use two forms: qemu_real_host_page_size and getpagesize().

qemu_real_host_page_size is defined to be a replacement of
getpagesize(), so let it serve the role.

[Note] Not fully tested for some arch or device.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20191013021145.16011-3-richardw.yang@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-26 15:38:06 +02:00
Alexey Kardashevskiy
972bd57689 ppc/kvm: Skip writing DPDES back when in run time state
On POWER8 systems the Directed Privileged Door-bell Exception State
register (DPDES) stores doorbell pending status, one bit per a thread
of a core, set by "msgsndp" instruction. The register is shared among
threads of the same core and KVM on POWER9 emulates it in a similar way
(POWER9 does not have DPDES).

DPDES is shared but QEMU assumes all SPRs are per thread so the only safe
way to write DPDES back to VCPU before running a guest is doing so
while all threads are pulled out of the guest so DPDES cannot change.
There is only one situation when this condition is met: incoming migration
when all threads are stopped. Otherwise any QEMU HMP/QMP command causing
kvm_arch_put_registers() (for example printing registers or dumping memory)
can clobber DPDES in a race with other vcpu threads.

This changes DPDES handling so it is not written to KVM at runtime.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20190923084110.34643-1-aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-04 10:25:23 +10:00
Shivaprasad G Bhat
1e8f51e856 ppc: remove idle_timer logic
The logic is broken for multiple vcpu guests, also causing memory leak.
The logic is in place to handle kvm not having KVM_CAP_PPC_IRQ_LEVEL,
which is part of the kernel now since 2.6.37. Instead of fixing the
leak, drop the redundant logic which is not excercised on new kernels
anymore. Exit with error on older kernels.

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Message-Id: <156406409479.19996.7606556689856621111.stgit@lep8c.aus.stglabs.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:39 +10:00
Markus Armbruster
d5938f29fe Clean up inclusion of sysemu/sysemu.h
In my "build everything" tree, changing sysemu/sysemu.h triggers a
recompile of some 5400 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

Almost a third of its inclusions are actually superfluous.  Delete
them.  Downgrade two more to qapi/qapi-types-run-state.h, and move one
from char/serial.h to char/serial.c.

hw/semihosting/config.c, monitor/monitor.c, qdev-monitor.c, and
stubs/semihost.c define variables declared in sysemu/sysemu.h without
including it.  The compiler is cool with that, but include it anyway.

This doesn't reduce actual use much, as it's still included into
widely included headers.  The next commit will tackle that.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-27-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2019-08-16 13:31:53 +02:00
Markus Armbruster
db72581598 Include qemu/main-loop.h less
In my "build everything" tree, changing qemu/main-loop.h triggers a
recompile of some 5600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).  It includes block/aio.h,
which in turn includes qemu/event_notifier.h, qemu/notify.h,
qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h,
qemu/thread.h, qemu/timer.h, and a few more.

Include qemu/main-loop.h only where it's needed.  Touching it now
recompiles only some 1700 objects.  For block/aio.h and
qemu/event_notifier.h, these numbers drop from 5600 to 2800.  For the
others, they shrink only slightly.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-21-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster
650d103d3e Include hw/hw.h exactly where needed
In my "build everything" tree, changing hw/hw.h triggers a recompile
of some 2600 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).

The previous commits have left only the declaration of hw_error() in
hw/hw.h.  This permits dropping most of its inclusions.  Touching it
now recompiles less than 200 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster
ca77ee28e0 Include migration/qemu-file-types.h a lot less
In my "build everything" tree, changing migration/qemu-file-types.h
triggers a recompile of some 2600 out of 6600 objects (not counting
tests and objects that don't depend on qemu/osdep.h).

The culprit is again hw/hw.h, which supposedly includes it for
convenience.

Include migration/qemu-file-types.h only where it's needed.  Touching
it now recompiles less than 200 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-10-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Greg Kurz
9723295a72 ppc: Introduce kvmppc_set_reg_tb_offset() helper
Introduce a KVM helper and its stub instead of guarding the code with
CONFIG_KVM.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156051055736.224162.11641594431517798715.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Cédric Le Goater
7701aeed0f target/ppc: fix compile error in kvmppc_define_rtas_kernel_token()
gcc9 reports :

In file included from /usr/include/string.h:494,
                 from ./include/qemu/osdep.h:101,
                 from ./target/ppc/kvm.c:17:
In function ‘strncpy’,
    inlined from ‘kvmppc_define_rtas_kernel_token’ at ./target/ppc/kvm.c:2648:5:
/usr/include/bits/string_fortified.h:106:10: error: ‘__builtin_strncpy’ specified bound 120 equals destination size [-Werror=stringop-truncation]
  106 |   return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest));
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190615081252.28602-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Liran Alon
b1115c9991 KVM: Introduce kvm_arch_destroy_vcpu()
Simiar to how kvm_init_vcpu() calls kvm_arch_init_vcpu() to perform
arch-dependent initialisation, introduce kvm_arch_destroy_vcpu()
to be called from kvm_destroy_vcpu() to perform arch-dependent
destruction.

This was added because some architectures (Such as i386)
currently do not free memory that it have allocated in
kvm_arch_init_vcpu().

Suggested-by: Maran Wilson <maran.wilson@oracle.com>
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-3-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-21 02:29:39 +02:00
Richard Henderson
db70b31144 target/ppc: Use env_cpu, env_archcpu
Cleanup in the boilerplate that each target must define.
Replace ppc_env_get_cpu with env_archcpu.  The combination
CPU(ppc_env_get_cpu) should have used ENV_GET_CPU to begin;
use env_cpu now.

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-10 07:03:42 -07:00
Cédric Le Goater
38afd772f8 spapr/xive: add KVM support
This introduces a set of helpers when KVM is in use, which create the
KVM XIVE device, initialize the interrupt sources at a KVM level and
connect the interrupt presenters to the vCPU.

They also handle the initialization of the TIMA and the source ESB
memory regions of the controller. These have a different type under
KVM. They are 'ram device' memory mappings, similarly to VFIO, exposed
to the guest and the associated VMAs on the host are populated
dynamically with the appropriate pages using a fault handler.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:45 +10:00
Boxuan Li
228152c27e target/ppc/kvm: Fix trace typo
Signed-off-by: Boxuan Li <liboxuan@connect.hku.hk>
Message-Id: <20190430172842.27369-1-liboxuan@connect.hku.hk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:44 +10:00
Peter Maydell
9ec34ecc97 ppc patch queue 2019-04-26
Here's the first ppc target pull request for qemu-4.1.  This has a
 number of things that have accumulated while qemu-4.0 was frozen.
 
  * A number of emulated MMU improvements from Ben Herrenschmidt
 
  * Assorted cleanups fro Greg Kurz
 
  * A large set of mostly mechanical cleanups from me to make target/ppc
    much closer to compliant with the modern coding style
 
  * Support for passthrough of NVIDIA GPUs using NVLink2
 
 As well as some other assorted fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAlzCnusACgkQbDjKyiDZ
 s5LfhhAAuem5UBGKPKPj33c87HC+GGG+S4y89ic3ebyKplWulGgouHCa4Dnc7Y5m
 9MfIEcljRDpuRJCEONo6yg9aaRb3cW2Go9TpTwxmF8o1suG/v5bIQIdiRbBuMa2t
 yhNujVg5kkWSU1G4mCZjL9FS2ADPsxsKZVd73DPEqjlNJg981+2qtSnfR8SXhfnk
 dSSKxyfC6Hq1+uhGkLI+xtft+BCTWOstjz+efHpZ5l2mbiaMeh7zMKrIXXy/FtKA
 ufIyxbZznMS5MAZk7t90YldznfwOCqfh3di1kx8GTZ40LkBKbuI5LLHTG0sT75z5
 LHwFuLkBgWmS8RyIRRh9opr7ifrayHx8bQFpW368Qu+PbPzUCcTVIrWUfPmaNR74
 CkYJvhiYZfTwKtUeP7b2wUkHpZF4KINI4TKNaS4QAlm3DNbO67DFYkBrytpXsSzv
 smEpe+sqlbY40olw9q4ESP80r+kGdEPLkRjfdj0R7qS4fsqAH1bjuSkNqlPaCTJQ
 hNsoz2D+f56z0bBq4x8FRzDpqnBkdy4x6PlLxkJuAaV7WAtvq7n7tiMA3TRr/rIB
 OYFP2xPNajjP8MfyOB94+S4WDltmsgXoM7HyyvrKp2JBpe7mFjpep5fMp5GUpweV
 OOYrTsN1Nuu3kFpeimEc+IOyp1BWXnJF4vHhKTOqHeqZEs5Fgus=
 =RpAK
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190426' into staging

ppc patch queue 2019-04-26

Here's the first ppc target pull request for qemu-4.1.  This has a
number of things that have accumulated while qemu-4.0 was frozen.

 * A number of emulated MMU improvements from Ben Herrenschmidt

 * Assorted cleanups fro Greg Kurz

 * A large set of mostly mechanical cleanups from me to make target/ppc
   much closer to compliant with the modern coding style

 * Support for passthrough of NVIDIA GPUs using NVLink2

As well as some other assorted fixes.

# gpg: Signature made Fri 26 Apr 2019 07:02:19 BST
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-4.1-20190426: (36 commits)
  target/ppc: improve performance of large BAT invalidations
  ppc/hash32: Rework R and C bit updates
  ppc/hash64: Rework R and C bit updates
  ppc/spapr: Use proper HPTE accessors for H_READ
  target/ppc: Don't check UPRT in radix mode when in HV real mode
  target/ppc/kvm: Convert DPRINTF to traces
  target/ppc/trace-events: Fix trivial typo
  spapr: Drop duplicate PCI swizzle code
  spapr_pci: Get rid of duplicate code for node name creation
  target/ppc: Style fixes for translate/spe-impl.inc.c
  target/ppc: Style fixes for translate/vmx-impl.inc.c
  target/ppc: Style fixes for translate/vsx-impl.inc.c
  target/ppc: Style fixes for translate/fp-impl.inc.c
  target/ppc: Style fixes for translate.c
  target/ppc: Style fixes for translate_init.inc.c
  target/ppc: Style fixes for monitor.c
  target/ppc: Style fixes for mmu_helper.c
  target/ppc: Style fixes for mmu-hash64.[ch]
  target/ppc: Style fixes for mmu-hash32.[ch]
  target/ppc: Style fixes for misc_helper.c
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-04-27 21:34:46 +01:00
Greg Kurz
8d83cbf101 target/ppc/kvm: Convert DPRINTF to traces
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155445152490.302073.17033451726459859333.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-04-26 11:37:57 +10:00
David Gibson
c995e942bf target/ppc: Style fixes for kvm_ppc.h and kvm.c
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2019-04-26 10:42:38 +10:00
David Hildenbrand
905b7ee4d6 exec: Introduce qemu_maxrampagesize() and rename qemu_getrampagesize()
Rename qemu_getrampagesize() to qemu_minrampagesize(). While at it,
properly rename find_max_supported_pagesize() to
find_min_backend_pagesize().

s390x is actually interested into the maximum ram pagesize, so
introduce and use qemu_maxrampagesize().

Add a TODO, indicating that looking at any mapped memory backends is not
100% correct in some cases.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190417113143.5551-3-david@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-04-25 13:47:27 +02:00
David Gibson
ce2918cbc3 spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of.  There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".

That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.

In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words".  So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.

In addition to case changes, we also make some other identifier renames:
  VIOsPAPR* -> SpaprVio*
    The reverse word ordering was only ever used to mitigate the capital
    cluster, so revert to the natural ordering.
  VIOsPAPRVTYDevice -> SpaprVioVty
  VIOsPAPRVLANDevice -> SpaprVioVlan
    Brevity, since the "Device" didn't add useful information
  sPAPRDRConnector -> SpaprDrc
  sPAPRDRConnectorClass -> SpaprDrcClass
    Brevity, and makes it clearer this is the same thing as a "DRC"
    mentioned in many other places in the code

This is 100% a mechanical search-and-replace patch.  It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:05 +11:00
Suraj Jitindar Singh
68f9f70841 target/ppc/spapr: Enable H_PAGE_INIT in-kernel handling
The H_CALL H_PAGE_INIT can be used to zero or copy a page of guest
memory. Enable the in-kernel H_PAGE_INIT handler.

The in-kernel handler takes half the time to complete compared to
handling the H_CALL in userspace.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190306060608.19935-1-sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Fabiano Rosas
468e3a1a94 target/ppc: Refactor kvm_handle_debug
There are four scenarios being handled in this function:

- single stepping
- hardware breakpoints
- software breakpoints
- fallback (no debug supported)

A future patch will add code to handle specific single step and
software breakpoints cases so let's split each scenario into its own
function now to avoid hurting readability.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20190228225759.21328-5-farosas@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Fabiano Rosas
2cbd158131 target/ppc: Move handling of hardware breakpoints to a separate function
This is in preparation for a refactoring of the kvm_handle_debug
function in the next patch.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20190228225759.21328-4-farosas@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:04 +11:00
Suraj Jitindar Singh
8ff43ee404 target/ppc/spapr: Add SPAPR_CAP_CCF_ASSIST
Introduce a new spapr_cap SPAPR_CAP_CCF_ASSIST to be used to indicate
the requirement for a hw-assisted version of the count cache flush
workaround.

The count cache flush workaround is a software workaround which can be
used to flush the count cache on context switch. Some revisions of
hardware may have a hardware accelerated flush, in which case the
software flush can be shortened. This cap is used to set the
availability of such hardware acceleration for the count cache flush
routine.

The availability of such hardware acceleration is indicated by the
H_CPU_CHAR_BCCTR_FLUSH_ASSIST flag being set in the characteristics
returned from the KVM_PPC_GET_CPU_CHAR ioctl.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190301031912.28809-2-sjitindarsingh@gmail.com>
[dwg: Small style fixes]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 12:07:49 +11:00
Suraj Jitindar Singh
399b2896d4 target/ppc/spapr: Add workaround option to SPAPR_CAP_IBS
The spapr_cap SPAPR_CAP_IBS is used to indicate the level of capability
for mitigations for indirect branch speculation. Currently the available
values are broken (default), fixed-ibs (fixed by serialising indirect
branches) and fixed-ccd (fixed by diabling the count cache).

Introduce a new value for this capability denoted workaround, meaning that
software can work around the issue by flushing the count cache on
context switch. This option is available if the hypervisor sets the
H_CPU_BEHAV_FLUSH_COUNT_CACHE flag in the cpu behaviours returned from
the KVM_PPC_GET_CPU_CHAR ioctl.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190301031912.28809-1-sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 12:07:49 +11:00
Suraj Jitindar Singh
7d050527e3 target/ppc: Implement large decrementer support for KVM
Implement support to allow KVM guests to take advantage of the large
decrementer introduced on POWER9 cpus.

To determine if the host can support the requested large decrementer
size, we check it matches that specified in the ibm,dec-bits device-tree
property. We also need to enable it in KVM by setting the LPCR_LD bit in
the LPCR. Note that to do this we need to try and set the bit, then read
it back to check the host allowed us to set it, if so we can use it but
if we were unable to set it the host cannot support it and we must not
use the large decrementer.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190301024317.22137-3-sjitindarsingh@gmail.com>
[dwg: Small style fixes]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 12:07:49 +11:00
Greg Kurz
52b73c09bf target/ppc/kvm: Drop useless include directive
It has been there since the enablement of PR KVM for PAPR, ie, commit
f61b4bedaf in 2011. Not sure why at that time, but it is definitely
not needed with the current code.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-04 18:44:17 +11:00
Mark Cave-Ayland
ef96e3ae96 target/ppc: move FP and VMX registers into aligned vsr register array
The VSX register array is a block of 64 128-bit registers where the first 32
registers consist of the existing 64-bit FP registers extended to 128-bit
using new VSR registers, and the last 32 registers are the VMX 128-bit
registers as show below:

            64-bit               64-bit
    +--------------------+--------------------+
    |        FP0         |                    |  VSR0
    +--------------------+--------------------+
    |        FP1         |                    |  VSR1
    +--------------------+--------------------+
    |        ...         |        ...         |  ...
    +--------------------+--------------------+
    |        FP30        |                    |  VSR30
    +--------------------+--------------------+
    |        FP31        |                    |  VSR31
    +--------------------+--------------------+
    |                  VMX0                   |  VSR32
    +-----------------------------------------+
    |                  VMX1                   |  VSR33
    +-----------------------------------------+
    |                  ...                    |  ...
    +-----------------------------------------+
    |                  VMX30                  |  VSR62
    +-----------------------------------------+
    |                  VMX31                  |  VSR63
    +-----------------------------------------+

In order to allow for future conversion of VSX instructions to use TCG vector
operations, recreate the same layout using an aligned version of the existing
vsr register array.

Since the old fpr and avr register arrays are removed, the existing callers
must also be updated to use the correct offset in the vsr register array. This
also includes switching the relevant VMState fields over to using subarrays
to make sure that migration is preserved.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-01-09 09:28:14 +11:00
Suraj Jitindar Singh
b9a477b725 ppc/spapr_caps: Add SPAPR_CAP_NESTED_KVM_HV
Add the spapr cap SPAPR_CAP_NESTED_KVM_HV to be used to control the
availability of nested kvm-hv to the level 1 (L1) guest.

Assuming a hypervisor with support enabled an L1 guest can be allowed to
use the kvm-hv module (and thus run it's own kvm-hv guests) by setting:
-machine pseries,cap-nested-hv=true
or disabled with:
-machine pseries,cap-nested-hv=false

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-11-08 13:08:35 +11:00