Initialization of the env->sps structure at the end of instance_init is
specific to the 64-bit hash MMU, so move the code into a helper function
in mmu-hash64.c.
We also create a corresponding function to be called at finalize time -
it's empty for now, but we'll need it shortly.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
CPU definitions for cpus with the 64-bit hash MMU can include a table of
available pagesizes. If this isn't supplied ppc_cpu_instance_init() will
fill it in a fallback table based on the POWERPC_MMU_64K bit in mmu_model.
However, it turns out all the cpus which support 64K pages already include
an explicit table of page sizes, so there's no point to the fallback table
including 64k pages.
That removes the only place which tests POWERPC_MMU_64K, so we can remove
it. Which in turn allows some logic to be removed from
kvm_fixup_page_sizes().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
In most cases we prefer to pass a PowerPCCPU rather than the (embedded)
CPUPPCState.
For ppc_hash64_update_{rmls,vrma}() change to take "cpu" instead of "env".
For ppc_hash64_set_{dsi,isi}() remove the redundant "env" parameter.
In theory this makes more work for the functions, but since "cs", "cpu"
and "env" are related by at most constant offsets, the compiler should be
able to optimize out the difference at effectively zero cost.
helper_*() functions are left alone - since they're more closely tied to
the TCG generated code, passing "env" is still the standard there.
While we're there, fix an incorrect indentation.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
The #if isn't necessary, because there's a suitable one inside
ppc_cpu_is_valid(). We've already filtered for suitable cpu models in the
functions that search and register them. So by the time we get to realize
having an invalid one indicates a code error, not a user error, so an
assert() is more appropriate than error_setg().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Because of the various hooks called some variant on "init" - and the rather
greater number that used to exist, I'm always wondering when a function
called simply "*_init" or "*_initfn" will be called.
To make it easier on myself, and maybe others, rename the instance_init
hooks for ppc cpus to *_instance_init(). While we're at it rename the
realize time hooks to *_realize() (from *_realizefn()) which seems to be
the more common current convention.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
There are a couple places (one generic, one target specific) where we need
to get the host page size associated with a particular memory backend. I
have some upcoming code which will add another place which wants this. So,
for convenience, add a helper function to calculate this.
host_memory_backend_pagesize() returns the host pagesize for a given
HostMemoryBackend object.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
qemu_mempath_getpagesize() gets the effective (host side) page size for
a block of memory backed by an mmap()ed file on the host. It requires
the mem_path parameter to be non-NULL.
This ends up meaning all the callers need a different case for handling
anonymous memory (for memory-backend-ram or default memory with -mem-path
is not specified).
We can make all those callers a little simpler by having
qemu_mempath_getpagesize() accept NULL, and treat that as the anonymous
memory case.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
According to the Vector/SIMD extension documentation bit 6 that is
currently masked is valid (listed as transient bit) but bits 7 and 8
should be reserved instead. Fix the mask to match this.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The normal gdb definition of the XER registers is only 32 bit,
and that's what the current version of power64-core.xml also
says (seems copied from gdb's). But qemu's idea of the XER register
is target_ulong (in CPUPPCState, ppc_gdb_register_len and
ppc_cpu_gdb_read_register)
That mismatch leads to the following message when attaching
with gdb:
Truncated register 32 in remote 'g' packet
(and following on that qemu stops responding). The simple fix is
to say the truth in the .xml file. But the better fix is to
actually make it 32bit on the wire, as old gdbs don't support
XML files for describing registers. Also the XER state in qemu
doesn't seem to use the high 32 bits, so sending it off to gdb
doesn't seem worthwhile.
Signed-off-by: Michael Matz <matz@suse.de>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
21b786f "PowerPC: Add TS bits into msr_mask" added the transaction states
to msr_mask for recent POWER CPUs to allow correct migration of machines
that are in certain interim transactional memory states.
This was correct, but unfortunately breaks backwards of pseries-2.7 and
earlier machine types which (stupidly) transferred the msr_mask in the
migration stream and failed if it wasn't equal on each end.
This works around the problem by masking out the new MSR bits in the
compatibility code to send the msr_mask on old machine types.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Tested-by: Lukáš Doktor <ldoktor@redhat.com>
ppc_tr_init_disas_context() correctly sets lazy_tlb_flush to true on
certain CPU models. However, it leaves it uninitialized, instead of
setting it to false on all others.
It wasn't caught before now because we didn't have examples in the tests
that exercised this path. However it can now be caught using clang's
undefined behaviour sanitizer and the sam460ex board.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
cpu_init(cpu_model) were replaced by cpu_create(cpu_type) so
no users are left, remove it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc)
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1518000027-274608-6-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
it will be used for providing to cpu name resolving class for
parsing cpu model for system and user emulation code.
Along with change add target to null-machine tests, so
that when switch to CPU_RESOLVING_TYPE happens,
it would ensure that null-machine usecase still works.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu> (m68k)
Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc)
Acked-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> (tricore)
Message-Id: <1518000027-274608-4-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
[ehabkost: Added macro to riscv too]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
tlbsync also needs to check the Guest Translation Shootdown Enable
(GTSE) bit in the Logical Partition Control Register (LPCR) to
determine at which privilege level it is running.
See commit c6fd28fd57 ("target/ppc: Update tlbie to check privilege
level based on GTSE")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
During migration, after MSR bits is synced, cpu_post_load() will use
msr_mask to determine which PPC MSR bits will be applied into the target
side. Hardware Transaction Memory(HTM) has been supported since Power8,
but TS0/TS1 bit was not in msr_mask yet. That will prevent target KVM
from loading TM checkpointed values.
This patch adds TS bits into msr_mask for Power8, so that transactional
application can be migrated across qemu.
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Convert cap-ibs (indirect branch speculation) to a custom spapr-cap
type.
All tristate caps have now been converted to custom spapr-caps, so
remove the remaining support for them.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[dwg: Don't explicitly list "?"/help option, trust convention]
[dwg: Fold tristate removal into here, to not break bisect]
[dwg: Fix minor style problems]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Check the character and character_mask field when setting
cap_ppc_safe_indirect_branch based on the hypervisor response
to KVM_PPC_GET_CPU_CHAR. Previously the mask field wasn't checked
which was incorrect.
Fixes: 8acc2ae5 (target/ppc/kvm: Add cap_ppc_safe_[cache/bounds_check/indirect_branch])
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is needed before the next patch because the target-dependent kvm stub
uses the existing kvm_openpic_connect_vcpu() declaration, making it impossible
to move the device-specific declarations into the same file without breaking
ppc-linux-user compilation.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
As cpu.h is another typically widely included file which doesn't need
full access to the softfloat API we can remove the includes from here
as well. Where they do need types it's typically for float_status and
the rounding modes so we move that to softfloat-types.h as well.
As a result of not having softfloat in every cpu.h call we now need to
add it to various helpers that do need the full softfloat.h
definitions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[For PPC parts]
Acked-by: David Gibson <david@gibson.dropbear.id.au>
A few changes worth noting:
- Didn't migrate ctx->exception to DISAS_* since the exception field is
in many cases architecturally relevant.
- Moved the cross-page check from the end of translate_insn to tb_start.
- Removed the exit(1) after a TCG temp leak; changed the fprintf there to
qemu_log.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A couple of notes:
- removed ctx->nip in favour of base->pc_next. Yes, it is annoying,
but didn't want to waste its 4 bytes.
- ctx->singlestep_enabled does a lot more than
base.singlestep_enabled; this confused me at first.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-17-armbru@redhat.com>
The macro expansions of qdict_put_TYPE() and qlist_append_TYPE() need
qbool.h, qnull.h, qnum.h and qstring.h to compile. We include qnull.h
and qnum.h in the headers, but not qbool.h and qstring.h. Works,
because we include those wherever the macros get used.
Open-coding these helpers is of dubious value. Turn them into
functions and drop the includes from the headers.
This cleanup makes the number of objects depending on qapi/qmp/qnum.h
from 4551 (out of 4743) to 46 in my "build everything" tree. For
qapi/qmp/qnull.h, the number drops from 4552 to 21.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-10-armbru@redhat.com>
This cleanup makes the number of objects depending on qapi/error.h
drop from 1910 (out of 4743) to 1612 in my "build everything" tree.
While there, separate #include from file comment with a blank line,
and drop a useless comment on why qemu/osdep.h is included first.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-5-armbru@redhat.com>
[Semantic conflict with commit 34e304e975 resolved, OSX breakage fixed]
In order to enable TCE operations support in KVM, we have to inform
the KVM about VFIO groups being attached to specific LIOBNs;
the necessary bits are implemented already by IOMMU MR and VFIO.
This defines get_attr() for the SPAPR TCE IOMMU MR which makes VFIO
call the KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE ioctl and establish
LIOBN-to-IOMMU link.
This changes spapr_tce_set_need_vfio() to avoid TCE table reallocation
if the kernel supports the TCE acceleration.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
[aw - remove unnecessary sys/ioctl.h include]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Add three new kvm capabilities used to represent the level of host support
for three corresponding workarounds.
Host support for each of the capabilities is queried through the
new ioctl KVM_PPC_GET_CPU_CHAR which returns four uint64 quantities. The
first two, character and behaviour, represent the available
characteristics of the cpu and the behaviour of the cpu respectively.
The second two, c_mask and b_mask, represent the mask of known bits for
the character and beheviour dwords respectively.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Correct some compile errors due to name change in final kernel
patch version]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The MC68040 MMU provides the size of the access that
triggers the page fault.
This size is set in the Special Status Word which
is written in the stack frame of the access fault
exception.
So we need the size in m68k_cpu_unassigned_access() and
m68k_cpu_handle_mmu_fault().
To be able to do that, this patch modifies the prototype of
handle_mmu_fault handler, tlb_fill() and probe_write().
do_unassigned_access() already includes a size parameter.
This patch also updates handle_mmu_fault handlers and
tlb_fill() of all targets (only parameter, no code change).
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180118193846.24953-2-laurent@vivier.eu>
The hypervisor doorbells are used by skiboot and Linux on POWER9
processors to wake up secondaries.
This adds processor control support to the Server architecture by
reusing the Embedded support. They are very similar, only the bits
definition of the CPU identifier differ.
Still to be done is message broadcast to all threads of the same
processor.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We know that only one bit (in addition to SO) is going to be set in
the condition register, so do two movconds instead of three setconds,
three shifts and two ORs.
For ppc64-linux-user, the code size reduction is around 5% and the
performance improvement slightly less than 10%. For softmmu, the
improvement is around 5%.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
commit f03a1af581 ("ppc: Fix POWER7 and POWER8 exception definitions")
introduced definitions for the server doorbell exceptions by reusing
the embedded definitions but this adds complexity in the powerpc_excp()
routine. Let's introduce specific definitions for the Server doorbells
exception.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When overwritting a valid TLB entry with a new one, the previous page
were not flushed in QEMU TLB, leading to incoherent mapping. This commit
fixes this.
Signed-off-by: Luc MICHEL <luc.michel@git.antfield.fr>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We recently had some discussions that were sidetracked for a while, because
nearly everyone misapprehended the purpose of the 'max_threads' field in
the compatiblity modes table. It's all about guest expectations, not host
expectations or support (that's handled elsewhere).
In an attempt to avoid a repeat of that confusion, rename the field to
'max_vthreads' and add an explanatory comment.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Increases the max smt mode to 8 for Power9. That's because KVM supports
smt emulation in this platform so QEMU should allow users to use it as
well.
Today if we try to pass -smp ...,threads=8, QEMU will silently truncate
it to smt4 mode and may cause a crash if we try to perform a cpu
hotplug.
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
[dwg: Added an explanatory comment]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When constructing the "host" cpu class we modify whether the VMX and VSX
vector extensions and DFP (Decimal Floating Point) are available
based on whether KVM can support those instructions. This can depend on
policy in the host kernel as well as on the actual host cpu capabilities.
However, the way we probe for this is not very nice: we explicitly check
the host's device tree. That works in practice, but it's not really
correct, since the device tree is a property of the host kernel's platform
which we don't really know about. We get away with it because the only
modern POWER platforms happen to encode VMX, VSX and DFP availability in
the device tree in the same way.
Arguably we should have an explicit KVM capability for this, but we haven't
needed one so far. Barring specific KVM policies which don't yet exist,
each of these instruction classes will be available in the guest if and
only if they're available in the qemu userspace process. We can determine
that from the ELF AUX vector we're supplied with.
Once reworked like this, there are no more callers for kvmppc_get_vmx() and
kvmppc_get_dfp() so remove them.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
As stated in the 1ad9f0a464 commit log, the returned entries are not
a whole PTEG. It was not a problem before 1ad9f0a464 as it would read
a single record assuming it contains a whole PTEG but now the code tries
reading the entire PTEG and "if ((n - i) < invalid)" produces negative
values which then are converted to size_t for memset() and that throws
seg fault.
This fixes the math.
While here, fix the last @i increment as well.
Fixes: 1ad9f0a464 "target/ppc: Fix KVM-HV HPTE accessors"
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Also introduce utilities to manipulate bitmasks (originaly from OPAL)
which be will be used in the model of the XIVE interrupt controller.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
These are now trivial sets and tests against NULL. Unwrap.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
applied using ./scripts/clean-includes
not needed since 7ebaf79556
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
and use them in a couple of obvious places. Other macros will be used
in the model of the XIVE interrupt controller.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When a CPU is stopped with the 'stop-self' RTAS call, its state
'halted' is switched to 1 and, in this case, the MSR is not taken into
account anymore in the cpu_has_work() routine. Only the pending
hardware interrupts are checked with their LPCR:PECE* enablement bit.
If the DECR timer fires after 'stop-self' is called and before the CPU
'stop' state is reached, the nearly-dead CPU will have some work to do
and the guest will crash. This case happens very frequently with the
not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
occasionally fired but after 'stop' state, so no work is to be done
and the guest survives.
I suspect there is a race between the QEMU mainloop triggering the
timers and the TCG CPU thread but I could not quite identify the root
cause. To be safe, let's disable in the LPCR all the exceptions which
can cause an exit while the CPU is in power-saving mode and reenable
them when the CPU is started.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
and use the value to define precisely the default value of the LPCR in
the helper routine cpu_ppc_set_papr()
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Occasionally in Linux guests on x86_64 we're seeing logs like:
ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 00000100req 00000004
when they should read:
ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 00000100req 00000002
The "00000004" is CPU_INTERRUPT_EXITTB yet the code calls
cpu_interrupt(cs, CPU_INTERRUPT_HARD) ("00000002") in this function
just before the log message. Something is causing the HARD bit setting
to get lost.
The knock on effect of losing that bit is the decrementer timer interrupts
don't get delivered which causes the guest to sit idle in its idle handler
and 'hang'.
The issue occurs due to races from code which sets CPU_INTERRUPT_EXITTB.
Rather than poking directly into cs->interrupt_request, that code needs to:
a) hold BQL
b) use the cpu_interrupt() helper
This patch fixes the call sites to do this, fixing the hang. The calls
are made from a variety of contexts so a helper function is added to handle
the necessary locking. This can likely be improved and optimised in the future
but it ensures the code is correct and doesn't lockup as it stands today.
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The msr invalidation code (commits 993eb and 2360b) inverts all
bits except MSR_TGPR and MSR_HVB. On non PowerPC 601 processors
this leads to incorrect change of excp_prefix in hreg_store_msr()
function. The problem is that new msr value get multiplied by msr_mask
and inverted msr does not, thus values of MSR_EP bit in new msr value
and inverted msr are distinct, so that excp_prefix changes but should
not.
Signed-off-by: Kurban Mallachiev <mallachiev@ispras.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
cpu->compat_pvr is used to store the current compat mode of the cpu.
On the receiving side during incoming migration we check compatibility
with the compat mode by calling ppc_set_compat(). However we fail to set
the compat mode with the hypervisor since the "new" compat mode doesn't
differ from the current (due to a "cpu->compat_pvr != compat_pvr" check).
This means that kvm runs the vcpus without a compat mode, which is the
incorrect behaviour. The implication being that a compatibility mode
will never be in effect after migration.
To fix this so that the compat mode is correctly set with the
hypervisor, store the desired compat mode and reset cpu->compat_pvr to
zero before calling ppc_set_compat().
Fixes: 5dfaa532 ("ppc: fix ppc_set_compat() with KVM PR")
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Migration of a system under stress (for example, with
"stress-ng --numa 2") triggers on the destination
some kernel watchdog messages like:
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 3489660870s!
NMI watchdog: BUG: soft lockup - CPU#1 stuck for 3489660884s!
This problem appears with the changes introduced by
42043e4 spapr: clock should count only if vm is running
I think this commit only triggers the problem.
Kernel computes the soft lockup duration using the
Virtual Timebase register (VTB), not using the Timebase
Register (TBR, the one 42043e4 stops).
It appears VTB is not migrated, so this patch adds it in
the list of the SPRs to migrate, and fixes the problem.
For the migration, I've tested a migration from qemu-2.8.0 and
pseries-2.8.0 to a patched master (qemu-2.11.0-rc1). The received
VTB is 0 (as is it not initialized by qemu-2.8.0), but the value
seems to be ignored by KVM and a non zero VTB is used by the kernel.
I have no explanation for that, but as the original problem appears
only with SMP system under stress I suspect some problems in KVM
(I think because VTB is shared by all threads of a core).
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>