Commit Graph

58455 Commits

Author SHA1 Message Date
Eric Blake
894e02804c nbd/server: Add helper functions for parsing option payload
Rather than making every callsite perform length sanity checks
and error reporting, add the helper functions nbd_opt_read()
and nbd_opt_drop() that use the length stored in the client
struct; also add an assertion that optlen is 0 before any
option (ie. any previous option was fully handled), complementing
the assertion added in an earlier patch that optlen is 0 after
all negotiation completes.

Note that the call in nbd_negotiate_handle_export_name() does
not use the new helper (in part because the server cannot
reply to NBD_OPT_EXPORT_NAME - it either succeeds or the
connection drops).

Based on patches by Vladimir Sementsov-Ogievskiy.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180110230825.18321-6-eblake@redhat.com>
2018-01-17 20:14:12 -06:00
Eric Blake
41f5dfafbb nbd/server: Add va_list form of nbd_negotiate_send_rep_err()
This will be useful for the next patch.

Based on a patch by Vladimir Sementsov-Ogievskiy

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180110230825.18321-5-eblake@redhat.com>
2018-01-17 20:14:12 -06:00
Eric Blake
32f158a635 nbd/server: Better error for NBD_OPT_EXPORT_NAME failure
When a client abruptly disconnects before we've finished reading
the name sent with NBD_OPT_EXPORT_NAME, we are better off logging
the failure as EIO (we can't communicate with the client), rather
than EINVAL (the client sent bogus data).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180110230825.18321-4-eblake@redhat.com>
2018-01-17 20:14:12 -06:00
Vladimir Sementsov-Ogievskiy
0cfae925d2 nbd/server: refactor negotiation functions parameters
Instead of passing currently negotiating option and its length to
many of negotiation functions let's just store them on NBDClient
struct to be state-variables of negotiation phase.

This unifies semantics of negotiation functions and allows
tracking changes of remaining option length in future patches.

Asssert that optlen is back to 0 after negotiation (including
old-style connections which don't negotiate), although we need
more patches before we can assert optlen is 0 between options
during negotiation.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171122101958.17065-2-vsementsov@virtuozzo.com>
[eblake: rebase, commit message tweak, assert !optlen after
negotiation completes]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-01-17 20:14:12 -06:00
Eric Blake
a16a790770 nbd/server: Hoist nbd_reject_length() earlier
No semantic change, but will make it easier for an upcoming patch
to refactor code without having to add forward declarations.  Fix
a poor comment while at it.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180110230825.18321-2-eblake@redhat.com>
2018-01-17 20:14:12 -06:00
Eduardo Habkost
6cfbc54e89 i386: Add EPYC-IBPB CPU model
EPYC-IBPB is a copy of the EPYC CPU model with
just CPUID_8000_0008_EBX_IBPB added.

Cc: Jiri Denemark <jdenemar@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-7-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17 23:54:39 -02:00
Eduardo Habkost
ac96c41354 i386: Add new -IBRS versions of Intel CPU models
The new MSR IA32_SPEC_CTRL MSR was introduced by a recent Intel
microcode updated and can be used by OSes to mitigate
CVE-2017-5715.  Unfortunately we can't change the existing CPU
models without breaking existing setups, so users need to
explicitly update their VM configuration to use the new *-IBRS
CPU model if they want to expose IBRS to guests.

The new CPU models are simple copies of the existing CPU models,
with just CPUID_7_0_EDX_SPEC_CTRL added and model_id updated.

Cc: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-6-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17 23:04:31 -02:00
Eduardo Habkost
1b3420e1c4 i386: Add FEAT_8000_0008_EBX CPUID feature word
Add the new feature word and the "ibpb" feature flag.

Based on a patch by Paolo Bonzini.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-5-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17 23:04:31 -02:00
Eduardo Habkost
a2381f0934 i386: Add spec-ctrl CPUID bit
Add the feature name and a CPUID_7_0_EDX_SPEC_CTRL macro.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-4-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17 23:04:31 -02:00
Paolo Bonzini
a33a2cfe2f i386: Add support for SPEC_CTRL MSR
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-3-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17 23:04:31 -02:00
Eduardo Habkost
807e9869b8 i386: Change X86CPUDefinition::model_id to const char*
It is valid to have a 48-character model ID on CPUID, however the
definition of X86CPUDefinition::model_id is char[48], which can
make the compiler drop the null terminator from the string.

If a CPU model happens to have 48 bytes on model_id, "-cpu help"
will print garbage and the object_property_set_str() call at
x86_cpu_load_def() will read data outside the model_id array.

We could increase the array size to 49, but this would mean the
compiler would not issue a warning if a 49-char string is used by
mistake for model_id.

To make things simpler, simply change model_id to be const char*,
and validate the string length using an assert() on
x86_register_cpudef_type().

Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-2-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17 23:04:31 -02:00
Haozhong Zhang
c68bcb3a99 target/i386: add clflushopt to "Skylake-Server" cpu model
CPUID_7_0_EBX_CLFLUSHOPT is missed in current "Skylake-Server" cpu
model. Add it to "Skylake-Server" cpu model on pc-i440fx-2.12 and
pc-q35-2.12. Keep it disabled in "Skylake-Server" cpu model on older
machine types.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171219033730.12748-3-haozhong.zhang@intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17 23:04:31 -02:00
Haozhong Zhang
df47ce8af4 pc: add 2.12 machine types
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171219033730.12748-2-haozhong.zhang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17 23:04:31 -02:00
Luc MICHEL
2e569845bd target-ppc: Fix booke206 tlbwe TLB instruction
When overwritting a valid TLB entry with a new one, the previous page
were not flushed in QEMU TLB, leading to incoherent mapping. This commit
fixes this.

Signed-off-by: Luc MICHEL <luc.michel@git.antfield.fr>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17 09:35:24 +11:00
Cédric Le Goater
0bfc0cf0af target/ppc: add support for POWER9 HILE
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17 09:35:24 +11:00
Cédric Le Goater
fef592f909 ppc/pnv: change initrd address
When skiboot starts, it first clears the CPU structs for all possible
CPUs on a system :

	for (i = 0; i <= cpu_max_pir; i++)
		memset(&cpu_stacks[i].cpu, 0, sizeof(struct cpu_thread));

On POWER9, cpu_max_pir is quite big, 0x7fff, and the skiboot cpu_stacks
array overlaps with the memory region in which QEMU maps the initramfs
file. Move it upwards in memory to keep it safe.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17 09:35:24 +11:00
Cédric Le Goater
c035851ac0 ppc/pnv: fix XSCOM core addressing on POWER9
The XSCOM base address of the core chiplet was wrongly calculated. Use
the OPAL macros to fix that and do a couple of renames.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17 09:35:24 +11:00
Cédric Le Goater
b3b066e9d8 ppc/pnv: introduce pnv*_is_power9() helpers
These are useful when instantiating device models which are shared
between the POWER8 and the POWER9 processor families.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17 09:35:24 +11:00
Cédric Le Goater
09279d7e7b ppc/pnv: change core mask for POWER9
When addressed by XSCOM, the first core has the 0x20 chiplet ID but
the CPU PIR can start at 0x0.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17 09:35:24 +11:00
Cédric Le Goater
83028a2b28 ppc/pnv: use POWER9 DD2 processor
commit 1ed9c8af50 ("target/ppc: Add POWER9 DD2.0 model information")
deprecated the POWER9 model v1.0.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17 09:35:24 +11:00
Cédric Le Goater
459bb44cc4 tests/boot-serial-test: fix powernv support
Recent commit introduced the firmware image skiboot 5.9 which
has a different first line ouput.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17 09:35:24 +11:00
Cédric Le Goater
60e58bd9f0 ppc/pnv: Update skiboot firmware image
This is skiboot 5.9 (commit e0ee24c2). It brings improved POWER9
support among many other things. Built from submodule.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17 09:35:24 +11:00
David Gibson
8904e5a750 spapr: Adjust default VSMT value for better migration compatibility
fa98fbfc "PC: KVM: Support machine option to set VSMT mode" introduced the
"vsmt" parameter for the pseries machine type, which controls the spacing
of the vcpu ids of thread 0 for each virtual core.  This was done to bring
some consistency and stability to how that was done, while still allowing
backwards compatibility for migration and otherwise.

The default value we used for vsmt was set to the max of the host's
advertised default number of threads and the number of vthreads per vcore
in the guest.  This was done to continue running without extra parameters
on older KVM versions which don't allow the VSMT value to be changed.

Unfortunately, even that smaller than before leakage of host configuration
into guest visible configuration still breaks things.  Specifically a guest
with 4 (or less) vthread/vcore will get a different vsmt value when
running on a POWER8 (vsmt==8) and POWER9 (vsmt==4) host.  That means the
vcpu ids don't line up so you can't migrate between them, though you should
be able to.

Long term we really want to make vsmt == smp_threads for sufficiently
new machine types.  However, that means that qemu will then require a
sufficiently recent KVM (one which supports changing VSMT) - that's still
not widely enough deployed to be really comfortable to do.

In the meantime we need some default that will work as often as
possible.  This patch changes that default to 8 in all circumstances.
This does change guest visible behaviour (including for existing
machine versions) for many cases - just not the most common/important
case.

Following is case by case justification for why this is still the least
worst option.  Note that any of the old behaviours can still be duplicated
after this patch, it's just that it requires manual intervention by
setting the vsmt property on the command line.

KVM HV on POWER8 host:
   This is the overwhelmingly common case in production setups, and is
   unchanged by design.  POWER8 hosts will advertise a default VSMT mode
   of 8, and > 8 vthreads/vcore isn't permitted

KVM HV on POWER7 host:
   Will break, but POWER7s allowing KVM were never released to the public.

KVM HV on POWER9 host:
   Not yet released to the public, breaking this now will reduce other
   breakage later.

KVM HV on PowerPC 970:
   Will theoretically break it, but it was barely supported to begin with
   and already required various user visible hacks to work.  Also so old
   that I just don't care.

TCG:
   This is the nastiest one; it means migration of TCG guests (without
   manual vsmt setting) will break.  Since TCG is rarely used in production
   I think this is worth it for the other benefits.  It does also remove
   one more barrier to TCG<->KVM migration which could be interesting for
   debugging applications.

KVM PR:
   As with TCG, this will break migration of existing configurations,
   without adding extra manual vsmt options.  As with TCG, it is rare in
   production so I think the benefits outweigh breakages.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17 09:35:24 +11:00
David Gibson
1f20f2e0ee spapr: Allow some cases where we can't set VSMT mode in the kernel
At present if we require a vsmt mode that's not equal to the kernel's
default, and the kernel doesn't let us change it (e.g. because it's an old
kernel without support) then we always fail.

But in fact we can cope with the kernel having a different vsmt as long as
  a) it's >= the actual number of vthreads/vcore (so that guest threads
     that are supposed to be on the same core act like it)
  b) it's a submultiple of the requested vsmt mode (so that guest threads
     spaced by the vsmt value will act like they're on different cores)

Allowing this case gives us a bit more freedom to adjust the vsmt behaviour
without breaking existing cases.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17 09:35:24 +11:00
David Gibson
abbc124753 target/ppc: Clarify compat mode max_threads value
We recently had some discussions that were sidetracked for a while, because
nearly everyone misapprehended the purpose of the 'max_threads' field in
the compatiblity modes table.  It's all about guest expectations, not host
expectations or support (that's handled elsewhere).

In an attempt to avoid a repeat of that confusion, rename the field to
'max_vthreads' and add an explanatory comment.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
2018-01-17 09:35:24 +11:00
Jose Ricardo Ziviani
03ee51d354 ppc: Change Power9 compat table to support at most 8 threads/core
Increases the max smt mode to 8 for Power9. That's because KVM supports
smt emulation in this platform so QEMU should allow users to use it as
well.

Today if we try to pass -smp ...,threads=8, QEMU will silently truncate
it to smt4 mode and may cause a crash if we try to perform a cpu
hotplug.

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
[dwg: Added an explanatory comment]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17 09:35:24 +11:00
David Gibson
895d5cd620 spapr: Remove unnecessary 'options' field from sPAPRCapabilityInfo
The options field here is intended to list the available values for the
capability.  It's not used yet, because the existing capabilities are
boolean.

We're going to add capabilities that aren't, but in that case the info on
the possible values can be folded into the .description field.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17 09:35:24 +11:00
Suraj Jitindar Singh
4e5fe3688e hw/ppc/spapr_caps: Rework spapr_caps to use uint8 internal representation
Currently spapr_caps are tied to boolean values (on or off). This patch
reworks the caps so that they can have any uint8 value. This allows more
capabilities with various values to be represented in the same way
internally. Capabilities are numbered in ascending order. The internal
representation of capability values is an array of uint8s in the
sPAPRMachineState, indexed by capability number.

Capabilities can have their own name, description, options, getter and
setter functions, type and allow functions. They also each have their own
section in the migration stream. Capabilities are only migrated if they
were explictly set on the command line, with the assumption that
otherwise the default will match.

On migration we ensure that the capability value on the destination
is greater than or equal to the capability value from the source. So
long at this remains the case then the migration is considered
compatible and allowed to continue.

This patch implements generic getter and setter functions for boolean
capabilities. It also converts the existings cap-htm, cap-vsx and
cap-dfp capabilities to this new format.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17 09:35:24 +11:00
David Gibson
2d1fb9bc8e spapr: Handle Decimal Floating Point (DFP) as an optional capability
Decimal Floating Point has been available on POWER7 and later (server)
cpus.  However, it can be disabled on the hypervisor, meaning that it's
not available to guests.

We currently handle this by conditionally advertising DFP support in the
device tree depending on whether the guest CPU model supports it - which
can also depend on what's allowed in the host for -cpu host.  That can lead
to confusion on migration, since host properties are silently affecting
guest visible properties.

This patch handles it by treating it as an optional capability for the
pseries machine type.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17 09:35:24 +11:00
David Gibson
2938664286 spapr: Handle VMX/VSX presence as an spapr capability flag
We currently have some conditionals in the spapr device tree code to decide
whether or not to advertise the availability of the VMX (aka Altivec) and
VSX vector extensions to the guest, based on whether the guest cpu has
those features.

This can lead to confusion and subtle failures on migration, since it makes
a guest visible change based only on host capabilities.  We now have a
better mechanism for this, in spapr capabilities flags, which explicitly
depend on user options rather than host capabilities.

Rework the advertisement of VSX and VMX based on a new VSX capability.  We
no longer bother with a conditional for VMX support, because every CPU
that's ever been supported by the pseries machine type supports VMX.

NOTE: Some userspace distributions (e.g. RHEL7.4) already rely on
availability of VSX in libc, so using cap-vsx=off may lead to a fatal
SIGILL in init.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17 09:35:24 +11:00
David Gibson
3f2ca480eb target/ppc: Clean up probing of VMX, VSX and DFP availability on KVM
When constructing the "host" cpu class we modify whether the VMX and VSX
vector extensions and DFP (Decimal Floating Point) are available
based on whether KVM can support those instructions.  This can depend on
policy in the host kernel as well as on the actual host cpu capabilities.

However, the way we probe for this is not very nice: we explicitly check
the host's device tree.  That works in practice, but it's not really
correct, since the device tree is a property of the host kernel's platform
which we don't really know about.  We get away with it because the only
modern POWER platforms happen to encode VMX, VSX and DFP availability in
the device tree in the same way.

Arguably we should have an explicit KVM capability for this, but we haven't
needed one so far.  Barring specific KVM policies which don't yet exist,
each of these instruction classes will be available in the guest if and
only if they're available in the qemu userspace process.  We can determine
that from the ELF AUX vector we're supplied with.

Once reworked like this, there are no more callers for kvmppc_get_vmx() and
kvmppc_get_dfp() so remove them.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17 09:35:24 +11:00
David Gibson
be85537d65 spapr: Validate capabilities on migration
Now that the "pseries" machine type implements optional capabilities (well,
one so far) there's the possibility of having different capabilities
available at either end of a migration.  Although arguably a user error,
it would be nice to catch this situation and fail as gracefully as we can.

This adds code to migrate the capabilities flags.  These aren't pulled
directly into the destination's configuration since what the user has
specified on the destination command line should take precedence.  However,
they are checked against the destination capabilities.

If the source was using a capability which is absent on the destination,
we fail the migration, since that could easily cause a guest crash or other
bad behaviour.  If the source lacked a capability which is present on the
destination we warn, but allow the migration to proceed.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17 09:35:24 +11:00
David Gibson
ee76a09fc7 spapr: Treat Hardware Transactional Memory (HTM) as an optional capability
This adds an spapr capability bit for Hardware Transactional Memory.  It is
enabled by default for pseries-2.11 and earlier machine types. with POWER8
or later CPUs (as it must be, since earlier qemu versions would implicitly
allow it).  However it is disabled by default for the latest pseries-2.12
machine type.

This means that with the latest machine type, HTM will not be available,
regardless of CPU, unless it is explicitly enabled on the command line.
That change is made on the basis that:

 * This way running with -M pseries,accel=tcg will start with whatever cpu
   and will provide the same guest visible model as with accel=kvm.
     - More specifically, this means existing make check tests don't have
       to be modified to use cap-htm=off in order to run with TCG

 * We hope to add a new "HTM without suspend" feature in the not too
   distant future which could work on both POWER8 and POWER9 cpus, and
   could be enabled by default.

 * Best guesses suggest that future POWER cpus may well only support the
   HTM-without-suspend model, not the (frankly, horribly overcomplicated)
   POWER8 style HTM with suspend.

 * Anecdotal evidence suggests problems with HTM being enabled when it
   wasn't wanted are more common than being missing when it was.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17 09:35:24 +11:00
David Gibson
33face6b89 spapr: Capabilities infrastructure
Because PAPR is a paravirtual environment access to certain CPU (or other)
facilities can be blocked by the hypervisor.  PAPR provides ways to
advertise in the device tree whether or not those features are available to
the guest.

In some places we automatically determine whether to make a feature
available based on whether our host can support it, in most cases this is
based on limitations in the available KVM implementation.

Although we correctly advertise this to the guest, it means that host
factors might make changes to the guest visible environment which is bad:
as well as generaly reducing reproducibility, it means that a migration
between different host environments can easily go bad.

We've mostly gotten away with it because the environments considered mature
enough to be well supported (basically, KVM on POWER8) have had consistent
feature availability.  But, it's still not right and some limitations on
POWER9 is going to make it more of an issue in future.

This introduces an infrastructure for defining "sPAPR capabilities".  These
are set by default based on the machine version, masked by the capabilities
of the chosen cpu, but can be overriden with machine properties.

The intention is at reset time we verify that the requested capabilities
can be supported on the host (considering TCG, KVM and/or host cpu
limitations).  If not we simply fail, rather than silently modifying the
advertised featureset to the guest.

This does mean that certain configurations that "worked" may now fail, but
such configurations were already more subtly broken.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17 09:35:24 +11:00
Alexey Kardashevskiy
a36593e167 target/ppc: Yet another fix for KVM-HV HPTE accessors
As stated in the 1ad9f0a464 commit log, the returned entries are not
a whole PTEG. It was not a problem before 1ad9f0a464 as it would read
a single record assuming it contains a whole PTEG but now the code tries
reading the entire PTEG and "if ((n - i) < invalid)" produces negative
values which then are converted to size_t for memset() and that throws
seg fault.

This fixes the math.

While here, fix the last @i increment as well.

Fixes: 1ad9f0a464 "target/ppc: Fix KVM-HV HPTE accessors"
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17 09:35:24 +11:00
Peter Maydell
8e5dc9ba49 Queued TCG patches
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJaXidSAAoJEGTfOOivfiFfMboH/jH6g5hYASXyLQe8Z5fSJpfo
 2+rJ9fS3yTjAIZgEtHsAG0zXHml2zXZFv6U7W8TZZnBRR5X2h3Fv+AOjWKKrGtiY
 VfHxj75iggl9u3g4nFkasmQxB07u27uj82q42pp7K8Dvt8mFgnrzJEclGSdKhb2J
 6LROJZtmzargemrQUigc2txqjT3Ce1lv9Jqii1ArGdl401OC/itWoeEyA+Ttymip
 OQJn3+98uwZjWQzNpxBBaKGS7ao5lSL5zca4l89YKvckrPbuJqT+w3EkJM6112Wu
 A8qibdyLd6MH92vhumNrtjfQUy4n9Yk5PITDweSEYOOvNZpnRqGQyhjzwrcJWq0=
 =lq05
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20180116' into staging

Queued TCG patches

# gpg: Signature made Tue 16 Jan 2018 16:24:50 GMT
# gpg:                using RSA key 0x64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth/tags/pull-tcg-20180116:
  tcg/ppc: Allow a 32-bit offset to the constant pool
  tcg/ppc: Support tlb offsets larger than 64k
  tcg/arm: Support tlb offsets larger than 64k
  tcg/arm: Fix double-word comparisons

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-01-16 17:36:39 +00:00
Richard Henderson
030ffe39dd tcg/ppc: Allow a 32-bit offset to the constant pool
We recently relaxed the limit of the number of opcodes that can
appear in a TranslationBlock.  In certain cases this has resulted
in relocation overflow.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-01-16 08:21:56 -08:00
Richard Henderson
4a64e0fd68 tcg/ppc: Support tlb offsets larger than 64k
AArch64 with SVE has an offset of 80k to the 8th TLB.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-01-16 08:21:56 -08:00
Richard Henderson
71f9cee9d0 tcg/arm: Support tlb offsets larger than 64k
AArch64 with SVE has an offset of 80k to the 8th TLB.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-01-16 08:21:56 -08:00
Richard Henderson
7170ac3313 tcg/arm: Fix double-word comparisons
The code sequence we were generating was only good for unsigned
comparisons.  For signed comparisions, use the sequence from gcc.

Fixes booting of ppc64 firmware, with a patch changing the code
sequence for ppc comparisons.

Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-01-16 08:20:39 -08:00
Peter Maydell
c1d5b9add7 * QemuMutex tracing improvements (Alex)
* ram_addr_t optimization (David)
 * SCSI fixes (Fam, Stefan, me)
 * do {} while (0) fixes (Eric)
 * KVM fix for PMU (Jan)
 * memory leak fixes from ASAN (Marc-André)
 * migration fix for HPET, icount, loadvm (Maria, Pavel)
 * hflags fixes (me, Tao)
 * block/iscsi uninitialized variable (Peter L.)
 * full support for GMainContexts in character devices (Peter Xu)
 * more boot-serial-test (Thomas)
 * Memory leak fix (Zhecheng)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJaXgkRAAoJEL/70l94x66DA3EIAI8z8Y+1NAmbLqiHhrrN9Ji/
 b8EHQ8wf0pwwrHuRVKYZvKUU8yvp/CRIoVWZwfeGjRbZC+l7l+BAwdOx42Bj/dUW
 VopNzcJMu3s5SNwoYLvs01OjhciBYNXWTXBkIiErwurF0Ow7oYR7trkLwOw0veSO
 L4qFAGoIBI/7b6BZ3YRQXshhzdSQ6dvHrDness2V1c0crLG+yhvjKJ8PJ2tJyNZO
 DbsrCd7hS6e6liSUqdLj9XgRySFj9R5kgjaLjckjg1SC6kmhLN9hyke8iXgH7uvz
 WGnRPmKjKexFHVYgR0rRFlazcQclAczHuIi/OZe0HLi6trg2YKBkolMaQLQdgfk=
 =HTyS
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* QemuMutex tracing improvements (Alex)
* ram_addr_t optimization (David)
* SCSI fixes (Fam, Stefan, me)
* do {} while (0) fixes (Eric)
* KVM fix for PMU (Jan)
* memory leak fixes from ASAN (Marc-André)
* migration fix for HPET, icount, loadvm (Maria, Pavel)
* hflags fixes (me, Tao)
* block/iscsi uninitialized variable (Peter L.)
* full support for GMainContexts in character devices (Peter Xu)
* more boot-serial-test (Thomas)
* Memory leak fix (Zhecheng)

# gpg: Signature made Tue 16 Jan 2018 14:15:45 GMT
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (51 commits)
  scripts/analyse-locks-simpletrace.py: script to analyse lock times
  util/qemu-thread-*: add qemu_lock, locked and unlock trace events
  cpu: flush TB cache when loading VMState
  block/iscsi: fix initialization of iTask in iscsi_co_get_block_status
  find_ram_offset: Align ram_addr_t allocation on long boundaries
  find_ram_offset: Add comments and tracing
  cpu_physical_memory_sync_dirty_bitmap: Another alignment fix
  checkpatch: Enforce proper do/while (0) style
  maint: Fix macros with broken 'do/while(0); ' usage
  tests: Avoid 'do/while(false); ' in vhost-user-bridge
  chardev: Clean up previous patch indentation
  chardev: Use goto/label instead of do/break/while(0)
  mips: Tweak location of ';' in macros
  net: Drop unusual use of do { } while (0);
  irq: fix memory leak
  cpus: unify qemu_*_wait_io_event
  icount: fixed saving/restoring of icount warp timers
  scripts/qemu-gdb/timers.py: new helper to dump timer state
  scripts/qemu-gdb: add simple tcg lock status helper
  target-i386: update hflags on Hypervisor.framework
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-01-16 15:45:15 +00:00
Peter Maydell
aae39d24a3 target-arm queue:
* SDHCI: cleanups and minor bug fixes
  * target/arm: minor refactor preparatory to fp16 support
  * omap_ssd, ssi-sd, pl181, milkymist-memcard: reset the SD
    card on controller reset (fixes migration failures)
  * target/arm: Handle page table walk load failures correctly
  * hw/arm/virt: Add virt-2.12 machine type
  * get_phys_addr_pmsav7: Support AP=0b111 for v7M
  * hw/intc/armv7m: Support byte and halfword accesses to CFSR
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJaXf8rAAoJEDwlJe0UNgzeoi4QALHpLEXjOmIrIxjcbEoN2T6B
 tGdR+ZyNfdKI2lgleG6c5bmdzPaFooDoqOyCIML77AvqVlRjbDGw6KIfNmLfXQ2g
 16bh0D11Qyl5Yv/YBN3Lv7nDPou4/sop6TzP0MSxvDucqksEVvQARuY147bxHkmy
 uz5jhdSkWdePFJRtmIJtm9G1PpOhS6IjCVKR/upWDJkpBqUdIxtPNx8fEgrcK7mb
 MdSU9LVl0p4J4zYjzFCUF8qMIrtbiqlAAVEt894BeZJk+tDYbq245S3oj1psc3XP
 sGRriXNxQnsczkrkyEAq2BHjswyLLd12jG+vtWkj0A0tIlB4jHgB26juZJbqk8Ui
 GzJoJQ7kCA4qsQxm4goONUkUJdmBtgCCQekWc0a87ELhdzd9pus8bDe9ivFmxCOW
 1+i/ZX773GfLyDp5b1HwN5TU6u8J2bg+C4604hGIJ/h8r4Wu+iqH5/LfEum304rS
 Oqx5XQhU3a1dw+4w897CLbBscfVjNsPmgrCOXq4SYGZZ3NjTwyBMOedtB5HF279i
 mI5NZqVBiLoKC+oDjDe7k7oW6FdeEKfuDsOsxeLpIIe/66YwSpgk1bAsbvipcbRG
 4sUIqMEk6/2TQmVY9lnYg5msQK3QGFtPDSzrFnz3T+DswaCto/SFa89yeocAOXqv
 qEcQXDkndUFrEuo6S4/+
 =G4xE
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20180116' into staging

target-arm queue:
 * SDHCI: cleanups and minor bug fixes
 * target/arm: minor refactor preparatory to fp16 support
 * omap_ssd, ssi-sd, pl181, milkymist-memcard: reset the SD
   card on controller reset (fixes migration failures)
 * target/arm: Handle page table walk load failures correctly
 * hw/arm/virt: Add virt-2.12 machine type
 * get_phys_addr_pmsav7: Support AP=0b111 for v7M
 * hw/intc/armv7m: Support byte and halfword accesses to CFSR

# gpg: Signature made Tue 16 Jan 2018 13:33:31 GMT
# gpg:                using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20180116: (24 commits)
  sdhci: add a 'dma' property to the sysbus devices
  sdhci: fix the PCI device, using the PCI address space for DMA
  sdhci: Implement write method of ACMD12ERRSTS register
  sdhci: fix CAPAB/MAXCURR registers, both are 64bit and read-only
  sdhci: rename the SDHC_CAPAB register
  sdhci: move MASK_TRNMOD with other SDHC_TRN* defines in "sd-internal.h"
  sdhci: convert the DPRINT() calls into trace events
  sdhci: use qemu_log_mask(UNIMP) instead of fprintf()
  sdhci: refactor common sysbus/pci unrealize() into sdhci_common_unrealize()
  sdhci: refactor common sysbus/pci realize() into sdhci_common_realize()
  sdhci: refactor common sysbus/pci class_init() into sdhci_common_class_init()
  sdhci: use DEFINE_SDHCI_COMMON_PROPERTIES() for common sysbus/pci properties
  sdhci: remove dead code
  sdhci: clean up includes
  target/arm: Add fp16 support to vfp_expand_imm
  target/arm: Split out vfp_expand_imm
  hw/sd/omap_mmc: Reset SD card on controller reset
  hw/sd/ssi-sd: Reset SD card on controller reset
  hw/sd/milkymist-memcard: Reset SD card on controller reset
  hw/sd/pl181: Reset SD card on controller reset
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-01-16 14:18:20 +00:00
Alex Bennée
b5976c2e46 scripts/analyse-locks-simpletrace.py: script to analyse lock times
This script allows analysis of mutex acquisition and hold times based
on a trace file. Given a trace control file of:

  qemu_mutex_lock
  qemu_mutex_locked
  qemu_mutex_unlock

And running with:

  $QEMU $QEMU_ARGS -trace events=./lock-trace

You can analyse the results with:

  ./scripts/analyse-locks-simpletrace.py trace-events-all ./trace-21812

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-01-16 14:54:52 +01:00
Alex Bennée
6c27a0ded9 util/qemu-thread-*: add qemu_lock, locked and unlock trace events
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-01-16 14:54:52 +01:00
Pavel Dovgalyuk
15a356c49a cpu: flush TB cache when loading VMState
Flushing TB cache is required because TBs key in the cache may match
different code which existed in the previous state.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: Maria Klimushenkova <maria.klimushenkova@ispras.ru>
Message-Id: <20180110134846.12940.99993.stgit@pasha-VirtualBox>
[Add comment suggested by Peter Maydell. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2018-01-16 14:54:52 +01:00
Peter Lieven
79f9c75e17 block/iscsi: fix initialization of iTask in iscsi_co_get_block_status
in case of unaligned requests or on a target that does not support
block provisioning we leave iTask uninitialized and check iTask.task
for NULL later.

Fixes: e38bc23454
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1515425247-21730-1-git-send-email-pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-01-16 14:54:52 +01:00
Dr. David Alan Gilbert
801110ab22 find_ram_offset: Align ram_addr_t allocation on long boundaries
The dirty bitmaps are built from 'long's and there is fast-path code
for synchronising the case where the RAMBlock is aligned to the start
of a long boundary.  Align the allocation to this boundary
to cause the fast path to be used.

Offsets before change:
11398@1515169675.018566:find_ram_offset size: 0x1e0000 @ 0x8000000
11398@1515169675.020064:find_ram_offset size: 0x20000 @ 0x81e0000
11398@1515169675.020244:find_ram_offset size: 0x20000 @ 0x8200000
11398@1515169675.024343:find_ram_offset size: 0x1000000 @ 0x8220000
11398@1515169675.025154:find_ram_offset size: 0x10000 @ 0x9220000
11398@1515169675.027682:find_ram_offset size: 0x40000 @ 0x9230000
11398@1515169675.032921:find_ram_offset size: 0x200000 @ 0x9270000
11398@1515169675.033307:find_ram_offset size: 0x1000 @ 0x9470000
11398@1515169675.033601:find_ram_offset size: 0x1000 @ 0x9471000

after change:
10923@1515169108.818245:find_ram_offset size: 0x1e0000 @ 0x8000000
10923@1515169108.819410:find_ram_offset size: 0x20000 @ 0x8200000
10923@1515169108.819587:find_ram_offset size: 0x20000 @ 0x8240000
10923@1515169108.823708:find_ram_offset size: 0x1000000 @ 0x8280000
10923@1515169108.824503:find_ram_offset size: 0x10000 @ 0x9280000
10923@1515169108.827093:find_ram_offset size: 0x40000 @ 0x92c0000
10923@1515169108.833045:find_ram_offset size: 0x200000 @ 0x9300000
10923@1515169108.833504:find_ram_offset size: 0x1000 @ 0x9500000
10923@1515169108.833787:find_ram_offset size: 0x1000 @ 0x9540000

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180105170138.23357-3-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-01-16 14:54:52 +01:00
Dr. David Alan Gilbert
154cc9ea3b find_ram_offset: Add comments and tracing
Add some comments so I can understand the various nested loops.
Add some tracing so I can see what they're doing.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180105170138.23357-2-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-01-16 14:54:52 +01:00
Dr. David Alan Gilbert
aa777e297c cpu_physical_memory_sync_dirty_bitmap: Another alignment fix
This code has an optimised, word aligned version, and a boring
unaligned version. My commit f70d345 fixed one alignment issue, but
there's another.

The optimised version operates on 'longs' dealing with (typically) 64
pages at a time, replacing the whole long by a 0 and counting the bits.
If the Ramblock is less than 64bits in length that long can contain bits
representing two different RAMBlocks, but the code will update the
bmap belinging to the 1st RAMBlock only while having updated the total
dirty page count for both.

This probably didn't matter prior to 6b6712ef which split the dirty
bitmap by RAMBlock, but now they're separate RAMBlocks we end up
with a count that doesn't match the state in the bitmaps.

Symptom:
  Migration showing a few dirty pages left to be sent constantly
  Seen on aarch64 and x86 with x86+ovmf

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reported-by: Wei Huang <wei@redhat.com>
Fixes: 6b6712efcc
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-01-16 14:54:52 +01:00
Eric Blake
f4bdc13e49 checkpatch: Enforce proper do/while (0) style
Use of a loop construct for code that is not intended to repeat
does not make much idiomatic sense, except in one place: it is a
common usage in macros in order to wrap arbitrary code with
single-statement semantics.  But when used in a macro, it is more
typical for the caller to supply the trailing ';' when calling
the macro.

Although qemu coding style frowns on bare:
  if (cond)
    statement1;
  else
    statement2;
where extra semicolons actually cause syntax errors, we still
want our macro styles to be easily copied to other projects.
Thus, declare it an error if we encounter any form of 'while (0)'
with a semicolon in the same line.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171201232433.25193-8-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-01-16 14:54:52 +01:00