The parsing of KVM_SET_LAPIC's input depends on the current value of the
APIC base MSR---which indeed is stored in APICCommonState---but for historical
reasons APIC base is set through KVM_SET_SREGS together with cr8 (which is
really just the APIC TPR) and the actual "special CPU registers".
APIC base must now be set before the actual LAPIC registers, so do that
in kvm_apic_put. It will be set again to the same value with KVM_SET_SREGS,
but that's not a big issue.
This only happens since Linux 4.8, which checks for x2apic mode in
KVM_SET_LAPIC. However it's really a QEMU bug; until the recent
commit 78d6a05 ("x86/lapic: Load LAPIC state at post_load", 2016-09-13)
QEMU was indeed setting APIC base (via KVM_SET_SREGS) before the other
LAPIC registers.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This avoids a double hand-full of magic numbers in the
xsave and xrstor helper functions.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
In user-mode emulation env->idt.base memory is
allocated in linux-user/main.c with
size 8*512 = 4096 (for 64-bit).
When fake interrupt EXCP_SYSCALL is thrown
do_interrupt_user checks destination privilege level
for this fake exception, and tries to read 4 bytes
at address base + (256 * 2^4)=4096, that causes
segfault.
Privlege level was checked only for int's, so lets
read dpl from memory only for this case.
Signed-off-by: Stanislav Shmarov <snarpix@gmail.com>
Message-Id: <1473773008-2588376-1-git-send-email-snarpix@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make sure reset zeroes TSC_AUX, XCR0, PKRU. Move XSTATE_BV from the
"vmstate only" section to the "KVM only" section.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Load the LAPIC state during post_load (rather than when the CPU
starts).
This allows an interrupt to be delivered from the ioapic to
the lapic prior to cpu loading, in particular the RTC that starts
ticking as soon as we load it's state.
Fixes a case where Windows hangs after migration due to RTC interrupts
disappearing.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some software algorithms are based on the hardware's cache info, for example,
for x86 linux kernel, when cpu1 want to wakeup a task on cpu2, cpu1 will trigger
a resched IPI and told cpu2 to do the wakeup if they don't share low level
cache. Oppositely, cpu1 will access cpu2's runqueue directly if they share llc.
The relevant linux-kernel code as bellow:
static void ttwu_queue(struct task_struct *p, int cpu)
{
struct rq *rq = cpu_rq(cpu);
......
if (... && !cpus_share_cache(smp_processor_id(), cpu)) {
......
ttwu_queue_remote(p, cpu); /* will trigger RES IPI */
return;
}
......
ttwu_do_activate(rq, p, 0); /* access target's rq directly */
......
}
In real hardware, the cpus on the same socket share L3 cache, so one won't
trigger a resched IPIs when wakeup a task on others. But QEMU doesn't present a
virtual L3 cache info for VM, then the linux guest will trigger lots of RES IPIs
under some workloads even if the virtual cpus belongs to the same virtual socket.
For KVM, there will be lots of vmexit due to guest send IPIs.
The workload is a SAP HANA's testsuite, we run it one round(about 40 minuates)
and observe the (Suse11sp3)Guest's amounts of RES IPIs which triggering during
the period:
No-L3 With-L3(applied this patch)
cpu0: 363890 44582
cpu1: 373405 43109
cpu2: 340783 43797
cpu3: 333854 43409
cpu4: 327170 40038
cpu5: 325491 39922
cpu6: 319129 42391
cpu7: 306480 41035
cpu8: 161139 32188
cpu9: 164649 31024
cpu10: 149823 30398
cpu11: 149823 32455
cpu12: 164830 35143
cpu13: 172269 35805
cpu14: 179979 33898
cpu15: 194505 32754
avg: 268963.6 40129.8
The VM's topology is "1*socket 8*cores 2*threads".
After present virtual L3 cache info for VM, the amounts of RES IPIs in guest
reduce 85%.
For KVM, vcpus send IPIs will cause vmexit which is expensive, so it can cause
severe performance degradation. We had tested the overall system performance if
vcpus actually run on sparate physical socket. With L3 cache, the performance
improves 7.2%~33.1%(avg:15.7%).
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The kvm_pv_unhalt feature doesn't work if kernel_irqchip is
disabled, so we need to report it as unsupported.
Tested-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Commit 9af9e0f, 6daf194d, be62a2eb and 312fd5f got rid of a bunch, but
they keep coming back. checkpatch.pl tries to flag them since commit
5d596c2, but it's not very good at it. Offenders tracked down with
Coccinelle script scripts/coccinelle/err-bad-newline.cocci, an updated
version of the script from commit 312fd5f.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1470224274-31522-2-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
QEMU 2.6 added support for the XSAVE family of instructions, which
includes the XSETBV instruction which allows setting the XCR0
register.
But, when booting Linux kernels with XSAVE support enabled, I was
getting very early crashes where the instruction pointer was set
to 0x3. I tracked it down to a jump instruction generated by this:
gen_jmp_im(s->pc - pc_start);
where s->pc is pointing to the instruction after XSETBV and pc_start
is pointing _at_ XSETBV. Subtract the two and you get 0x3. Whoops.
The fix is to replace this typo with the pattern found everywhere
else in the file when folks want to end the translation buffer.
Richard Henderson confirmed that this is a bug and that this is the
correct fix.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: qemu-stable@nongnu.org
Cc: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- interrupt remapping for intel iommus
- a bunch of virtio cleanups
- fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXkQsqAAoJECgfDbjSjVRpanoIAJ9JVlc1aEjt9sa0cSBcs+NQ
J7JmgU9FqFsj+4FrNTouO3AxTjHurd1UAULP1WMPD+V3JpbnHct8r6SCBLQ5EBMN
VOjYo4DwWs1g+DqnQ9WZmbadu06XvYi/yiAKNUzWfZk0MR11D0D/S5hmarNKw0Kq
tGHeTWjGeY4WqFLV7m+qB4+cqkAByn6um99UtUvgLL05RgIEIP2IEMKYZ+rXvAa9
iGUvzqlO7mbq/+LbL18kaWywa4TCwbbd2eSGWaqhX4CuB62Rl33mWTXFcfaYhkyp
Z3FgwaJ09h0lAjSVEbyAuLFMfO/BnMcsoKqwl4xc4vkn/xBCqFtgH9JcEVm3O8U=
=ge2D
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, virtio: new features, cleanups, fixes
- interrupt remapping for intel iommus
- a bunch of virtio cleanups
- fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Thu 21 Jul 2016 18:49:30 BST
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (57 commits)
intel_iommu: avoid unnamed fields
virtio: Update migration docs
virtio-gpu: Wrap in vmstate
virtio-gpu: Use migrate_add_blocker for virgl migration blocking
virtio-input: Wrap in vmstate
9pfs: Wrap in vmstate
virtio-serial: Wrap in vmstate
virtio-net: Wrap in vmstate
virtio-balloon: Wrap in vmstate
virtio-rng: Wrap in vmstate
virtio-blk: Wrap in vmstate
virtio-scsi: Wrap in vmstate
virtio: Migration helper function and macro
virtio-serial: Remove old migration version support
virtio-net: Remove old migration version support
virtio-scsi: Replace HandleOutput typedef
Revert "mirror: Workaround for unexpected iohandler events during completion"
virtio-scsi: Call virtio_add_queue_aio
virtio-blk: Call virtio_add_queue_aio
virtio: Introduce virtio_add_queue_aio
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In the past, we are doing gsi route commit for each irqchip route
update. This is not efficient if we are updating lots of routes in the
same time. This patch removes the committing phase in
kvm_irqchip_update_msi_route(). Instead, we do explicit commit after all
routes updated.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
One more IEC notifier is added to let msi routes know about the IEC
changes. When interrupt invalidation happens, all registered msi routes
will be updated for all PCI devices.
Since both vfio and vhost are possible gsi route consumers, this patch
will go one step further to keep them safe in split irqchip mode and
when irqfd is enabled.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
[move trace-events lines into target-i386/trace-events]
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Adding two hooks to be notified when adding/removing msi routes. There
are two kinds of MSI routes:
- in kvm_irqchip_add_irq_route(): before assigning IRQFD. Used by
vhost, vfio, etc.
- in kvm_irqchip_send_msi(): when sending direct MSI message, if
direct MSI not allowed, we will first create one MSI route entry
in the kernel, then trigger it.
This patch only hooks the first one (irqfd case). We do not need to
take care for the 2nd one, since it's only used by QEMU userspace
(kvm-apic) and the messages will always do in-time translation when
triggered. While we need to note them down for the 1st one, so that we
can notify the kernel when cache invalidation happens.
Also, we do not hook IOAPIC msi routes (we have explicit notifier for
IOAPIC to keep its cache updated). We only need to care about irqfd
users.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Changing the original MSIMessage parameter in kvm_irqchip_add_msi_route
into the vector number. Vector index provides more information than the
MSIMessage, we can retrieve the MSIMessage using the vector easily. This
will avoid fetching MSIMessage every time before adding MSI routes.
Meanwhile, the vector info will be used in the coming patches to further
enable gsi route update notifications.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In split irqchip mode, IOAPIC is working in user space, only update
kernel irq routes when entry changed. When IR is enabled, we directly
update the kernel with translated messages. It works just like a kernel
cache for the remapping entries.
Since KVM irqfd is using kernel gsi routes to deliver interrupts, as
long as we can support split irqchip, we will support irqfd as
well. Also, since kernel gsi routes will cache translated interrupts,
irqfd delivery will not suffer from any performance impact due to IR.
And, since we supported irqfd, vhost devices will be able to work
seamlessly with IR now. Logically this should contain both vhost-net and
vhost-user case.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[move trace-events lines into target-i386/trace-events]
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
First remove VCPU from exec loop and only then remove lapic.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
object_property_add_child() silently fails with error that it can't
create duplicate propery 'apic' as we already have 'apic' property
registered for 'apic' feature. As result generic device_realize puts
apic into unattached container.
As it's programming error, abort if name collision happens in future
and fix property name for apic_state to 'lapic', this way apic is
a child of cpu instance.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
These are both stored in CPUID[EAX=7,EBX=0].ECX. KVM is going to
be able to emulate both (albeit with a performance loss in the case
of RDPID, which therefore will be in KVM_GET_EMULATED_CPUID rather
than KVM_GET_SUPPORTED_CPUID).
It's also possible to implement both in TCG, but this is for 2.8.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
These properties will be used by as address where to plug
CPU with help -device/device_add commands.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Custom apic-id setter/getter doesn't do any property specific
checks anymore, so clean it up and use more compact static
property DEFINE_PROP_UINT32 instead.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Machine code knows about all possible APIC IDs so use that
instead of hack which does O(n^2) complexity duplicate
checks, interating over global CPUs list.
As result duplicate check is done only once with O(log n) complexity.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add the host-phys-bits boolean property, if true, take phys-bits
from the hosts physical bits value, overriding either the default
or the user specified value.
We can also use the value we read from the host to check the users
explicitly set value and warn them if it doesn't match.
Note:
a) We only read the hosts value in KVM mode (because on non-x86
we get an abort if we try)
b) We don't warn about trying to use host-phys-bits in TCG mode,
we just fall back to the TCG default. This allows the machine
type to set the host-phys-bits flag if it wants and then to
work in both TCG and KVM.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Redo 9886e834 (target-i386: Require APIC ID to be explicitly set before
CPU realize) in another way that doesn't use int64_t to detect
if apic-id property has been set.
Use the fact that 0xFFFFFFFF is the broadcast
value that a CPU can't have and set default
uint32_t apic_id to it instead of using int64_t.
Later uint32_t apic_id will be used to drop custom
property setter/getter in favor of static property.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Fill the bits between 51..number-of-physical-address-bits in the
MTRR_PHYSMASKn variable range mtrr masks so that they're consistent
in the migration stream irrespective of the physical address space
of the source VM in a migration.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The CPU GPs if we try and set a bit in a variable MTRR mask above
the limit of physical address bits on the host. We hit this
when loading a migration from a host with a larger physical
address limit than our destination (e.g. a Xeon->i7 of same
generation) but previously used to get away with it
until 48e1a45 started checking that msr writes actually worked.
It seems in our case the GP probably comes from KVM emulating
that GP.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Currently QEMU sets the x86 number of physical address bits to the
magic number 40. This is only correct on some small AMD systems;
Intel systems tend to have 36, 39, 46 bits, and large AMD systems
tend to have 48.
Having the value different from your actual hardware is detectable
by the guest and in principal can cause problems;
The current limit of 40 stops TB VMs being created by those lucky
enough to have that much.
This patch lets you set the physical bits by a cpu property but
defaults to the same 40bits which matches TCGs setup.
I've removed the ancient warning about the 42 bit limit in exec.c;
I can't find that limit in there and no one else seems to know where
it is.
We use a magic value of 0 as the property default so that we can
later distinguish between the default and a user set value.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Provide a constant for the number of address bits supported under TCG.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
'HF_SOFTMMU_MASK' is only set when 'CONFIG_SOFTMMU' is defined. So
there's no need in this flag: test 'CONFIG_SOFTMMU' instead.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20160715175852.30749-6-sergey.fedorov@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Header guard symbols should match their file name to make guard
collisions less likely. Offenders found with
scripts/clean-header-guards.pl -vn.
Cleaned up with scripts/clean-header-guards.pl, followed by some
renaming of new guard symbols picked by the script to better ones.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Most of them use guard symbols like CPU_$target_H, but we also have
__MIPS_CPU_H__ and __TRICORE_CPU_H__. They all upset
scripts/clean-header-guards.pl.
The script dislikes CPU_$target_H because they don't match their file
name (they should, to make guard collisions less likely). The others
are reserved identifiers.
Clean them all up: use guard symbol $target_CPU_H for
target-$target/cpu.h.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
There are functions tlb_fill(), cpu_unaligned_access() and
do_unaligned_access() that are called with access type and mmu index
arguments. But these arguments are named 'is_write' and 'is_user' in their
declarations. The patches fix the arguments to avoid a confusion.
Signed-off-by: Sergey Sorokin <afarallax@yandex.ru>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-id: 1465907177-1399402-1-git-send-email-afarallax@yandex.ru
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If -cpu host is used, LMCE will be automatically enabled when it's
supported by host.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
It's a prerequisite that certain bits of MSR_IA32_FEATURE_CONTROL should
be set before some features (e.g. VMX and LMCE) can be used, which is
usually done by the firmware. This patch adds a fw_cfg file
"etc/msr_feature_control" which contains the advised value of
MSR_IA32_FEATURE_CONTROL and can be used by guest firmware (e.g. SeaBIOS).
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This patch adds the support to inject SRAR and SRAO as LMCE, i.e. they
are injected to only one VCPU rather than broadcast to all VCPUs. As KVM
reports LMCE support on Intel platforms, this features is only available
on Intel platforms.
LMCE is disabled by default and can be enabled/disabled by cpu option
'lmce=on/off'.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
[Haozhong: Enable LMCE only on Intel platforms
Disable LMCE by default and add a cpu option 'lmce'
Handle the error if LMCE is enabled w/o host support
Remove MCG_LMCE_P from MCE_CAP_DEF
Add migration support for LMCE
Minor code style changes]
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This change adds hyperv feature words report through qom rpc.
When VM is configured with hyperv features enabled
libvirt will check that required feature words are set
in cpuid leaf 40000003 through qom request.
Currently qemu does not report hyperv feature words
which prevents windows guests from starting with libvirt.
To avoid conflicting with current hyperv properties all added feature
words cannot be set directly with -cpu +feature yet.
Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
CC: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Improve the TSC frequency mismatch warning to show the host and
VM TSC frequencies.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Considering that features are converted to global properties and
global properties are automatically applied to every new instance
of created CPU (at object_new() time), there is no point in
parsing cpu_model string every time a CPU created. So move
parsing outside CPU creation loop and do it only once.
Parsing also should be done before any CPU is created so that
features would affect the first CPU a well.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Currently CPUClass->parse_features() is used to parse -cpu
features string and set properties on created CPU instances.
But considering that features specified by -cpu apply to every
created CPU instance, it doesn't make sense to parse the same
features string for every CPU created. It also makes every target
that cares about parsing features string explicitly call
CPUClass->parse_features() parser, which gets in a way if we
consider using generic device_add for CPU hotplug as device_add
has not a clue about CPU specific hooks.
Turns out we can use global properties mechanism to set
properties on every created CPU instance for a given type. That
way it's possible to convert CPU features into a set of global
properties for CPU type specified by -cpu cpu_model and common
Device.device_post_init() will apply them to CPU of given type
automatically regardless whether it's manually created CPU or CPU
created with help of device_add.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
x86_cpu_parse_featurestr has a "val = num;" assignment just before num
goes out of scope. Push num up to fix the issue.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
ERMS just says "rep movsb" and "rep stosb" are fast. It does not
imply any new instruction, so we can support it easily.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This function needs to be converted to QOM hook and virtualised for
multi-arch. This rename interferes, as cpu-qom will not have access
to the renaming causing name divergence. This rename doesn't really do
anything anyway so just delete it.
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <69bd25a8678b8b31b91cd9760c777bed1aafb44e.1437212383.git.crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaitepeter@gmail.com>
Version: GnuPG v1
iQEcBAABAgAGBQJXaFInAAoJEJykq7OBq3PI6VsH/0Sfgbdo1RksYuQwb/y92sCW
EN+lxUZ+OLfgrc8PYgNZwfSM3rsfYhznL0MAXOeEe7Ahabi07w7DhGR8WvwfAOlI
G96FRuvrIPfv5u6U6fwS4CvG3TIHVLxfHKCsTpPUmH8U5CNx/x/tpjNiWN1dj6t+
sXybSjYHfZfiZy2tI9MFIFWCdxnF/pl0QAPhbRqc8Y/RQTDrPKRjLpz+nitN/u96
5TS7KlELyQuP91YMmLceYSmIkHbxW703h+iE2n4hov0uZCP8Jil+2Jsd3ziQSRlL
j6LqexQ2ViBGdDSfiZGYES2VPlsHOCwb4G+IgWBStfZg1ppaXENvcDzPrgrB+L4=
=eUnF
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Mon 20 Jun 2016 21:29:27 BST
# gpg: using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/tracing-pull-request: (42 commits)
trace: split out trace events for linux-user/ directory
trace: split out trace events for qom/ directory
trace: split out trace events for target-ppc/ directory
trace: split out trace events for target-s390x/ directory
trace: split out trace events for target-sparc/ directory
trace: split out trace events for net/ directory
trace: split out trace events for audio/ directory
trace: split out trace events for ui/ directory
trace: split out trace events for hw/alpha/ directory
trace: split out trace events for hw/arm/ directory
trace: split out trace events for hw/acpi/ directory
trace: split out trace events for hw/vfio/ directory
trace: split out trace events for hw/s390x/ directory
trace: split out trace events for hw/pci/ directory
trace: split out trace events for hw/ppc/ directory
trace: split out trace events for hw/9pfs/ directory
trace: split out trace events for hw/i386/ directory
trace: split out trace events for hw/isa/ directory
trace: split out trace events for hw/sd/ directory
trace: split out trace events for hw/sparc/ directory
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use Coccinelle script to replace 'ret = E; return ret' with
'return E'. The script will do the substitution only when the
function return type and variable type are the same.
Manual fixups:
* audio/audio.c: coding style of "read (...)" and "write (...)"
* block/qcow2-cluster.c: wrap line to make it shorter
* block/qcow2-refcount.c: change indentation of wrapped line
* target-tricore/op_helper.c: fix coding style of
"remainder|quotient"
* target-mips/dsp_helper.c: reverted changes because I don't
want to argue about checkpatch.pl
* ui/qemu-pixman.c: fix line indentation
* block/rbd.c: restore blank line between declarations and
statements
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1465855078-19435-4-git-send-email-ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Unused Coccinelle rule name dropped along with a redundant comment;
whitespace touched up in block/qcow2-cluster.c; stale commit message
paragraph deleted]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
This patch simplifies code that uses a local_err variable just to
immediately use it for an error_propagate() call.
Coccinelle patch used to perform the changes added to
scripts/coccinelle/remove_local_err.cocci.
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1465855078-19435-3-git-send-email-ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Blank line in s390-virtio-ccw.c restored]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Information is tracked inside the TCGContext structure, and later used
by tracing events with the 'tcg' and 'vcpu' properties.
The 'cpu' field is used to check tracing of translation-time
events ("*_trans"). The 'tcg_env' field is used to pass it to
execution-time events ("*_exec").
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 146549350162.18437.3033661139638458143.stgit@fimbulvetr.bsc.es
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
KVM_GET_SUPPORTED_CPUID ioctl is called frequently when initializing
CPU. Depends on CPU features and CPU count, the number of calls can be
extremely high which slows down QEMU booting significantly. In our
testing, we saw 5922 calls with switches:
-cpu SandyBridge -smp 6,sockets=6,cores=1,threads=1
This ioctl takes more than 100ms, which is almost half of the total
QEMU startup time.
While for most cases the data returned from two different invocations
are not changed, that means, we can cache the data to avoid trapping
into kernel for the second time. To make sure the cache safe one
assumption is desirable: the ioctl is stateless. This is not true for
CPUID leaves in general (such as CPUID leaf 0xD, whose value depends
on guest XCR0 and IA32_XSS) but it is true of KVM_GET_SUPPORTED_CPUID,
which runs before there is a value for XCR0 and IA32_XSS.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Message-Id: <1465784487-23482-1-git-send-email-chao.p.peng@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu/osdep.h checks whether MAP_ANONYMOUS is defined, but this check
is bogus without a previous inclusion of sys/mman.h. Include it in
sysemu/os-posix.h and remove it from everywhere else.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now cpu_x86_init() does nothing more or less
than duplicating cpu_generic_init() logic.
So simplify it by using cpu_generic_init().
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
It will allow to drop custom cpu_x86_init() and use
cpu_generic_init() instead, reducing cpu_x86_create()
to a simple 3-liner.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The code will be changed to allow creation of the CPU object and
report kvm_required errors only at realizefn, so we need to make
the instance_init function more flexible.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Making x86_cpu_parse_featurestr() a pure convertor
of legacy feature string into global properties, needs
it to be called before a CPU instance is created so
parser shouldn't modify CPUState directly or access
it at all. Hence move current hack that directly pokes
into CPUState, to set/unset +-feats, from parser to
CPU's realize method.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The "fixup will be removed in future versions" warnings are
present since QEMU 1.7.0, at least, so users should have fixed
their scripts and configurations, already.
In the case of libvirt users, libvirt doesn't use the "xlevel"
option, and already rejects HyperV spinlock retry count < 0xFFF.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
I looked at a dozen Intel CPU that have this CPUID and all of them
always had Core offset as 1 (a wasted bit when hyperthreading is
disabled) and Package offset at least 4 (wasted bits at <= 4 cores).
QEMU uses more compact IDs and it doesn't make much sense to change it
now. I keep the SMT and Core sub-leaves even if there is just one
thread/core; it makes the code simpler and there should be no harm.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Introduce Skylake-Client cpu mode which inherits the features from
Broadwell and supports some additional features that are: MPX,
XSAVEC, and XGETBV1.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The exception_action() function in user-exec.c is just a call to
cpu_loop_exit() for every target CPU except i386. Since this
function is only called if the target's handle_mmu_fault() hook has
indicated an MMU fault, and that hook is only called from the
handle_cpu_signal() code path, we can simply move the x86-specific
setup into that hook, which allows us to remove the TARGET_I386
ifdef from user-exec.c.
Of the actions that were done by the call to raise_interrupt_err():
* cpu_svm_check_intercept_param() is a no-op in user mode
* check_exception() is a no-op since double faults are impossible
for user-mode
* assignments to cs->exception_index and env->error_code are no-ops
* assigning to env->exception_next_eip is unnecessary because it
is not used unless env->exception_is_int is true
* cpu_loop_exit_restore() is equivalent to cpu_loop_exit() since
pc is 0
which leaves just setting env_>exception_is_int as the action that
needs to be added to x86_cpu_handle_mmu_fault().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id: 1463494687-25947-7-git-send-email-peter.maydell@linaro.org
Add a comment to do_interrupt_user() along the same lines as the
existing one for do_interrupt_all() noting that the next_eip
argument is not used unless is_int is true or intno is EXCP_SYSCALL.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id: 1463494687-25947-6-git-send-email-peter.maydell@linaro.org
The function cpu_resume_from_signal() is now always called with a
NULL puc argument, and is rather misnamed since it is never called
from a signal handler. It is essentially forcing an exit to the
top level cpu loop but without raising any exception, so rename
it to cpu_loop_exit_noexc() and drop the useless unused argument.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id: 1463494687-25947-4-git-send-email-peter.maydell@linaro.org
The arm target was handled by 06486077, but other targets
were ignored. This handles all the rest which actually support
disassembly (that is, skipping moxie and tilegx).
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Move the old qemu_ram_addr_from_host to memory_region_from_host and
make it return an offset within the region. For qemu_ram_addr_from_host
return the ram_addr_t directly, similar to what it was before
commit 1b5ec23 ("memory: return MemoryRegion from qemu_ram_addr_from_host",
2013-07-04).
Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Simplify kvm_put_tscdeadline_msr() and
kvm_put_msr_feature_control() using kvm_msr_buf and the
kvm_msr_entry_add() helper.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add a helper function that appends new entries to the MSR buffer
and checks for the buffer size limit.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
We are dangerously close to the array limits in kvm_put_msrs()
and kvm_get_msrs(): with the default mcg_cap configuration, we
can set up to 148 MSRs in kvm_put_msrs(), and if we allow mcg_cap
to be changed, we can write up to 236 MSRs.
Use 4096 bytes for the buffer, that can hold 255 kvm_msr_entry
structs.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Instead of using 2400 bytes in the stack for 150 MSR entries in
kvm_get_msrs() and kvm_put_msrs(), allocate a buffer once for
each VCPU.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
QOM instance_init functions are not supposed to have any side-effects,
as new objects may be created at any moment for querying property
information (see qmp_device_list_properties()).
Calling cpu_exec_init() also affects QEMU's ability to handle errors
during CPU creation, as some actions done by cpu_exec_init() can't be
reverted.
Move cpu_exec_init() call to realize so a simple object_new() won't
trigger it, and so that it is called after some basic validation of CPU
parameters.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
QOM instance_init functions are not supposed to have any side-effects,
as new objects may be created at any moment for querying property
information (see qmp_device_list_properties()).
Move TCG initialization to realize time so it won't be called when just
doing object_new() on a X86CPU subclass.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Instead of requiring cpu.c to check if TCG was already initialized,
simply let the function be called multiple times.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
x86_cpudef_init() doesn't do anything anymore, cpudef_init(),
cpudef_setup(), and x86_cpudef_init() can be finally removed.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Newer PC machines don't set hw_version, and older machines set
model-id on compat_props explicitly, so we don't need the
x86_cpudef_setup() code that sets model_id using
qemu_hw_version() anymore.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Instead of using offset macros and bit operations in a uint32_t
array, use the X86XSaveArea struct to perform the loading/saving
operations in kvm_put_xsave() and kvm_get_xsave().
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This doesn't introduce any change in the code, as the offsets and
struct sizes match what was present in the table. This can be
validated by the QEMU_BUILD_BUG_ON lines on target-i386/cpu.h,
which ensures the struct sizes and offsets match the existing
values in ext_save_area.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add structs that define the layout of the xsave areas used by
Intel processors. Add some QEMU_BUILD_BUG_ON lines to ensure the
structs match the XSAVE_* macros in target-i386/kvm.c and the
offsets and sizes at target-i386/cpu.c:ext_save_areas.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
exec-all.h contains TCG-specific definitions. It is not needed outside
TCG-specific files such as translate.c, exec.c or *helper.c.
One generic function had snuck into include/exec/exec-all.h; move it to
include/qom/cpu.h.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All qdev definitions are available from other headers, user-mode
emulation does not need hw/hw.h.
By considering system emulation only, it is simpler to disentangle
hw/hw.h from NEED_CPU_H.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make X86CPU an opaque type within cpu-qom.h, and move all definitions of
private methods, as well as all type definitions that require knowledge
of the layout to cpu.h. This helps making files independent of NEED_CPU_H
if they only need to pass around CPU pointers.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make cpu-qom.h so that it is only included from cpu.h. Then there
is no need for it to include cpu.h again.
Later we will make cpu-qom.h target independent and we will _want_
to include it from elsewhere, but for now reduce the number of cases
to handle.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In user mode, there's only a static address translation, TBs are always
invalidated properly and direct jumps are reset when mapping change.
Thus the destination address is always valid for direct jumps and
there's no need to restrict it to the pages the TB resides in.
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Blue Swirl <blauwirbel@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
We don't take care of direct jumps when address mapping changes. Thus we
must be sure to generate direct jumps so that they always keep valid
even if address mapping changes. Luckily, we can only allow to execute a
TB if it was generated from the pages which match with current mapping.
Document tcg_gen_goto_tb() declaration and note the reason for
destination PC limitations.
Some targets with variable length instructions allow TB to straddle a
page boundary. However, we make sure that both of TB pages match the
current address mapping when looking up TBs. So it is safe to do direct
jumps into the both pages. Correct the checks for some of those targets.
Given that, we can safely patch a TB which spans two pages. Remove the
unnecessary check in cpu_exec() and allow such TBs to be patched.
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
We are inconsistent with the type of tb->flags: usage varies loosely
between int and uint64_t. Settle to uint32_t everywhere, which is
superior to both: at least one target (aarch64) uses the most significant
bit in the u32, and uint64_t is wasteful.
Compile-tested for all targets.
Suggested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1460049562-23517-1-git-send-email-cota@braap.org>
All callers pass "false" keeping the old semantics. The windows
implementation doesn't distinguish the flag yet. On posix, it is passed
down to the underlying aio context.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
QEMU complains about -cpu host on an AMD machine:
warning: host doesn't support requested feature: CPUID.80000001H:EDX [bit 0]
For bits 0,1,3,4,5,6,7,8,9,12,13,14,15,16,17,23,24.
KVM_GET_SUPPORTED_CPUID and and x86_cpu_get_migratable_flags()
don't handle the AMD CPUID aliases bits, making
x86_cpu_filter_features() print warnings and clear those CPUID
bits incorrectly.
To avoid hacking x86_cpu_get_migratable_flags() to handle
CPUID_EXT2_AMD_ALIASES (just like the existing hack inside
kvm_arch_get_supported_cpuid()), simply move the
CPUID_EXT2_AMD_ALIASES code in x86_cpu_realizefn() after the
x86_cpu_filter_features() call.
This will probably make the CPUID_EXT2_AMD_ALIASES hack in
kvm_arch_get_supported_cpuid() unnecessary, too. The hack will be
removed in a follow-up patch after v2.6.0.
Reported-by: Radim Krčmář <rkrcmar@redhat.com>
Tested-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Xiao Guangrong ran kvm-unit-tests on an actual machine with PKU and
found that it fails:
test pte.p pte.user pde.p pde.user pde.a pde.pse pkru.wd pkey=1 user write efer.nx cr4.pke: FAIL: error code 27 expected 7
Dump mapping: address: 0x123400000000
------L4: 2ebe007
------L3: 2ebf007
------L2: 8000000020000a5
(All failures are combinations of "pde.user pde.p pkru.wd pkey=1",
plus either "pde.pse" or "pte.p pte.user", plus one of "user cr0.wp",
"cr0.wp" or "user", plus unimportant bits such as accessed/dirty or
efer.nx).
So PFEC.PKEY is set even if the ordinary check failed (which it did
because pde.w is zero). Adjust QEMU to match behavior of silicon.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM does not let you read or write this MSR if the corresponding CPUID
bit is not set. This in turn causes MSRs that come after MSR_TSC_AUX
to be ignored by KVM_SET_MSRS.
One visible symptom is that s3.flat from kvm-unit-tests fails with
CPUs that do not have RDTSCP, because the SMBASE is not reset to
0x30000 after reset.
Fixes: c9b8f6b621
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move declarations out of qemu-common.h for functions declared in
utils/ files: e.g. include/qemu/path.h for utils/path.c.
Move inline functions out of qemu-common.h and into new files (e.g.
include/qemu/bcd.h)
Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>