The initial CPU count number is not required, if any of the topology
options are given, since it can be computed.
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The list of CPU topology options are presented in a fairly arbitrary
order currently. Re-arrange them so that they're ordered from largest to
smallest unit
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The test aborts and error message as the following be throwed:
"No such file or directory: '/var/tmp/qemu-migrate-{pid}.migrate",
when the unix socket migration test nearly done. The reason is
qemu removes the unix socket file after migration before
guestperf.py script do it. So pre-check if the socket file exists
when removing it to prevent the guestperf program from aborting.
See also commit f9cc00346d ("tests/migration: fix unix socket batch
migration").
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Signed-off-by: Hyman <huangy81@chinatelecom.cn>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Recent GLibC calls sched_getaffinity in code paths related to malloc and
when QEMU blocks access, it sends it off into a bad codepath resulting
in stack exhaustion[1]. The GLibC bug is being fixed[2], but none the
less, GLibC has valid reasons to want to use sched_getaffinity.
It is not unreasonable for code to want to run many resource syscalls
for information gathering, so it is a bit too harsh for QEMU to block
them.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1975693
[2] https://sourceware.org/pipermail/libc-alpha/2021-June/128271.html
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The GDateTime APIs provided by GLib avoid portability pitfalls, such
as some platforms where 'struct timeval.tv_sec' field is still 'long'
instead of 'time_t'. When combined with automatic cleanup, GDateTime
often results in simpler code too.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The GDateTime APIs provided by GLib avoid portability pitfalls, such
as some platforms where 'struct timeval.tv_sec' field is still 'long'
instead of 'time_t'. When combined with automatic cleanup, GDateTime
often results in simpler code too.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
If we have gnutls >= 3.6.13, then it has enough functionality
and performance that we can use it as the preferred crypto
backend.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This adds support for using gnutls as a provider of the crypto
pbkdf APIs.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This adds support for using gnutls as a provider of the crypto
hmac APIs.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This adds support for using gnutls as a provider of the crypto
hash APIs.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Add an implementation of the QEMU cipher APIs to the gnutls
crypto backend. XTS support is only available for gnutls
version >= 3.6.8. Since ECB mode is not exposed by gnutls
APIs, we can't use the private XTS code for compatibility.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This introduces the build logic needed to decide whether we can
use gnutls as a crypto driver backend. The actual implementations
will be introduced in following patches. We only wish to use
gnutls if it has version 3.6.14 or newer, because that is what
finally brings HW accelerated AES-XTS mode for x86_64.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Originally we preferred to use nettle over gcrypt because
gnutls already links to nettle and thus it minimizes the
dependencies. In retrospect this was the wrong criteria to
optimize for.
Currently shipping versions of gcrypt have cipher impls that
are massively faster than those in nettle and this is way
more important. The nettle library is also not capable of
enforcing FIPS compliance, since it considers that out of
scope. It merely aims to provide general purpose impls of
algorithms, and usage policy is left upto the layer above,
such as GNUTLS.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Currently the crypto layer exposes support for a 'des-rfb'
algorithm which is just normal single-DES, with the bits
in each key byte reversed. This special key munging is
required by the RFB protocol password authentication
mechanism.
Since the crypto layer is generic shared code, it makes
more sense to do the key byte munging in the VNC server
code, and expose normal single-DES support.
Replacing cipher 'des-rfb' by 'des' looks like an incompatible
interface change, but it doesn't matter. While the QMP schema
allows any QCryptoCipherAlgorithm for the 'cipher-alg' field
in QCryptoBlockCreateOptionsLUKS, the code restricts what can
be used at runtime. Thus the only effect is a change in error
message.
Original behaviour:
$ qemu-img create -f luks --object secret,id=sec0,data=123 -o cipher-alg=des-rfb,key-secret=sec0 demo.luks 1G
Formatting 'demo.luks', fmt=luks size=1073741824 key-secret=sec0 cipher-alg=des-rfb
qemu-img: demo.luks: Algorithm 'des-rfb' not supported
New behaviour:
$ qemu-img create -f luks --object secret,id=sec0,data=123 -o cipher-alg=des-rfb,key-secret=sec0 demo.luks 1G
Formatting 'demo.luks', fmt=luks size=1073741824 key-secret=sec0 cipher-alg=des-fish
qemu-img: demo.luks: Invalid parameter 'des-rfb'
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The built-in AES+XTS implementation is used for the LUKS encryption
When building system emulators it is reasonable to expect that an
external crypto library is being used instead. The performance of the
builtin XTS implementation is terrible as it has no CPU acceleration
support. It is thus not worth keeping a home grown XTS implementation
for the built-in cipher backend.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The built-in DES implementation is used for the VNC server password
authentication scheme. When building system emulators it is reasonable
to expect that an external crypto library is being used. It is thus
not worth keeping a home grown DES implementation in tree.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The GNUTLS crypto provider doesn't support DES-ECB, only DES-CBC.
We can use the latter to simulate the former, if we encrypt only
1 block (8 bytes) of data at a time, using an all-zeros IV. This
is a very inefficient way to use the QCryptoCipher APIs, but
since the VNC authentication challenge is only 16 bytes, this
is acceptable. No other part of QEMU should be using DES. This
test case demonstrates the equivalence of ECB and CBC for the
single-block case.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The XTS cipher mode was introduced in gcrypt 1.8.0, which
matches QEMU's current minimum version.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This is only required on gcrypt < 1.6.0, and is thus obsolete
since
commit b33a84632a
Author: Daniel P. Berrangé <berrange@redhat.com>
Date: Fri May 14 13:04:08 2021 +0100
crypto: bump min gcrypt to 1.8.0, dropping RHEL-7 support
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The min gcrypt was bumped:
commit b33a84632a
Author: Daniel P. Berrangé <berrange@redhat.com>
Date: Fri May 14 13:04:08 2021 +0100
crypto: bump min gcrypt to 1.8.0, dropping RHEL-7 support
but this was accidentally lost in conflict resolution for
commit 5761251138
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Thu Jun 3 11:15:26 2021 +0200
configure, meson: convert crypto detection to meson
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Using error_fatal provides better diagnostics when tests
failed, than using asserts, because we see the text of
the error message.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Since we now require gcrypt >= 1.8.0, there is no need
to exclude the pbkdf test case.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The main method checks whether the cipher choice is supported
at runtime, so there is no need for compile time conditions.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
I thought I'd sent the last PR before the 6.1 soft freeze, but
unfortunately I need one more. This last minute one puts in a SLOF
update, along with a couple of bugfixes.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAmDs9VgACgkQbDjKyiDZ
s5JPHRAApamC4lmoHD9eENznNKPvggAQ10h0OMNEvATyq4boAQ6rZRdAkeBqYAGA
5DF0sWIyRy7/IZUSEcHHlHiw1KQqem3lBUYWQ1L78nt6AphcRQciTeQ73WMIsduA
ruFxYlCHxFQ/2wixUWmyCnGyKqFsinrbc9DrAfPFnuf3SMwr0gl4x+V+mwQbcvRZ
dn/rR8RXmOnqgX8dsViyftnmijqoyIUSWWPL7jk5WiaRdRcdCd8ly9pmkinPj6IX
k+Cgty3DSV0mn9d8zH+tDkqXwU8R/HHY8TWkmLSTtR1nXtbDBIphcmeLQ8j8Eugy
SNWxZb3ft2fmfPJICcCYOy0qcPyNekRRkmQhADqtoA4OVAdd5QQmVNXtmAV+jKp7
WX4Ozsbt4P1FXSuvhzyOTIumNsz9NxgtuGmnEl09suJ2WdzN4XOI1SzC9/JzPM/s
K/0dalIQf9NymyWQMpbVUFcPiAqGr+yuHXy5FZssTa/lgD76Odds5EVFmua95HMl
J1XRMRYmsKzRq/TCOZFr72cCzGOixzYY/Oe/yoa48oPX5HMchCsm5h5ljgqKgTh2
R7uAHmqNvsvJ0PuH9DWCPEMGr0f1f16m4ayIELysyvd1geSL/SQ9nuT/phaqmUKO
Myo0unIcuJbagf9JwG19j9fVp1dpNee/AhR38jlaNgMNX2sXI5M=
=i+2e
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.1-20210713' into staging
ppc patch queue 2021-07-13
I thought I'd sent the last PR before the 6.1 soft freeze, but
unfortunately I need one more. This last minute one puts in a SLOF
update, along with a couple of bugfixes.
# gpg: Signature made Tue 13 Jul 2021 03:07:20 BST
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dg-gitlab/tags/ppc-for-6.1-20210713:
mv64361: Remove extra break from a switch case
pseries: Update SLOF firmware image
ppc/pegasos2: Allow setprop in VOF
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Taking the mutex every time for each dirty bit to clear is too slow, especially
we'll take/release even if the dirty bit is cleared. So far it's only used to
sync with special cases with qemu_guest_free_page_hint() against migration
thread, nothing really that serious yet. Let's move the lock to be upper.
There're two callers of migration_bitmap_clear_dirty().
For migration, move it into ram_save_iterate(). With the help of MAX_WAIT
logic, we'll only run ram_save_iterate() for no more than 50ms-ish time, so
taking the lock once there at the entry. It also means any call sites to
qemu_guest_free_page_hint() can be delayed; but it should be very rare, only
during migration, and I don't see a problem with it.
For COLO, move it up to colo_flush_ram_cache(). I think COLO forgot to take
that lock even when calling ramblock_sync_dirty_bitmap(), where another example
is migration_bitmap_sync() who took it right. So let the mutex cover both the
ramblock_sync_dirty_bitmap() and migration_bitmap_clear_dirty() calls.
It's even possible to drop the lock so we use atomic operations upon rb->bmap
and the variable migration_dirty_pages. I didn't do it just to still be safe,
also not predictable whether the frequent atomic ops could bring overhead too
e.g. on huge vms when it happens very often. When that really comes, we can
keep a local counter and periodically call atomic ops. Keep it simple for now.
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hailiang Zhang <zhang.zhanghailiang@huawei.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: Leonardo Bras Soares Passos <lsoaresp@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210630200805.280905-1-peterx@redhat.com>
Reviewed-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
For each "migrate" command, remember to clear the s->error before going on.
For one reason, when there's a new error it'll be still remembered; see
migrate_set_error() who only sets the error if error==NULL. Meanwhile if a
failed migration completes (e.g., postcopy recovered and finished), we
shouldn't dump an error when calling migrate_fd_cleanup() at last.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210708190653.252961-4-peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Below process could crash qemu with postcopy recovery:
1. (hmp) migrate -d ..
2. (hmp) migrate_start_postcopy
3. [network down, postcopy paused]
4. (hmp) migrate -r $WRONG_PORT
when try the recover on an invalid $WRONG_PORT, cleanup_bh will be cleared
5. (hmp) migrate -r $RIGHT_PORT
[qemu crash on assert(cleanup_bh)]
The thing is we shouldn't cleanup if it's postcopy resume; the error is set
mostly because the channel is wrong, so we return directly waiting for the user
to retry.
migrate_fd_cleanup() should only be called when migration is cancelled or
completed.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210708190653.252961-3-peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
When postcopy pause triggered, we rely on the migration thread to cleanup the
to_dst_file handle, and the return path thread to cleanup the from_dst_file
handle (which is stored in the local variable "rp").
Within the process, from_dst_file cleanup (qemu_fclose) is postponed until it's
setup again due to a postcopy recovery.
It used to work before yank was born; after yank is introduced we rely on the
refcount of IOC to correctly unregister yank function in channel_close(). If
without the early and on-time release of from_dst_file handle the yank function
will be leftover during paused postcopy.
Without this patch, below steps (quoted from Xiaohui) could trigger qemu src
crash:
1.Boot vm on src host
2.Boot vm on dst host
3.Enable postcopy on src&dst host
4.Load stressapptest in vm and set postcopy speed to 50M
5.Start migration from src to dst host, change into postcopy mode when migration is active.
6.When postcopy is active, down the network card(do migration via this network) on dst host.
7.Wait untill postcopy is paused on src&dst host.
8.Before up network card, recover migration on dst host, will get error like following.
9.Ignore the error of step 8, go on recovering migration on src host:
After step 9, qemu on src host will core dump after some seconds:
qemu-kvm: ../util/yank.c:107: yank_unregister_instance: Assertion `QLIST_EMPTY(&entry->yankfns)' failed.
1.sh: line 38: 44662 Aborted (core dumped)
Reported-by: Li Xiaohui <xiaohli@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210708190653.252961-2-peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
When the migration fails or is canceled we wait the end of the unplug
operation to be able to plug it back. But if the unplug operation
is never finished we stop to wait and QEMU emits a warning to inform
the user.
Based-on: 20210629155007.629086-1-lvivier@redhat.com
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20210701131458.112036-1-lvivier@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
backtrace:
'0x00007ffff5f44ec2 in __ibv_dereg_mr_1_1 (mr=0x7fff1007d390) at /home/lizhijian/rdma-core/libibverbs/verbs.c:478
478 void *addr = mr->addr;
(gdb) bt
#0 0x00007ffff5f44ec2 in __ibv_dereg_mr_1_1 (mr=0x7fff1007d390) at /home/lizhijian/rdma-core/libibverbs/verbs.c:478
#1 0x0000555555891fcc in rdma_delete_block (block=<optimized out>, rdma=0x7fff38176010) at ../migration/rdma.c:691
#2 qemu_rdma_cleanup (rdma=0x7fff38176010) at ../migration/rdma.c:2365
#3 0x00005555558925b0 in qio_channel_rdma_close_rcu (rcu=0x555556b8b6c0) at ../migration/rdma.c:3073
#4 0x0000555555d652a3 in call_rcu_thread (opaque=opaque@entry=0x0) at ../util/rcu.c:281
#5 0x0000555555d5edf9 in qemu_thread_start (args=0x7fffe88bb4d0) at ../util/qemu-thread-posix.c:541
#6 0x00007ffff54c73f9 in start_thread () at /lib64/libpthread.so.0
#7 0x00007ffff53f3b03 in clone () at /lib64/libc.so.6 '
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Message-Id: <20210708144521.1959614-1-lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
When parsing cpus= attribute of -numa object couple of checks
is performed, such as correct initiator setting (see the if()
statement at the end of for() loop in
machine_set_cpu_numa_node()).
However, with the current code cpus= attribute is parsed before
initiator= attribute and thus the check may fail even though it
is not obvious why. But since parsing the initiator= attribute
does not depend on the cpus= attribute we can swap the order of
the two.
It's fairly easy to reproduce with the following command line
(snippet of an actual cmd line):
-smp 4,sockets=4,cores=1,threads=1 \
-object '{"qom-type":"memory-backend-ram","id":"ram-node0","size":2147483648}' \
-numa node,nodeid=0,cpus=0-1,initiator=0,memdev=ram-node0 \
-object '{"qom-type":"memory-backend-ram","id":"ram-node1","size":2147483648}' \
-numa node,nodeid=1,cpus=2-3,initiator=1,memdev=ram-node1 \
-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=5 \
-numa hmat-lb,initiator=0,target=0,hierarchy=first-level,data-type=access-latency,latency=10 \
-numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-latency,latency=5 \
-numa hmat-lb,initiator=1,target=1,hierarchy=first-level,data-type=access-latency,latency=10 \
-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=204800K \
-numa hmat-lb,initiator=0,target=0,hierarchy=first-level,data-type=access-bandwidth,bandwidth=208896K \
-numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=204800K \
-numa hmat-lb,initiator=1,target=1,hierarchy=first-level,data-type=access-bandwidth,bandwidth=208896K \
-numa hmat-cache,node-id=0,size=10K,level=1,associativity=direct,policy=write-back,line=8 \
-numa hmat-cache,node-id=1,size=10K,level=1,associativity=direct,policy=write-back,line=8 \
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <b27a6a88986d63e3f610a728c845e01ff8d92e2e.1625662776.git.mprivozn@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
When setting up NUMA with HMAT enabled there's a check performed
in machine_set_cpu_numa_node() that reports an error when a NUMA
node has a CPU but the node's initiator is not itself. The error
message reported contains only the expected value and not the
actual value (which is different because an error is being
reported). Report both values in the error message.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Message-Id: <ebdf871551ea995bafa7a858899a26aa9bc153d3.1625662776.git.mprivozn@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
A AMD server typically has cpuid level 0x10(test on Rome/Milan), it
should not be changed to 0x1f in multi-dies case.
* to maintain compatibility with older machine types, only implement
this change when the CPU's "x-vendor-cpuid-only" property is false
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: zhenwei pi <pizhenwei@bytedance.com>
Fixes: a94e142899 (target/i386: Add CPUID.1F generation support for multi-dies PCMachine)
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-Id: <20210708170641.49410-1-michael.roth@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Currently all built-in CPUs report cache information via CPUID leaves 2
and 4, but these have never been defined for AMD. In the case of
SEV-SNP this can cause issues with CPUID enforcement. Address this by
allowing CPU types to suppress these via a new "x-vendor-cpuid-only"
CPU property, which is true by default, but switched off for older
machine types to maintain compatibility.
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: zhenwei pi <pizhenwei@bytedance.com>
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-Id: <20210708003623.18665-1-michael.roth@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
When Hyper-V SynIC is enabled, we may need to allow Windows guests to make
hypercalls (POST_MESSAGES/SIGNAL_EVENTS). No issue is currently observed
because KVM is very permissive, allowing these hypercalls regarding of
guest visible CPUid bits.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210608120817.1325125-9-vkuznets@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
According to TLFS, Hyper-V guest is supposed to check
HV_HYPERCALL_AVAILABLE privilege bit before accessing
HV_X64_MSR_GUEST_OS_ID/HV_X64_MSR_HYPERCALL MSRs but at least some
Windows versions ignore that. As KVM is very permissive and allows
accessing these MSRs unconditionally, no issue is observed. We may,
however, want to tighten the checks eventually. Conforming to the
spec is probably also a good idea.
Enable HV_HYPERCALL_AVAILABLE bit unconditionally.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210608120817.1325125-8-vkuznets@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
hv_cpuid_check_and_set() does too much:
- Checks if the feature is supported by KVM;
- Checks if all dependencies are enabled;
- Sets the feature bit in cpu->hyperv_features for 'passthrough' mode.
To reduce the complexity, move all the logic except for dependencies
check out of it. Also, in 'passthrough' mode we don't really need to
check dependencies because KVM is supposed to provide a consistent
set anyway.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210608120817.1325125-7-vkuznets@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
To make Hyper-V features appear in e.g. QMP query-cpu-model-expansion we
need to expand and set the corresponding CPUID leaves early. Modify
x86_cpu_get_supported_feature_word() to call newly intoduced Hyper-V
specific kvm_hv_get_supported_cpuid() instead of
kvm_arch_get_supported_cpuid(). We can't use kvm_arch_get_supported_cpuid()
as Hyper-V specific CPUID leaves intersect with KVM's.
Note, early expansion will only happen when KVM supports system wide
KVM_GET_SUPPORTED_HV_CPUID ioctl (KVM_CAP_SYS_HYPERV_CPUID).
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210608120817.1325125-6-vkuznets@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Currently, the only eVMCS version, supported by KVM (and described in TLFS)
is '1'. When Enlightened VMCS feature is enabled, QEMU takes the supported
eVMCS version range (from KVM_CAP_HYPERV_ENLIGHTENED_VMCS enablement) and
puts it to guest visible CPUIDs. When (and if) eVMCS ver.2 appears a
problem on migration is expected: it doesn't seem to be possible to migrate
from a host supporting eVMCS ver.2 to a host, which only support eVMCS
ver.1.
Hardcode eVMCS ver.1 as the result of 'hv-evmcs' enablement for now. Newer
eVMCS versions will have to have their own enablement options (e.g.
'hv-evmcs=2').
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20210608120817.1325125-4-vkuznets@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Clarify the fact that 'hv-passthrough' only enables features which are
already known to QEMU and that it overrides all other 'hv-*' settings.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210608120817.1325125-3-vkuznets@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>