Vivek's support for new FUSE KILLPRIV_V2
and some smaller cleanups.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEERfXHG0oMt/uXep+pBRYzHrxb/ecFAmAsEDgACgkQBRYzHrxb
/ee1Hg/+PXJXLzTZMPdM97BLGyE5k7jgUXfYSCC+VTC87PR/nOUV7LAU12Lqdmiz
NZG9GoqdRusw4vjc1FF/GcqJYWERrknWDjbpNB3rUfP1I87mWuMXk9HKfubnqP78
JeeLx6O2q7O8V9CLj6lFcX96fa8umSJjYvpHB7jGuvmfRUgNOa3f9QiU7I9ySThn
lYCcpqbd/k27eFGAzjy5T5l1SVZTnbVWsM0QoIjRZP27aublECDKJva9owX8m2AC
50QAoS5hhNIc1mYe45xAB3fibWFh8oD3Onfr1HPRjDLOyL9F8rXZeQibzLLIFwIc
uC2fYQ07Mywt91f6s6ns9IAxfbnUtujK2Wxcc38xr7Cs5WoHXFhqjPBYPuWTl3Mo
XFBRH2J5CQKsU0BswOTdYo6DRwJqASSXyD6aEm/Rl5GibIOPE86+UedV7mPLJNYM
6cWbC9zxlS7ImjUxrTHu4zaVCLJz4AO+Z12uT4KbbieDS6mRznrBwZbkiEho9wR1
A5UVTs7vAWpthdHJFvImV4SwjkmcYx+LhQcIqK7HdJIG9fRP4jnrnsHhCxaQQPIn
4yl+jgQbN45imRazdIiFTCIJ1MwGU7dxOX3OQI6FjTpGENHve3XuO1h1c025TtcD
7gheE0sgVSsddEMAvl8zQ1ZW/eQyL/aTI4ctgUkxiwN5gct9L+k=
=3eKx
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgilbert-gitlab/tags/pull-virtiofs-20210216' into staging
virtiofsd pull 2021-02-16
Vivek's support for new FUSE KILLPRIV_V2
and some smaller cleanups.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
# gpg: Signature made Tue 16 Feb 2021 18:34:32 GMT
# gpg: using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert-gitlab/tags/pull-virtiofs-20210216:
virtiofsd: Do not use a thread pool by default
viriofsd: Add support for FUSE_HANDLE_KILLPRIV_V2
virtiofsd: Save error code early at the failure callsite
tools/virtiofsd: Replace the word 'whitelist'
virtiofsd: vu_dispatch locking should never fail
virtiofsd: Allow to build it without the tools
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Without that wireshark complains about invalid control setup data
for non-control transfers.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20210216144939.841873-1-kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
In order to keep track of the alternate setting that should be used for
a given interface, the USBDevice struct keeps an array of alternate
setting values, which is indexed by the interface number. In
usb_host_set_interface, when this array is updated, usb_host_ep_update
is called as a result. However, when usb_host_ep_update accesses the
active libusb_config_descriptor, it indexes udev->altsetting with the
loop variable, rather than the interface number.
With the simple trace backend enable, this behavior can be seen:
[...]
usb_xhci_xfer_start 0.440 pid=1215 xfer=0x5596a4b85930 slotid=0x1 epid=0x1 streamid=0x0
usb_packet_state_change 1.703 pid=1215 bus=0x1 port=b'1' ep=0x0 p=0x5596a4b85938 o=b'undef' n=b'setup'
usb_host_req_control 2.269 pid=1215 bus=0x1 addr=0x5 p=0x5596a4b85938 req=0x10b value=0x1 index=0xd
usb_host_set_interface 0.449 pid=1215 bus=0x1 addr=0x5 interface=0xd alt=0x1
usb_host_parse_config 2542.648 pid=1215 bus=0x1 addr=0x5 value=0x2 active=0x1
usb_host_parse_interface 1.804 pid=1215 bus=0x1 addr=0x5 num=0xc alt=0x0 active=0x1
usb_host_parse_endpoint 2.012 pid=1215 bus=0x1 addr=0x5 ep=0x2 dir=b'in' type=b'int' active=0x1
usb_host_parse_interface 1.598 pid=1215 bus=0x1 addr=0x5 num=0xd alt=0x0 active=0x1
usb_host_req_emulated 3.593 pid=1215 bus=0x1 addr=0x5 p=0x5596a4b85938 status=0x0
usb_packet_state_change 2.550 pid=1215 bus=0x1 port=b'1' ep=0x0 p=0x5596a4b85938 o=b'setup' n=b'complete'
usb_xhci_xfer_success 4.298 pid=1215 xfer=0x5596a4b85930 bytes=0x0
[...]
In particular, it is seen that although usb_host_set_interface sets the
alternate setting of interface 0xd to 0x1, usb_host_ep_update uses 0x0
as the alternate setting due to using the incorrect index to
udev->altsetting.
Fix this problem by getting the interface number from the active
libusb_config_descriptor, and then using that as the index to
udev->altsetting.
Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Message-Id: <20210201213021.500277-1-rosbrookn@ainfosec.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Currently we created a thread pool (With 64 max threads per pool) for
each virtqueue. We hoped that this will provide us with better scalability
and performance.
But in practice, we are getting better numbers in most of the cases
when we don't create a thread pool at all and a single thread per
virtqueue receives the request and processes it.
Hence, I am proposing that we switch to no thread pool by default
(equivalent of --thread-pool-size=0). This will provide out of
box better performance to most of the users. In fact other users
have confirmed that not using a thread pool gives them better
numbers. So why not use this as default. It can be changed when
somebody can fix the issues with thread pool performance.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Message-Id: <20210210182744.27324-2-vgoyal@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This patch adds basic support for FUSE_HANDLE_KILLPRIV_V2. virtiofsd
can enable/disable this by specifying option "-o killpriv_v2/no_killpriv_v2".
By default this is enabled as long as client supports it
Enabling this option helps with performance in write path. Without this
option, currently every write is first preceeded with a getxattr() operation
to find out if security.capability is set. (Write is supposed to clear
security.capability). With this option enabled, server is signing up for
clearing security.capability on every WRITE and also clearing suid/sgid
subject to certain rules. This gets rid of extra getxattr() call for every
WRITE and improves performance. This is true when virtiofsd is run with
option -o xattr.
What does enabling FUSE_HANDLE_KILLPRIV_V2 mean for file server implementation.
It needs to adhere to following rules. Thanks to Miklos for this summary.
- clear "security.capability" on write, truncate and chown unconditionally
- clear suid/sgid in case of following. Note, sgid is cleared only if
group executable bit is set.
o setattr has FATTR_SIZE and FATTR_KILL_SUIDGID set.
o setattr has FATTR_UID or FATTR_GID
o open has O_TRUNC and FUSE_OPEN_KILL_SUIDGID
o create has O_TRUNC and FUSE_OPEN_KILL_SUIDGID flag set.
o write has FUSE_WRITE_KILL_SUIDGID
>From Linux VFS client perspective, here are the requirements.
- caps are always cleared on chown/write/truncate
- suid is always cleared on chown, while for truncate/write it is cleared
only if caller does not have CAP_FSETID.
- sgid is always cleared on chown, while for truncate/write it is cleared
only if caller does not have CAP_FSETID as well as file has group execute
permission.
virtiofsd implementation has not changed much to adhere to above ruls. And
reason being that current assumption is that we are running on Linux
and on top of filesystems like ext4/xfs which already follow above rules.
On write, truncate, chown, seucurity.capability is cleared. And virtiofsd
drops CAP_FSETID if need be and that will lead to clearing of suid/sgid.
But if virtiofsd is running on top a filesystem which breaks above assumptions,
then it will have to take extra actions to emulate above. That's a TODO
for later when need arises.
Note: create normally is supposed to be called only when file does not
exist. So generally there should not be any question of clearing
setuid/setgid. But it is possible that after client checks that
file is not present, some other client creates file on server
and this race can trigger sending FUSE_CREATE. In that case, if
O_TRUNC is set, we should clear suid/sgid if FUSE_OPEN_KILL_SUIDGID
is also set.
v3:
- Resolved conflicts due to lo_inode_open() changes.
- Moved capability code in lo_do_open() so that both lo_open() and
lo_create() can benefit from common code.
- Dropped changes to kernel headers as these are part of qemu already.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210208224024.43555-3-vgoyal@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Change error code handling slightly in lo_setattr(). Right now we seem
to jump to out_err and assume that "errno" is valid and use that to
send reply.
But if caller has to do some other operations before jumping to out_err,
then it does the dance of first saving errno to saverr and the restore
errno before jumping to out_err. This makes it more confusing.
I am about to make more changes where caller will have to do some
work after error before jumping to out_err. I found it easier to
change the convention a bit. That is caller saves error in "saverr"
before jumping to out_err. And out_err uses "saverr" to send error
back and does not rely on "errno" having actual error.
v3: Resolved conflicts in lo_setattr() due to lo_inode_open() changes.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210208224024.43555-2-vgoyal@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Follow the inclusive terminology from the "Conscious Language in your
Open Source Projects" guidelines [*] and replace the words "whitelist"
appropriately.
[*] https://github.com/conscious-lang/conscious-lang-docs/blob/main/faq.md
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210205171817.2108907-3-philmd@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
pthread_rwlock_rdlock() and pthread_rwlock_wrlock() can fail if a
deadlock condition is detected or the current thread already owns
the lock. They can also fail, like pthread_rwlock_unlock(), if the
mutex wasn't properly initialized. None of these are ever expected
to happen with fv_VuDev::vu_dispatch_rwlock.
Some users already check the return value and assert, some others
don't. Introduce rdlock/wrlock/unlock wrappers that just do the
former and use them everywhere for improved consistency and
robustness.
This is just cleanup. It doesn't fix any actual issue.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210203182434.93870-1-groug@kaod.org>
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This changed the Meson build script to allow virtiofsd be built even
though the tools build is disabled, thus honoring the --enable-virtiofsd
option.
Fixes: cece116c93 (configure: add option for virtiofsd)
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Message-Id: <20210201211456.1133364-2-wainersm@redhat.com>
Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Record/replay provides REPLAY_CLOCK_LOCKED macro to access
the clock when vm_clock_seqlock is locked. This macro is
needed because replay internals operate icount. In locked case
replay use icount_get_raw_locked for icount request, which prevents
excess locking which leads to deadlock. But previously only
record code used *_locked function and replay did not.
Therefore sometimes clock access lead to deadlocks.
This patch fixes clock access for replay too and uses *_locked
icount access function.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Message-Id: <161347990483.1313189.8371838968343494161.stgit@pasha-ThinkPad-X280>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Otherwise the call to event_notifier_set() is a nop, which causes
the SLOF firmware on POWER to hang when booting from a virtio-scsi
device:
virtio_scsi_dataplane_start()
virtio_scsi_vring_init()
virtio_bus_set_host_notifier() <- assign == true
event_notifier_init() <- active == 1
event_notifier_set() <- fails right away if !e->initialized
Fixes: e34e47eb28 ("event_notifier: handle initialization failure better")
Cc: mlevitsk@redhat.com
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210216120247.1293569-1-groug@kaod.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The CPUID function 1 has a bit called OSXSAVE which tells user space the
status of the CR4.OSXSAVE bit. Our generic CPUID function injects that bit
based on the status of CR4.
With Hypervisor.framework, we do not synchronize full CPU state often enough
for this function to see the CR4 update before guest user space asks for it.
To be on the save side, let's just always synchronize it when we receive a
CPUID(1) request. That way we can set the bit with real confidence.
Reported-by: Asad Ali <asad@osaro.com>
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Message-Id: <20210123004129.6364-1-agraf@csgraf.de>
[RB: resolved conflict with another CPUID change]
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some guests (ex. Darwin-XNU) can attemp to read this MSR to retrieve and
validate CPU topology comparing it to ACPI MADT content
MSR description from Intel Manual:
35H: MSR_CORE_THREAD_COUNT: Configured State of Enabled Processor Core
Count and Logical Processor Count
Bits 15:0 THREAD_COUNT The number of logical processors that are
currently enabled in the physical package
Bits 31:16 Core_COUNT The number of processor cores that are currently
enabled in the physical package
Bits 63:32 Reserved
Signed-off-by: Vladislav Yaroshchuk <yaroshchuk2000@gmail.com>
Message-Id: <20210113205323.33310-1-yaroshchuk2000@gmail.com>
[RB: reordered MSR definition and dropped u suffix from shift offset]
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The hvf i386 has a few struct and cpp definitions that are never
used. Remove them.
Suggested-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Message-Id: <20210120224444.71840-3-agraf@csgraf.de>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For `-accel hvf` cpu_x86_cpuid() is wrapped with hvf_cpu_x86_cpuid() to
add paravirtualization cpuid leaf 0x40000010
https://lkml.org/lkml/2008/10/1/246
Leaf 0x40000010, Timing Information:
EAX: (Virtual) TSC frequency in kHz.
EBX: (Virtual) Bus (local apic timer) frequency in kHz.
ECX, EDX: RESERVED (Per above, reserved fields are set to zero).
On macOS TSC and APIC Bus frequencies can be readed by sysctl call with
names `machdep.tsc.frequency` and `hw.busfrequency`
This options is required for Darwin-XNU guest to be synchronized with
host
Leaf 0x40000000 not exposes HVF leaving hypervisor signature empty
Signed-off-by: Vladislav Yaroshchuk <yaroshchuk2000@gmail.com>
Message-Id: <20210122150518.3551-1-yaroshchuk2000@gmail.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This prevents illegal instruction on cpus that do not support xgetbv.
Buglink: https://bugs.launchpad.net/qemu/+bug/1758819
Reviewed-by: Cameron Esfahani <dirty@apple.com>
Signed-off-by: Hill Ma <maahiuzeon@gmail.com>
Message-Id: <X/6OJ7qk0W6bHkHQ@Hills-Mac-Pro.local>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When looking for the next directory component, a "." component is now skipped.
This fixes the path(s) used for firmware lookup for the prefix == bindir case
which is standard for QEMU on Windows and where the internally
used bindir value ends with "/.".
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20210208205752.2488774-1-sw@weilnetz.de>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If qtests are run in verbose mode (i.e. if --verbose CL argument
was provided) then print the assembled qemu command line for each
test.
Use qos_printf() instead of g_test_message() to avoid the latter
cluttering the output.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <110bef3595cb841dfa1b86733c174ac9774eb37e.1611704181.git.qemu_oss@crudebyte.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If qtests are run in verbose mode (i.e. if --verbose CL argument
was provided) then print all environment variables to stdout
before running the individual tests.
It is common nowadays, at least being able to output all config
vectors in a build chain, especially if it is required to
investigate build- and test-issues on foreign/remote machines,
which includes environment variables. In the context of writing
new test cases this is also useful for finding out whether there
are already some existing options for common questions like is
there a preferred location for writing test files to? Is there
a maximum size for test data? Is there a deadline for running
tests?
Use qos_printf() instead of g_test_message() to avoid the latter
cluttering the output.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <21d77b33c578d80b5bba1068e61fd3562958b3c2.1611704181.git.qemu_oss@crudebyte.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If qtests were run in verbose mode (i.e. if --verbose CL argument was
provided) then dump the generated qos graph (all nodes and edges,
along with their current individual availability status) to stdout,
which allows to identify problems in the created qos graph e.g. when
writing new qos tests.
See API doc comment on function qos_dump_graph() for details.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <6bffb6e38589fb2c06a2c1b5deed33f3e710fed1.1611704181.git.qemu_oss@crudebyte.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These two are macros wrapping regular printf() call. They are intended
to be used instead of calling printf() directly in order to avoid
breaking TAP output format.
TAP output format is enabled by using --tap command line argument.
Starting with glib 2.62 it is enabled by default.
Unfortunately there is currently no public glib API available to check
whether TAP output format is enabled. For that reason qos_printf()
simply always prepends a '#' character for now.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <653a5ef61c5e7d160e4d6294e542c57ea324cee4.1611704181.git.qemu_oss@crudebyte.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
So far the qos subsystem of the qtest framework had the limitation
that only one instance of the same official QEMU (QMP) driver name
could be created for qtests. That's because a) the created qos
node names must always be unique, b) the node name must match the
official QEMU driver name being instantiated and c) all nodes are
in a global space shared by all tests.
This patch removes this limitation by introducing a new function
qos_node_create_driver_named() which allows test case authors to
specify a node name being different from the actual associated
QEMU driver name. It fills the new 'qemu_name' field of
QOSGraphNode for that purpose.
Adjust build_driver_cmd_line() and qos_graph_node_set_availability()
to correctly deal with either accessing node name vs. node's
qemu_name correctly.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <3be962ff38f3396f8040deaa5ffdab525c4e0b16.1611704181.git.qemu_oss@crudebyte.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Update the sev_es_enabled() function return value to be based on the SEV
policy that has been specified. SEV-ES is enabled if SEV is enabled and
the SEV-ES policy bit is set in the policy object.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Message-Id: <c69f81c6029f31fc4c52a9f35f1bd704362476a5.1611682609.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SMM is not currently supported for an SEV-ES guest by KVM. Change the SMM
capability check from a KVM-wide check to a per-VM check in order to have
a finer-grained SMM capability check.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Message-Id: <f851903809e9d4e6a22d5dfd738dac8da991e28d.1611682609.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
An SEV-ES guest does not allow register state to be altered once it has
been measured. When an SEV-ES guest issues a reboot command, Qemu will
reset the vCPU state and resume the guest. This will cause failures under
SEV-ES. Prevent that from occuring by introducing an arch-specific
callback that returns a boolean indicating whether vCPUs are resettable.
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Aleksandar Rikalo <aleksandar.rikalo@syrmia.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: David Hildenbrand <david@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Message-Id: <1ac39c441b9a3e970e9556e1cc29d0a0814de6fd.1611682609.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When SEV-ES is enabled, it is not possible modify the guests register
state after it has been initially created, encrypted and measured.
Normally, an INIT-SIPI-SIPI request is used to boot the AP. However, the
hypervisor cannot emulate this because it cannot update the AP register
state. For the very first boot by an AP, the reset vector CS segment
value and the EIP value must be programmed before the register has been
encrypted and measured. Search the guest firmware for the guest for a
specific GUID that tells Qemu the value of the reset vector to use.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <22db2bfb4d6551aed661a9ae95b4fdbef613ca21.1611682609.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In prep for AP booting, require the use of in-kernel irqchip support. This
lessens the Qemu support burden required to boot APs.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Message-Id: <e9aec5941e613456f0757f5a73869cdc5deea105.1611682609.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Provide initial support for SEV-ES. This includes creating a function to
indicate the guest is an SEV-ES guest (which will return false until all
support is in place), performing the proper SEV initialization and
ensuring that the guest CPU state is measured as part of the launch.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Co-developed-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Message-Id: <2e6386cbc1ddeaf701547dd5677adf5ddab2b6bd.1611682609.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If the gpa isn't specified, it's value is extracted from the OVMF
properties table located below the reset vector (and if this doesn't
exist, an error is returned). OVMF has defined the GUID for the SEV
secret area as 4c2eb361-7d9b-4cc3-8081-127c90d3d294 and the format of
the <data> is: <base>|<size> where both are uint32_t. We extract
<base> and use it as the gpa for the injection.
Note: it is expected that the injected secret will also be GUID
described but since qemu can't interpret it, the format is left
undefined here.
Signed-off-by: James Bottomley <jejb@linux.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210204193939.16617-3-jejb@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
OVMF is developing a mechanism for depositing a GUIDed table just
below the known location of the reset vector. The table goes
backwards in memory so all entries are of the form
<data>|len|<GUID>
Where <data> is arbtrary size and type, <len> is a uint16_t and
describes the entire length of the entry from the beginning of the
data to the end of the guid.
The foot of the table is of this form and <len> for this case
describes the entire size of the table. The table foot GUID is
defined by OVMF as 96b582de-1fb2-45f7-baea-a366c55a082d and if the
table is present this GUID is just below the reset vector, 48 bytes
before the end of the firmware file.
Add a parser for the ovmf reset block which takes a copy of the block,
if the table foot guid is found, minus the footer and a function for
later traversal to return the data area of any specified GUIDs.
Signed-off-by: James Bottomley <jejb@linux.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210204193939.16617-2-jejb@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Along with the Acceptance Tests and Python libs improvements, a
improvement to the diff generation for Python code.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEeruW64tGuU1eD+m7ZX6NM6XyCfMFAmArUFEACgkQZX6NM6Xy
CfPZHQ/7B7E2aiSC3KBdlqYkzboa+mQS0p7NTic/IZ9AzjALaTBgO6ZUWfWhOSpF
3NlJ5K43EXW0z2UrjbI2qiA1fuvebDI1XmIAKIIglRAB/5kU8bKIROQaEiYMECju
UuZrza5UkFNb0YfxC53jMU6TNx/HgQskX5unhnFUt/x7KTlyldQ69UW8bgXibwgM
no/SOGpE0yEIJ7ASuQ37pVN7y9EdQ6lVpUtBqrqCiqOlodCt4BAUFaKVheVp9MyE
4+08VAq+CZsGeNhXcOYoyoEyN4ngeupvwZUREpJQ5ieslyzO67IFgGTfMJ2y6nPH
E4pbUouGXZWqBc+AXb9zhc+wIHl9y7Nwb4G/GkiRORJjlgPpFeFzPFUW/uRfF1NS
A4oiuFu054O2fqJZpwSDBBD+RFlDtBrOc3ITivUDpIxO9kmU3+/2+1XoGdQ/uITA
Yv9+Lkc2gNlBLycveFesGjpMIgSC+V/gQmhRwILNhSq7R0RlQzOIbjuOtvDFL19o
ydGMehPZUwOrBmqhMrqTteWIJXLYlGhYuCJeZaKn0tN5xtJT9OZjg6nXFH8W7S9l
TE3/0O+Rrw3ezxyYOETZC/mXaQwCzEz6kgBi4k9ylMQktmBEbp20mZJlJ5ljxf2A
ByBRuiwZzYqXrfiYVRQ1ZN7D/H9xz5Mvag/qFNtL3yBhI1c+vIA=
=biat
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/cleber-gitlab/tags/python-next-pull-request' into staging
Acceptance Tests and Python libs improvements
Along with the Acceptance Tests and Python libs improvements, a
improvement to the diff generation for Python code.
# gpg: Signature made Tue 16 Feb 2021 04:55:45 GMT
# gpg: using RSA key 7ABB96EB8B46B94D5E0FE9BB657E8D33A5F209F3
# gpg: Good signature from "Cleber Rosa <crosa@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 7ABB 96EB 8B46 B94D 5E0F E9BB 657E 8D33 A5F2 09F3
* remotes/cleber-gitlab/tags/python-next-pull-request:
Acceptance Tests: set up existing ssh keys by default
Acceptance Tests: fix population of public key in cloudinit image
Acceptance Tests: introduce method for requiring an accelerator
Acceptance Tests: introduce LinuxTest base class
maint: Tell git that *.py files should use python diff hunks
tests/acceptance/virtio-gpu.py: preserve virtio-user-gpu log
Python: close the log file kept by QEMUMachine before reading it
virtiofs_submounts.py test: Note on vmlinuz param
Acceptance Tests: bump Avocado version requirement to 85.0
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Also add Damien as a reviewer.
Signed-off-by: Luc Michel <luc@lmichel.fr>
Acked-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210211085318.2507-1-luc@lmichel.fr
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch implements the FIFO mode of the SMBus module. In FIFO, the
user transmits or receives at most 16 bytes at a time. The FIFO mode
allows the module to transmit large amount of data faster than single
byte mode.
Since we only added the device in a patch that is only a few commits
away in the same patch set. We do not increase the VMstate version
number in this special case.
Reviewed-by: Doug Evans<dje@google.com>
Reviewed-by: Tyrong Ting<kfting@nuvoton.com>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Reviewed-by: Corey Minyard <cminyard@mvista.com>
Message-id: 20210210220426.3577804-6-wuhaotsh@google.com
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds a QTest for NPCM7XX SMBus's single byte mode. It sends a
byte to a device in the evaluation board, and verify the retrieved value
is equivalent to the sent value.
Reviewed-by: Doug Evans<dje@google.com>
Reviewed-by: Tyrong Ting<kfting@nuvoton.com>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20210210220426.3577804-5-wuhaotsh@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This commit implements the single-byte mode of the SMBus.
Each Nuvoton SoC has 16 System Management Bus (SMBus). These buses
compliant with SMBus and I2C protocol.
This patch implements the single-byte mode of the SMBus. In this mode,
the user sends or receives a byte each time. The SMBus device transmits
it to the underlying i2c device and sends an interrupt back to the QEMU
guest.
Reviewed-by: Doug Evans<dje@google.com>
Reviewed-by: Tyrong Ting<kfting@nuvoton.com>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Reviewed-by: Corey Minyard <cminyard@mvista.com>
Message-id: 20210210220426.3577804-2-wuhaotsh@google.com
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-32-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-31-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use the now-saved PAGE_ANON and PAGE_MTE bits,
and the per-page saved data.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-30-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The real kernel collects _TIF_MTE_ASYNC_FAULT into the current thread's
state on any kernel entry (interrupt, exception etc), and then delivers
the signal in advance of resuming the thread.
This means that while the signal won't be delivered immediately, it will
not be delayed forever -- at minimum it will be delivered after the next
clock interrupt.
We don't have a clock interrupt in linux-user, so we issue a cpu_kick
to signal a return to the main loop at the end of the current TB.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-29-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-28-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A proper syndrome is required to fill in the proper si_code.
Use page_get_flags to determine permission vs translation for user-only.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-27-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Move everything related to syndromes to a new file,
which can be shared with linux-user.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210212184902.1251044-26-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Remember the PROT_MTE bit as PAGE_MTE/PAGE_TARGET_2.
Otherwise this does not yet have effect.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-25-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
These prctl fields are required for the function of MTE.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-24-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We were fudging TBI1 enabled to speed up the generated code.
Now that we've improved the code generation, remove this.
Also, tidy the comment to reflect the current code.
The pauth test was testing a kernel address (-1) and making
incorrect assumptions about TBI1; stick to userland addresses.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-23-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use simple arithmetic instead of a conditional
move when tbi0 != tbi1.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-22-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>