Commit Graph

92789 Commits

Author SHA1 Message Date
David Hildenbrand
dba506788b util/oslib-posix: Introduce and use MemsetContext for touch_all_pages()
Let's minimize the number of global variables to prepare for
os_mem_prealloc() getting called concurrently and make the code a bit
easier to read.

The only consumer that really needs a global variable is the sigbus
handler, which will require protection via a mutex in the future either way
as we cannot concurrently mess with the SIGBUS handler.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-4-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
David Hildenbrand
a384bfa32e util/oslib-posix: Support MADV_POPULATE_WRITE for os_mem_prealloc()
Let's sense support and use it for preallocation. MADV_POPULATE_WRITE
does not require a SIGBUS handler, doesn't actually touch page content,
and avoids context switches; it is, therefore, faster and easier to handle
than our current approach.

While MADV_POPULATE_WRITE is, in general, faster than manual
prefaulting, and especially faster with 4k pages, there is still value in
prefaulting using multiple threads to speed up preallocation.

More details on MADV_POPULATE_WRITE can be found in the Linux commits
4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault
page tables") and eb2faa513c24 ("mm/madvise: report SIGBUS as -EFAULT for
MADV_POPULATE_(READ|WRITE)"), and in the man page proposal [1].

This resolves the TODO in do_touch_pages().

In the future, we might want to look into using fallocate(), eventually
combined with MADV_POPULATE_READ, when dealing with shared file/fd
mappings and not caring about memory bindings.

[1] https://lkml.kernel.org/r/20210816081922.5155-1-david@redhat.com

Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
David Hildenbrand
6c427ab926 util/oslib-posix: Let touch_all_pages() return an error
Let's prepare touch_all_pages() for returning differing errors. Return
an error from the thread and report the last processed error.

Translate SIGBUS to -EFAULT, as a SIGBUS can mean all different kind of
things (memory error, read error, out of memory). When allocating memory
fails via the current SIGBUS-based mechanism, we'll get:
    os_mem_prealloc: preallocating memory failed: Bad address

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Andy Pei
0a963af3e3 hw/vhost-user-blk: turn on VIRTIO_BLK_F_SIZE_MAX feature for virtio blk device
Turn on pre-defined feature VIRTIO_BLK_F_SIZE_MAX for virtio blk device to
avoid guest DMA request sizes which are too large for hardware spec.

Signed-off-by: Andy Pei <andy.pei@intel.com>
Message-Id: <1641202092-149677-1-git-send-email-andy.pei@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2022-01-07 05:19:55 -05:00
Eduardo Habkost
0e4edb3b3b hw/i386: expose a "smbios-entry-point-type" PC machine property
The i440fx and Q35 machine types are both hardcoded to use the
legacy SMBIOS 2.1 (32-bit) entry point. This is a sensible
conservative choice because SeaBIOS only supports SMBIOS 2.1

EDK2, however, can also support SMBIOS 3.0 (64-bit) entry points,
and QEMU already uses this on the ARM virt machine type.

This adds a property to allow the choice of SMBIOS entry point
versions For example to opt in to 64-bit SMBIOS entry point:

   $QEMU -machine q35,smbios-entry-point-type=64

Based on a patch submitted by Daniel Berrangé.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20211026151100.1691925-4-ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2022-01-07 05:19:55 -05:00
Eduardo Habkost
bdf54a9a7b hw/smbios: Use qapi for SmbiosEntryPointType
This prepares for exposing the SMBIOS entry point type as a
machine property on x86.

Based on a patch from Daniel P. Berrangé.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20211026151100.1691925-3-ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
2022-01-07 05:19:55 -05:00
Eduardo Habkost
10be11d0b4 smbios: Rename SMBIOS_ENTRY_POINT_* enums
Rename the enums to match the naming style used by QAPI, and to
use "32" and "64" instead of "20" and "31".  This will allow us
to more easily move the enum to the QAPI schema later.

About the naming choice: "SMBIOS 2.1 entry point"/"SMBIOS 3.0
entry point" and "32-bit entry point"/"64-bit entry point" are
synonymous in the SMBIOS specification.  However, the phrases
"32-bit entry point" and "64-bit entry point" are used more often.

The new names also avoid confusion between the entry point format
and the actual SMBIOS version reported in the entry point
structure.  For example: currently the 32-bit entry point
actually report SMBIOS 2.8 support, not 2.1.

Based on portions of a patch submitted by Daniel P. Berrangé.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20211026151100.1691925-2-ehabkost@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Frederic Barrat
20766514d6 pcie_aer: Don't trigger a LSI if none are defined
Skip triggering an LSI when the AER root error status is updated if no
LSI is defined for the device. We can have a root bridge with no LSI,
MSI and MSI-X defined, for example on POWER systems.

Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Message-Id: <20211116170133.724751-4-fbarrat@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2022-01-07 05:19:55 -05:00
Frederic Barrat
2fedf46e34 pci: Export the pci_intx() function
Move the pci_intx() definition to the PCI header file, so that it can
be called from other PCI files. It is used by the next patch.

Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Message-Id: <20211116170133.724751-3-fbarrat@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2022-01-07 05:19:55 -05:00
Roman Kagan
fb76785934 vhost-user-blk: propagate error return from generic vhost
Fix the only callsite that doesn't propagate the error code from the
generic vhost code.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-11-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2022-01-07 05:19:55 -05:00
Roman Kagan
5d33ae4b7a vhost: stick to -errno error return convention
The generic vhost code expects that many of the VhostOps methods in the
respective backends set errno on errors.  However, none of the existing
backends actually bothers to do so.  In a number of those methods errno
from the failed call is clobbered by successful later calls to some
library functions; on a few code paths the generic vhost code then
negates and returns that errno, thus making failures look as successes
to the caller.

As a result, in certain scenarios (e.g. live migration) the device
doesn't notice the first failure and goes on through its state
transitions as if everything is ok, instead of taking recovery actions
(break and reestablish the vhost-user connection, cancel migration, etc)
before it's too late.

To fix this, consolidate on the convention to return negated errno on
failures throughout generic vhost, and use it for error propagation.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-10-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Roman Kagan
025faa872b vhost-user: stick to -errno error return convention
VhostOps methods in user_ops are not very consistent in their error
returns: some return negated errno while others just -1.

Make sure all of them consistently return negated errno.  This also
helps error propagation from the functions being called inside.
Besides, this synchronizes the error return convention with the other
two vhost backends, kernel and vdpa, and will therefore allow for
consistent error propagation in the generic vhost code (in a followup
patch).

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-9-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Roman Kagan
3631151b3e vhost-vdpa: stick to -errno error return convention
Almost all VhostOps methods in vdpa_ops follow the convention of
returning negated errno on error.

Adjust the few that don't.  To that end, rework vhost_vdpa_add_status to
check if setting of the requested status bits has succeeded and return
the respective error code it hasn't, and propagate the error codes
wherever it's appropriate.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-8-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Roman Kagan
2d88d9c65c vhost-backend: stick to -errno error return convention
Almost all VhostOps methods in kernel_ops follow the convention of
returning negated errno on error.

Adjust the only one that doesn't.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-7-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2022-01-07 05:19:55 -05:00
Roman Kagan
6dcae534e8 vhost-backend: avoid overflow on memslots_limit
Fix the (hypothetical) potential problem when the value parsed out of
the vhost module parameter in sysfs overflows the return value from
vhost_kernel_memslots_limit.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-6-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Roman Kagan
666265036f chardev/char-socket: tcp_chr_sync_read: don't clobber errno
After the return from tcp_chr_recv, tcp_chr_sync_read calls into a
function which eventually makes a system call and may clobber errno.

Make a copy of errno right after tcp_chr_recv and restore the errno on
return from tcp_chr_sync_read.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-4-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2022-01-07 05:19:55 -05:00
Roman Kagan
e87975051e chardev/char-socket: tcp_chr_recv: don't clobber errno
tcp_chr_recv communicates the specific error condition to the caller via
errno.  However, after setting it, it may call into some system calls or
library functions which can clobber the errno.

Avoid this by moving the errno assignment to the end of the function.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-3-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2022-01-07 05:19:55 -05:00
Roman Kagan
b7107e758f vhost-user-blk: reconnect on any error during realize
vhost-user-blk realize only attempts to reconnect if the previous
connection attempt failed on "a problem with the connection and not an
error related to the content (which would fail again the same way in the
next attempt)".

However this distinction is very subtle, and may be inadvertently broken
if the code changes somewhere deep down the stack and a new error gets
propagated up to here.

OTOH now that the number of reconnection attempts is limited it seems
harmless to try reconnecting on any error.

So relax the condition of whether to retry connecting to check for any
error.

This patch amends a527e312b5 "vhost-user-blk: Implement reconnection
during realize".

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-2-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2022-01-07 05:19:55 -05:00
Laurent Vivier
deeb956c40 trace-events,pci: unify trace events format
Unify format used by trace_pci_update_mappings_del(),
trace_pci_update_mappings_add(), trace_pci_cfg_write() and
trace_pci_cfg_read() to print the device name and bus number,
slot number and function number.

For instance:

  pci_cfg_read virtio-net-pci 00:0 @0x20 -> 0xffffc00c
  pci_cfg_write virtio-net-pci 00:0 @0x20 <- 0xfea0000c
  pci_update_mappings_del d=0x555810b92330 01:00.0 4,0xffffc000+0x4000
  pci_update_mappings_add d=0x555810b92330 01:00.0 4,0xfea00000+0x4000

becomes

  pci_cfg_read virtio-net-pci 01:00.0 @0x20 -> 0xffffc00c
  pci_cfg_write virtio-net-pci 01:00.0 @0x20 <- 0xfea0000c
  pci_update_mappings_del virtio-net-pci 01:00.0 4,0xffffc000+0x4000
  pci_update_mappings_add virtio-net-pci 01:00.0 4,0xfea00000+0x4000

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20211105192541.655831-1-lvivier@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Cindy Lu
d5d24d859c virtio-pci: add support for configure interrupt
Add support for configure interrupt, The process is used kvm_irqfd_assign
to set the gsi to kernel. When the configure notifier was signal by
host, qemu will inject a msix interrupt to guest

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-11-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Cindy Lu
d48185f1a4 virtio-mmio: add support for configure interrupt
Add configure interrupt support for virtio-mmio bus. This
interrupt will be working while the backend is vhost-vdpa

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-10-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Cindy Lu
497679d510 virtio-net: add support for configure interrupt
Add functions to support configure interrupt in virtio_net
The functions are config_pending and config_mask, while
this input idx is VIRTIO_CONFIG_IRQ_IDX will check the
function of configure interrupt.

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-9-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Richard Henderson
41fb4c14ee linux-user pull request 20220106
update netlink entries
 nios2 fixes
 /proc/self/maps fixes
 set/getscheduler update
 prctl cleanup and fixes
 target_signal.h cleanup
 and some trivial fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmHWx0MSHGxhdXJlbnRA
 dml2aWVyLmV1AAoJEPMMOL0/L748/OcP/jLSX6rPSMUC2RaPuVU7mF2r6tNO+tXi
 FYPxBkYg9oelkIqVjB+PMm0DREvKsu12EJDvNyVUwOtEKkJqtuDWHO5gAP4pnm5v
 amtsdsIhJuOJL446aS/acb2kzodWEuwJkpxZneFqTYDPhnkWGqHoWBJKwetH8RoZ
 zuXlPsJN9Qpp35llrrLpZsNxowGCPT4R54iamCG3tfgpeKKj0VQlNJRzXyCo+UpK
 ts4+akf0i7xxzxraTkV2cokzuP3ZGxUq3aSKAtTEzyGG/IkXsVEDAZ4Y22F2JcST
 4xKgeyk7BQ0EToyL44EirgDkAAxqV2kZGUeuYcHJsf6HXOY6beNEVN4iQuh+vod2
 zlldGtoWy2VCxQS8k+8z4irbQBYE3qXTQ71jZtQcv2fwQUh8lCKQhkSQK294pkSB
 y3gDPeowMj6Vb9jdoi3E/5YWdO0s/97i6OgKzoNE98xU4G4Gdle4/suiKiIahSOo
 qSKeBk5hk9JWuTuVTCsLFiq7lBe2TUYVRT9o6Lac0zu/glZVLA9F18mVQSJUHqqb
 77c45yDuC6wFJFNMmt/2SkBlS9kZn6yPAfMH9k3ICocibmwvjkJdu7fUDnTgR/wc
 wM4H3JtT6l+aMhvxhLWMu5Hv/8uMqF4+jY25xAVBEnXwhDDGrF2/T9wORj8ljk8d
 gAuXE/VLZvkm
 =OYdy
 -----END PGP SIGNATURE-----

Merge tag 'linux-user-for-7.0-pull-request' of https://gitlab.com/laurent_vivier/qemu into staging

linux-user pull request 20220106

update netlink entries
nios2 fixes
/proc/self/maps fixes
set/getscheduler update
prctl cleanup and fixes
target_signal.h cleanup
and some trivial fixes

# gpg: Signature made Thu 06 Jan 2022 02:41:07 AM PST
# gpg:                using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
# gpg:                issuer "laurent@vivier.eu"
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [undefined]
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>" [undefined]
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* tag 'linux-user-for-7.0-pull-request' of https://gitlab.com/laurent_vivier/qemu: (27 commits)
  linux-user: netlink: update IFLA_BRPORT entries
  linux-user: netlink: Add IFLA_VFINFO_LIST
  linux-user: netlink: update IFLA entries
  linux-user/syscall.c: malloc to g_try_malloc
  linux-user/nios2: Use set_sigmask in do_rt_sigreturn
  linux-user/nios2: Fix sigmask in setup_rt_frame
  linux-user/nios2: Fix EA vs PC confusion
  linux-user/nios2: Map a real kuser page
  linux-user/elfload: Rename ARM_COMMPAGE to HI_COMMPAGE
  linux-user/nios2: Fixes for signal frame setup
  linux-user/nios2: Properly emulate EXCP_TRAP
  linux-user/syscall.c: fix missed flag for shared memory in open_self_maps
  linux-user: call set/getscheduler set/getparam directly
  linux-user: add sched_getattr support
  linux-user/signal: Map exit signals in SIGCHLD siginfo_t
  target/sh4: Implement prctl_unalign_sigbus
  target/hppa: Implement prctl_unalign_sigbus
  target/alpha: Implement prctl_unalign_sigbus
  linux-user: Add code for PR_GET/SET_UNALIGN
  linux-user: Disable more prctl subcodes
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-01-06 11:22:42 -08:00
Cindy Lu
f7220a7ce2 vhost: add support for configure interrupt
Add functions to support configure interrupt.
The configure interrupt process will start in vhost_dev_start
and stop in vhost_dev_stop.

Also add the functions to support vhost_config_pending and
vhost_config_mask, for masked_config_notifier, we only
use the notifier saved in vq 0.

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-8-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Cindy Lu
081f864f56 virtio: add support for configure interrupt
Add the functions to support the configure interrupt in virtio
The function virtio_config_guest_notifier_read will notify the
guest if there is an configure interrupt.
The function virtio_config_set_guest_notifier_fd_handler is
to set the fd hander for the notifier

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-7-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Cindy Lu
634f7c89fb vhost-vdpa: add support for config interrupt
Add new call back function in vhost-vdpa, this function will
set the event fd to kernel. This function will be called
in the vhost_dev_start and vhost_dev_stop

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-6-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Cindy Lu
8806237234 vhost: introduce new VhostOps vhost_set_config_call
This patch introduces new VhostOps vhost_set_config_call. This function allows the
vhost to set the event fd to kernel

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-5-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Cindy Lu
316011b8a7 virtio-pci: decouple the single vector from the interrupt process
To reuse the interrupt process in configure interrupt
Need to decouple the single vector from the interrupt process. Add new function
kvm_virtio_pci_vector_use_one and _release_one. These functions are use
for the single vector, the whole process will finish in a loop for the vq number.

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-4-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Cindy Lu
e3480ef81f virtio-pci: decouple notifier from interrupt process
To reuse the notifier process in configure interrupt.
Use the virtio_pci_get_notifier function to get the notifier.
the INPUT of this function is the IDX, the OUTPUT is notifier and
the vector

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-3-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Cindy Lu
bf1d85c166 virtio: introduce macro IRTIO_CONFIG_IRQ_IDX
To support configure interrupt for vhost-vdpa
Introduce VIRTIO_CONFIG_IRQ_IDX -1 as configure interrupt's queue index,
Then we can reuse the functions guest_notifier_mask and guest_notifier_pending.
Add the check of queue index in these drivers, if the driver does not support
configure interrupt, the function will just return

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-2-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Michael S. Tsirkin
9bd6565cce acpi: validate hotplug selector on access
When bus is looked up on a pci write, we didn't
validate that the lookup succeeded.
Fuzzers thus can trigger QEMU crash by dereferencing the NULL
bus pointer.

Fixes: b32bd763a1 ("pci: introduce acpi-index property for PCI device")
Fixes: CVE-2021-4158
Cc: "Igor Mammedov" <imammedo@redhat.com>
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/770
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Ani Sinha <ani@anisinha.ca>
2022-01-06 06:11:38 -05:00
Laurent Vivier
f0effdbc2a linux-user: netlink: update IFLA_BRPORT entries
add IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT and IFLA_BRPORT_MCAST_EHT_HOSTS_CNT

  # QEMU_LOG=unimp ip a
  Unknown QEMU_IFLA_BRPORT type 37
  Unknown QEMU_IFLA_BRPORT type 38

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211219154514.2165728-3-laurent@vivier.eu>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:53 +01:00
Laurent Vivier
a99478672c linux-user: netlink: Add IFLA_VFINFO_LIST
# QEMU_LOG=unimp ip a
  Unknown host QEMU_IFLA type: 22

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211219154514.2165728-2-laurent@vivier.eu>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00
Laurent Vivier
312aef98ae linux-user: netlink: update IFLA entries
Add IFLA_PHYS_PORT_ID, IFLA_PARENT_DEV_NAME, IFLA_PARENT_DEV_BUS_NAME

  # QEMU_LOG=unimp ip a
  Unknown host QEMU_IFLA type: 56
  Unknown host QEMU_IFLA type: 57
  Unknown host QEMU_IFLA type: 34

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211219154514.2165728-1-laurent@vivier.eu>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00
Ahmed Abouzied
7a5626a1d8 linux-user/syscall.c: malloc to g_try_malloc
Use g_try_malloc instead of malloc to alocate the target ifconfig.
Also replace the corresponding free with g_free.

Signed-off-by: Ahmed Abouzied <email@aabouzied.com>
Message-Id: <20220104143841.25116-1-email@aabouzied.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00
Richard Henderson
155fff93f8 linux-user/nios2: Use set_sigmask in do_rt_sigreturn
Using do_sigprocmask directly was incorrect, as it will
leave the signal blocked by the outer layers of linux-user.

Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211221025012.1057923-8-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00
Richard Henderson
7a83cbb0b0 linux-user/nios2: Fix sigmask in setup_rt_frame
Do not cast the signal mask elements; trust __put_user.

Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211221025012.1057923-7-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00
Richard Henderson
8222d8ba6f linux-user/nios2: Fix EA vs PC confusion
The real kernel will talk about the user PC as EA,
because that's where the hardware will have copied it,
and where it expects to put it to then use ERET.
But qemu does not emulate all of the exception stuff
while emulating user-only.  Manipulate PC directly.

This fixes signal entry and return, and eliminates
some slight confusion from target_cpu_copy_regs.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20211221025012.1057923-6-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00
Richard Henderson
f5ef0e518d linux-user/nios2: Map a real kuser page
The first word of page1 is data, so the whole thing
can't be implemented with emulation of addresses.
Use init_guest_commpage for the allocation.

Hijack trap number 16 to implement cmpxchg.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20211221025012.1057923-5-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00
Richard Henderson
66346faf32 linux-user/elfload: Rename ARM_COMMPAGE to HI_COMMPAGE
Arm will no longer be the only target requiring a commpage,
but it will continue to be the only target placing the page
at the high end of the address space.

Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211221025012.1057923-4-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00
Richard Henderson
80c6e9d4ae linux-user/nios2: Fixes for signal frame setup
Do not confuse host and guest addresses.  Lock and unlock
the target_rt_sigframe structure in setup_rt_sigframe.

Since rt_setup_ucontext always returns 0, drop the return
value entirely.  This eliminates the only write to the err
variable in setup_rt_sigframe.

Always copy the siginfo structure.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211221025012.1057923-3-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00
Richard Henderson
87d7bfdba1 linux-user/nios2: Properly emulate EXCP_TRAP
The real kernel has to load the instruction and extract
the imm5 field; for qemu, modify the translator to do this.

The use of R_AT for this in cpu_loop was a bug.  Handle
the other trap numbers as per the kernel's trap_table.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20211221025012.1057923-2-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00
Andrey Kazmin
e13685a6e5 linux-user/syscall.c: fix missed flag for shared memory in open_self_maps
The possible variants for region type in /proc/self/maps are either
private "p" or shared "s". In the current implementation,
we mark shared regions as "-". It could break memory mapping parsers
such as included into ASan/HWASan sanitizers.

Fixes: 01ef6b9e4e ("linux-user: factor out reading of /proc/self/maps")
Signed-off-by: Andrey Kazmin <a.kazmin@partner.samsung.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20211227125048.22610-1-a.kazmin@partner.samsung.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00
Tonis Tiigi
407a119bfd linux-user: call set/getscheduler set/getparam directly
There seems to be difference in syscall and libc definition of these
methods and therefore musl does not implement them (1e21e78bf7). Call
syscall directly to ensure the behavior of the libc of user application,
not the libc that was used to build QEMU.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Message-Id: <20220105041819.24160-3-tonistiigi@gmail.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00
Tonis Tiigi
45ad761c27 linux-user: add sched_getattr support
These syscalls are not exposed by glibc. The struct type need to be
redefined as it can't be included directly before
https://lkml.org/lkml/2020/5/28/810 .

sched_attr type can grow in future kernel versions. When client sends
values that QEMU does not understand it will return E2BIG with same
semantics as old kernel would so client can retry with smaller inputs.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Message-Id: <20220105041819.24160-2-tonistiigi@gmail.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00
Matthias Schiffer
139e5de7c8 linux-user/signal: Map exit signals in SIGCHLD siginfo_t
When converting a siginfo_t from waitid(), the interpretation of si_status
depends on the value of si_code: For CLD_EXITED, it is an exit code and
should be copied verbatim. For other codes, it is a signal number
(possibly with additional high bits from ptrace) that should be mapped.

This code was previously changed in commit 1c3dfb506e
("linux-user/signal: Decode waitid si_code"), but the fix was
incomplete.

Tested with the following test program:

    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
    #include <sys/wait.h>

    int main() {
    	pid_t pid = fork();
    	if (pid == 0) {
    		exit(12);
    	} else {
    		siginfo_t siginfo = {};
    		waitid(P_PID, pid, &siginfo, WEXITED);
    		printf("Code: %d, status: %d\n", (int)siginfo.si_code, (int)siginfo.si_status);
    	}

    	pid = fork();
    	if (pid == 0) {
    		raise(SIGUSR2);
    	} else {
    		siginfo_t siginfo = {};
    		waitid(P_PID, pid, &siginfo, WEXITED);
    		printf("Code: %d, status: %d\n", (int)siginfo.si_code, (int)siginfo.si_status);
    	}
    }

Output with an x86_64 host and mips64el target before 1c3dfb506e
(incorrect: exit code 12 is translated like a signal):

    Code: 1, status: 17
    Code: 2, status: 17

After 1c3dfb506e (incorrect: signal number is not translated):

    Code: 1, status: 12
    Code: 2, status: 12

With this patch:

    Code: 1, status: 12
    Code: 2, status: 17

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <81534fde7cdfc6acea4889d886fbefdd606630fb.1635019124.git.mschiffer@universe-factory.net>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00
Richard Henderson
4da06fb306 target/sh4: Implement prctl_unalign_sigbus
Leave TARGET_ALIGNED_ONLY set, but use the new CPUState
flag to set MO_UNALN for the instructions that the kernel
handles in the unaligned trap.

The Linux kernel does not handle all memory operations: no
floating-point and no MAC.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20211227150127.2659293-7-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00
Richard Henderson
217d1a5ef8 target/hppa: Implement prctl_unalign_sigbus
Leave TARGET_ALIGNED_ONLY set, but use the new CPUState
flag to set MO_UNALN for the instructions that the kernel
handles in the unaligned trap.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20211227150127.2659293-6-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00
Richard Henderson
fed1424617 target/alpha: Implement prctl_unalign_sigbus
Leave TARGET_ALIGNED_ONLY set, but use the new CPUState
flag to set MO_UNALN for the instructions that the kernel
handles in the unaligned trap.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20211227150127.2659293-5-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00
Richard Henderson
6e8dcacd08 linux-user: Add code for PR_GET/SET_UNALIGN
This requires extra work for each target, but adds the
common syscall code, and the necessary flag in CPUState.

Reviewed-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211227150127.2659293-4-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-06 11:40:52 +01:00