Commit Graph

106826 Commits

Author SHA1 Message Date
Stefan Hajnoczi
13d6b16081 Unify implementation of carry-less multiply.
Accelerate carry-less multiply for 64x64->128.
 -----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmUEiPodHHJpY2hhcmQu
 aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/akgf/XkiIeErWJr1YXSbS
 YPQtCsDAfIrqn3RiyQ2uwSn2eeuwVqTFFPGER04YegRDK8dyO874JBfvOwmBT70J
 I/aU8Z4BbRyNu9nfaCtFMlXQH9KArAKcAds1PnshfcnI5T2yBloZ1sAU97IuJFZk
 Uuz96H60+ohc4wzaUiPqPhXQStgZeSYwwAJB0s25DhCckdea0udRCAJ1tQTVpxkM
 wIFef1SHPoM6DtMzFKHLLUH6VivSlHjqx8GqFusa7pVqfQyDzNBfwvDl1F/bkE07
 yTocQEkV3QnZvIplhqUxAaZXIFZr9BNk7bDimMjHW6z3pNPN3T8zRn4trNjxbgPV
 jqzAtg==
 =8nnk
 -----END PGP SIGNATURE-----

Merge tag 'pull-crypto-20230915' of https://gitlab.com/rth7680/qemu into staging

Unify implementation of carry-less multiply.
Accelerate carry-less multiply for 64x64->128.

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmUEiPodHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/akgf/XkiIeErWJr1YXSbS
# YPQtCsDAfIrqn3RiyQ2uwSn2eeuwVqTFFPGER04YegRDK8dyO874JBfvOwmBT70J
# I/aU8Z4BbRyNu9nfaCtFMlXQH9KArAKcAds1PnshfcnI5T2yBloZ1sAU97IuJFZk
# Uuz96H60+ohc4wzaUiPqPhXQStgZeSYwwAJB0s25DhCckdea0udRCAJ1tQTVpxkM
# wIFef1SHPoM6DtMzFKHLLUH6VivSlHjqx8GqFusa7pVqfQyDzNBfwvDl1F/bkE07
# yTocQEkV3QnZvIplhqUxAaZXIFZr9BNk7bDimMjHW6z3pNPN3T8zRn4trNjxbgPV
# jqzAtg==
# =8nnk
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 15 Sep 2023 12:40:26 EDT
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-crypto-20230915' of https://gitlab.com/rth7680/qemu:
  host/include/aarch64: Implement clmul.h
  host/include/i386: Implement clmul.h
  target/ppc: Use clmul_64
  target/s390x: Use clmul_64
  target/i386: Use clmul_64
  target/arm: Use clmul_64
  crypto: Add generic 64-bit carry-less multiply routine
  target/ppc: Use clmul_32* routines
  target/s390x: Use clmul_32* routines
  target/arm: Use clmul_32* routines
  crypto: Add generic 32-bit carry-less multiply routines
  target/ppc: Use clmul_16* routines
  target/s390x: Use clmul_16* routines
  target/arm: Use clmul_16* routines
  crypto: Add generic 16-bit carry-less multiply routines
  target/ppc: Use clmul_8* routines
  target/s390x: Use clmul_8* routines
  target/arm: Use clmul_8* routines
  crypto: Add generic 8-bit carry-less multiply routines

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-09-18 11:04:21 -04:00
Gerd Hoffmann
0ec0767e59 tests/acpi: disallow virt/SSDT.memhp updates
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-09-18 15:27:27 +02:00
Gerd Hoffmann
5f88dd43d0 tests/acpi: update virt/SSDT.memhp
The edk2 update caused an address change:

 DefinitionBlock ("", "SSDT", 1, "BOCHS ", "NVDIMM", 0x00000001)
 {
     Scope (\_SB)
     {
         Device (NVDR)
         {
             Name (_HID, "ACPI0012" /* NVDIMM Root Device */)  // _HID: Hardware ID
             [ ... ]
         }
     }

-    Name (MEMA, 0x43D10000)
+    Name (MEMA, 0x43C90000)
 }

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-09-18 15:27:27 +02:00
Gerd Hoffmann
91e0127087 edk2: update binaries to edk2-stable202308
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-09-18 15:27:27 +02:00
Gerd Hoffmann
241f99399f edk2: update submodule to edk2-stable202308
New stable release was tagged in August 2023,
update the edk2 submodule to it.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-09-18 15:27:27 +02:00
Gerd Hoffmann
3bb5051009 edk2: workaround edk-stable202308 bug
Set PCD to workaround two fixes missing the release.
8b66f9df1b
020cc9e2e7

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-09-18 15:27:27 +02:00
Gerd Hoffmann
b0494f131e edk2: update build config
risc-v switched to use split code/vars images like the other archs.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-09-18 15:27:27 +02:00
Gerd Hoffmann
c28a2891f3 edk2: update build script
Sync with latest version from gitlab.com/kraxel/edk2-build-config

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-09-18 15:27:27 +02:00
Gerd Hoffmann
3808a058fc tests/acpi: allow virt/SSDT.memhp updates
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-09-18 15:27:27 +02:00
Cédric Le Goater
44fa20c928 spapr: Remove support for NVIDIA V100 GPU with NVLink2
NVLink2 support was removed from the PPC PowerNV platform and VFIO in
Linux 5.13 with commits :

  562d1e207d32 ("powerpc/powernv: remove the nvlink support")
  b392a1989170 ("vfio/pci: remove vfio_pci_nvlink2")

This was 2.5 years ago. Do the same in QEMU with a revert of commit
ec132efaa8 ("spapr: Support NVIDIA V100 GPU with NVLink2"). Some
adjustements are required on the NUMA part.

Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20230918091717.149950-1-clg@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2023-09-18 07:25:28 -03:00
Cédric Le Goater
527b238329 ppc/xive: Fix uint32_t overflow
As reported by Coverity, "idx << xive->pc_shift" is evaluated using
32-bit arithmetic, and then used in a context expecting a "uint64_t".
Add a uint64_t cast.

Fixes: Coverity CID 1519049
Fixes: b68147b7a5 ("ppc/xive: Add support for the PC MMIOs")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Message-ID: <20230914154650.222111-1-clg@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2023-09-18 07:25:24 -03:00
Daniel Henrique Barboza
0cbc34dc8e MAINTAINERS: Nick Piggin PPC maintainer, other PPC changes
Update all relevant PowerPC entries as follows:

- Nick Piggin is promoted to Maintainer in all qemu-ppc subsystems.
  Nick has  been a solid contributor for the last couple of years and
  has the required knowledge and motivation to drive the boat.

- Greg Kurz is being removed from all qemu-ppc entries. Greg has moved
  to other areas of interest and will retire from qemu-ppc.  Thanks Mr
  Kurz for all the years of service.

- David Gibson was removed as 'Reviewer' from PowerPC TCG CPUs and PPC
  KVM CPUs. Change done per his request.

- Daniel Barboza downgraded from 'Maintainer' to 'Reviewer' in sPAPR and
  PPC KVM CPUs. It has been a long since I last touched those areas and
  it's not justified to be kept as maintainer in them.

- Cedric Le Goater and Daniel Barboza removed as 'Reviewer' in VOF. We
  don't have the required knowledge to justify it.

- VOF support downgraded from 'Maintained' to 'Odd Fixes' since it
  better reflects the current state of the subsystem.

Acked-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Message-ID: <20230915110507.194762-1-danielhb413@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2023-09-18 07:25:01 -03:00
Peter Maydell
6d7a53e9f1 net/tap: Avoid variable-length array
Use a heap allocation instead of a variable length array in
tap_receive_iov().

The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions.  This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g.  CVE-2021-3527).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Peter Maydell
c4cf68198e net/dump: Avoid variable length array
Use a g_autofree heap allocation instead of a variable length
array in dump_receive_iov().

The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions.  This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g.  CVE-2021-3527).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Peter Maydell
1257065783 hw/net/rocker: Avoid variable length array
Replace an on-stack variable length array in of_dpa_ig() with
a g_autofree heap allocation.

The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions.  This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g.  CVE-2021-3527).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Peter Maydell
2a6cb383e2 hw/net/fsl_etsec/rings.c: Avoid variable length array
In fill_rx_bd() we create a variable length array of size
etsec->rx_padding. In fact we know that this will never be
larger than 64 bytes, because rx_padding is set in rx_init_frame()
in a way that ensures it is only that large. Use a fixed sized
array and assert that it is big enough.

Since padd[] is now potentially rather larger than the actual
padding required, adjust the memset() we do on it to match the
size that we write with cpu_physical_memory_write(), rather than
clearing the entire array.

The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions.  This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g.  CVE-2021-3527).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Ilya Maximets
cb039ef3d9 net: add initial support for AF_XDP network backend
AF_XDP is a network socket family that allows communication directly
with the network device driver in the kernel, bypassing most or all
of the kernel networking stack.  In the essence, the technology is
pretty similar to netmap.  But, unlike netmap, AF_XDP is Linux-native
and works with any network interfaces without driver modifications.
Unlike vhost-based backends (kernel, user, vdpa), AF_XDP doesn't
require access to character devices or unix sockets.  Only access to
the network interface itself is necessary.

This patch implements a network backend that communicates with the
kernel by creating an AF_XDP socket.  A chunk of userspace memory
is shared between QEMU and the host kernel.  4 ring buffers (Tx, Rx,
Fill and Completion) are placed in that memory along with a pool of
memory buffers for the packet data.  Data transmission is done by
allocating one of the buffers, copying packet data into it and
placing the pointer into Tx ring.  After transmission, device will
return the buffer via Completion ring.  On Rx, device will take
a buffer form a pre-populated Fill ring, write the packet data into
it and place the buffer into Rx ring.

AF_XDP network backend takes on the communication with the host
kernel and the network interface and forwards packets to/from the
peer device in QEMU.

Usage example:

  -device virtio-net-pci,netdev=guest1,mac=00:16:35:AF:AA:5C
  -netdev af-xdp,ifname=ens6f1np1,id=guest1,mode=native,queues=1

XDP program bridges the socket with a network interface.  It can be
attached to the interface in 2 different modes:

1. skb - this mode should work for any interface and doesn't require
         driver support.  With a caveat of lower performance.

2. native - this does require support from the driver and allows to
            bypass skb allocation in the kernel and potentially use
            zero-copy while getting packets in/out userspace.

By default, QEMU will try to use native mode and fall back to skb.
Mode can be forced via 'mode' option.  To force 'copy' even in native
mode, use 'force-copy=on' option.  This might be useful if there is
some issue with the driver.

Option 'queues=N' allows to specify how many device queues should
be open.  Note that all the queues that are not open are still
functional and can receive traffic, but it will not be delivered to
QEMU.  So, the number of device queues should generally match the
QEMU configuration, unless the device is shared with something
else and the traffic re-direction to appropriate queues is correctly
configured on a device level (e.g. with ethtool -N).
'start-queue=M' option can be used to specify from which queue id
QEMU should start configuring 'N' queues.  It might also be necessary
to use this option with certain NICs, e.g. MLX5 NICs.  See the docs
for examples.

In a general case QEMU will need CAP_NET_ADMIN and CAP_SYS_ADMIN
or CAP_BPF capabilities in order to load default XSK/XDP programs to
the network interface and configure BPF maps.  It is possible, however,
to run with no capabilities.  For that to work, an external process
with enough capabilities will need to pre-load default XSK program,
create AF_XDP sockets and pass their file descriptors to QEMU process
on startup via 'sock-fds' option.  Network backend will need to be
configured with 'inhibit=on' to avoid loading of the program.
QEMU will need 32 MB of locked memory (RLIMIT_MEMLOCK) per queue
or CAP_IPC_LOCK.

There are few performance challenges with the current network backends.

First is that they do not support IO threads.  This means that data
path is handled by the main thread in QEMU and may slow down other
work or may be slowed down by some other work.  This also means that
taking advantage of multi-queue is generally not possible today.

Another thing is that data path is going through the device emulation
code, which is not really optimized for performance.  The fastest
"frontend" device is virtio-net.  But it's not optimized for heavy
traffic either, because it expects such use-cases to be handled via
some implementation of vhost (user, kernel, vdpa).  In practice, we
have virtio notifications and rcu lock/unlock on a per-packet basis
and not very efficient accesses to the guest memory.  Communication
channels between backend and frontend devices do not allow passing
more than one packet at a time as well.

Some of these challenges can be avoided in the future by adding better
batching into device emulation or by implementing vhost-af-xdp variant.

There are also a few kernel limitations.  AF_XDP sockets do not
support any kinds of checksum or segmentation offloading.  Buffers
are limited to a page size (4K), i.e. MTU is limited.  Multi-buffer
support implementation for AF_XDP is in progress, but not ready yet.
Also, transmission in all non-zero-copy modes is synchronous, i.e.
done in a syscall.  That doesn't allow high packet rates on virtual
interfaces.

However, keeping in mind all of these challenges, current implementation
of the AF_XDP backend shows a decent performance while running on top
of a physical NIC with zero-copy support.

Test setup:

2 VMs running on 2 physical hosts connected via ConnectX6-Dx card.
Network backend is configured to open the NIC directly in native mode.
The driver supports zero-copy.  NIC is configured to use 1 queue.

Inside a VM - iperf3 for basic TCP performance testing and dpdk-testpmd
for PPS testing.

iperf3 result:
 TCP stream      : 19.1 Gbps

dpdk-testpmd (single queue, single CPU core, 64 B packets) results:
 Tx only         : 3.4 Mpps
 Rx only         : 2.0 Mpps
 L2 FWD Loopback : 1.5 Mpps

In skb mode the same setup shows much lower performance, similar to
the setup where pair of physical NICs is replaced with veth pair:

iperf3 result:
  TCP stream      : 9 Gbps

dpdk-testpmd (single queue, single CPU core, 64 B packets) results:
  Tx only         : 1.2 Mpps
  Rx only         : 1.0 Mpps
  L2 FWD Loopback : 0.7 Mpps

Results in skb mode or over the veth are close to results of a tap
backend with vhost=on and disabled segmentation offloading bridged
with a NIC.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> (docker/lcitool)
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Ilya Maximets
a6f376e9ba tests: bump libvirt-ci for libasan and libxdp
This pulls in the fixes for libasan version as well as support for
libxdp that will be used for af-xdp netdev in the next commits.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Tomasz Dzieciol
e710f9c470 e1000e: rename e1000e_ba_state and e1000e_write_hdr_to_rx_buffers
Rename e1000e_ba_state according and e1000e_write_hdr_to_rx_buffers for
consistency with IGB.

Signed-off-by: Tomasz Dzieciol <t.dzieciol@partner.samsung.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Tomasz Dzieciol
560cf339b2 igb: packet-split descriptors support
Packet-split descriptors are used by Linux VF driver for MTU values from 2048

Signed-off-by: Tomasz Dzieciol <t.dzieciol@partner.samsung.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Tomasz Dzieciol
1c4e67a5be igb: add IPv6 extended headers traffic detection
Signed-off-by: Tomasz Dzieciol <t.dzieciol@partner.samsung.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Tomasz Dzieciol
17ccd01647 igb: RX payload guest writting refactoring
Refactoring is done in preparation for support of multiple advanced
descriptors RX modes, especially packet-split modes.

Signed-off-by: Tomasz Dzieciol <t.dzieciol@partner.samsung.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Tomasz Dzieciol
ec82ad7c4d igb: RX descriptors guest writting refactoring
Refactoring is done in preparation for support of multiple advanced
descriptors RX modes, especially packet-split modes.

Signed-off-by: Tomasz Dzieciol <t.dzieciol@partner.samsung.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Tomasz Dzieciol
a86aee7e95 igb: rename E1000E_RingInfo_st
Rename E1000E_RingInfo_st and E1000E_RingInfo according to qemu typdefs guide.

Signed-off-by: Tomasz Dzieciol <t.dzieciol@partner.samsung.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Tomasz Dzieciol
2959c51dde igb: remove TCP ACK detection
TCP ACK detection is no longer present in igb.

Signed-off-by: Tomasz Dzieciol <t.dzieciol@partner.samsung.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Yuri Benditovich
53da8b5a99 virtio-net: Add support for USO features
USO features of virtio-net device depend on kernel ability
to support them, for backward compatibility by default the
features are disabled on 8.0 and earlier.

Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Andrew Melnychecnko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Andrew Melnychenko
9da1684954 virtio-net: Add USO flags to vhost support.
New features are subject to check with vhost-user and vdpa.

Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Yuri Benditovich
f03e0cf63b tap: Add check for USO features
Tap indicates support for USO features according to
capabilities of current kernel module.

Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Andrew Melnychecnko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Andrew Melnychenko
2ab0ec3121 tap: Add USO support to tap device.
Passing additional parameters (USOv4 and USOv6 offloads) when
setting TAP offloads

Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Richard Henderson
a97a83753c tcg: Map code_gen_buffer with PROT_BTI
For linux aarch64 host supporting BTI, map the buffer
to require BTI instructions at branch landing pads.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:16 +00:00
Richard Henderson
5826a0dbf0 tcg/aarch64: Emit BTI insns at jump landing pads
The prologue is entered via "call"; the epilogue, each tb,
and each goto_tb continuation point are all reached via "jump".

As tcg_out_goto_long is only used by tcg_out_exit_tb, merge
the two functions.  Change the indirect register used to
TCG_REG_TMP1, aka X17, so that the BTI condition created
is "jump" instead of "jump or call".

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:16 +00:00
Richard Henderson
095859e5d9 util/cpuinfo-aarch64: Add CPUINFO_BTI
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:16 +00:00
Richard Henderson
9358fbbf6e tcg: Add tcg_out_tb_start backend hook
This hook may emit code at the beginning of the TB.

Suggested-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:16 +00:00
Richard Henderson
722460652b fpu: Handle m68k extended precision denormals properly
Motorola treats denormals with explicit integer bit set as
having unbiased exponent 0, unlike Intel which treats it as
having unbiased exponent 1 (more like all other IEEE formats
that have no explicit integer bit).

Add a flag on FloatFmt to differentiate the behaviour.

Reported-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:16 +00:00
LIU Zhiwei
00f9ef8f3d fpu: Add conversions between bfloat16 and [u]int8
We missed these functions when upstreaming the bfloat16 support.

Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Message-Id: <20230531065458.2082-1-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:15 +00:00
Richard Henderson
1f9823cea2 accel/tcg: Introduce do_st16_mmio_leN
Split out int_st_mmio_leN, to be used by both do_st_mmio_leN
and do_st16_mmio_leN.  Move the locks down into the two
functions, since each one now covers all accesses to once page.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:15 +00:00
Richard Henderson
8bf6726741 accel/tcg: Introduce do_ld16_mmio_beN
Split out int_ld_mmio_beN, to be used by both do_ld_mmio_beN
and do_ld16_mmio_beN.  Move the locks down into the two
functions, since each one now covers all accesses to once page.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:15 +00:00
Richard Henderson
5646d6a70f accel/tcg: Merge io_writex into do_st_mmio_leN
Avoid multiple calls to io_prepare for unaligned acceses.
One call to do_st_mmio_leN will never cross pages.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:15 +00:00
Richard Henderson
13e617475d accel/tcg: Merge io_readx into do_ld_mmio_beN
Avoid multiple calls to io_prepare for unaligned acceses.
One call to do_ld_mmio_beN will never cross pages.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:15 +00:00
Richard Henderson
d89c64f6fd accel/tcg: Replace direct use of io_readx/io_writex in do_{ld,st}_1
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:15 +00:00
Richard Henderson
bef0c21678 accel/tcg: Merge cpu_transaction_failed into io_failed
Push computation down into the if statements to the point
the data is used.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:15 +00:00
Richard Henderson
405c02d85d plugin: Simplify struct qemu_plugin_hwaddr
Rather than saving MemoryRegionSection and offset,
save phys_addr and MemoryRegion.  This matches up
much closer with the plugin api.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:15 +00:00
Richard Henderson
0e1144400f accel/tcg: Use CPUTLBEntryFull.phys_addr in io_failed
Since the introduction of CPUTLBEntryFull, we can recover
the full cpu address space physical address without having
to examine the MemoryRegionSection.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:15 +00:00
Richard Henderson
fb3cb376e9 accel/tcg: Split out io_prepare and io_failed
These are common code from io_readx and io_writex.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:15 +00:00
Richard Henderson
da6aef48d9 accel/tcg: Simplify tlb_plugin_lookup
Now that we defer address space update and tlb_flush until
the next async_run_on_cpu, the plugin run at the end of the
instruction no longer has to contend with a flushed tlb.
Therefore, delete SavedIOTLB entirely.

Properly return false from tlb_plugin_lookup when we do
not have a tlb match.

Fixes a bug in which SavedIOTLB had stale data, because
there were multiple i/o accesses within a single insn.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:15 +00:00
Richard Henderson
e8967b6152 target/arm: Use tcg_gen_gvec_cmpi for compare vs 0
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20230831030904.1194667-3-richard.henderson@linaro.org>
2023-09-16 14:57:15 +00:00
Richard Henderson
9622c697d1 tcg: Add gvec compare with immediate and scalar operand
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20230831030904.1194667-2-richard.henderson@linaro.org>
2023-09-16 14:57:15 +00:00
Jiajie Chen
58f8961285 tcg/loongarch64: Implement 128-bit load & store
If LSX is available, use LSX instructions to implement 128-bit load &
store when MO_128 is required, otherwise use two 64-bit loads & stores.

Signed-off-by: Jiajie Chen <c@jia.je>
Message-Id: <20230908022302.180442-17-c@jia.je>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:10 +00:00
Helge Deller
a64b8842f1 target/hppa: Extract diagnose immediate value
Extract the immediate value given by the diagnose CPU instruction.
This is needed to distinguish the various diagnose calls.

Signed-off-by: Helge Deller <deller@gmx.de>
2023-09-15 17:34:38 +02:00
Helge Deller
fa824d99f9 target/hppa: Add BTLB support to hppa TLB functions
Change the TLB code to store the Block-TLBs at the beginning
of the TLB table. New 4k TLB entries which are added later
shall not overwrite any of the BTLB entries.

Make sure that when the TLB is cleared by the OS via the ptlbe
instruction, the Block-TLBs will not be dropped.

Signed-off-by: Helge Deller <deller@gmx.de>
2023-09-15 17:34:38 +02:00