Commit Graph

86214 Commits

Author SHA1 Message Date
Frank Chang
5e437d3ccd target/riscv: fix vs() to return proper error code
vs() should return -RISCV_EXCP_ILLEGAL_INST instead of -1 if rvv feature
is not enabled.

If -1 is returned, exception will be raised and cs->exception_index will
be set to the negative return value. The exception will then be treated
as an instruction access fault instead of illegal instruction fault.

Signed-off-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210223065935.20208-1-frank.chang@sifive.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2021-03-22 21:54:40 -04:00
Marian Postevca
d07b22863b acpi: Move setters/getters of oem fields to X86MachineState
The code that sets/gets oem fields is duplicated in both PC and MICROVM
variants. This commit moves it to X86MachineState so that all x86
variants can use it and duplication is removed.

Signed-off-by: Marian Postevca <posteuca@mutex.one>
Message-Id: <20210221001737.24499-2-posteuca@mutex.one>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
David Hildenbrand
50337286b7 acpi: Set proper maximum size for "etc/acpi/rsdp" blob
Let's also set a maximum size for "etc/acpi/rsdp", so the maximum
size doesn't get implicitly set based on the initial table size. In my
experiments, the table size was in the range of 22 bytes, so a single
page (== what we used until now) seems to be good enough.

Now that we have defined maximum sizes for all currently used table types,
let's assert that we catch usage with new tables that need a proper maximum
size definition.

Also assert that our initial size does not exceed the maximum size; while
qemu_ram_alloc_internal() properly asserts that the initial RAMBlock size
is <= its maximum size, the result might differ when the host page size
is bigger than 4k.

Suggested-by: Laszlo Ersek <lersek@redhat.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <shannon.zhaosl@gmail.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210304105554.121674-5-david@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
David Hildenbrand
6930ba0d44 acpi: Move maximum size logic into acpi_add_rom_blob()
We want to have safety margins for all tables based on the table type.
Let's move the maximum size logic into acpi_add_rom_blob() and make it
dependent on the table name, so we don't have to replicate for each and
every instance that creates such tables.

Suggested-by: Laszlo Ersek <lersek@redhat.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <shannon.zhaosl@gmail.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210304105554.121674-4-david@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
David Hildenbrand
2a3bdc5cec microvm: Don't open-code "etc/table-loader"
Let's just reuse ACPI_BUILD_LOADER_FILE.

Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <shannon.zhaosl@gmail.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210304105554.121674-3-david@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
David Hildenbrand
6c2b24d1d2 acpi: Set proper maximum size for "etc/table-loader" blob
The resizeable memory region / RAMBlock that is created for the cmd blob
has a maximum size of whole host pages (e.g., 4k), because RAMBlocks
work on full host pages. In addition, in i386 ACPI code:
  acpi_align_size(tables->linker->cmd_blob, ACPI_BUILD_ALIGN_SIZE);
makes sure to align to multiples of 4k, padding with 0.

For example, if our cmd_blob is created with a size of 2k, the maximum
size is 4k - we cannot grow beyond that. Growing might be required
due to guest action when rebuilding the tables, but also on incoming
migration.

This automatic generation of the maximum size used to be sufficient,
however, there are cases where we cross host pages now when growing at
runtime: we exceed the maximum size of the RAMBlock and can crash QEMU when
trying to resize the resizeable memory region / RAMBlock:
  $ build/qemu-system-x86_64 --enable-kvm \
      -machine q35,nvdimm=on \
      -smp 1 \
      -cpu host \
      -m size=2G,slots=8,maxmem=4G \
      -object memory-backend-file,id=mem0,mem-path=/tmp/nvdimm,size=256M \
      -device nvdimm,label-size=131072,memdev=mem0,id=nvdimm0,slot=1 \
      -nodefaults \
      -device vmgenid \
      -device intel-iommu

Results in:
  Unexpected error in qemu_ram_resize() at ../softmmu/physmem.c:1850:
  qemu-system-x86_64: Size too large: /rom@etc/table-loader:
    0x2000 > 0x1000: Invalid argument

In this configuration, we consume exactly 4k (32 entries, 128 bytes each)
when creating the VM. However, once the guest boots up and maps the MCFG,
we also create the MCFG table and end up consuming 2 additional entries
(pointer + checksum) -- which is where we try resizing the memory region
/ RAMBlock, however, the maximum size does not allow for it.

Currently, we get the following maximum sizes for our different
mutable tables based on behavior of resizeable RAMBlock:

  hw       table                max_size
  -------  ---------------------------------------------------------

  virt     "etc/acpi/tables"    ACPI_BUILD_TABLE_MAX_SIZE (0x200000)
  virt     "etc/table-loader"   HOST_PAGE_ALIGN(initial_size)
  virt     "etc/acpi/rsdp"      HOST_PAGE_ALIGN(initial_size)

  i386     "etc/acpi/tables"    ACPI_BUILD_TABLE_MAX_SIZE (0x200000)
  i386     "etc/table-loader"   HOST_PAGE_ALIGN(initial_size)
  i386     "etc/acpi/rsdp"      HOST_PAGE_ALIGN(initial_size)

  microvm  "etc/acpi/tables"    ACPI_BUILD_TABLE_MAX_SIZE (0x200000)
  microvm  "etc/table-loader"   HOST_PAGE_ALIGN(initial_size)
  microvm  "etc/acpi/rsdp"      HOST_PAGE_ALIGN(initial_size)

Let's set the maximum table size for "etc/table-loader" to 64k, so we
can properly grow at runtime, which should be good enough for the future.

Migration is not concerned with the maximum size of a RAMBlock, only
with the used size - so existing setups are not affected. Of course, we
cannot migrate a VM that would have crash when started on older QEMU from
new QEMU to older QEMU without failing early on the destination when
synchronizing the RAM state:
    qemu-system-x86_64: Size too large: /rom@etc/table-loader: 0x2000 > 0x1000: Invalid argument
    qemu-system-x86_64: error while loading state for instance 0x0 of device 'ram'
    qemu-system-x86_64: load of migration failed: Invalid argument

We'll refactor the code next, to make sure we get rid of this implicit
behavior for "etc/acpi/rsdp" as well and to make the code easier to
grasp.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <shannon.zhaosl@gmail.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210304105554.121674-2-david@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
Igor Mammedov
835fde4a78 tests: acpi: update expected blobs
expected changes are:
 * larger BNMR operation region
 * new PIDX field and method to fetch acpi-index
 * PDSM method that implements PCI device _DSM +
   per device _DSM that calls PDSM

@@ -221,10 +221,11 @@ DefinitionBlock ("", "DSDT", 1, "BOCHS ", "BXPC    ", 0x00000001)
             B0EJ,   32
         }

-        OperationRegion (BNMR, SystemIO, 0xAE10, 0x04)
+        OperationRegion (BNMR, SystemIO, 0xAE10, 0x08)
         Field (BNMR, DWordAcc, NoLock, WriteAsZeros)
         {
-            BNUM,   32
+            BNUM,   32,
+            PIDX,   32
         }

         Mutex (BLCK, 0x00)
@@ -236,6 +237,52 @@ DefinitionBlock ("", "DSDT", 1, "BOCHS ", "BXPC    ", 0x00000001)
             Release (BLCK)
             Return (Zero)
         }
+
+        Method (AIDX, 2, NotSerialized)
+        {
+            Acquire (BLCK, 0xFFFF)
+            BNUM = Arg0
+            PIDX = (One << Arg1)
+            Local0 = PIDX /* \_SB_.PCI0.PIDX */
+            Release (BLCK)
+            Return (Local0)
+        }
+
+        Method (PDSM, 6, Serialized)
+        {
+            If ((Arg0 == ToUUID ("e5c937d0-3553-4d7a-9117-ea4d19c3434d") /* Device Labeling Interface */))
+            {
+                Local0 = AIDX (Arg4, Arg5)
+                If ((Arg2 == Zero))
+                {
+                    If ((Arg1 == 0x02))
+                    {
+                        If (!((Local0 == Zero) | (Local0 == 0xFFFFFFFF)))
+                        {
+                            Return (Buffer (One)
+                            {
+                                 0x81                                             // .
+                            })
+                        }
+                    }
+
+                    Return (Buffer (One)
+                    {
+                         0x00                                             // .
+                    })
+                }
+                ElseIf ((Arg2 == 0x07))
+                {
+                    Local1 = Package (0x02)
+                        {
+                            Zero,
+                            ""
+                        }
+                    Local1 [Zero] = Local0
+                    Return (Local1)
+                }
+            }
+        }
     }

     Scope (_SB)
@@ -785,7 +832,7 @@ DefinitionBlock ("", "DSDT", 1, "BOCHS ", "BXPC    ", 0x00000001)
                     0xAE00,             // Range Minimum
                     0xAE00,             // Range Maximum
                     0x01,               // Alignment
-                    0x14,               // Length
+                    0x18,               // Length
                     )
             })
         }
@@ -842,11 +889,22 @@ DefinitionBlock ("", "DSDT", 1, "BOCHS ", "BXPC    ", 0x00000001)
             Device (S00)
             {
                 Name (_ADR, Zero)  // _ADR: Address
+                Name (_SUN, Zero)  // _SUN: Slot User Number
+                Method (_DSM, 4, Serialized)  // _DSM: Device-Specific Method
+                {
+                    Return (PDSM (Arg0, Arg1, Arg2, Arg3, BSEL, _SUN))
+                }
             }

             Device (S10)
             {
                 Name (_ADR, 0x00020000)  // _ADR: Address
+                Name (_SUN, 0x02)  // _SUN: Slot User Number
+                Method (_DSM, 4, Serialized)  // _DSM: Device-Specific Method
+                {
+                    Return (PDSM (Arg0, Arg1, Arg2, Arg3, BSEL, _SUN))
+                }
+
                 Method (_S1D, 0, NotSerialized)  // _S1D: S1 Device State
                 {
                     Return (Zero)
[...]

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210315180102.3008391-7-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
Igor Mammedov
b7f23f62e4 pci: acpi: add _DSM method to PCI devices
Implement _DSM according to:
    PCI Firmware Specification 3.1
    4.6.7.  DSM for Naming a PCI or PCI Express Device Under
            Operating Systems
and wire it up to cold and hot-plugged PCI devices.
Feature depends on ACPI hotplug being enabled (as that provides
PCI devices descriptions in ACPI and MMIO registers that are
reused to fetch acpi-index).

acpi-index should work for
  - cold plugged NICs:
      $QEMU -device e1000,acpi-index=100
         => 'eno100'
  - hot-plugged
      (monitor) device_add e1000,acpi-index=200,id=remove_me
         => 'eno200'
  - re-plugged
      (monitor) device_del remove_me
      (monitor) device_add e1000,acpi-index=1
         => 'eno1'

Windows also sees index under "PCI Label Id" field in properties
dialog but otherwise it doesn't seem to have any effect.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210315180102.3008391-6-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
Igor Mammedov
910e406971 acpi: add aml_to_decimalstring() and aml_call6() helpers
it will be used by follow up patches

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210315180102.3008391-5-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
Igor Mammedov
4fd7da4c03 pci: acpi: ensure that acpi-index is unique
it helps to avoid device naming conflicts when guest OS is
configured to use acpi-index for naming.
Spec ialso says so:

PCI Firmware Specification Revision 3.2
4.6.7.  _DSM for Naming a PCI or PCI Express Device Under Operating Systems
"
Instance number must be unique under \_SB scope. This instance number does not have to
be sequential in a given system configuration.
"

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210315180102.3008391-4-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
Igor Mammedov
b32bd763a1 pci: introduce acpi-index property for PCI device
In x86/ACPI world, linux distros are using predictable
network interface naming since systemd v197. Which on
QEMU based VMs results into path based naming scheme,
that names network interfaces based on PCI topology.

With itm on has to plug NIC in exactly the same bus/slot,
which was used when disk image was first provisioned/configured
or one risks to loose network configuration due to NIC being
renamed to actually used topology.
That also restricts freedom to reshape PCI configuration of
VM without need to reconfigure used guest image.

systemd also offers "onboard" naming scheme which is
preferred over PCI slot/topology one, provided that
firmware implements:
    "
    PCI Firmware Specification 3.1
    4.6.7.  DSM for Naming a PCI or PCI Express Device Under
            Operating Systems
    "
that allows to assign user defined index to PCI device,
which systemd will use to name NIC. For example, using
  -device e1000,acpi-index=100
guest will rename NIC to 'eno100', where 'eno' is default
prefix for "onboard" naming scheme. This doesn't require
any advance configuration on guest side to com in effect
at 'onboard' scheme takes priority over path based naming.

Hope is that 'acpi-index' it will be easier to consume by
management layer, compared to forcing specific PCI topology
and/or having several disk image templates for different
topologies and will help to simplify process of spawning
VM from the same template without need to reconfigure
guest NIC.

This patch adds, 'acpi-index'* property and wires up
a 32bit register on top of pci hotplug register block
to pass index value to AML code at runtime.
Following patch will add corresponding _DSM code and
wire it up to PCI devices described in ACPI.

*) name comes from linux kernel terminology

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210315180102.3008391-3-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
Igor Mammedov
79a2aca20c tests: acpi: temporary whitelist DSDT changes
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210315180102.3008391-2-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
Peter Maydell
5ca634afcf SD/MMC patches queue
- Fix build error when DEBUG_SD is on
 - Perform SD ERASE operation
 - SDHCI ADMA heap buffer overflow
   (CVE-2020-17380, CVE-2020-25085, CVE-2021-3409)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmBY0EcACgkQ4+MsLN6t
 wN438w//TlfUYqwYFLQ5DXtjyWev2tTfRuZrVM7+9pbeZrIN9nG+EPd8eimgr/Mp
 fmJ/xnhjq7nCZlTk+clpr3uSkchFKwHYHugEAKA2J4+bByWHSqgkIHtt6aW5zuEj
 oFSfmjx/hfKuZoYZPLj3I035kXbolZn3q8NwtGL83u5Jw+9dFWhxEoFkDqFeWKtn
 lP7pHFXrW5+kvz33f6KILnlkjPe5ues7IUgi/QbtDwUd2gw6kgIo9OQa1nCjfzNu
 m/ITEL1g8Vcj1X0HUVUAJcOWNjbesq/KTjaf4ZitJAAvvk1l95G3wT1HiUZTKaf8
 iOoCtp5BqfCqV3koJO91kceD7bYf9TokueBcPdow7c2ekbbi30LL51uQEAeJRpHC
 sut3jfmmQWC5shNf6KKmUJsNSRlTHD5xQx5UTxwsNQuyb5KSNuF/LP3iFGAsCouh
 bu4DQfofAm5MCjzz1gkoGiJBGLrvJVnghbxF8anBvFL+jdYkUfpuuGXwyQm5fuJB
 l+k25wBLHbWzpf+9EUiDpOObXEhb8jyxB8ihTHE7sRTktBzRgnRkFQ3RCcybOypt
 dpb1ITN647XH9mGjgzxKjL+UuQb4loUl1Qih4kFuHRlVS7VncmulrJ2Z7D78WZMk
 f7ObM0kC5bakWOAP/QoMfytlRDPtorpL+bTFN6NEOpyTro+4Pcs=
 =gvrj
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/philmd/tags/sdmmc-20210322' into staging

SD/MMC patches queue

- Fix build error when DEBUG_SD is on
- Perform SD ERASE operation
- SDHCI ADMA heap buffer overflow
  (CVE-2020-17380, CVE-2020-25085, CVE-2021-3409)

# gpg: Signature made Mon 22 Mar 2021 17:13:43 GMT
# gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD  6BB2 E3E3 2C2C DEAD C0DE

* remotes/philmd/tags/sdmmc-20210322:
  hw/sd: sdhci: Reset the data pointer of s->fifo_buffer[] when a different block size is programmed
  hw/sd: sdhci: Limit block size only when SDHC_BLKSIZE register is writable
  hw/sd: sdhci: Correctly set the controller status for ADMA
  hw/sd: sdhci: Don't write to SDHC_SYSAD register when transfer is in progress
  hw/sd: sdhci: Don't transfer any data when command time out
  hw/sd: sd: Actually perform the erase operation
  hw/sd: sd: Fix build error when DEBUG_SD is on

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-03-22 18:50:25 +00:00
Bin Meng
cffb446e8f hw/sd: sdhci: Reset the data pointer of s->fifo_buffer[] when a different block size is programmed
If the block size is programmed to a different value from the
previous one, reset the data pointer of s->fifo_buffer[] so that
s->fifo_buffer[] can be filled in using the new block size in
the next transfer.

With this fix, the following reproducer:

outl 0xcf8 0x80001010
outl 0xcfc 0xe0000000
outl 0xcf8 0x80001001
outl 0xcfc 0x06000000
write 0xe000002c 0x1 0x05
write 0xe0000005 0x1 0x02
write 0xe0000007 0x1 0x01
write 0xe0000028 0x1 0x10
write 0x0 0x1 0x23
write 0x2 0x1 0x08
write 0xe000000c 0x1 0x01
write 0xe000000e 0x1 0x20
write 0xe000000f 0x1 0x00
write 0xe000000c 0x1 0x32
write 0xe0000004 0x2 0x0200
write 0xe0000028 0x1 0x00
write 0xe0000003 0x1 0x40

cannot be reproduced with the following QEMU command line:

$ qemu-system-x86_64 -nographic -machine accel=qtest -m 512M \
      -nodefaults -device sdhci-pci,sd-spec-version=3 \
      -drive if=sd,index=0,file=null-co://,format=raw,id=mydrive \
      -device sd-card,drive=mydrive -qtest stdio

Cc: qemu-stable@nongnu.org
Fixes: CVE-2020-17380
Fixes: CVE-2020-25085
Fixes: CVE-2021-3409
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cornelius Aschermann (Ruhr-Universität Bochum)
Reported-by: Sergej Schumilo (Ruhr-Universität Bochum)
Reported-by: Simon Wörner (Ruhr-Universität Bochum)
Buglink: https://bugs.launchpad.net/qemu/+bug/1892960
Buglink: https://bugs.launchpad.net/qemu/+bug/1909418
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1928146
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210303122639.20004-6-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2021-03-22 16:56:22 +01:00
Bin Meng
5cd7aa3451 hw/sd: sdhci: Limit block size only when SDHC_BLKSIZE register is writable
The codes to limit the maximum block size is only necessary when
SDHC_BLKSIZE register is writable.

Tested-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210303122639.20004-5-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2021-03-22 16:56:19 +01:00
Bin Meng
bc6f28995f hw/sd: sdhci: Correctly set the controller status for ADMA
When an ADMA transfer is started, the codes forget to set the
controller status to indicate a transfer is in progress.

With this fix, the following 2 reproducers:

https://paste.debian.net/plain/1185136
https://paste.debian.net/plain/1185141

cannot be reproduced with the following QEMU command line:

$ qemu-system-x86_64 -nographic -machine accel=qtest -m 512M \
      -nodefaults -device sdhci-pci,sd-spec-version=3 \
      -drive if=sd,index=0,file=null-co://,format=raw,id=mydrive \
      -device sd-card,drive=mydrive -qtest stdio

Cc: qemu-stable@nongnu.org
Fixes: CVE-2020-17380
Fixes: CVE-2020-25085
Fixes: CVE-2021-3409
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cornelius Aschermann (Ruhr-Universität Bochum)
Reported-by: Sergej Schumilo (Ruhr-Universität Bochum)
Reported-by: Simon Wörner (Ruhr-Universität Bochum)
Buglink: https://bugs.launchpad.net/qemu/+bug/1892960
Buglink: https://bugs.launchpad.net/qemu/+bug/1909418
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1928146
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210303122639.20004-4-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2021-03-22 16:56:13 +01:00
Bin Meng
8be45cc947 hw/sd: sdhci: Don't write to SDHC_SYSAD register when transfer is in progress
Per "SD Host Controller Standard Specification Version 7.00"
chapter 2.2.1 SDMA System Address Register:

This register can be accessed only if no transaction is executing
(i.e., after a transaction has stopped).

With this fix, the following reproducer:

outl 0xcf8 0x80001010
outl 0xcfc 0xfbefff00
outl 0xcf8 0x80001001
outl 0xcfc 0x06000000
write 0xfbefff2c 0x1 0x05
write 0xfbefff0f 0x1 0x37
write 0xfbefff0a 0x1 0x01
write 0xfbefff0f 0x1 0x29
write 0xfbefff0f 0x1 0x02
write 0xfbefff0f 0x1 0x03
write 0xfbefff04 0x1 0x01
write 0xfbefff05 0x1 0x01
write 0xfbefff07 0x1 0x02
write 0xfbefff0c 0x1 0x33
write 0xfbefff0e 0x1 0x20
write 0xfbefff0f 0x1 0x00
write 0xfbefff2a 0x1 0x01
write 0xfbefff0c 0x1 0x00
write 0xfbefff03 0x1 0x00
write 0xfbefff05 0x1 0x00
write 0xfbefff2a 0x1 0x02
write 0xfbefff0c 0x1 0x32
write 0xfbefff01 0x1 0x01
write 0xfbefff02 0x1 0x01
write 0xfbefff03 0x1 0x01

cannot be reproduced with the following QEMU command line:

$ qemu-system-x86_64 -nographic -machine accel=qtest -m 512M \
       -nodefaults -device sdhci-pci,sd-spec-version=3 \
       -drive if=sd,index=0,file=null-co://,format=raw,id=mydrive \
       -device sd-card,drive=mydrive -qtest stdio

Cc: qemu-stable@nongnu.org
Fixes: CVE-2020-17380
Fixes: CVE-2020-25085
Fixes: CVE-2021-3409
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cornelius Aschermann (Ruhr-Universität Bochum)
Reported-by: Sergej Schumilo (Ruhr-Universität Bochum)
Reported-by: Simon Wörner (Ruhr-Universität Bochum)
Buglink: https://bugs.launchpad.net/qemu/+bug/1892960
Buglink: https://bugs.launchpad.net/qemu/+bug/1909418
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1928146
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210303122639.20004-3-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2021-03-22 16:56:07 +01:00
Bin Meng
b263d8f928 hw/sd: sdhci: Don't transfer any data when command time out
At the end of sdhci_send_command(), it starts a data transfer if the
command register indicates data is associated. But the data transfer
should only be initiated when the command execution has succeeded.

With this fix, the following reproducer:

outl 0xcf8 0x80001810
outl 0xcfc 0xe1068000
outl 0xcf8 0x80001804
outw 0xcfc 0x7
write 0xe106802c 0x1 0x0f
write 0xe1068004 0xc 0x2801d10101fffffbff28a384
write 0xe106800c 0x1f 0x9dacbbcad9e8f7061524334251606f7e8d9cabbac9d8e7f60514233241505f
write 0xe1068003 0x28 0x80d000251480d000252280d000253080d000253e80d000254c80d000255a80d000256880d0002576
write 0xe1068003 0x1 0xfe

cannot be reproduced with the following QEMU command line:

$ qemu-system-x86_64 -nographic -M pc-q35-5.0 \
      -device sdhci-pci,sd-spec-version=3 \
      -drive if=sd,index=0,file=null-co://,format=raw,id=mydrive \
      -device sd-card,drive=mydrive \
      -monitor none -serial none -qtest stdio

Cc: qemu-stable@nongnu.org
Fixes: CVE-2020-17380
Fixes: CVE-2020-25085
Fixes: CVE-2021-3409
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cornelius Aschermann (Ruhr-Universität Bochum)
Reported-by: Sergej Schumilo (Ruhr-Universität Bochum)
Reported-by: Simon Wörner (Ruhr-Universität Bochum)
Buglink: https://bugs.launchpad.net/qemu/+bug/1892960
Buglink: https://bugs.launchpad.net/qemu/+bug/1909418
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1928146
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210303122639.20004-2-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2021-03-22 16:56:00 +01:00
Bin Meng
818a5cdcfc hw/sd: sd: Actually perform the erase operation
At present the sd_erase() does not erase the requested range of card
data to 0xFFs. Let's make the erase operation actually happen.

Signed-off-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <1613811493-58815-1-git-send-email-bmeng.cn@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2021-03-22 15:34:29 +01:00
Bin Meng
a78d9f27b7 hw/sd: sd: Fix build error when DEBUG_SD is on
"qemu-common.h" should be included to provide the forward declaration
of qemu_hexdump() when DEBUG_SD is on.

Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210228050609.24779-1-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2021-03-22 15:34:29 +01:00
Peter Maydell
c95bd5ff16 MIPS patches queue
- Fix array overrun (Coverity CID 1450831)
 - Deprecate KVM TE (Trap-and-Emul)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmBYpHgACgkQ4+MsLN6t
 wN6fKRAAka2tmVbZw+2FD+T1hBGykNp0ZdFRB0g4qh9gns/l7jWs2lv22Ny9koYE
 VuTpHHyMRrSweZeCBaRbX5lcE1gIPAvCmbl/eosVBRsNiqQs5SPAMGlbr6kbAu2Z
 TTMH7mUp9Q6I8lxiACInE+Q3K0gKPtFse/CjQ3oJ2tnUMvjk0GAn//oI4TZHv4jl
 lIvJ+d1Y/pO7KYFH6VHMeoaUhIyaZ+YpmUBF5yRU9iCxGxKdW7xZaYwcTMS2+jah
 g+C/zVXxanrbfiUQZLkDNqPjPOTsXkf55rSdKn8MK1bvP+xwpTm+mtS8OG5jr20G
 BGGKwUJ3WE+BKB3C/NfRnnVlSQnmpBRaMSm1o3PTFUV8Dt0BJ8Qfqaycq8cxHPsG
 2e+H1WrtmoswalSGXS8b5u4WQKTucDNekjPEdBS5PIOPaY1W3/fUFjlVXpZ5kphm
 pnEk9yXAFvA45pyMzf4I4fZqMFQ+GhRUOUwtQ2Zm209Fucnxtn5iIviH4QcJFt5v
 t9B6liQuuZ/cLY0bUSD/LysW0c9gDKyVCim/GISHoyOQkeQaMAht9JS9zOybmLNa
 ooXfN+rpG+5OqtGdzsfkCQ/rk8jL4G1NbizH/jPlcWQqXhSH80fOKtrdEi26dOMb
 mOujcGKc4ZE33BKdFG/wSBucqsv/OZmcUwFxk9FFvszMLdXfOyg=
 =5qRC
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/philmd/tags/mips-fixes-20210322' into staging

MIPS patches queue

- Fix array overrun (Coverity CID 1450831)
- Deprecate KVM TE (Trap-and-Emul)

# gpg: Signature made Mon 22 Mar 2021 14:06:48 GMT
# gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD  6BB2 E3E3 2C2C DEAD C0DE

* remotes/philmd/tags/mips-fixes-20210322:
  target/mips: Deprecate Trap-and-Emul KVM support
  target/mips/mxu_translate.c: Fix array overrun for D16MIN/D16MAX

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-03-22 14:26:13 +00:00
Wang Liang
d2adda34a9 virtio-pmem: fix virtio_pmem_resp assign problem
ret in virtio_pmem_resp is a uint32_t variable, which should be assigned
using virtio_stl_p.

The kernel side driver does not guarantee virtio_pmem_resp to be initialized
to zero in advance, So sometimes the flush operation will fail.

Signed-off-by: Wang Liang <wangliangzz@inspur.com>
Message-Id: <20210317024145.271212-1-wangliangzz@126.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 10:17:53 -04:00
Greg Kurz
db8a3772e3 vhost-user: Monitor slave channel in vhost_user_read()
Now that everything is in place, have the nested event loop to monitor
the slave channel. The source in the main event loop is destroyed and
recreated to ensure any pending even for the slave channel that was
previously detected is purged. This guarantees that the main loop
wont invoke slave_read() based on an event that was already handled
by the nested loop.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-7-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-03-22 10:17:53 -04:00
Greg Kurz
a7f523c7d1 vhost-user: Introduce nested event loop in vhost_user_read()
A deadlock condition potentially exists if a vhost-user process needs
to request something to QEMU on the slave channel while processing a
vhost-user message.

This doesn't seem to affect any vhost-user implementation so far, but
this is currently biting the upcoming enablement of DAX with virtio-fs.
The issue is being observed when the guest does an emergency reboot while
a mapping still exits in the DAX window, which is very easy to get with
a busy enough workload (e.g. as simulated by blogbench [1]) :

- QEMU sends VHOST_USER_GET_VRING_BASE to virtiofsd.

- In order to complete the request, virtiofsd then asks QEMU to remove
  the mapping on the slave channel.

All these dialogs are synchronous, hence the deadlock.

As pointed out by Stefan Hajnoczi:

When QEMU's vhost-user master implementation sends a vhost-user protocol
message, vhost_user_read() does a "blocking" read during which slave_fd
is not monitored by QEMU.

The natural solution for this issue is an event loop. The main event
loop cannot be nested though since we have no guarantees that its
fd handlers are prepared for re-entrancy.

Introduce a new event loop that only monitors the chardev I/O for now
in vhost_user_read() and push the actual reading to a one-shot handler.
A subsequent patch will teach the loop to monitor and process messages
from the slave channel as well.

[1] https://github.com/jedisct1/Blogbench

Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-6-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-03-22 10:17:53 -04:00
Greg Kurz
57dc02173c vhost-user: Convert slave channel to QIOChannelSocket
The slave channel is implemented with socketpair() : QEMU creates
the pair, passes one of the socket to virtiofsd and monitors the
other one with the main event loop using qemu_set_fd_handler().

In order to fix a potential deadlock between QEMU and a vhost-user
external process (e.g. virtiofsd with DAX), we want to be able to
monitor and service the slave channel while handling vhost-user
requests.

Prepare ground for this by converting the slave channel to be a
QIOChannelSocket. This will make monitoring of the slave channel
as simple as calling qio_channel_add_watch_source(). Since the
connection is already established between the two sockets, only
incoming I/O (G_IO_IN) and disconnect (G_IO_HUP) need to be
serviced.

This also allows to get rid of the ancillary data parsing since
QIOChannelSocket can do this for us. Note that the MSG_CTRUNC
check is dropped on the way because QIOChannelSocket ignores this
case. This isn't a problem since slave_read() provisions space for
8 file descriptors, but affected vhost-user slave protocol messages
generally only convey one. If for some reason a buggy implementation
passes more file descriptors, no need to break the connection, just
like we don't break it if some other type of ancillary data is
received : this isn't explicitely violating the protocol per-se so
it seems better to ignore it.

The current code errors out on short reads and writes. Use the
qio_channel_*_all() variants to address this on the way.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-5-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-03-22 10:17:53 -04:00
Greg Kurz
de62e49460 vhost-user: Factor out duplicated slave_fd teardown code
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-4-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-03-22 10:17:53 -04:00
Greg Kurz
9e06080bed vhost-user: Fix double-close on slave_read() error path
Some message types, e.g. VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG,
can convey file descriptors. These must be closed before returning
from slave_read() to avoid being leaked. This can currently be done
in two different places:

[1] just after the request has been processed

[2] on the error path, under the goto label err:

These path are supposed to be mutually exclusive but they are not
actually. If the VHOST_USER_NEED_REPLY_MASK flag was passed and the
sending of the reply fails, both [1] and [2] are performed with the
same descriptor values. This can potentially cause subtle bugs if one
of the descriptor was recycled by some other thread in the meantime.

This code duplication complicates rollback for no real good benefit.
Do the closing in a unique place, under a new fdcleanup: goto label
at the end of the function.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-3-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-03-22 10:17:53 -04:00
Greg Kurz
a890557d5a vhost-user: Drop misleading EAGAIN checks in slave_read()
slave_read() checks EAGAIN when reading or writing to the socket
fails. This gives the impression that the slave channel is in
non-blocking mode, which is certainly not the case with the current
code base. And the rest of the code isn't actually ready to cope
with non-blocking I/O.

Just drop the checks everywhere in this function for the sake of
clarity.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-2-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-03-22 10:17:53 -04:00
Laurent Vivier
0ab8c021c6 virtio: Fix virtio_mmio_read()/virtio_mmio_write()
Both functions don't check the personality of the interface (legacy or
modern) before accessing the configuration memory and always use
virtio_config_readX()/virtio_config_writeX().

With this patch, they now check the personality and in legacy mode
call virtio_config_readX()/virtio_config_writeX(), otherwise call
virtio_config_modern_readX()/virtio_config_modern_writeX().

This change has been tested with virtio-mmio guests (virt stretch/armhf and
virt sid/m68k) and virtio-pci guests (pseries RHEL-7.3/ppc64 and /ppc64le).

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210314200300.3259170-1-laurent@vivier.eu>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 10:17:53 -04:00
Jiaxun Yang
f071dc1f0c target/mips: Deprecate Trap-and-Emul KVM support
Upstream kernel had removed both host[1] and guest[2] support.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git/commit/?id=45c7e8af4a5e3f0bea4ac209eea34118dd57ac64
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git/commit/?id=a1515ec7204edca770c07929df8538fcdb03ad46

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210317011235.7425-1-jiaxun.yang@flygoat.com>
[PMD: Specify mentioned kernel is Linux]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2021-03-22 15:05:45 +01:00
Peter Maydell
20b42789aa target/mips/mxu_translate.c: Fix array overrun for D16MIN/D16MAX
Coverity reported (CID 1450831) an array overrun in
gen_mxu_D16MAX_D16MIN():

  1103     } else if (unlikely((XRb == 0) || (XRa == 0))) {
  ....
  1112         if (opc == OPC_MXU_D16MAX) {
  1113             tcg_gen_smax_i32(mxu_gpr[XRa - 1], t0, t1);
  1114         } else {
  1115             tcg_gen_smin_i32(mxu_gpr[XRa - 1], t0, t1);
  1116         }

>>> Overrunning array "mxu_gpr" of 15 8-byte elements at element
    index 4294967295 (byte offset 34359738367) using index "XRa - 1U"
    (which evaluates to 4294967295).

This happens because the code is confused about which of XRa, XRb and
XRc is the output, and which are the inputs.  XRa is the output, but
most of the conditions separating out different special cases are
written as if XRc is the output, with the result that we can end up
in the code path that assumes XRa is non-0 even when it is zero.

Fix the erroneous code, bringing it in to line with the structure
used in functions like gen_mxu_S32MAX_S32MIN() and
gen_mxu_Q8MAX_Q8MIN().

Fixes: CID 1450831
Fixes: bb84cbf385
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210316131353.4533-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2021-03-22 15:05:41 +01:00
Peter Maydell
b184750926 -----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
 
 iQEcBAABAgAGBQJgWGTMAAoJEO8Ells5jWIRLUQIAJ6xxOKldqyo9n9N2vP+6O1J
 lLSC0NT2d0ACptl/8o+KKlf9z2n7o/SKykLUXwCjgLeT2xTKhJ+p8tPOXJ4gns3D
 U3GGc20gUkuGwNl0g2Rr2Yz4EJXlxx5Sl9LJz9quyx0Yp0SV8fYgQpj3mLgRye6T
 hbVSDnQiSyo1iSx+Q/Qc6+Gr4a+sV4pExhrVP1n61JFCB95DUMNqjBsHv+urowHh
 WCN9JbHuiZRHeIGIZujJxzMrh8jlP4pPFIrXvsdwHm/xJlVTgMXFvfIjfi3IrYWU
 NbI0PluSfPbeAB8V+aPb9386K4UFpEEHJipIm+4HSVpVsAuoRVnEVoIWGubXfWo=
 =9Tij
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging

# gpg: Signature made Mon 22 Mar 2021 09:35:08 GMT
# gpg:                using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  net/eth: Add an assert() and invert if() statement to simplify code
  net/eth: Read ip6_ext_hdr_routing buffer before accessing it
  net/eth: Check iovec has enough data earlier
  net/eth: Check size earlier in _eth_get_rss_ex_dst_addr()
  net/eth: Better describe _eth_get_rss_ex_dst_addr's offset argument
  net/eth: Simplify _eth_get_rss_ex_dst_addr()
  net/eth: Use correct in6_address offset in _eth_get_rss_ex_dst_addr()
  net/colo-compare.c: Optimize removal of secondary packet
  net/colo-compare.c: Fix memory leak for non-tcp packet
  hw/net: virtio-net: Initialize nc->do_not_pad to true
  net: Pad short frames to minimum size before sending from SLiRP/TAP
  net: Add a 'do_not_pad" to NetClientState
  net: eth: Add a helper to pad a short Ethernet frame

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-03-22 11:24:55 +00:00
Peter Maydell
f0f20022a0 * Small fixes for the unit tests
* Compilation fixes for Illumos et al.
 * Update the FreeBSD VM to 12.2
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmBXeZ4RHHRodXRoQHJl
 ZGhhdC5jb20ACgkQLtnXdP5wLbXrtw//SWtVgXYlifD8JTJcF40CcXB4/hFzPPvs
 syFBtm7mFikgkiXa0oKPVpiHeEmvYvE7F9H30xbxEb3eqg194K+KBHOL8PBH03Mv
 3P7IjEs+MOIUvpYcCwoJWS1r9Dpw4IMfGSljdkYWW4iO9TJBt+xOOIQy68eJnXkw
 eEEZ9NjBq6cSKSEoIyaRiGIi1OsLMGatsXEgKhaKBiLaKFagNLEzTOxKafSyJkWt
 jRgESoLbUH3EbkmKrUZGYhVpKgFiXSJMz1j35ujqVhMY3FeCi69xjcvpiXRN1NMS
 kQO6CKkkVa5WpCPhqUPBI5pKzIAax180v7Qes+hlOgecNCjy76wVY2ZJVUf1kuxL
 aHdgs77Tmq6dwCjuDu2Jon78Rge89cKCIXCDI70BYQUv7zHxwxVyO6Fwn889k3D/
 7xIsze0oELbHYSperMm2wyOmswevyUOGCCBprtgrDUW/CtFMBQBHI5kvs4PV0dx9
 GssCl9xkzNtGqeH/pHy5gr8NkYxSfAQRdv4NQen4lrpZpGDltOa3eJ+EDuIYYLdR
 fsPMVuc5AKuw4lgNyIzPrIq7SAhbhlYxhsNLKFh+XpvEF3/d0OHkT02o10iXH9Py
 I2S4qdVbKp5ZYa8Fj9fTksxAU3qn0DR8zAMh5RIFkce0u6M05UZ/2KaOW4WQVEbJ
 NFRresAmcgw=
 =fkgL
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/thuth-gitlab/tags/pull-request-2021-03-21' into staging

* Small fixes for the unit tests
* Compilation fixes for Illumos et al.
* Update the FreeBSD VM to 12.2

# gpg: Signature made Sun 21 Mar 2021 16:51:42 GMT
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* remotes/thuth-gitlab/tags/pull-request-2021-03-21:
  FreeBSD: Upgrade to 12.2 release
  contrib: ivshmem client and server build fix for SunOS.
  configure: fix for SunOS based systems
  tests/unit/test-block-iothread: fix maybe-uninitialized error on GCC 11
  docs/devel/testing.rst: Fix references to unit tests

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-03-22 10:05:45 +00:00
Philippe Mathieu-Daudé
c7274b5ef4 net/eth: Add an assert() and invert if() statement to simplify code
To simplify the function body, invert the if() statement, returning
earlier.
Since we already checked there is enough data in the iovec buffer,
simply add an assert() call to consume the bytes_read variable.

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-22 17:34:31 +08:00
Philippe Mathieu-Daudé
7d6a4f123e net/eth: Read ip6_ext_hdr_routing buffer before accessing it
We can't know the caller read enough data in the memory pointed
by ext_hdr to cast it as a ip6_ext_hdr_routing.
Declare rt_hdr on the stack and fill it again from the iovec.

Since we already checked there is enough data in the iovec buffer,
simply add an assert() call to consume the bytes_read variable.

This fix a 2 bytes buffer overrun in eth_parse_ipv6_hdr() reported
by QEMU fuzzer:

  $ cat << EOF | ./qemu-system-i386 -M pc-q35-5.0 \
    -accel qtest -monitor none \
    -serial none -nographic -qtest stdio
  outl 0xcf8 0x80001010
  outl 0xcfc 0xe1020000
  outl 0xcf8 0x80001004
  outw 0xcfc 0x7
  write 0x25 0x1 0x86
  write 0x26 0x1 0xdd
  write 0x4f 0x1 0x2b
  write 0xe1020030 0x4 0x190002e1
  write 0xe102003a 0x2 0x0807
  write 0xe1020048 0x4 0x12077cdd
  write 0xe1020400 0x4 0xba077cdd
  write 0xe1020420 0x4 0x190002e1
  write 0xe1020428 0x4 0x3509d807
  write 0xe1020438 0x1 0xe2
  EOF
  =================================================================
  ==2859770==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffdef904902 at pc 0x561ceefa78de bp 0x7ffdef904820 sp 0x7ffdef904818
  READ of size 1 at 0x7ffdef904902 thread T0
      #0 0x561ceefa78dd in _eth_get_rss_ex_dst_addr net/eth.c:410:17
      #1 0x561ceefa41fb in eth_parse_ipv6_hdr net/eth.c:532:17
      #2 0x561cef7de639 in net_tx_pkt_parse_headers hw/net/net_tx_pkt.c:228:14
      #3 0x561cef7dbef4 in net_tx_pkt_parse hw/net/net_tx_pkt.c:273:9
      #4 0x561ceec29f22 in e1000e_process_tx_desc hw/net/e1000e_core.c:730:29
      #5 0x561ceec28eac in e1000e_start_xmit hw/net/e1000e_core.c:927:9
      #6 0x561ceec1baab in e1000e_set_tdt hw/net/e1000e_core.c:2444:9
      #7 0x561ceebf300e in e1000e_core_write hw/net/e1000e_core.c:3256:9
      #8 0x561cef3cd4cd in e1000e_mmio_write hw/net/e1000e.c:110:5

  Address 0x7ffdef904902 is located in stack of thread T0 at offset 34 in frame
      #0 0x561ceefa320f in eth_parse_ipv6_hdr net/eth.c:486

    This frame has 1 object(s):
      [32, 34) 'ext_hdr' (line 487) <== Memory access at offset 34 overflows this variable
  HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
        (longjmp and C++ exceptions *are* supported)
  SUMMARY: AddressSanitizer: stack-buffer-overflow net/eth.c:410:17 in _eth_get_rss_ex_dst_addr
  Shadow bytes around the buggy address:
    0x10003df188d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x10003df188e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x10003df188f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x10003df18900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x10003df18910: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
  =>0x10003df18920:[02]f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
    0x10003df18930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x10003df18940: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x10003df18950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x10003df18960: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x10003df18970: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  Shadow byte legend (one shadow byte represents 8 application bytes):
    Addressable:           00
    Partially addressable: 01 02 03 04 05 06 07
    Stack left redzone:      f1
    Stack right redzone:     f3
  ==2859770==ABORTING

Add the corresponding qtest case with the fuzzer reproducer.

FWIW GCC 11 similarly reported:

  net/eth.c: In function 'eth_parse_ipv6_hdr':
  net/eth.c:410:15: error: array subscript 'struct ip6_ext_hdr_routing[0]' is partly outside array bounds of 'struct ip6_ext_hdr[1]' [-Werror=array-bounds]
    410 |     if ((rthdr->rtype == 2) && (rthdr->segleft == 1)) {
        |          ~~~~~^~~~~~~
  net/eth.c:485:24: note: while referencing 'ext_hdr'
    485 |     struct ip6_ext_hdr ext_hdr;
        |                        ^~~~~~~
  net/eth.c:410:38: error: array subscript 'struct ip6_ext_hdr_routing[0]' is partly outside array bounds of 'struct ip6_ext_hdr[1]' [-Werror=array-bounds]
    410 |     if ((rthdr->rtype == 2) && (rthdr->segleft == 1)) {
        |                                 ~~~~~^~~~~~~~~
  net/eth.c:485:24: note: while referencing 'ext_hdr'
    485 |     struct ip6_ext_hdr ext_hdr;
        |                        ^~~~~~~

Cc: qemu-stable@nongnu.org
Buglink: https://bugs.launchpad.net/qemu/+bug/1879531
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Miroslav Rezanina <mrezanin@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Fixes: eb700029c7 ("net_pkt: Extend packet abstraction as required by e1000e functionality")
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-22 17:34:31 +08:00
Philippe Mathieu-Daudé
ef763586c9 net/eth: Check iovec has enough data earlier
We want to check fields from ip6_ext_hdr_routing structure
and if correct read the full in6_address. Let's directly check
if our iovec contains enough data for everything, else return
early.

Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-22 17:34:31 +08:00
Philippe Mathieu-Daudé
6f10f77dcd net/eth: Check size earlier in _eth_get_rss_ex_dst_addr()
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-22 17:34:31 +08:00
Philippe Mathieu-Daudé
dbd8d3f959 net/eth: Better describe _eth_get_rss_ex_dst_addr's offset argument
The 'offset' argument represents the offset to the ip6_ext_hdr
header, rename it as 'ext_hdr_offset'.

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-22 17:34:31 +08:00
Philippe Mathieu-Daudé
38462440ca net/eth: Simplify _eth_get_rss_ex_dst_addr()
The length field is already contained in the ip6_ext_hdr structure.
Check it direcly in eth_parse_ipv6_hdr() before calling
_eth_get_rss_ex_dst_addr(), which gets a bit simplified.

Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-22 17:34:31 +08:00
Philippe Mathieu-Daudé
b565b44ec2 net/eth: Use correct in6_address offset in _eth_get_rss_ex_dst_addr()
The in6_address comes after the ip6_ext_hdr_routing header,
not after the ip6_ext_hdr one. Fix the offset.

Cc: qemu-stable@nongnu.org
Reported-by: Stefano Garzarella <sgarzare@redhat.com>
Fixes: eb700029c7 ("net_pkt: Extend packet abstraction as required by e1000e functionality")
Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-22 17:34:31 +08:00
Lukas Straub
739128e43b net/colo-compare.c: Optimize removal of secondary packet
g_queue_remove needs to look up the list entry first, but we
already have it as result and can remove it directly with
g_queue_delete_link.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-22 17:34:31 +08:00
Lukas Straub
9162ed6649 net/colo-compare.c: Fix memory leak for non-tcp packet
Additional to removing the packet from the secondary queue,
we also need to free it.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-22 17:34:31 +08:00
Bin Meng
d4c6293041 hw/net: virtio-net: Initialize nc->do_not_pad to true
For virtio-net, there is no need to pad the Ethernet frame size to
60 bytes before sending to it.

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-22 17:34:31 +08:00
Bin Meng
969e50b61a net: Pad short frames to minimum size before sending from SLiRP/TAP
The minimum Ethernet frame length is 60 bytes. For short frames with
smaller length like ARP packets (only 42 bytes), on a real world NIC
it can choose either padding its length to the minimum required 60
bytes, or sending it out directly to the wire. Such behavior can be
hardcoded or controled by a register bit. Similarly on the receive
path, NICs can choose either dropping such short frames directly or
handing them over to software to handle.

On the other hand, for the network backends like SLiRP/TAP, they
don't expose a way to control the short frame behavior. As of today
they just send/receive data from/to the other end connected to them,
which means any sized packet is acceptable. So they can send and
receive short frames without any problem. It is observed that ARP
packets sent from SLiRP/TAP are 42 bytes, and SLiRP/TAP just send
these ARP packets to the other end which might be a NIC model that
does not allow short frames to pass through.

To provide better compatibility, for packets sent from QEMU network
backends like SLiRP/TAP, we change to pad short frames before sending
it out to the other end, if the other end does not forbid it via the
nc->do_not_pad flag. This ensures a backend as an Ethernet sender
does not violate the spec. But with this change, the behavior of
dropping short frames from SLiRP/TAP interfaces in the NIC model
cannot be emulated because it always receives a packet that is spec
complaint. The capability of sending short frames from NIC models is
still supported and short frames can still pass through SLiRP/TAP.

This commit should be able to fix the issue as reported with some
NIC models before, that ARP requests get dropped, preventing the
guest from becoming visible on the network. It was workarounded in
these NIC models on the receive path, that when a short frame is
received, it is padded up to 60 bytes.

The following 2 commits seem to be the one to workaround this issue
in e1000 and vmxenet3 before, and should probably be reverted.

  commit 78aeb23ede ("e1000: Pad short frames to minimum size (60 bytes)")
  commit 40a87c6c9b ("vmxnet3: Pad short frames to minimum size (60 bytes)")

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-22 17:34:31 +08:00
Bin Meng
935344bed6 net: Add a 'do_not_pad" to NetClientState
This adds a flag in NetClientState, so that a net client can tell
its peer that the packets do not need to be padded to the minimum
size of an Ethernet frame (60 bytes) before sending to it.

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-22 17:34:31 +08:00
Bin Meng
af774513f7 net: eth: Add a helper to pad a short Ethernet frame
Add a helper to pad a short Ethernet frame to the minimum required
length, which can be used by backends' code.

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-22 17:34:31 +08:00
Warner Losh
262fd27392 FreeBSD: Upgrade to 12.2 release
FreeBSD 12.1 has reached end of life. Use 12.2 instead so that FreeBSD's
project's packages will work.  Update which timezone to pick. Work around a QEMU
bug that incorrectly raises an exception on a CRC32 instruction with the FPU
disabled.  The qemu bug is described here:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg784158.html

Signed-off-by: Warner Losh <imp@bsdimp.com>
Message-Id: <20210307155654.993-2-imp@bsdimp.com>
[thuth: Disable gnutls to work-around a problem with libtasn1]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-03-20 06:33:17 +01:00
David CARLIER
7e47061d93 contrib: ivshmem client and server build fix for SunOS.
sun is a macro on these systems, thus renaming the variables on the
client and server.

Signed-off-by: David Carlier <devnexen@gmail.com>
Message-Id: <CA+XhMqzHPG5ezqY-YxbA+tMoadA3VDFWBX8_LaLC5YsQOMrz+A@mail.gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-03-20 06:33:17 +01:00
David CARLIER
630d86b7f2 configure: fix for SunOS based systems
local directive make the configure fails on these systems.

Signed-off-by: David Carlier <devnexen@gmail.com>
Message-Id: <CA+XhMqy95D8X-QvBcEfL=a-Rkgy6aJtheskkqZwQkreru+T_Wg@mail.gmail.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-03-20 06:33:17 +01:00
Emanuele Giuseppe Esposito
d8b2e5639a tests/unit/test-block-iothread: fix maybe-uninitialized error on GCC 11
When building qemu with GCC 11, test-block-iothread produces the following
warning:

../tests/unit/test-block-iothread.c:148:11: error: ‘buf’ may be used
uninitialized [-Werror=maybe-uninitialized]

This is caused by buf[512] left uninitialized and passed to
bdrv_save_vmstate() that expects a const uint8_t *, so the compiler
assumes it will be read and expects the parameter to be initialized.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210319112218.49609-1-eesposit@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-03-20 06:17:09 +01:00