Commit Graph

1471 Commits

Author SHA1 Message Date
Akihiko Odaki
1c5618a267 vmxnet3: Reset packet state after emptying Tx queue
Keeping Tx packet state after the transmit queue is emptied but this
behavior is unreliable as the state can be reset anytime the migration
happens.

Always reset Tx packet state always after the queue is emptied.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
5c86734025 e1000e: Reset packet state after emptying Tx queue
Keeping Tx packet state after the transmit queue is emptied has some
problems:
- The datasheet says the descriptors can be reused after the transmit
  queue is emptied, but the Tx packet state may keep references to them.
- The Tx packet state cannot be migrated so it can be reset anytime the
  migration happens.

Always reset Tx packet state always after the queue is emptied.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
5d92e88a50 igb: Read DCMD.VLE of the first Tx descriptor
Section 7.2.2.3 Advanced Transmit Data Descriptor says:
> For frames that spans multiple descriptors, all fields apart from
> DCMD.EOP, DCMD.RS, DCMD.DEXT, DTALEN, Address and DTYP are valid only
> in the first descriptors and are ignored in the subsequent ones.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
a09cc21e80 igb: Remove goto
The goto is a bit confusing as it changes the control flow only if L4
protocol is not recognized. It is also different from e1000e, and
noisy when comparing e1000e and igb.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
d5241351bd igb: Always log status after building rx metadata
Without this change, the status flags may not be traced e.g. if checksum
offloading is disabled.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
8b876b99a1 e1000e: Always log status after building rx metadata
Without this change, the status flags may not be traced e.g. if checksum
offloading is disabled.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
5052fc9eb1 e1000x: Rename TcpIpv6 into TcpIpv6Ex
e1000e and igb employs NetPktRssIpV6TcpEx for RSS hash if TcpIpv6 MRQC
bit is set. Moreover, igb also has a MRQC bit for NetPktRssIpV6Tcp
though it is not implemented yet. Rename it to TcpIpv6Ex to avoid
confusion.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
7434951442 e1000x: Take CRC into consideration for size check
Section 13.7.15 Receive Length Error Count says:
>  Packets over 1522 bytes are oversized if LongPacketEnable is 0b
> (RCTL.LPE). If LongPacketEnable (LPE) is 1b, then an incoming packet
> is considered oversized if it exceeds 16384 bytes.

> These lengths are based on bytes in the received packet from
> <Destination Address> through <CRC>, inclusively.

As QEMU processes packets without CRC, the number of bytes for CRC
need to be subtracted. This change adds some size definitions to be used
to derive the new size thresholds to eth.h.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
e9e5b93069 e1000x: Share more Rx filtering logic
This saves some code and enables tracepoint for e1000's VLAN filtering.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
0b11783014 net/eth: Rename eth_setup_vlan_headers_ex
The old eth_setup_vlan_headers has no user so remove it and rename
eth_setup_vlan_headers_ex.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
f0f3ac41d5 hw/net/net_tx_pkt: Remove net_rx_pkt_get_l4_info
This function is not used.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
dc9ef1bf45 igb: Always copy ethernet header
igb_receive_internal() used to check the iov length to determine
copy the iovs to a contiguous buffer, but the check is flawed in two
ways:
- It does not ensure that iovcnt > 0.
- It does not take virtio-net header into consideration.

The size of this copy is just 22 octets, which can be even less than
the code size required for checks. This (wrong) optimization is probably
not worth so just remove it. Removing this also allows igb to assume
aligned accesses for the ethernet header.

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
310a128eae e1000e: Always copy ethernet header
e1000e_receive_internal() used to check the iov length to determine
copy the iovs to a contiguous buffer, but the check is flawed in two
ways:
- It does not ensure that iovcnt > 0.
- It does not take virtio-net header into consideration.

The size of this copy is just 18 octets, which can be even less than
the code size required for checks. This (wrong) optimization is probably
not worth so just remove it.

Fixes: 6f3fbe4ed0 ("net: Introduce e1000e device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
2f0fa232b8 net/net_rx_pkt: Use iovec for net_rx_pkt_set_protocols()
igb does not properly ensure the buffer passed to
net_rx_pkt_set_protocols() is contiguous for the entire L2/L3/L4 header.
Allow it to pass scattered data to net_rx_pkt_set_protocols().

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
f0b1df5c45 igb: Clear IMS bits when committing ICR access
The datasheet says contradicting statements regarding ICR accesses so it
is not reliable to determine the behavior of ICR accesses. However,
e1000e does clear IMS bits when reading ICR accesses and Linux also
expects ICR accesses will clear IMS bits according to:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/intel/igb/igb_main.c?h=v6.2#n8048

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
e209716749 igb: Do not require CTRL.VME for tx VLAN tagging
While the datasheet of e1000e says it checks CTRL.VME for tx VLAN
tagging, igb's datasheet has no such statements. It also says for
"CTRL.VLE":
> This register only affects the VLAN Strip in Rx it does not have any
> influence in the Tx path in the 82576.
(Appendix A. Changes from the 82575)

There is no "CTRL.VLE" so it is more likely that it is a mistake of
CTRL.VME.

Fixes: fba7c3b788 ("igb: respect VMVIR and VMOLR for VLAN")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
ed447c60b3 igb: Fix Rx packet type encoding
igb's advanced descriptor uses a packet type encoding different from
one used in e1000e's extended descriptor. Fix the logic to encode
Rx packet type accordingly.

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
f3f9b726af e1000x: Fix BPRC and MPRC
Before this change, e1000 and the common code updated BPRC and MPRC
depending on the matched filter, but e1000e and igb decided to update
those counters by deriving the packet type independently. This
inconsistency caused a multicast packet to be counted twice.

Updating BPRC and MPRC depending on are fundamentally flawed anyway as
a filter can be used for different types of packets. For example, it is
possible to filter broadcast packets with MTA.

Always determine what counters to update by inspecting the packets.

Fixes: 3b27430177 ("e1000: Implementing various counters")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
a51db58027 hw/net/net_tx_pkt: Decouple interface from PCI
This allows to use the network packet abstractions even if PCI is not
used.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki
163246e1ce hw/net/net_tx_pkt: Decouple implementation from PCI
This is intended to be followed by another change for the interface.
It also fixes the leak of memory mapping when the specified memory is
partially mapped.

Fixes: e263cd49c7 ("Packet abstraction for VMWARE network devices")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
timothee.cocault@gmail.com
8d689f6aae e1000e: Fix tx/rx counters
The bytes and packets counter registers are cleared on read.

Copying the "total counter" registers to the "good counter" registers has
side effects.
If the "total" register is never read by the OS, it only gets incremented.
This leads to exponential growth of the "good" register.

This commit increments the counters individually to avoid this.

Signed-off-by: Timothée Cocault <timothee.cocault@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Eugenio Pérez
1fac00f70b virtio-net: not enable vq reset feature unconditionally
The commit 93a97dc520 ("virtio-net: enable vq reset feature") enables
unconditionally vq reset feature as long as the device is emulated.
This makes impossible to actually disable the feature, and it causes
migration problems from qemu version previous than 7.2.

The entire final commit is unneeded as device system already enable or
disable the feature properly.

This reverts commit 93a97dc520.
Fixes: 93a97dc520 ("virtio-net: enable vq reset feature")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>

Message-Id: <20230504101447.389398-1-eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-05-19 10:30:46 -04:00
Thomas Huth
a9ea0a9c93 hw/net: Move xilinx_ethlite.c to the target-independent source set
Now that the tswap() functions are available for target-independent
code, too, we can move xilinx_ethlite.c from specific_ss to softmmu_ss
to avoid that we have to compile this file multiple times.

Message-Id: <20230508120314.59274-1-thuth@redhat.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-05-16 09:14:18 +02:00
Peter Maydell
a4ae17e5ec hw/net/allwinner-sun8i-emac: Correctly byteswap descriptor fields
In allwinner-sun8i-emac we just read directly from guest memory into
a host FrameDescriptor struct and back.  This only works on
little-endian hosts.  Reading and writing of descriptors is already
abstracted into functions; make those functions also handle the
byte-swapping so that TransferDescriptor structs as seen by the rest
of the code are always in host-order, and fix two places that were
doing ad-hoc descriptor reading without using the functions.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230424165053.1428857-3-peter.maydell@linaro.org
2023-05-02 15:47:41 +01:00
Peter Maydell
d565f58b38 hw/net/msf2-emac: Don't modify descriptor in-place in emac_store_desc()
The msf2-emac ethernet controller has functions emac_load_desc() and
emac_store_desc() which read and write the in-memory descriptor
blocks and handle conversion between guest and host endianness.

As currently written, emac_store_desc() does the endianness
conversion in-place; this means that it effectively consumes the
input EmacDesc struct, because on a big-endian host the fields will
be overwritten with the little-endian versions of their values.
Unfortunately, in all the callsites the code continues to access
fields in the EmacDesc struct after it has called emac_store_desc()
-- specifically, it looks at the d.next field.

The effect of this is that on a big-endian host networking doesn't
work because the address of the next descriptor is corrupted.

We could fix this by making the callsite avoid using the struct; but
it's more robust to have emac_store_desc() leave its input alone.

(emac_load_desc() also does an in-place conversion, but here this is
fine, because the function is supposed to be initializing the
struct.)

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20230424151919.1333299-1-peter.maydell@linaro.org
2023-05-02 15:47:40 +01:00
Patrick Venture
471896381a hw/net: npcm7xx_emc: set MAC in register space
The MAC address set from Qemu wasn't being saved into the register space.

Reviewed-by: Hao Wu <wuhaotsh@google.com>
Signed-off-by: Patrick Venture <venture@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: moved variable declaration to top of function]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-05-02 15:47:39 +01:00
Alexander Bulekov
f63192b054 hw: replace most qemu_bh_new calls with qemu_bh_new_guarded
This protects devices from bh->mmio reentrancy issues.

Thanks: Thomas Huth <thuth@redhat.com> for diagnosing OS X test failure.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-5-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-04-28 11:31:54 +02:00
Guenter Roeck
df3f5efe10 hw/net/imx_fec: Support two Ethernet interfaces connected to single MDIO bus
The SOC on i.MX6UL and i.MX7 has 2 Ethernet interfaces. The PHY on each may
be connected to separate MDIO busses, or both may be connected on the same
MDIO bus using different PHY addresses. Commit 461c51ad42 ("Add a phy-num
property to the i.MX FEC emulator") added support for specifying PHY
addresses, but it did not provide support for linking the second PHY on
a given MDIO bus to the other Ethernet interface.

To be able to support two PHY instances on a single MDIO bus, two properties
are needed: First, there needs to be a flag indicating if the MDIO bus on
a given Ethernet interface is connected. If not, attempts to read from this
bus must always return 0xffff. Implement this property as phy-connected.
Second, if the MDIO bus on an interface is active, it needs a link to the
consumer interface to be able to provide PHY access for it. Implement this
property as phy-consumer.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 20230315145248.1639364-2-linux@roeck-us.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-04-20 10:25:43 +01:00
Sriram Yagnaraman
fba7c3b788 igb: respect VMVIR and VMOLR for VLAN
Add support for stripping/inserting VLAN for VFs.

Had to move CSUM calculation back into the for loop, since packet data
is pulled inside the loop based on strip VLAN decision for every VF.

net_rx_pkt_fix_l4_csum should be extended to accept a buffer instead for
igb. Work for a future patch.

Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28 13:10:55 +08:00
Sriram Yagnaraman
7581baed88 igb: implement VF Tx and Rx stats
Please note that loopback counters for VM to VM traffic is not
implemented yet: VFGOTLBC, VFGPTLBC, VFGORLBC and VFGPRLBC.

Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28 13:10:55 +08:00
Sriram Yagnaraman
3c2e0a6853 igb: respect E1000_VMOLR_RSSE
RSS for VFs is only enabled if VMOLR[n].RSSE is set.

Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28 13:10:55 +08:00
Sriram Yagnaraman
5f12d70304 igb: check oversized packets for VMDq
Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28 13:10:55 +08:00
Sriram Yagnaraman
3269ebb3e0 igb: implement VFRE and VFTE registers
Also introduce:
- Checks for RXDCTL/TXDCTL queue enable bits
- IGB_NUM_VM_POOLS enum (Sec 1.5: Table 1-7)

Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28 13:10:55 +08:00
Sriram Yagnaraman
1c1e649761 igb: add ICR_RXDW
IGB uses RXDW ICR bit to indicate that rx descriptor has been written
back. This is the same as RXT0 bit in older HW.

Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28 13:10:55 +08:00
Sriram Yagnaraman
2e68546a43 igb: handle PF/VF reset properly
Use PFRSTD to reset RSTI bit for VFs, and raise VFLRE interrupt when VF
is reset.

Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28 13:10:55 +08:00
Akihiko Odaki
2a5f744ef2 hw/net/net_tx_pkt: Align l3_hdr
Align the l3_hdr member of NetTxPkt by defining it as a union of
ip_header, ip6_header, and an array of octets.

Fixes: e263cd49c7 ("Packet abstraction for VMWARE network devices")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1544
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28 13:10:55 +08:00
Akihiko Odaki
4cf3a63849 hw/net/net_tx_pkt: Ignore ECN bit
No segmentation should be performed if gso type is
VIRTIO_NET_HDR_GSO_NONE even if ECN bit is set.

Fixes: e263cd49c7 ("Packet abstraction for VMWARE network devices")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1544
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28 13:10:55 +08:00
Akihiko Odaki
f4fdaf009c igb: Fix DMA requester specification for Tx packet
igb used to specify the PF as DMA requester when reading Tx packets.
This made Tx requests from VFs to be performed on the address space of
the PF, defeating the purpose of SR-IOV. Add some logic to change the
requester depending on the queue, which can be assigned to a VF.

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28 13:10:55 +08:00
Akihiko Odaki
212f7b1dac igb: Save more Tx states
The current implementation of igb uses only part of a advanced Tx
context descriptor and first data descriptor because it misses some
features and sniffs the trait of the packet instead of respecting the
packet type specified in the descriptor. However, we will certainly
need the entire Tx context descriptor when we update igb to respect
these ignored fields. Save the entire context descriptor and first
data descriptor except the buffer address to prepare for such a change.

This also introduces the distinction of contexts with different
indexes, which was not present in e1000e but in igb.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28 13:10:55 +08:00
Akihiko Odaki
3a977deebe Intrdocue igb device emulation
This change introduces emulation for the Intel 82576 adapter, AKA igb.
The details of the device will be provided by the documentation that
will follow this change.

This initial implementation of igb does not cover the full feature set,
but it selectively implements changes necessary to pass tests of Linut
Test Project, and Windows HLK. The below is the list of the implemented
changes; anything not listed here is not implemented:

New features:
- igb advanced descriptor handling
- Support of 16 queues
- SRRCTL.BSIZEPACKET register field
- SRRCTL.RDMTS register field
- Tx descriptor completion writeback
- Extended RA registers
- VMDq feature
    - MRQC "Multiple Receive Queues Enable" register field
    - DTXSWC.Loopback_en register field
    - VMOLR.ROMPE register field
    - VMOLR.AUPE register field
    - VLVF.VLAN_id register field
    - VLVF.VI_En register field
- VF
    - Mailbox
    - Reset
- Extended interrupt registers
- Default values for IGP01E1000 PHY registers

Removed features:
- e1000e extended descriptor
- e1000e packet split descriptor
- Legacy descriptor
- PHY register paging
- MAC Registers
    - Legacy interrupt timer registers
    - Legacy EEPROM registers
    - PBA/POEM registers
    - RSRPD register
    - RFCTL.ACKDIS
    - RCTL.DTYPE
- Copper PHY registers

Misc:
- VET register format
- ICR register format

Signed-off-by: Gal Hammer <gal.hammer@sap.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
[Jason: don't abort on msi(x)_init()]
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
c9653b77d5 e1000: Split header files
Some definitions in the header files are invalid for igb so extract
them to new header files to keep igb from referring to them.

Signed-off-by: Gal Hammer <gal.hammer@sap.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
65f474bbae net/eth: Introduce EthL4HdrProto
igb, a new network device emulation, will need SCTP checksum offloading.
Currently eth_get_protocols() has a bool parameter for each protocol
currently it supports, but there will be a bit too many parameters if
we add yet another protocol.

Introduce an enum type, EthL4HdrProto to represent all L4 protocols
eth_get_protocols() support with one parameter.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
5fb7d14995 e1000e: Implement system clock
The system clock is necessary to implement PTP features. While we are
not implementing PTP features for e1000e yet, we do have a plan to
implement them for igb, a new network device derived from e1000e,
so add system clock to the common base first.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
69ff5ef847 net/eth: Report if headers are actually present
The values returned by eth_get_protocols() are used to perform RSS,
checksumming and segmentation. Even when a packet signals the use of the
protocols which these operations can be applied to, the headers for them
may not be present because of too short packet or fragmentation, for
example. In such a case, the operations cannot be applied safely.

Report the presence of headers instead of whether the use of the
protocols are indicated with eth_get_protocols(). This also makes
corresponding changes to the callers of eth_get_protocols() to match
with its new signature and to remove redundant checks for fragmentation.

Fixes: 75020a7021 ("Common definitions for VMWARE devices")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
47399506dc e1000e: Count CRC in Tx statistics
The datasheet 8.19.29 "Good Packets Transmitted Count - GPTC (0x04080;
RC)" says:
> This register counts the number of good (no errors) packets
> transmitted. A good transmit packet is considered one that is 64 or
> more bytes in length (from <Destination Address> through <CRC>,
> inclusively) in length.

It also says similar for the other Tx statistics registers. Add the
number of bytes for CRC to those registers.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
c50b152485 e1000: Count CRC in Tx statistics
The Software Developer's Manual 13.7.4.5 "Packets Transmitted (64 Bytes)
Count" says:
> This register counts the number of packets transmitted that are
> exactly 64 bytes (from <Destination Address> through <CRC>,
> inclusively) in length.

It also says similar for the other Tx statistics registers. Add the
number of bytes for CRC to those registers.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
bf2a7212c2 e1000e: Combine rx traces
Whether a packet will be written back to the guest depends on the
remaining space of the queue. Therefore, e1000e_rx_written_to_guest and
e1000e_rx_not_written_to_guest should log the index of the queue instead
of generated interrupts. This also removes the need of
e1000e_rx_rss_dispatched_to_queue, which logs the queue index.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
0cbd6e5221 e1000e: Do not assert when MSI-X is disabled later
Assertions will fail if MSI-X gets disabled while a timer for MSI-X
interrupts is running so remove them to avoid abortions. Fortunately,
nothing bad happens even if the assertions won't trigger as
msix_notify(), called by timer handlers, does nothing when MSI-X is
disabled.

This bug was found by Alexander Bulekov when fuzzing igb, a new
device implementation derived from e1000e:
https://patchew.org/QEMU/20230129053316.1071513-1-alxndr@bu.edu/

The fixed test case is:
fuzz/crash_aea040166819193cf9fedb810c6d100221da721a

Fixes: 6f3fbe4ed0 ("net: Introduce e1000e device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
dd32b5ea7e hw/net/net_tx_pkt: Check the payload length
Check the payload length if checksumming to ensure the payload contains
the space for the resulting value.

This bug was found by Alexander Bulekov with the fuzzer:
https://patchew.org/QEMU/20230129053316.1071513-1-alxndr@bu.edu/

The fixed test case is:
fuzz/crash_6aeaa33e7211ecd603726c53e834df4c6d1e08bc

Fixes: e263cd49c7 ("Packet abstraction for VMWARE network devices")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
02ef5fdc09 hw/net/net_tx_pkt: Implement TCP segmentation
There was no proper implementation of TCP segmentation before this
change, and net_tx_pkt relied solely on IPv4 fragmentation. Not only
this is not aligned with the specification, but it also resulted in
corrupted IPv6 packets.

This is particularly problematic for the igb, a new proposed device
implementation; igb provides loopback feature for VMDq and the feature
relies on software segmentation.

Implement proper TCP segmentation in net_tx_pkt to fix such a scenario.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
ffbd2dbd8e e1000e: Perform software segmentation for loopback
e1000e didn't perform software segmentation for loopback if virtio-net
header is enabled, which is wrong.

To fix the problem, introduce net_tx_pkt_send_custom(), which allows the
caller to specify whether offloading should be assumed or not.

net_tx_pkt_send_custom() also allows the caller to provide a custom
sending function. Packets with virtio-net headers and ones without
virtio-net headers will be provided at the same time so the function
can choose the preferred version. In case of e1000e loopback, it prefers
to have virtio-net headers as they allows to skip the checksum
verification if VIRTIO_NET_HDR_F_DATA_VALID is set.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
aac8f89dba hw/net/net_rx_pkt: Remove net_rx_pkt_has_virt_hdr
When virtio-net header is not set, net_rx_pkt_get_vhdr() returns
zero-filled virtio_net_hdr, which is actually valid. In fact, tap device
uses zero-filled virtio_net_hdr when virtio-net header is not provided
by the peer. Therefore, we can just remove net_rx_pkt_has_virt_hdr() and
always assume NetTxPkt has a valid virtio-net header.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
55daf493f7 hw/net/net_tx_pkt: Automatically determine if virtio-net header is used
The new function qemu_get_using_vnet_hdr() allows to automatically
determine if virtio-net header is used.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
d921db0ae9 e1000x: Alter the signature of e1000x_is_vlan_packet
e1000x_is_vlan_packet() had a pointer to uint8_t as a parameter, but
it does not have to be uint8_t. Change the type to void *.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
f9a9eb16e2 net: Check L4 header size
net_tx_pkt_build_vheader() inspects TCP header but had no check for
the header size, resulting in an undefined behavior. Check the header
size and drop the packet if the header is too small.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
156dc1555d e1000e: Remove extra pointer indirection
e1000e_write_packet_to_guest() passes the reference of variable ba as a
pointer to an array, and that pointer indirection is just unnecessary;
all functions which uses the passed reference performs no pointer
operation on the pointer and they simply dereference the passed
pointer. Remove the extra pointer indirection.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
eb4d8e2553 e1000e: Set MII_ANER_NWAY
This keeps Windows driver 12.18.9.23 from generating an event with ID
30. The description of the event is as follows:
> Intel(R) 82574L Gigabit Network Connection
>  PROBLEM: The network adapter is configured for auto-negotiation but
> the link partner is not.  This may result in a duplex mismatch.
>  ACTION: Configure the link partner for auto-negotiation.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
235f2eee82 e1000e: Introduce e1000_rx_desc_union
Before this change, e1000e_write_packet_to_guest() allocated the
receive descriptor buffer as an array of uint8_t. This does not ensure
the buffer is sufficiently aligned.

Introduce e1000_rx_desc_union type, a union type of all receive
descriptor types to correct this.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
8a35c648ac e1000e: Configure ResettableClass
This is part of recent efforts of refactoring e1000 and e1000e.

DeviceClass's reset member is deprecated so migrate to ResettableClass.
There is no behavioral difference.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
9d46505368 e1000: Configure ResettableClass
This is part of recent efforts of refactoring e1000 and e1000e.

DeviceClass's reset member is deprecated so migrate to ResettableClass.
There is no behavioral difference.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
86343066ba e1000e: Improve software reset
This change makes e1000e reset more things when software reset was
triggered. Some registers are exempted from software reset in the
datasheet and this change also implements the behavior accordingly.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
31e3f318c8 e1000e: Remove pending interrupt flags
They are duplicate of running throttling timer flags and incomplete as
the flags are not cleared when the interrupts are fired or the device is
reset.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
a7539f9d74 e1000e: Use memcpy to intialize registers
Use memcpy instead of memmove to initialize registers. The initial
register templates and register table instances will never overlap.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
9eb525ee89 e1000: Use memcpy to intialize registers
Use memcpy instead of memmove to initialize registers. The initial
register templates and register table instances will never overlap.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
be7daa5904 e1000e: Use more constant definitions
The definitions of SW Semaphore Register were copied from:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/ethernet/intel/e1000e/defines.h?h=v6.0.9#n374

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
2fe63579d8 e1000: Use more constant definitions
The definitions for E1000_VFTA_ENTRY_SHIFT, E1000_VFTA_ENTRY_MASK, and
E1000_VFTA_ENTRY_BIT_SHIFT_MASK were copied from:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/ethernet/intel/e1000/e1000_hw.h?h=v6.0.9#n306

The definitions for E1000_NUM_UNICAST, E1000_MC_TBL_SIZE, and
E1000_VLAN_FILTER_TBL_SIZE were copied from:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/ethernet/intel/e1000/e1000_hw.h?h=v6.0.9#n707

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
c16bd68e16 e1000e: Mask registers when writing
When a register has effective bits fewer than their width, the old code
inconsistently masked when writing or reading. Make the code consistent
by always masking when writing, and remove some code duplication.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
3de66fe4d7 e1000e: Introduce E1000E_LOW_BITS_SET_FUNC
e1000e_set_16bit and e1000e_set_12bit look so similar so define a
generic macro.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
a9484b8a41 e1000: Mask registers when writing
When a register has effective bits fewer than their width, the old code
inconsistently masked when writing or reading. Make the code consistent
by always masking when writing, and remove some code duplication.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
b7728c9f62 e1000: Use hw/net/mii.h
hw/net/mii.h provides common definitions for MII.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
6684bef12e fsl_etsec: Use hw/net/mii.h
hw/net/mii.h provides common definitions for MII.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki
0eadd56bf5 e1000e: Fix the code style
igb implementation first starts off by copying e1000e code. Correct the
code style before that.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
David Woodhouse
a9ae1418b3 hw/xen: Use XEN_PAGE_SIZE in PV backend drivers
XC_PAGE_SIZE comes from the actual Xen libraries, while XEN_PAGE_SIZE is
provided by QEMU itself in xen_backend_ops.h. For backends which may be
built for emulation mode, use the latter.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
f80fad16af hw/xen: Pass grant ref to gnttab unmap operation
The previous commit introduced redirectable gnttab operations fairly
much like-for-like, with the exception of the extra arguments to the
->open() call which were always NULL/0 anyway.

This *changes* the arguments to the ->unmap() operation to include the
original ref# that was mapped. Under real Xen it isn't necessary; all we
need to do from QEMU is munmap(), then the kernel will release the grant,
and Xen does the tracking/refcounting for the guest.

When we have emulated grant tables though, we need to do all that for
ourselves. So let's have the back ends keep track of what they mapped
and pass it in to the ->unmap() method for us.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
Eugenio Pérez
cd69d47cdd virtio-net: clear guest_announce feature if no cvq backend
Since GUEST_ANNOUNCE is emulated the feature bit could be set without
backend support.  This happens in the vDPA case.

However, backend vDPA parent may not have CVQ support.  This causes an
incoherent feature set, and the driver may refuse to start.  This
happens in virtio-net Linux driver.

This may be solved differently in the future.  Qemu is able to emulate a
CVQ just for guest_announce purposes, helping guest to notify the new
location with vDPA devices that does not support it.  However, this is
left as a TODO as it is way more complex to backport.

Tested with vdpa_net_sim, toggling manually VIRTIO_NET_F_CTRL_VQ in the
driver and migrating it with x-svq=on.

Fixes: 980003debd ("vdpa: do not handle VIRTIO_NET_F_GUEST_ANNOUNCE in vhost-vdpa")
Reported-by: Dawar, Gautam <gautam.dawar@amd.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230124161159.2182117-1-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Edmondson <david.edmondson@oracle.com>
Reviewed-by: Gautam Dawar <gautam.dawar@amd.com>
Tested-by: Gautam Dawar <gautam.dawar@amd.com>
Tested-by: Lei Yang <leiyang@redhat.com>
2023-03-02 03:10:46 -05:00
Markus Armbruster
1178710247 rocker: Tweak stubbed out monitor commands' error messages
The QERR_ macros are leftovers from the days of "rich" error objects.
We've been trying to reduce their remaining use.

The stubbed out Rocker monitor commands are the last remaining users
of QERR_FEATURE_DISABLED.  They fail like this:

    (qemu) info rocker mumble
    Error: The feature 'rocker' is not enabled

The real rocker commands fail like this when the named object doesn't
exist:

    Error: rocker mumble not found

If that's good enough when Rocker is enabled, then it's good enough
when it's disabled, so replace QERR_FEATURE_DISABLED with that, and
drop the macro.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230207075115.1525-13-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
2023-02-23 14:10:17 +01:00
Fiona Ebner
099a638281 hw/net/vmxnet3: allow VMXNET3_MAX_MTU itself as a value
Currently, VMXNET3_MAX_MTU itself (being 9000) is not considered a
valid value for the MTU, but a guest running ESXi 7.0 might try to
set it and fail the assert [0].

In the Linux kernel, dev->max_mtu itself is a valid value for the MTU
and for the vmxnet3 driver it's 9000, so a guest running Linux will
also fail the assert when trying to set an MTU of 9000.

VMXNET3_MAX_MTU and s->mtu don't seem to be used in relation to buffer
allocations/accesses, so allowing the upper limit itself as a value
should be fine.

[0]: https://forum.proxmox.com/threads/114011/

Fixes: d05dcd94ae ("net: vmxnet3: validate configuration values during activate (CVE-2021-20203)")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-02-17 13:31:33 +08:00
Qiang Liu
44c94cdb21 hw/net/lan9118: log [read|write]b when mode_16bit is enabled rather than abort
This patch replaces hw_error to guest error log for [read|write]b
accesses when mode_16bit is enabled. This avoids aborting qemu.

Fixes: 1248f8d4cb ("hw/lan9118: Add basic 16-bit mode support.")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1433
Reported-by: Qiang Liu <cyruscyliu@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Qiang Liu <cyruscyliu@gmail.com>
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-02-17 13:31:33 +08:00
Markus Armbruster
a67dfa660b Drop duplicate #include
Tracked down with the help of scripts/clean-includes.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230202133830.2152150-21-armbru@redhat.com>
2023-02-08 07:28:05 +01:00
Stephen Longfield
036e98e5c2 hw/net: Fix read of uninitialized memory in ftgmac100
With the `size += 4` before the call to `crc32`, the CRC calculation
would overrun the buffer. Size is used in the while loop starting on
line 1009 to determine how much data to write back, with the last
four bytes coming from `crc_ptr`, so do need to increase it, but should
do this after the computation.

I'm unsure why this use of uninitialized memory in the CRC doesn't
result in CRC errors, but it seems clear to me that it should not be
included in the calculation.

Signed-off-by: Stephen Longfield <slongfield@google.com>
Reviewed-by: Hao Wu <wuhaotsh@google.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-Id: <20221220221437.3303721-1-slongfield@google.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-02-07 09:02:04 +01:00
Markus Armbruster
52cafcea43 rocker: Move HMP commands from monitor to hw/net/rocker/
This moves these commands from MAINTAINERS section "Human
Monitor (HMP)" to "Rocker" and "Network devices".

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230124121946.1139465-14-armbru@redhat.com>
2023-02-04 07:56:54 +01:00
Philippe Mathieu-Daudé
883f2c591f bulk: Rename TARGET_FMT_plx -> HWADDR_FMT_plx
The 'hwaddr' type is defined in "exec/hwaddr.h" as:

    hwaddr is the type of a physical address
   (its size can be different from 'target_ulong').

All definitions use the 'HWADDR_' prefix, except TARGET_FMT_plx:

 $ fgrep define include/exec/hwaddr.h
 #define HWADDR_H
 #define HWADDR_BITS 64
 #define HWADDR_MAX UINT64_MAX
 #define TARGET_FMT_plx "%016" PRIx64
         ^^^^^^
 #define HWADDR_PRId PRId64
 #define HWADDR_PRIi PRIi64
 #define HWADDR_PRIo PRIo64
 #define HWADDR_PRIu PRIu64
 #define HWADDR_PRIx PRIx64
 #define HWADDR_PRIX PRIX64

Since hwaddr's size can be *different* from target_ulong, it is
very confusing to read one of its format using the 'TARGET_FMT_'
prefix, normally used for the target_long / target_ulong types:

$ fgrep TARGET_FMT_ include/exec/cpu-defs.h
 #define TARGET_FMT_lx "%08x"
 #define TARGET_FMT_ld "%d"
 #define TARGET_FMT_lu "%u"
 #define TARGET_FMT_lx "%016" PRIx64
 #define TARGET_FMT_ld "%" PRId64
 #define TARGET_FMT_lu "%" PRIu64

Apparently this format was missed during commit a8170e5e97
("Rename target_phys_addr_t to hwaddr"), so complete it by
doing a bulk-rename with:

 $ sed -i -e s/TARGET_FMT_plx/HWADDR_FMT_plx/g $(git grep -l TARGET_FMT_plx)

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230110212947.34557-1-philmd@linaro.org>
[thuth: Fix some warnings from checkpatch.pl along the way]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-01-18 11:14:34 +01:00
Markus Armbruster
edf5ca5dbe include/hw/pci: Split pci_device.h off pci.h
PCIDeviceClass and PCIDevice are defined in pci.h.  Many users of the
header don't actually need them.  Similar structs live in their own
headers: PCIBusClass and PCIBus in pci_bus.h, PCIBridge in
pci_bridge.h, PCIHostBridgeClass and PCIHostState in pci_host.h,
PCIExpressHost in pcie_host.h, and PCIERootPortClass, PCIEPort, and
PCIESlot in pcie_port.h.

Move PCIDeviceClass and PCIDeviceClass to new pci_device.h, along with
the code that needs them.  Adjust include directives.

This also enables the next commit.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221222100330.380143-6-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-01-08 01:54:22 -05:00
Cindy Lu
8aab0d1dbe virtio-net: add support for configure interrupt
Add functions to support configure interrupt in virtio_net
Add the functions to support vhost_net_config_pending
and vhost_net_config_mask.

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20221222070451.936503-9-lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-01-08 01:54:22 -05:00
Cindy Lu
544f0278af virtio: introduce macro VIRTIO_CONFIG_IRQ_IDX
To support configure interrupt for vhost-vdpa
Introduce VIRTIO_CONFIG_IRQ_IDX -1 as configure interrupt's queue index,
Then we can reuse the functions guest_notifier_mask and guest_notifier_pending.
Add the check of queue index in these drivers, if the driver does not support
configure interrupt, the function will just return

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20221222070451.936503-2-lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-01-08 01:54:22 -05:00
Hyman Huang(黄勇)
c9bdc449f9 vhost-user: Fix the virtio features negotiation flaw
This patch aims to fix unexpected negotiation features for
vhost-user netdev interface.

When openvswitch reconnect Qemu after an unexpected disconnection
and Qemu therefore start the vhost_dev, acked_features field in
vhost_dev is initialized with value fetched from acked_features
field in NetVhostUserState, which should be up-to-date at that
moment but Qemu could not make it actually during the time window
of virtio features negotiation.

So we save the acked_features right after being configured by
guest virtio driver so it can be used to restore acked_features
field in vhost_dev correctly.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
Signed-off-by: Liuxiangdong <liuxiangdong5@huawei.com>
Message-Id: <b9f8cf5561a79ea65ea38960e5a5e6d3707eef0a.1671627406.git.huangy81@chinatelecom.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-01-08 01:54:22 -05:00
Eugenio Pérez
4f93aafc8f virtio_net: copy VIRTIO_NET_S_ANNOUNCE if device model has it
Status part of the emulated feature. It will follow device model, so we
must copy it as long as NIC device model has it set.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20221221115015.1400889-3-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2023-01-08 01:54:21 -05:00
Eugenio Pérez
ebc141a625 virtio_net: Modify virtio_net_get_config to early return
Next patches introduce more code on vhost-vdpa branch, with already have
too much indentation.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221221115015.1400889-2-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-01-08 01:54:21 -05:00
Stephen Longfield
93c9678de9 hw/net: Fix read of uninitialized memory in imx_fec.
Size is used at lines 1088/1188 for the loop, which reads the last 4
bytes from the crc_ptr so it does need to get increased, however it
shouldn't be increased before the buffer is passed to CRC computation,
or the crc32 function will access uninitialized memory.

This was pointed out to me by clg@kaod.org during the code review of
a similar patch to hw/net/ftgmac100.c

Change-Id: Ib0464303b191af1e28abeb2f5105eb25aadb5e9b
Signed-off-by: Stephen Longfield <slongfield@google.com>
Reviewed-by: Patrick Venture <venture@google.com>
Message-id: 20221221183202.3788132-1-slongfield@google.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-01-05 15:33:00 +00:00
Thomas Huth
c57e0ea6b4 hw: Include the VMWare devices only in the x86 targets
It seems a little bit weird that the para-virtualized x86 VMWare
devices "vmware-svga" and "vmxnet3" also show up in non-x86 targets.
They are likely pretty useless there (since the guest OSes likely
do not have any drivers for those enabled), so let's change this and
only enable those devices by default for the classical x86 targets.

Message-Id: <20221213095144.42355-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-12-15 15:19:24 +01:00
Peter Maydell
48804eebd4 Miscellaneous patches for 2022-12-14
-----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmOZ6lYSHGFybWJydUBy
 ZWRoYXQuY29tAAoJEDhwtADrkYZT6VEQAKynjWh3AIZ4/qOgrVqsP0oRspevLmfH
 BbuGoldjYpEE7RbwuCaZalZ7iy7TcSySxnPfUDVsFHd7NWffJVjwKHifGC0D/Ez0
 +Ggyb1CBebN+mS7t+BNFUHdMM+wxFIlHwg4f4aTFbn2o0HKgj2a8tcNzNRonZbfa
 xURnvbD4G4u0VZEc3Jak+x193xbOJFsuuWq0BZnDuNk+XqjyW2RwfpXLPJVk+82a
 4uy/YgYuqXUqBeULwcJj+shBL4SXR9GyajTFMS64przSUle0ADUmXkPtaS2agV7e
 Pym/UQuAcxvNyw34fJsiMZxx6rZI9YU30jQUMRLoYcPRR/Q/aiPeiiHtiD6Kaid7
 IfOeH/EArXaQRFpD89xj4YcaTnRLQOEj0NXgXvAbQf6eD8JYyao/S/0lCsPZEoA2
 nibLqEQ25ncDNXoSomuwtfjVff3w68lODFbhwqfA0gf3cPtCgVZ6xQ8P/McNY6K6
 wqFHXMWTDHk1LOCTucjYz1z2TGzTnSG4iWi5Yt6FSxAc958AO+v5ALn/1pcYun+E
 azM/MF0AInKj2aJCT530zT0tpCs/Jo07YKC8k6ubi77S0ZdmGS1XLeXkRXfk1+yI
 OhuUgiVlSTHxD69DagT2vbnx1mDMM9X+OBIMvEi5nwvD9A/ghaCgkDeGFvbA1ud0
 t0mxPBZJ+tiZ
 =JJjG
 -----END PGP SIGNATURE-----

Merge tag 'pull-misc-2022-12-14' of https://repo.or.cz/qemu/armbru into staging

Miscellaneous patches for 2022-12-14

# gpg: Signature made Wed 14 Dec 2022 15:23:02 GMT
# gpg:                using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg:                issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* tag 'pull-misc-2022-12-14' of https://repo.or.cz/qemu/armbru:
  ppc4xx_sdram: Simplify sdram_ddr_size() to return
  block/vmdk: Simplify vmdk_co_create() to return directly
  cleanup: Tweak and re-run return_directly.cocci
  io: Tidy up fat-fingered parameter name
  qapi: Use returned bool to check for failure (again)
  sockets: Use ERRP_GUARD() where obviously appropriate
  qemu-config: Use ERRP_GUARD() where obviously appropriate
  qemu-config: Make config_parse_qdict() return bool
  monitor: Use ERRP_GUARD() in monitor_init()
  monitor: Simplify monitor_fd_param()'s error handling
  error: Move ERRP_GUARD() to the beginning of the function
  error: Drop a few superfluous ERRP_GUARD()
  error: Drop some obviously superfluous error_propagate()
  Drop more useless casts from void * to pointer

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-12-15 10:13:46 +00:00
Markus Armbruster
05e0748860 qapi rocker: Elide redundant has_FOO in generated C
The has_FOO for pointer-valued FOO are redundant, except for arrays.
They are also a nuisance to work with.  Recent commit "qapi: Start to
elide redundant has_FOO in generated C" provided the means to elide
them step by step.  This is the step for qapi/rocker.json.

Said commit explains the transformation in more detail.  The invariant
violations mentioned there do not occur here.

Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221104160712.3005652-23-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2022-12-14 20:05:07 +01:00
Markus Armbruster
7480874a69 qapi net: Elide redundant has_FOO in generated C
The has_FOO for pointer-valued FOO are redundant, except for arrays.
They are also a nuisance to work with.  Recent commit "qapi: Start to
elide redundant has_FOO in generated C" provided the means to elide
them step by step.  This is the step for qapi/net.json.

Said commit explains the transformation in more detail.  The invariant
violations mentioned there do not occur here.

Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221104160712.3005652-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[Fixes for MacOS squashed in]
2022-12-14 20:04:47 +01:00
Markus Armbruster
3d558330ad Drop more useless casts from void * to pointer
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20221123133811.1398562-1-armbru@redhat.com>
2022-12-14 16:19:35 +01:00
Stefan Hajnoczi
42f3253c34 virtio: regression fix
Fixes regression with migration and vsock, as fixing that
 exposes some known issues in vhost user cleanup, this attempts
 to fix those as well. More work on vhost user is needed :)
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmOIWaEPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRp+RQH/2PVAjD/GA3zF5F3Z07vH51c55T6tluZ85c3
 4u66SSkF5JR1hATCujYCtrt9V0mnqhmhhm4gJH5xcsynFjjyIXd2dDrTFRpCtRgn
 icXOmYCc9pCu8XsluJnWvY/5r/KEDxqmGVE8Kyhz551QjvsBkezhI9x9vhJZJLCn
 Xn1XQ/3jpUcQLwasu8AxZb0IDW8WdCtonbke6xIyMzOYGR2bnRdXlDXVVG1zJ/SZ
 eS3HUad71VekhfzWq0fx8yEJnfvbes9vo007y8rOGdHOcMneWGAie52W1dOBhclh
 Zt56zID55t1USEwlPxkZSj7UXNbVl7Uz/XU5ElN0yTesttP4Iq0=
 =ZkaX
 -----END PGP SIGNATURE-----

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio: regression fix

Fixes regression with migration and vsock, as fixing that
exposes some known issues in vhost user cleanup, this attempts
to fix those as well. More work on vhost user is needed :)

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmOIWaEPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp+RQH/2PVAjD/GA3zF5F3Z07vH51c55T6tluZ85c3
# 4u66SSkF5JR1hATCujYCtrt9V0mnqhmhhm4gJH5xcsynFjjyIXd2dDrTFRpCtRgn
# icXOmYCc9pCu8XsluJnWvY/5r/KEDxqmGVE8Kyhz551QjvsBkezhI9x9vhJZJLCn
# Xn1XQ/3jpUcQLwasu8AxZb0IDW8WdCtonbke6xIyMzOYGR2bnRdXlDXVVG1zJ/SZ
# eS3HUad71VekhfzWq0fx8yEJnfvbes9vo007y8rOGdHOcMneWGAie52W1dOBhclh
# Zt56zID55t1USEwlPxkZSj7UXNbVl7Uz/XU5ElN0yTesttP4Iq0=
# =ZkaX
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 01 Dec 2022 02:37:05 EST
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu:
  include/hw: VM state takes precedence in virtio_device_should_start
  hw/virtio: generalise CHR_EVENT_CLOSED handling
  hw/virtio: add started_vu status field to vhost-user-gpio
  vhost: enable vrings in vhost_dev_start() for vhost-user devices
  tests/qtests: override "force-legacy" for gpio virtio-mmio tests

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-12-04 11:00:26 -05:00
Stefano Garzarella
4daa5054c5 vhost: enable vrings in vhost_dev_start() for vhost-user devices
Commit 02b61f38d3 ("hw/virtio: incorporate backend features in features")
properly negotiates VHOST_USER_F_PROTOCOL_FEATURES with the vhost-user
backend, but we forgot to enable vrings as specified in
docs/interop/vhost-user.rst:

    If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the
    ring starts directly in the enabled state.

    If ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated, the ring is
    initialized in a disabled state and is enabled by
    ``VHOST_USER_SET_VRING_ENABLE`` with parameter 1.

Some vhost-user front-ends already did this by calling
vhost_ops.vhost_set_vring_enable() directly:
- backends/cryptodev-vhost.c
- hw/net/virtio-net.c
- hw/virtio/vhost-user-gpio.c

But most didn't do that, so we would leave the vrings disabled and some
backends would not work. We observed this issue with the rust version of
virtiofsd [1], which uses the event loop [2] provided by the
vhost-user-backend crate where requests are not processed if vring is
not enabled.

Let's fix this issue by enabling the vrings in vhost_dev_start() for
vhost-user front-ends that don't already do this directly. Same thing
also in vhost_dev_stop() where we disable vrings.

[1] https://gitlab.com/virtio-fs/virtiofsd
[2] https://github.com/rust-vmm/vhost/blob/240fc2966/crates/vhost-user-backend/src/event_loop.rs#L217

Fixes: 02b61f38d3 ("hw/virtio: incorporate backend features in features")
Reported-by: German Maglione <gmaglione@redhat.com>
Tested-by: German Maglione <gmaglione@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20221123131630.52020-1-sgarzare@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20221130112439.2527228-3-alex.bennee@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-01 02:30:04 -05:00
Stefan Hajnoczi
ccb8d593e0 pc,virtio: regression, test fixes
fixes regressions:
     virtio error message triggered by seabios
     failure in vhost due to VIRTIO_F_RING_RESET
     broken keyboard under seabios
 
 some biosbits test fixes
 
 there's still a known regression with migration and vsock,
 not fixed yet.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmN8os4PHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpw+UH/2JOoxqzkkq4QyV5PzmjsiyyMbrBqFSOTro2
 98k9MwxNWHIk6fi3FVIz+LX1EdwDbMl44IHQ88xo7k39V7ThvVvLfjaQySSeoCfE
 04++AvkOQ/O8W7sob17r2iPWIHKPIndq8uB652T89o1I/nhmFMsUkdRWmVQ3XsDk
 DnPZxjBQnKY0oSvBd7SmSI3C+eA4lF1tht50ZxmOshLx4OSjK/maFVpIyLsUGTMn
 6OMxKffDjfvw5z94+pGN1RQm9xu7OwtFZyUnH16xoQUmWSXjzcYXUZ3buTY+eAyJ
 I5j9ox27Cqn1wGcypUiTCKLIExEMBfvNY4ovyRa1xvW+MXluVyE=
 =bF2U
 -----END PGP SIGNATURE-----

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

pc,virtio: regression, test fixes

fixes regressions:
    virtio error message triggered by seabios
    failure in vhost due to VIRTIO_F_RING_RESET
    broken keyboard under seabios

some biosbits test fixes

there's still a known regression with migration and vsock,
not fixed yet.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmN8os4PHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpw+UH/2JOoxqzkkq4QyV5PzmjsiyyMbrBqFSOTro2
# 98k9MwxNWHIk6fi3FVIz+LX1EdwDbMl44IHQ88xo7k39V7ThvVvLfjaQySSeoCfE
# 04++AvkOQ/O8W7sob17r2iPWIHKPIndq8uB652T89o1I/nhmFMsUkdRWmVQ3XsDk
# DnPZxjBQnKY0oSvBd7SmSI3C+eA4lF1tht50ZxmOshLx4OSjK/maFVpIyLsUGTMn
# 6OMxKffDjfvw5z94+pGN1RQm9xu7OwtFZyUnH16xoQUmWSXjzcYXUZ3buTY+eAyJ
# I5j9ox27Cqn1wGcypUiTCKLIExEMBfvNY4ovyRa1xvW+MXluVyE=
# =bF2U
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 22 Nov 2022 05:22:06 EST
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu:
  virtio: disable error for out of spec queue-enable
  acpi/tests/avocado/bits: keep the work directory when BITS_DEBUG is set in env
  tests/avocado: configure acpi-bits to use avocado timeout
  MAINTAINERS: add mst to list of biosbits maintainers
  tests: acpi: x86: update expected DSDT after moving PRQx fields in _SB scope
  acpi: x86: move RPQx field back to _SB scope
  tests: acpi: whitelist DSDT before moving PRQx to _SB scope
  vhost: mask VIRTIO_F_RING_RESET for vhost and vhost-user devices

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-11-22 09:59:27 -05:00
Stefano Garzarella
562a7d23bf vhost: mask VIRTIO_F_RING_RESET for vhost and vhost-user devices
Commit 69e1c14aa2 ("virtio: core: vq reset feature negotation support")
enabled VIRTIO_F_RING_RESET by default for all virtio devices.

This feature is not currently emulated by QEMU, so for vhost and
vhost-user devices we need to make sure it is supported by the offloaded
device emulation (in-kernel or in another process).
To do this we need to add VIRTIO_F_RING_RESET to the features bitmap
passed to vhost_get_features(). This way it will be masked if the device
does not support it.

This issue was initially discovered with vhost-vsock and vhost-user-vsock,
and then also tested with vhost-user-rng which confirmed the same issue.
They fail when sending features through VHOST_SET_FEATURES ioctl or
VHOST_USER_SET_FEATURES message, since VIRTIO_F_RING_RESET is negotiated
by the guest (Linux >= v6.0), but not supported by the device.

Fixes: 69e1c14aa2 ("virtio: core: vq reset feature negotation support")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1318
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20221121101101.29400-1-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-11-22 05:19:00 -05:00
Stefan Hajnoczi
6d71357a3b rtl8139: honor large send MSS value
The Large-Send Task Offload Tx Descriptor (9.2.1 Transmit) has a
Large-Send MSS value where the driver specifies the MSS. See the
datasheet here:
http://realtek.info/pdf/rtl8139cp.pdf

The code ignores this value and uses a hardcoded MSS of 1500 bytes
instead. When the MTU is less than 1500 bytes the hardcoded value
results in IP fragmentation and poor performance.

Use the Large-Send MSS value to correctly size Large-Send packets.

Jason Wang <jasowang@redhat.com> noticed that the Large-Send MSS value
mask was incorrect so it is adjusted to match the datasheet and Linux
8139cp driver.

This issue was discussed in the past here:
https://lore.kernel.org/all/20161114162505.GD26664@stefanha-x1.localdomain/

Reported-by: Russell King - ARM Linux <linux@armlinux.org.uk>
Reported-by: Tobias Fiebig <tobias+git@fiebig.nl>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1312
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221117165554.1773409-4-stefanha@redhat.com>
2022-11-21 09:28:43 -05:00
Stefan Hajnoczi
c74831a02c rtl8139: keep Tx command mode 0 and 1 separate
There are two Tx Descriptor formats called mode 0 and mode 1. The mode
is determined by the Large Send bit.

CP_TX_IPCS (bit 18) is defined in mode 1 but the code checks the bit
unconditionally. In mode 0 bit 18 is part of the Large Send MSS value.

Explicitly check the Large Send bit to distinguish Tx command modes.
This avoids bugs where modes are confused. Note that I didn't find any
actual bugs aside from needlessly computing the IP checksum when the
Large Send bit is enabled.

Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221117165554.1773409-3-stefanha@redhat.com>
2022-11-21 09:28:43 -05:00