Some guests will use the standard MII status register
to verify link state. They will not notice link changes
unless this register is updated.
Verified with Linux 3.0 and Windows XP guests.
Without this patch, ethtool will report speed and duplex as
unknown when the link is down, but still report the link as
up. This is because the Linux e1000 driver checks the
mac_reg[STATUS] register link state before it checks speed
and duplex, but uses the phy_reg[PHY_STATUS] register for
the actual link state check. Fix by updating both registers
on link state changes.
Linux guest before:
(qemu) set_link e1000.0 off
kvm-sid:~# ethtool eth0
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Speed: Unknown!
Duplex: Unknown! (255)
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
MDI-X: Unknown
Supports Wake-on: umbg
Wake-on: d
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes
(qemu) set_link e1000.0 on
Linux guest after:
(qemu) set_link e1000.0 off
[ 63.384221] e1000: eth0 NIC Link is Down
kvm-sid:~# ethtool eth0
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Speed: Unknown!
Duplex: Unknown! (255)
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
MDI-X: Unknown
Supports Wake-on: umbg
Wake-on: d
Current message level: 0x00000007 (7)
drv probe link
Link detected: no
(qemu) set_link e1000.0 on
[ 84.304582] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Rx and Tx descriptors are 16 byte aligned, so the lower bits are
ignored by real hardware. In fact, they always read back as zero on real
hardware, but probably nobody relies on that.
Signed-off-by: Kevin Wolf <mail@kevin-wolf.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reduce spurious packet drops on RX ring empty
by verifying that we have at least 1 buffer
ahead of the time.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The PCI/PCI-X Family of Gigabit Ethernet Controllers Software
Developer’s Manual states the following about the POPTS field:
Provides a number of options which control the handling of this
packet. This field is ignored except on the first data descriptor of
a packet.
The current implementation always loads the field and its checksum
offload flags. This patch uses only the first descriptor's POPTS field
in order to comply with the specification.
When Solaris sends multi-descriptor packets it fills in POPTS for the
first descriptor only. Therefore this patch is necessary in order to
perform checksum offload correctly for multi-descriptor packets.
Reported-by: Daniel Pecka <dpecka@techniservit.cz>
Reported-by: Gabriele A. Trombetti <gabriele.trombetti@itb.cnr.it>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The spec says: Any descriptor with a non-zero status byte has been
processed by the hardware, and is ready to be handled by the software.
Thus, once we change a descriptor status to non-zero we should
never move the head backwards and try to reuse this
descriptor from hardware.
This actually happened with a multibuffer packet
that arrives when we don't have enough buffers.
Fix by checking that we have enough buffers upfront
so we never need to discard the packet midway through.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The e1000 spec says: if software statically allocates
buffers, and uses memory read to check for completed descriptors, it
simply has to zero the status byte in the descriptor to make it ready
for reuse by hardware. This is not a hardware requirement (moving the
hardware tail pointer is), but is necessary for performing an in–memory
scan.
Thus the guest does not have to clear the status byte. In case it
doesn't we need to clear EOP for all descriptors
except the last. While I don't know of any such guests,
it's probably a good idea to stick to the spec.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reported-by: Juan Quintela <quintela@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
e1000 supports multi-buffer packets larger than rxbuf_size.
This fixes the following (on linux):
- in guest: ifconfig eth1 mtu 16110
- in host: ifconfig tap0 mtu 16110
ping -s 16082 <guest-ip>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
If bootindex is specified on command line a string that describes device
in firmware readable way is added into sorted list. Later this list will
be passed into firmware to control boot order.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The e1000 has compatibility code to handle big endianness which makes it
mandatory to be recompiled on different targets.
With the generic mmio endianness solution, there's no need for that anymore.
We just declare all mmio to be little endian and call it a day.
Because we don't depend on the target endianness anymore, we can also
move the driver over to Makefile.objs.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
As stated before, devices can be little, big or native endian. The
target endianness is not of their concern, so we need to push things
down a level.
This patch adds a parameter to cpu_register_io_memory that allows a
device to choose its endianness. For now, all devices simply choose
native endian, because that's the same behavior as before.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
There is no need for these type casts (as other existing
code shows). So re-write the first argument without
type cast (and remove a related TODO comment).
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch adds missing braces around if/else statements that call
macros which are likely to result in errors if the macro is
changed. It also makes the code comply better with CODING_STYLE.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When adding the length to the pseudo header, we're not properly
accounting for overflow.
From: Mark Wu <dwu@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When adding the length to the pseudo header, we're not properly
accounting for overflow.
From: Mark Wu <dwu@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The OpenIndiana (Solaris) e1000g driver drops frames that are too long
or too short. It expects to receive frames of at least the Ethernet
minimum size. ARP requests in particular are small and will be dropped
if they are not padded appropriately, preventing a Solaris VM from
becoming visible on the network.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Patch b0b900070c made
TOR valuer incorrect: the spec says it should always
include the CRC field.
No one seems to use this field, but better to stick to spec.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This change fixes initialization of e1000's microwire EEPROM internal
state values so that qemu's e1000 emulation works on NetBSD,
which doesn't use Intel's em driver but has its own wm driver
for the Intel i8254x Gigabit Ethernet.
Previously set_eecd() function in e1000.c clears EEPROM internal state
values on SK rising edge during CS==L, but according to FM93C06 EEPROM
(which is MicroWire compatible) data sheet, EEPROM internal status
should be cleared on CS rise edge regardless of SK input:
"... a rising edge on this (CS) signal is required to reset the internal
state-machine to accept a new cycle .."
and nothing should be changed during CS (chip select) is inactive.
Intel's em driver seems to explicitly raise SK output after CS is negated
in em_standby_eeprom() so many other OSes that use Intel's driver
don't have this problem even on the previous e1000.c implementation,
but I can't find any articles that say the MICROWIRE or EEPROM spec
requires such sequence, and actually hardware works fine without it
(i.e. real i82540EM has been working on NetBSD).
This fix also changes initialization to clear each state value in
struct eecd_state individually rather than using memset() against
the whole structre. The old_eecd member stores the last SK and CS
signal levels and it should be preserved even after reset of internal
EEPROM state to detect next signal edges for proper EEPROM emulation.
Signed-off-by: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
We do range check for size, and get size as buffer,
but copy size + 4 bytes (4 is for FCS).
Let's copy size bytes but put size + 4 in length.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Change #define DEBUG to #define E1000_DEBUG in hw/e1000.c to make
it possible to build QEMU with -DDEBUG
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
There was a pointer cast warning on Ubuntu since _FORTIFY_SOURCE has been reenabled.
_FORTIFY_SOURCE had been disabled by 4a24470497
and reenabled by 849583050d.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Link to data sheet at intel.com so people can find it.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
No functional changes. I verified that the generated
object binary does not change.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Juan Quintela <quintela@redhat.com>
Acked-by: Glauber Costa <glommer@gmail.com>
Otherwise, the driver does not work in Linux after the INT_DISABLE changes in
PCI.
Michael Tsirkin had a patch to do this, I'm not sure what happened to it.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch adds a romfile property to the pci bus. It allows to specify
a romfile to load into the rom bar of the pci device. The default value
comes from a new field in PCIDeviceInfo. The property allows to change
the file and also to disable the rom loading using an empty string.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Currently, we preload option roms into the option rom space in memory. This
prevents DDIM from functioning correctly which severely limits the number
of roms we can support.
This patch introduces a pci_add_option_rom() which registers the
PCI_ROM_ADDRESS bar which points to our option rom. It also converts over
the cirrus vga adapter, the rtl8139, virtio, and the e1000 to use this
new mechanism.
The result is that PXE boot functions even with three unique types of cards.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
A code review run by Steve Grubb complained about code in e1000.c:
In hw/e1000.c at line 89, vlan is declared to be 4 bytes.
At line 382 is an attempt to do a memmove over it with a size of 12.
This was fixed by splitting the memmove in two calls and
adding a comment to the declaration of vlan and data.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
While writing working on an e1000 driver for my university's OS I
noticed that some registers aren't readable in QEMU, but they should
be readable as stated in Intels Driver Developer Manual (and also
verified on real hardware).
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch is preliminary for 64bit BAR.
Later pcibus_t will be changed from uint32_t to uint64_t.
Introduce FMT_PCIBUS for printf format for pcibus_t.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch is preliminary for 64 bit BAR support.
Introduce dedicated type, pcibus_t, to represent pci bus address/size
instead of uint32_t.
Later this type will be changed to uint64_t.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
make constants for pci base address match pci_regs.h by
renaming PCI_ADDRESS_SPACE_xxx to PCI_BASE_ADDRESS_SPACE_xxx.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
There is absolutely no need to call reset functions when initializing
devices. Since we are already registering them, calling qemu_system_reset()
should suffice. Actually, it is what happens when we reboot the machine,
and using the same process instead of a special case semantics will even
allow us to find bugs easier.
Furthermore, the fact that we initialize things like the cpu quite early,
leads to the need to introduce synchronization stuff like qemu_system_cond.
This patch removes it entirely. All we need to do is call qemu_system_reset()
only when we're already sure the system is up and running
I tested it with qemu (with and without io-thread) and qemu-kvm, and it
seems to be doing okay - although qemu-kvm uses a slightly different patch.
[ v2: user mode still needs cpu_reset, so put it in ifdef. ]
[ v3: leave qemu_system_cond for now. ]
Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
In the very least, a change like this requires discussion on the list.
The naming convention is goofy and it causes a massive merge problem. Something
like this _must_ be presented on the list first so people can provide input
and cope with it.
This reverts commit 99a0949b72.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>