Swing the tx.props out via a temporary structure, so in future patches
we can select what we're going to send.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Wire the new subsection from the previous commit to a property
so we can turn it off easily.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Old QEMUs only had one set of offload data; when we only receive
one lot, dupe the received data - that should give us about the
same bug level as the old version.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
A bunch of new TSO fields were introduced by d62644b4 and this bumped
the VMState version; however it's easier for those trying to keep
backwards migration compatibility if these fields are added in a
subsection instead.
Move the new fields to a subsection.
Since this was added after 2.11, this change will only affect
compatbility with 2.12-rc0.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Headers like "hw/loader.h" and "qemu/sockets.h" are not needed in
the hw/net/*.c files. And Some other headers are included via other
headers already, so we can drop them, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The device is supposed to maintain two distinct contexts for transmit
offloads: one has parameters for both segmentation and checksum
offload, the other only for checksum offload. The guest driver can
send two context descriptors, one for each context (the TSE flag
specifies which). Then the guest can refer to one or the other context
in subsequent transmit data descriptors, depending on what offloads it
wants applied to each packet.
Currently the e1000 device stores just one context, and misinterprets
the TSE flags in the context and data descriptors. This is often okay:
Linux happens to send a fresh context descriptor before every data
descriptor, so forgetting the other context doesn't matter. Windows
does rely on separate contexts for TSO vs. non-TSO packets, but for
mostly-TCP traffic the two contexts have identical TCP-specific
offload parameters so confusing them doesn't matter.
One case where this confusion matters is when a Windows guest sets up
a TSO context for TCP and a non-TSO context for UDP, and then
transmits both TCP and UDP traffic in parallel. The e1000 device
sometimes ends up using TCP-specific parameters while doing checksum
offload on a UDP datagram: it writes the checksum to offset 16 (the
correct location for a TCP checksum), stomping on two bytes of UDP
data, and leaving the wrong value in the actual UDP checksum field at
offset 6. (Even worse, the host network stack may then recompute the
UDP checksum, "correcting" it to match the corrupt data before sending
it out a physical interface.)
Correct this by tracking the TSO context independently of the non-TSO
context, and selecting the appropriate context based on the TSE flag
in each transmit data descriptor.
Signed-off-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
sum_needed and cptse flags are received from the guest within each
transmit data descriptor. They are not part of the offload context;
instead, they determine how to apply a previously received context to
the packet being transmitted:
- If cptse is set, perform both segmentation and checksum offload
using the parameters in the TSO context; otherwise just do checksum
offload. (Currently the e1000 device incorrectly stores only one
context, which will be fixed in a subsequent patch.)
- Depending on the bits set in sum_needed, possibly perform L4
checksum offload and/or IP checksum offload, using the parameters in
the appropriate context.
Move these flags out of struct e1000x_txd_props, which is otherwise
dedicated to storing values from a context descriptor, and into the
per-packet TX struct.
Signed-off-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The checksum algorithm used by IPv4, TCP and UDP allows a zero value
to be represented by either 0x0000 and 0xFFFF. But per RFC 768, a zero
UDP checksum must be transmitted as 0xFFFF because 0x0000 is a special
value meaning no checksum.
Substitute 0xFFFF whenever a checksum is computed as zero when
modifying a UDP datagram header. Doing this on IPv4 and TCP checksums
is unnecessary but legal. Add a wrapper for net_checksum_finish() that
makes the substitution.
(We can't just change net_checksum_finish(), as that function is also
used by receivers to verify checksums, and in that case the expected
value is always 0x0000.)
Signed-off-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Add INTERFACE_CONVENTIONAL_PCI_DEVICE to all direct subtypes of
TYPE_PCI_DEVICE, except:
1) The ones that already have INTERFACE_PCIE_DEVICE set:
* base-xhci
* e1000e
* nvme
* pvscsi
* vfio-pci
* virtio-pci
* vmxnet3
2) base-pci-bridge
Not all PCI bridges are Conventional PCI devices, so
INTERFACE_CONVENTIONAL_PCI_DEVICE is added only to the subtypes
that are actually Conventional PCI:
* dec-21154-p2p-bridge
* i82801b11-bridge
* pbm-bridge
* pci-bridge
The direct subtypes of base-pci-bridge not touched by this patch
are:
* xilinx-pcie-root: Already marked as PCIe-only.
* pcie-pci-bridge: Already marked as PCIe-only.
* pcie-port: all non-abstract subtypes of pcie-port are already
marked as PCIe-only devices.
3) megasas-base
Not all megasas devices are Conventional PCI devices, so the
interface names are added to the subclasses registered by
megasas_register_types(), according to information in the
megasas_devices[] array.
"megasas-gen2" already implements INTERFACE_PCIE_DEVICE, so add
INTERFACE_CONVENTIONAL_PCI_DEVICE only to "megasas".
Acked-by: Alberto Garcia <berto@igalia.com>
Acked-by: John Snow <jsnow@redhat.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Modify the pre_save method on VMStateDescription to return an int
rather than void so that it potentially can fail.
Changed zillions of devices to make them return 0; the only
case I've made it return non-0 is hw/intc/s390_flic_kvm.c that already
had an error_report/return case.
Note: If you add an error exit in your pre_save you must emit
an error_report to say why.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170925112917.21340-2-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
SunOS defines SEC in <sys/time.h> as 1 (commonly used time symbols).
This fixes build on SmartOS (Joyent).
Patch cherry-picked from pkgsrc by jperkin (Joyent).
Signed-off-by: Kamil Rytarowski <n54@gmx.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Disable debug output by default, the information were not needed for
release.
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Leonid Bloch <leonid.bloch@ravellosystems.com>
Cc: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This is a mostly-mechanical conversion that creates a new flat
union 'Netdev' QAPI type that covers all the branches of the
former 'NetClientOptions' simple union, where the branches are
now listed in a new 'NetClientDriver' enum rather than generated
from the simple union. The existence of a flat union has no
change to the command line syntax accepted for new code, and
will make it possible for a future patch to switch the QMP
command to parse a boxed union for no change to valid QMP; but
it does have some ripple effect on the C code when dealing with
the new types.
While making the conversion, note that the 'NetLegacy' type
remains unchanged: it applies only to legacy command line options,
and will not be ported to QMP, so it should remain a wrapper
around a simple union; to avoid confusion, the type named
'NetClientOptions' is now gone, and we introduce 'NetLegacyOptions'
in its place. Then, in the C code, we convert from NetLegacy to
Netdev as soon as possible, so that the bulk of the net stack
only has to deal with one QAPI type, not two. Note that since
the old legacy code always rejected 'hubport', we can just omit
that branch from the new 'NetLegacyOptions' simple union.
Based on an idea originally by Zoltán Kővágó <DirtY.iCE.hu@gmail.com>:
Message-Id: <01a527fbf1a5de880091f98cf011616a78adeeee.1441627176.git.DirtY.iCE.hu@gmail.com>
although the sed script in that patch no longer applies due to
other changes in the tree since then, and I also did some manual
cleanups (such as fixing whitespace to keep checkpatch happy).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1468468228-27827-13-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Fixup from Eric squashed in]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Don't use *_to_cpup() to do byte-swapped loads; instead use
ld*_p() which correctly handle misaligned accesses.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Dmitry Fleytman <dmitry@daynix.com <mailto:dmitry@daynix.com>>
Message-id: 1466097446-981-6-git-send-email-peter.maydell@linaro.org
Since mit_delay can never be 0 this if statement is
superfluous.
Signed-off-by: Sameeh Jubran <sameeh@daynix.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Code that will be shared moved to a separate files.
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This reverts commit 9596ef7c7b.
This workaround in order to fix endless interrupts is no
longer needed because it was superseded by the previous patch
(e1000: Fixing interrupt pace).
Signed-off-by: Sameeh Jubran <sameeh@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This patch introduces an upper bound for number of interrupts
per second. Without this bound an interrupt storm can occur as
it has been observed on Windows 10 when disabling the device.
According to the SPEC - Intel PCI/PCI-X Family of Gigabit
Ethernet Controllers Software Developer's Manual, section
13.4.18 - the Ethernet controller guarantees a maximum
observable interrupt rate of 7813 interrupts/sec. If there is
no upper bound this could lead to an interrupt storm by e1000
(when mit_delay < 500) causing interrupts to fire at a very high
pace.
Thus if mit_delay < 500 then the delay should be set to the
minimum delay possible which is 500. This can be calculated
easily as follows:
Interval = 10^9 / (7813 * 256) = 500.
Signed-off-by: Sameeh Jubran <sameeh@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The start_xmit() and e1000_receive_iov() functions implement DMA transfers
iterating over a set of descriptors that the guest's e1000 driver
prepares:
- the TDLEN and RDLEN registers store the total size of the descriptor
area,
- while the TDH and RDH registers store the offset (in whole tx / rx
descriptors) into the area where the transfer is supposed to start.
Each time a descriptor is processed, the TDH and RDH register is bumped
(as appropriate for the transfer direction).
QEMU already contains logic to deal with bogus transfers submitted by the
guest:
- Normally, the transmit case wants to increase TDH from its initial value
to TDT. (TDT is allowed to be numerically smaller than the initial TDH
value; wrapping at or above TDLEN bytes to zero is normal.) The failsafe
that QEMU currently has here is a check against reaching the original
TDH value again -- a complete wraparound, which should never happen.
- In the receive case RDH is increased from its initial value until
"total_size" bytes have been received; preferably in a single step, or
in "s->rxbuf_size" byte steps, if the latter is smaller. However, null
RX descriptors are skipped without receiving data, while RDH is
incremented just the same. QEMU tries to prevent an infinite loop
(processing only null RX descriptors) by detecting whether RDH assumes
its original value during the loop. (Again, wrapping from RDLEN to 0 is
normal.)
What both directions miss is that the guest could program TDLEN and RDLEN
so low, and the initial TDH and RDH so high, that these registers will
immediately be truncated to zero, and then never reassume their initial
values in the loop -- a full wraparound will never occur.
The condition that expresses this is:
xdh_start >= s->mac_reg[XDLEN] / sizeof(desc)
i.e., TDH or RDH start out after the last whole rx or tx descriptor that
fits into the TDLEN or RDLEN sized area.
This condition could be checked before we enter the loops, but
pci_dma_read() / pci_dma_write() knows how to fill in buffers safely for
bogus DMA addresses, so we just extend the existing failsafes with the
above condition.
This is CVE-2016-1981.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Prasad Pandit <ppandit@redhat.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: qemu-stable@nongnu.org
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1296044
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453832250-766-19-git-send-email-peter.maydell@linaro.org
Version: GnuPG v1
iQEcBAABAgAGBQJWZZJPAAoJEO8Ells5jWIRmp0H/26aFXVEgZykkUVNbqq05r7w
AI7podQlFOAESJHqZtR8FMaH8TAZ5GhphP4pn0PsWp54VjwcYZbdoME+dhZ4Elyc
WDanRHIweLv/zVg6+M8oHhw5GMaxtFLoLWrf0oanbUW9IZZmmM3COz/Y31hSVrR2
EzEJi1VZZhpMj3ibeOJns4MrugYrne8MtOdvusE/Uw2rJBTiStnWw1eTk8RmkNcg
5un1mQZxFU2AcNzmWdmWJmjY0rCnR3HhtTdZOwjM6uZGIJ9hbsItGzqiGadBfozI
fUtIa2HZahioe0VIzoB0snXnAuhV1jA0Uy18i04dPvgQOmiVSRjQNE2/lwQflyE=
=Pad3
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Mon 07 Dec 2015 14:06:07 GMT using RSA key ID 398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
lan9118: log and ignore access to invalid registers, rather than aborting
lan9118: fix emulation of MAC address loaded bit in E2P_CMD register
vmxnet3: silence warning
pcnet: fix rx buffer overflow(CVE-2015-7512)
net: pcnet: add check to validate receive data size(CVE-2015-7504)
e1000: fix hang of win2k12 shutdown with flood ping
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
e1000 driver in Win2k12 is really well rotten. It 100% hangs on shutdown
of UP VM under flood ping. The guest checks card state and reinjects
itself interrupt in a loop. This is fatal for UP machine.
There is no good way to fix this misbehavior but to kludge it. The
emulation has interrupt throttling register aka ITR which limits
interrupt rate and allows the guest to proceed this phase.
There is no problem with this kludge for Linux guests - it adjust the
value of it itself.
On the other hand according to the initial research in
commit e9845f0985
Author: Vincenzo Maffione <v.maffione@gmail.com>
Date: Fri Aug 2 18:30:52 2013 +0200
e1000: add interrupt mitigation support
...
Interrupt mitigation boosts performance when the guest suffers from
an high interrupt rate (i.e. receiving short UDP packets at high packet
rate). For some numerical results see the following link
http://info.iet.unipi.it/~luigi/papers/20130520-rizzo-vm.pdf
this should also boost performance a bit.
See https://bugzilla.redhat.com/show_bug.cgi?id=874406 for additional
details.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Vincenzo Maffione <v.maffione@gmail.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This follows the previous patches, where support for migrating the
entire MAC registers' array, and some new MAC registers were introduced.
This patch introduces the e1000-specific boolean parameter
"extra_mac_registers", which is on by default. Setting it to off will
enable migration to older versions of QEMU, but will disable the read
and write access to the new registers, that were introduced since adding
the ability to migrate the entire MAC array.
Example for usage to enable backward compatibility and to disable the
new MAC registers:
qemu-system-x86_64 -device e1000,extra_mac_registers=off,... ...
As mentioned above, the default value is "on".
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This implements the following Statistic registers (various counters)
according to Intel's specs:
TSCTC GOTCL GOTCH GORCL GORCH MPRC BPRC RUC ROC
BPTC MPTC PTC... PRC...
PLEASE NOTE: these registers will not be active, nor will migrate, until
a compatibility flag will be set (in the next patch in this series).
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Previously, if promiscuous unicast was enabled, a packet was received
straight away, even if it was a multicast or a broadcast packet. This
patch fixes that behavior, while making the filtering procedure a bit
more human-readable.
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Previously, these 64-bit registers did not stick at their maximal
values when (and if) they reached them, as they should do, according to
the specs.
This patch introduces a function that takes care of such registers,
avoiding code duplication, making the relevant parts more compatible
with the QEMU coding style, while ensuring that in the unlikely case
of reaching the maximal value, the counter will stick there, as it
supposed to.
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
According to Intel's specs, these counters (as the other Statistic
registers) stick at 0xffffffff when this maximal value is reached.
Previously, they would reset after the max. value.
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
These registers appear in Intel's specs, but were not implemented.
These registers are now implemented trivially, i.e. they are initiated
with zero values, and if they are RW, they can be written or read by the
driver, or read only if they are R (essentially retaining their zero
values). For these registers no other procedures are performed.
For the trivially implemented Diagnostic registers, a debug warning is
produced on read/write attempts.
PLEASE NOTE: these registers will not be active, nor will migrate, until
a compatibility flag will be set (in a later patch in this series).
The registers implemented here are:
Transmit:
RW: AIT
Management:
RW: WUC WUS IPAV IP6AT* IP4AT* FFLT* WUPM* FFMT* FFVT*
Diagnostic:
RW: RDFH RDFT RDFHS RDFTS RDFPC PBM* TDFH TDFT TDFHS
TDFTS TDFPC
Statistic:
RW: FCRUC
R: RNBC TSCTFC MGTPRC MGTPDC MGTPTC RFC RJC SCC ECOL
LATECOL MCC COLC DC TNCRS SEC CEXTERR RLEC XONRXC
XONTXC XOFFRXC XOFFTXC
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The array of uint8_t's which is introduced here, contains access metadata
about the MAC registers: if a register is accessible, but partly implemented,
or if a register requires a certain compatibility flag in order to be
accessed. Currently, 6 hypothetical flags are supported (3 exist for e1000
so far) but in the future, if more than 6 flags will be needed, the datatype
of this array can simply be swapped for a larger one.
This patch is intended to solve the following current problems:
1) In a scenario of migration between different versions of QEMU, which
differ by the MAC registers implemented in them, some registers need not to
be active if a compatibility flag is set, in order to preserve the machine's
state perfectly for the older version. Checking this for each register
individually, would create a lot of clutter in the code.
2) Some registers are (or may be) only partly implemented (e.g.
placeholders that allow reading and writing, but lack other functions).
In such cases it is better to print a debug warning on read/write attempts.
As above, dealing with this functionality on a per-register level, would
require longer and more messy code.
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This patch makes the migration of the entire array of MAC registers
possible during live migration. The entire array is just 128 KB long, so
practically no penalty should be felt when transmitting it, additionally
to the previously transmitted individual registers. The advantage here is
eliminating the need to introduce new vmstate subsections in the future,
when additional MAC registers will be implemented.
Backward compatibility is preserved by introducing a e1000-specific
boolean parameter (in a later patch), which will be on by default.
Setting it to off would enable migration to older versions of QEMU.
Additionally, this parameter will be used to control the access to the
extra MAC registers in the future.
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This fixes some alignment and cosmetic issues. The changes are made
in order that the following patches in this series will look like
integral parts of the code surrounding them, while conforming to the
coding style. Although some changes in unrelated areas are also made.
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Instead of duplicating the "e1000-82540em" device model as "e1000",
make the latter an alias for the former.
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com
Reviewed-by: Markus Armbruster <armbru@redhat.com>
While processing transmit descriptors, it could lead to an infinite
loop if 'bytes' was to become zero; Add a check to avoid it.
[The guest can force 'bytes' to 0 by setting the hdr_len and mss
descriptor fields to 0.
--Stefan]
Signed-off-by: P J P <pjp@fedoraproject.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1441383666-6590-1-git-send-email-stefanha@redhat.com
e1000_can_receive() checks the link up status register bit. If the bit
is clear, packets will be queued and the peer may disable receive to
avoid wasting CPU reading packets that cannot be delivered. The queue
must be flushed once the link comes back up again.
This patch fixes broken e1000 receive with Mac OS X Snow Leopard guests
and tap networking. Flushing the queue invokes the async send callback,
which re-enables tap fd read.
Reported-by: Jonathan Liu <net147@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1435223885-12745-1-git-send-email-stefanha@redhat.com
We create optional sections with this patch. But we already have
optional subsections. Instead of having two mechanism that do the
same, we can just generalize it.
For subsections we just change:
- Add a needed function to VMStateDescription
- Remove VMStateSubsection (after removal of the needed function
it is just a VMStateDescription)
- Adjust the whole tree, moving the needed function to the corresponding
VMStateDescription
Signed-off-by: Juan Quintela <quintela@redhat.com>
It's detected by coverity.In is_vlan_packet s->mac_reg[VET] is
unsigned int but is dereferenced as a narrower unsigned short.
This may lead to unexpected results depending on machine
endianness.
Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1426224119-8352-1-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Convert the device models where initialization obviously can't fail.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
All NICs have a cleanup function that, in most cases, zeroes the pointer
to the NICState. In some cases, it frees data belonging to the NIC.
However, this function is never called except when exiting from QEMU.
It is not necessary to NULL pointers and free data here; the right place
to do that would be in the device's unrealize function, after calling
qemu_del_nic. Zeroing the NIC multiple times is also wrong for multiqueue
devices.
This cleanup function gets in the way of making the NetClientStates for
the NIC hold an object_ref reference to the object, so get rid of it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Some guests seem to set BM for e1000 after
enabling RX.
If packets arrive in the window, device is wedged.
Probably works by luck on real hardware, work around
this by making can_receive depend on BM.
Tested-by: Gabriel Somlo <somlo@cmu.edu>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
On this way, we can assure the new bootindex take effect
during vm rebooting.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Memory changes for QOMification and automatic tracking of MR lifetime.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJT8et9AAoJEBvWZb6bTYbyIJAQAI3AlLSe27xWoUGfQUgWH30z
Rt/pShHz3BJMfQpD79JfTH8u6uBpkQmKtflerNT7FhXN9ULDzNq+b/jRtke8nkuy
ctCt05FhhK00rfWpUoRue4XiCuvbizBU7MK0DI3yCyNdXQyYnFvgnvsJtlqox8Zh
J5HZcBJEmdCiWBxq7UPk0qBitp4PqNoy7jlD/Ex3m7fJN5WK2cyspQIT9zmhehVn
B8Nwp+RitDDbXbwm0r18col5rFr/6Nj6+dW1gr+7sVJDLNsmJEqC2l3Kgk0wbPkG
Uqwbih29me9PC9/L1VLGHY0ApKDQ8JGE0GrYgEg162hbhoxEHkjjoHMhDUfV6Pj8
NkqcjjWl11UUhgkNqrGafayXbBVnOiEglxy8uXCeq14y9Xd/gjK9Fz6MQvRSOjms
PFmaKknhdmpxh0DuZmTix7WBmKim8zOiCE0/vrAPvwx5L+d1bn5xh6yQvtVjBMpU
Sru3Mhdm9bL9dUDBgOM/G6WCxSTVLBlExOblcYkQh03MfabD7bfplcrKYPXt5ull
Y8YLjqkoIfoy5t0ErvtlpdBJjeEz99JXU+wLQ6NYHnzwzTV+oUtSaEph14mAFOcY
XkFKdoPDI9PnyEfvy4193du8z/dSbhu7sWgHWbTCQyrcaNnSaVhlH43NUC+p23YN
8vfEsVLd1X7MFkDBUmWp
=M+/m
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
SCSI changes that enable sending vendor-specific commands via virtio-scsi.
Memory changes for QOMification and automatic tracking of MR lifetime.
# gpg: Signature made Mon 18 Aug 2014 13:03:09 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: aka "Paolo Bonzini <bonzini@gnu.org>"
* remotes/bonzini/tags/for-upstream:
mtree: remove write-only field
memory: Use canonical path component as the name
memory: Use memory_region_name for name access
memory: constify memory_region_name
exec: Abstract away ref to memory region names
loader: Abstract away ref to memory region names
tpm_tis: remove instance_finalize callback
memory: remove memory_region_destroy
memory: convert memory_region_destroy to object_unparent
ioport: split deletion and destruction
nic: do not destroy memory regions in cleanup functions
vga: do not dynamically allocate chain4_alias
sysbus: remove unused function sysbus_del_io
qom: object: move unparenting to the child property's release callback
qom: object: delete properties before calling instance_finalize
virtio-scsi: implement parse_cdb
scsi-block, scsi-generic: implement parse_cdb
scsi-block: extract scsi_block_is_passthrough
scsi-bus: introduce parse_cdb in SCSIDeviceClass and SCSIBusInfo
scsi-bus: prepare scsi_req_new for introduction of parse_cdb
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The function is empty after the previous patch, so remove it.
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make phyreg_writeops responsible for actually writing their
respective phy registers, rather than rely on set_mdic() to
do it on their behalf.
The only current instance of phyreg_writeops is set_phy_ctrl();
modify it to write the register on its own, while also correctly
handling reserved and self-clearing bits.
have_autoneg() does not need to check for MII_CR_RESTART_AUTO_NEG,
since the only time the flag comes into play is during set_phy_ctrl(),
and, following this patch, never actually gets written to the phy
control register.
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Also fix minor indentation issues in the surrounding code.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Enable calling set_ics() from within e1000_autoneg_timer() without
the need for a forward declaration.
This patch contains no functional changes.
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Generate a link status change interrupt once link auto-netotiation
is successfully completed. This does not affect Linux and Windows
(XP and 7 tested) in any way, but is needed by the stock OS X driver
(AppleIntel8254XEthernet.kext), which would otherwise fail to notice
the link status change event.
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Using mii-tool (on F20-live), the following output is produced:
SIOCGMIIREG on ens3 failed: Input/output error
ens3: no autonegotiation, 1000baseT-FD flow-control, link ok
The first line (SIOCGMIIREG error) is due to mii-tool's inability
to read the PHY auto-negotiation expansion register.
On the second line, "no autonegotiation" is wrong, and caused by
the absence of a flag in the link partner ability register which
would indicate that our link partner has acked us. This flag is
listed as "reserved" in the Intel e1000 manual, but mii-tool uses
it as LPA_LPACK from /usr/include/linux/mii.h.
This patch adds read access to PHY_AUTONEG_EXP and defines the
link partner ack flag, allowing mii-tool to generate output as
normally expected:
ens3: negotiated 1000baseT-FD flow-control, link ok
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch emulates auto-negotiation when the network link status
is modified externally (i.e. via "set_link <id> off/on").
Also, a couple of cleanup items:
- unset PHY status reg. AUTONEG_COMPLETE during link_down()
- set PHY status reg. AUTONEG_COMPLETE during autoneg_timer() only
if we actually brought the link up.
- group all checks for "can we, and should we autonegotiate?"
together for more clarity.
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>