Commit Graph

134 Commits

Author SHA1 Message Date
Jason Wang
f6b26cf257 net: reduce the unnecessary memory allocation of multiqueue
Edivaldo reports a problem that the array of NetClientState in NICState is too
large - MAX_QUEUE_NUM(1024) which will wastes memory even if multiqueue is not
used.

Instead of static arrays, solving this issue by allocating the queues on demand
for both the NetClientState array in NICState and VirtIONetQueue array in
VirtIONet.

Tested by myself, with single virtio-net-pci device. The memory allocation is
almost the same as when multiqueue is not merged.

Cc: Edivaldo de Araujo Pereira <edivaldoapereira@yahoo.com.br>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-02-27 16:10:47 +01:00
Jesse Larrew
14f9b664b3 hw/virtio-net.c: set config size using host features
Currently, the config size for virtio devices is hard coded. When a new
feature is added that changes the config size, drivers that assume a static
config size will break. For purposes of backward compatibility, there needs
to be a way to inform drivers of the config size needed to accommodate the
set of features enabled.

aliguori: merged in
 - hw/virtio-net: use existing macros to implement endof
 - hw/virtio-net: fix config_size data type

Signed-off-by: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-08 11:13:44 -06:00
Anthony Liguori
1e89ad5b00 virtio-net: pass host features to virtio_net_init
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-08 07:37:24 -06:00
Jason Wang
5f80080183 virtio-net: migration support for multiqueue
This patch add migration support for multiqueue virtio-net. Instead of bumping
the version, we conditionally send the info of multiqueue only when the device
support more than one queue to maintain the backward compatibility.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-01 11:03:03 -06:00
Jason Wang
fed699f9ca virtio-net: multiqueue support
This patch implements both userspace and vhost support for multiple queue
virtio-net (VIRTIO_NET_F_MQ). This is done by introducing an array of
VirtIONetQueue to VirtIONet.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-01 11:03:03 -06:00
Jason Wang
0c87e93e31 virtio-net: separate virtqueue from VirtIONet
To support multiqueue virtio-net, the first step is to separate the virtqueue
related fields from VirtIONet to a new structure VirtIONetQueue. The following
patches will add an array of VirtIONetQueue to VirtIONet based on this patch.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-01 11:03:03 -06:00
Jason Wang
a9f98bb5eb vhost: multiqueue support
This patch lets vhost support multiqueue. The idea is simple, just launching
multiple threads of vhost and let each of vhost thread processing a subset of
the virtqueues of the device. After this change each emulated device can have
multiple vhost threads as its backend.

To do this, a virtqueue index were introduced to record to first virtqueue that
will be handled by this vhost_net device. Based on this and nvqs, vhost could
calculate its relative index to setup vhost_net device.

Since we may have many vhost/net devices for a virtio-net device. The setting of
guest notifiers were moved out of the starting/stopping of a specific vhost
thread. The vhost_net_{start|stop}() were renamed to
vhost_net_{start|stop}_one(), and a new vhost_net_{start|stop}() were introduced
to configure the guest notifiers and start/stop all vhost/vhost_net devices.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-01 11:03:02 -06:00
Jason Wang
948ecf219c net: intorduce qemu_del_nic()
To support multiqueue nic, this patch separate the nic destructor from
qemu_del_net_client() to a new helper qemu_del_nic() since the mapping bettween
NiCState and NetClientState were not 1:1 in multiqueue. The following patches
would refactor this function to support multiqueue nic.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-01 11:03:01 -06:00
Jason Wang
cc1f0f4542 net: introduce qemu_get_nic()
To support multiqueue, this patch introduces a helper qemu_get_nic() to get
NICState from a NetClientState. The following patches would refactor this helper
to support multiqueue.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-01 11:03:00 -06:00
Jason Wang
b356f76de3 net: introduce qemu_get_queue()
To support multiqueue, the patch introduce a helper qemu_get_queue()
which is used to get the NetClientState of a device. The following patches would
refactor this helper to support multiqueue.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-01 11:02:55 -06:00
Jason Wang
ec45f08313 net: tap: using bool instead of bitfield
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-01 10:50:59 -06:00
Amos Kong
dd23454ba2 virtio-net: rename ctrl rx commands
This patch makes rx commands consistent with specification.

Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-01-30 01:31:09 +02:00
Amos Kong
c1943a3f37 virtio-net: introduce a new macaddr control
In virtio-net guest driver, currently we write MAC address to
pci config space byte by byte, this means that we have an
intermediate step where mac is wrong. This patch introduced
a new control command to set MAC address, it's atomic.

VIRTIO_NET_F_CTRL_MAC_ADDR is a new feature bit for compatibility.

"mac" field will be set to read-only when VIRTIO_NET_F_CTRL_MAC_ADDR
is acked.

Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-01-30 01:31:09 +02:00
Michael S. Tsirkin
921ac5d0f3 virtio-net: remove layout assumptions for ctrl vq
Virtio-net code makes assumption about virtqueue descriptor layout
(e.g. sg[0] is the header, sg[1] is the data buffer).

This patch makes code not rely on the layout of descriptors.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-01-30 01:31:09 +02:00
Michael S. Tsirkin
41dc8a67c7 virtio-net: revert mac on reset
Once guest overrides virtio net primary mac,
it retains the value set until qemu exit.
This is inconsistent with standard nic behaviour.
To fix, revert the mac to the original value on reset.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-01-30 01:31:08 +02:00
Michael S. Tsirkin
f56a12475f vhost: backend masking support
Support backend guest notifier masking in vhost-net:
create eventfd at device init, when masked,
make vhost use that as eventfd instead of
sending an interrupt.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-01-07 19:42:23 +02:00
Michael S. Tsirkin
1830b80ff2 virtio-net: set/clear vhost_started in reverse order
As vhost started is cleared last thing on stop,
set it first things on start. This makes it
possible to use vhost_started while start is in
progress which is used by follow-up patches.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-01-07 19:42:23 +02:00
Paolo Bonzini
1de7afc984 misc: move include files to include/qemu/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:32:39 +01:00
Paolo Bonzini
1422e32db5 net: reorganize headers
Move public headers to include/net, and leave private headers in net/.
Put the virtio headers in include/net/tap.h, removing the multiple copies
that existed.  Leave include/net/tap.h as the interface for NICs, and
net/tap_int.h as the interface for OS-specific parts of the tap backend.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:29 +01:00
Michael S. Tsirkin
ff3a8066e6 virtio-net: enable mrg buf header in tap on linux
Modern linux supports arbitrary header size,
which makes it possible to pass mrg buf header
to tap directly without iovec mangling.
Use this capability when it is there.

This removes the need to deal with it in
vhost-net as we do now.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-10-29 18:25:23 +02:00
Michael S. Tsirkin
6e371ab867 virtio-net: test peer header support at init time
There's no reason to query header support at random
times: at load or feature query.
Driver also might not query functions.
Cleaner to do it at device init.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-10-29 18:25:23 +02:00
Michael S. Tsirkin
e043ebc6f9 virtio-net: minor code simplification
During packet filtering, we can now use host hdr len
to offset incoming buffer unconditionally.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-10-29 18:25:23 +02:00
Michael S. Tsirkin
7b80d08efc virtio-net: simplify rx code
Remove code duplication using guest header length that we track.
Drop specific layout requirement for rx buffers: things work
using generic iovec functions in any case.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-10-29 18:25:23 +02:00
Michael S. Tsirkin
14761f9cf7 virtio-net: switch tx to safe iov functions
Avoid mangling iovec manually: use safe iov_*
functions.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-10-29 18:25:23 +02:00
Michael S. Tsirkin
c8d28e7e33 virtio-net: first s/g is always at start of buf
We know offset is 0, assert that.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-10-29 18:25:23 +02:00
Michael S. Tsirkin
280598b7a5 virtio-net: refactor receive_hdr
Now that we know host hdr length, we don't need to
duplicate the logic in receive_hdr: caller can
figure out the offset itself.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-10-29 18:25:23 +02:00
Michael S. Tsirkin
63c5872873 virtio-net: use safe iov operations for rx
Avoid magling iov manually: use safe iov operations
for processing packets incoming to guest.
This also removes the requirement for virtio header to
fit the first s/g entry exactly.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-10-29 18:25:23 +02:00
Michael S. Tsirkin
22cc84db6e virtio-net: avoid sg copy
Avoid tweaking iovec during receive. This removes
the need to copy the vector.
Note: we currently have an evil cast in work_around_broken_dhclient
and unfortunately this patch does not fix it - just
pushes the evil cast to another place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-10-29 18:25:23 +02:00
Michael S. Tsirkin
e35e23f655 virtio-net: track host/guest header length
Tracking these in device state instead of
re-calculating on each packet. No functional
changes.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-10-29 18:25:22 +02:00
Juan Quintela
e398d61b47 virtio-net: use qemu_get_buffer() in a temp buffer
qemu_fseek() is known to be wrong.  Would be removed on the next
commit.  This code should never been used (value has been
MAC_TABLE_ENTRIES since 2009).

Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2012-10-17 18:34:59 +02:00
Amos Kong
9899148110 virtio-net: update nc.link_down in virtio_net_load()
nc.link_down could not be migrated, this patch updates link_down in
virtio_post_load() to keep it coincident with real link status.

Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
2012-10-08 13:59:40 +02:00
Michael S. Tsirkin
40bad8f3de virtio-net: fix used len for tx
There is no out sg for TX, so used buf length for tx
should always be 0.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-09-28 12:16:27 +02:00
Paolo Bonzini
987a9b4800 net: notify iothread after flushing queue
virtio-net has code to flush the queue and notify the iothread
whenever new receive buffers are added by the guest.  That is
fine, and indeed we need to do the same in all other drivers.
However, notifying the iothread should be work for the network
subsystem.  And since we are at it we can add a little smartness:
if some of the queued packets already could not be delivered,
there is no need to notify the iothread.

Reported-by: Luigi Rizzo <rizzo@iet.unipi.it>
Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Jan Kiszka <jan.kiszka@siemens.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-09-14 08:40:31 +01:00
Stefan Hajnoczi
b20c6b9e47 net: Rename qemu_del_vlan_client() to qemu_del_net_client()
Another step in moving the vlan feature out of net core.  Users only
deal with NetClientState and therefore qemu_del_vlan_client() should be
named qemu_del_net_client().

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2012-08-01 13:32:10 +01:00
Stefan Hajnoczi
4e68f7a081 net: Rename VLANClientState to NetClientState
The vlan feature is no longer part of net core.  Rename VLANClientState
to NetClientState because net clients are not explicitly associated with
a vlan at all, instead they have a peer net client to which they are
connected.

This patch is a mechanical search-and-replace except for a few
whitespace fixups where changing VLANClientState to NetClientState
misaligned whitespace.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2012-08-01 13:32:10 +01:00
Laszlo Ersek
2be64a68ed hw, net: "net_client_type" -> "NetClientOptionsKind" (qapi-generated)
NET_CLIENT_TYPE_ -> NET_CLIENT_OPTIONS_KIND_

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-07-23 11:55:18 +01:00
Michael Tokarev
dcf6f5e15e change iov_* function prototypes to be more appropriate
Reorder arguments to be more natural, readable and
consistent with other iov_* functions, and change
argument names, from:
 iov_from_buf(iov, iov_cnt, buf, iov_off, size)
to
 iov_from_buf(iov, iov_cnt, offset, buf, bytes)

The result becomes natural English:

 copy data to this `iov' vector with `iov_cnt'
 elements starting at byte offset `offset'
 from memory buffer `buf', processing `bytes'
 bytes max.

(Try to read the original prototype this way).

Also change iov_clear() to more general iov_memset()
(it uses memset() internally anyway).

While at it, add comments to the header file
describing what the routines actually does.

The patch only renames argumens in the header, but
keeps old names in the implementation.  The next
patch will touch actual code to match.

Now, it might look wrong to pay so much attention
to so small things.  But we've so many badly designed
interfaces already so the whole thing becomes rather
confusing or error prone.  One example of this is
previous commit and small discussion which emerged
from it, with an outcome that the utility functions
like these aren't well-understdandable, leading to
strange usage cases.  That's why I paid quite some
attention to this set of functions and a few
others in subsequent patches.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-07 20:43:38 +04:00
Orit Wassermann
2a633c461e virtio: check virtio_load return code
Otherwise we crash on error.

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Orit Wassermann <owasserm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-21 15:40:50 -05:00
Anthony Liguori
f79f2bfc6a qdev: don't access name through info
We already have a QOM interface for this so let's use it.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-01-27 10:50:39 -06:00
Anthony Liguori
30fbb9fc7c qdev: move qdev->info to class
Right now, DeviceInfo acts as the class for qdev.  In order to switch to a
proper ObjectClass derivative, we need to ween all of the callers off of
interacting directly with the info pointer.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-01-27 10:50:34 -06:00
Anthony Liguori
7267c0947d Use glib memory allocation and free functions
qemu_malloc/qemu_free no longer exist after this commit.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-20 23:01:08 -05:00
Anthony Liguori
81e34a2401 Merge remote-tracking branch 'mst/for_anthony' into staging 2011-08-04 17:15:22 -05:00
Amit Shah
b52dfd71f3 virtio-net: don't use vdev after virtio_cleanup
virtio_cleanup() will be changed by the following patch to remove the
VirtIONet struct that gets allocated via virtio_common_init().  Ensure
we don't dereference the structure after calling the cleanup function.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2011-07-27 20:18:47 +03:00
Hannes Reinecke
348e7b8dcd iov: Update parameter usage in iov_(to|from)_buf()
iov_to_buf() has an 'offset' parameter, iov_from_buf() hasn't.
This patch adds the missing parameter to iov_from_buf().
It also renames the 'offset' parameter to 'iov_off' to
emphasize it's the offset into the iovec and not the buffer.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-07-18 16:06:27 +02:00
Paolo Bonzini
7447545544 change all other clock references to use nanosecond resolution accessors
This was done with:

    sed -i 's/qemu_get_clock\>/qemu_get_clock_ns/' \
        $(git grep -l 'qemu_get_clock\>' )
    sed -i 's/qemu_new_timer\>/qemu_new_timer_ns/' \
        $(git grep -l 'qemu_new_timer\>' )

after checking that get_clock and new_timer never occur twice
on the same line.  There were no missed occurrences; however, even
if there had been, they would have been caught by the compiler.

There was exactly one false positive in qemu_run_timers:

     -    current_time = qemu_get_clock (clock);
     +    current_time = qemu_get_clock_ns (clock);

which is of course not in this patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-03-21 09:23:23 +01:00
Stefan Hajnoczi
b46d97f2d2 virtio-net: Fix lduw_p() pointer argument of wrong size
A pointer to a size_t variable was passed as the void * pointer to
lduw_p() in virtio_net_receive().  Instead of acting on the 16-bit value
this caused failure on big-endian hosts.

Avoid this issue in the future by using stw_p() instead.  In general we
should use ld*_p() for loading from target memory and st*_p() for
storing to target memory anyway, not the other way around.

Also tighten up a correct use of lduw_p() when stw_p() should be used
instead in virtio_net_get_config().

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2011-03-03 23:33:26 +01:00
Michael S. Tsirkin
3299369856 vhost: disable on tap link down
qemu makes it possible to disable link at tap which is not communicated
to the guest but causes all packets to be dropped.

When vhost-net is enabled, vhost needs to be aware of both the virtio
link_down and the peer link_down. we switch to userspace emulation when
either is down.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reported-by: pradeep <psuriset@linux.vnet.ibm.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2011-02-20 18:06:21 +01:00
mst@redhat.com
5430a28fe4 vhost: force vhost off for non-MSI guests
When MSI is off, each interrupt needs to be bounced through the io
thread when it's set/cleared, so vhost-net causes more context switches and
higher CPU utilization than userspace virtio which handles networking in
the same thread.

We'll need to fix this by adding level irq support in kvm irqfd,
for now disable vhost-net in these configurations.

Added a vhostforce flag to force vhost-net back on.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-02-01 16:50:44 -06:00
Aurelien Jarno
44b15bc5c6 virtio-net: fix cross-endianness support
virtio-net used to work on cross-endianness configurations, but doesn't
anymore with recent guest kernels, as the new features don't handle
endianness correctly.

This patch fixes wrong conversion, and add missing ones to make
virtio-net working. Tested on the following configurations:
- i386 guest on x86_64 host
- ppc guest on x86_64 host
- i386 guest on mips host
- ppc guest on mips host

Cc: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2011-01-29 15:07:56 +01:00
Michael S. Tsirkin
85cf2a8d74 virtio: move vmstate change tracking to core
Move tracking vmstate change from virtio-net to virtio.c
as it is going to be used by virito-blk and virtio-pci
for the ioeventfd support.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2011-01-10 14:44:07 +02:00