The tap_has_vnet_hdr() and tap_has_vnet_hdr_len() functions used
to return int, even though they only return true/false values.
This patch changes the prototypes to return bool.
Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
error_is_set(&var) is the same as var != NULL, but it takes
whole-program analysis to figure that out. Unnecessarily hard for
optimizers, static checkers, and human readers. Dumb it down to
obvious.
Gets rid of several dozen Coverity false positives.
Note that the obvious form is already used in many places.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
assign_name() in net/net.c is using snprintf + g_strdup to get the same
result as g_strdup_printf.
Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This improves readability and simplifies the code.
Cc: Anthony Liguori <aliguori@amazon.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When a link change occurs on a backend (like tap), we currently do
not propage such change to the nic. As a result, when someone turns
off a link on a tap device, for instance, then a guest doesn't see
that change and continues to try to send traffic or run DHCP even
though the lower-layer is disconnected. This is OK when the network
is set up as a HUB since the the guest may be connected to other HUB
ports too, but when it's set up as a netdev, it makes thinkgs worse.
The patch addresses this by setting the peers link down only when the
peer is not a HUBPORT device. With this patch, in the following config
-netdev tap,id=net0 -device e1000,mac=XXXXX,netdev=net0
when net0 link is turned off, the guest e1000 shows lower-layer link
down. This allows guests to boot much faster in such configurations.
With windows guest, it also allows the network to recover properly
since windows will not configure the link-local IPv4 address, and
when the link is turned on, the proper address address is configured.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch adds support for a network backend based on netmap.
netmap is a framework for high speed packet I/O. You can use it
to build extremely fast traffic generators, monitors, software
switches or network middleboxes. Its companion software switch
VALE lets you interconnect virtual machines.
netmap and VALE are implemented as a non-intrusive kernel module,
support NICs from multiple vendors, are part of standard FreeBSD
distributions and available in source format for Linux too.
To compile QEMU with netmap support, use the following configure
options:
./configure [...] --enable-netmap --extra-cflags=-I/path/to/netmap/sys
where "/path/to/netmap" contains the netmap source code, available at
http://info.iet.unipi.it/~luigi/netmap/
The same webpage contains more information about the netmap project
(together with papers and presentations).
Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Do not return after net_hub_flush(). Always flush callee network client
incoming queue.
Signed-off-by: Sergey Fedorov <s.fedorov@samsung.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
[Assigning a multicast MAC address to a NIC leads to confusing behavior.
Reject multicast MAC addresses so users are alerted to their error
straight away.
The "net/eth.h" in6_addr rename prevents a name collision with
<netinet/in.h> on Linux.
-- Stefan]
Signed-off-by: Dmitry V. Krivenok <krivenok.dmitry@gmail.com>
Reviewed-by: Amos Kong <kongjianjun@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
SO_REUSEADDR should be avoided on Windows but is desired on other operating
systems. So instead of setting it we call socket_set_fast_reuse that will result
in the appropriate behaviour on all operating systems.
An exception to this rule are multicast sockets where it is sensible to have
multiple sockets listen on the same ip and port and we should set SO_REUSEADDR
on windows.
Signed-off-by: Sebastian Ottlik <ottlik@fzi.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Each networking client has a queue for packets that could not yet be
delivered to that client. Calling this queue "send_queue" is highly
confusing as it has nothing to to with packets send from this client but
to it. Avoid this confusing by renaming it to "incoming_queue".
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The following patch simplifies the *BSD tap/tun code and makes use of numbered
tap/tun interfaces on all *BSD OS's. NetBSD has a patch in their pkgsrc tree
to make use of this feature and DragonFly also supports this as well.
Signed-off-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is an autogenerated patch using scripts/switch-timer-api.
Switch the entire code base to using the new timer API.
Note this patch may introduce some line length issues.
Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
include/qemu/timer.h has no need to include main-loop.h and
doing so causes an issue for the next patch. Unfortunately
various files assume including timers.h will pull in main-loop.h.
Untangle this mess.
Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The macro g_assert_not_reached is a better self documenting replacement
for assert(0) or assert(false).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Currently macvtap based macvlan device is working in promiscuous
mode, we want to implement mac-programming over macvtap through
Libvirt for better performance.
Design:
QEMU notifies Libvirt when rx-filter config is changed in guest,
then Libvirt query the rx-filter information by a monitor command,
and sync the change to macvtap device. Related rx-filter config
of the nic contains main mac, rx-mode items and vlan table.
This patch adds a QMP event to notify management of rx-filter change,
and adds a monitor command for management to query rx-filter
information.
Test:
If we repeatedly add/remove vlan, and change macaddr of vlan
interfaces in guest by a loop script.
Result:
The events will flood the QMP client(management), management takes
too much resource to process the events.
Event_throttle API (set rate to 1 ms) can avoid the events to flood
QMP client, but it could cause an unexpected delay (~1ms), guests
guests normally expect rx-filter updates immediately.
So we use a flag for each nic to avoid events flooding, the event
is emitted once until the query command is executed. The flag
implementation could not introduce unexpected delay.
There maybe exist an uncontrollable delay if we let Libvirt do the
real change, guests normally expect rx-filter updates immediately.
But it's another separate issue, we can investigate it when the
work in Libvirt side is done.
Michael S. Tsirkin: tweaked to enable events on start
Michael S. Tsirkin: fixed not to crash when no id
Michael S. Tsirkin: fold in patch:
"additional fixes for mac-programming feature"
Amos Kong: always notify QMP client if mactable is changed
Amos Kong: return NULL list if no net client supports rx-filter query
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
reorder slirp config options. first check the dns-server-address,
then check the first-dhcp-address. the original code was comparing
the first-dhcp-address with the default dns-server-address, not
the configured dns-server-address.
Signed-off-by: Bas van Sisseren <bas@quarantainenet.nl>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
This patch forbid the following invalid parameters to tap:
1) fd and vhostfds were specified but vhostfd were not specified
2) vhostfds were specified but fds were not specified
3) fds and vhostfd were specified
For 1 and 2, net_init_tap_one() will still pass NULL as vhostfdname to
monitor_handle_fd_param(), which may crash the qemu.
Also remove the unnecessary has_fd check.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <shajnocz@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Only tap->vhostfd were checked net_init_tap_one(), but tap->vhostfds were
forgot, this will lead qemu to ignore all fds passed by management through
vhostfds, and tries to create vhost_net device itself. Fix by adding this check
also.
Reportyed-by: Michal Privoznik <mprivozn@redhat.com>
Cc: Michal Privoznik <mprivozn@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
assign_name() creates a name MODEL.NUM, where MODEL is the client's model,
and NUM is the number of MODELs that already exist.
Markus added NIC naming for non-VLAN clients in commit 53e51d85.
commit d33d93b2 incorrectly added a judgement of net-hub. It caused
net clients created with -netdev get same names.
eg:
# qemu-upstream -device virtio-net-pci,netdev=h1 -netdev tap,id=h1 \
-device virtio-net-pci,netdev=h2 -netdev tap,id=h2 ..
(qemu) info network
virtio-net-pci.0: index=0,type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
\ h1: index=0,type=tap,ifname=tap0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
virtio-net-pci.0: index=0,type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:57
\ h2: index=0,type=tap,ifname=tap1,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
This patch removed the check of nic-hub, and created unique names for
all net clients that have same model.
v2: update commitlog & comments
Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Many of these should be cleaned up with proper qdev-/QOM-ification.
Right now there are many catch-all headers in include/hw/ARCH depending
on cpu.h, and this makes it necessary to compile these files per-target.
However, fixing this does not belong in these patches.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
chardev-frontends need to explictly check, increase and decrement the
avail_connections "property" of the chardev when they are not using a
qdev-chardev-property for the chardev.
This fixes things like:
qemu-kvm -chardev stdio,id=foo -device isa-serial,chardev=foo \
-mon chardev=foo
Working, where they should fail. Most of the changes here are due to
old hardware emulation code which is using serial_hds directly rather then
a qdev-chardev-property.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Message-id: 1364412581-3672-3-git-send-email-hdegoede@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
There are several code paths in net_init_socket() depending on how the
socket is created: file descriptor passing, UDP multicast, TCP, or UDP.
Some of these support both listen and connect.
Not all code paths set the socket to non-blocking. This patch addresses
the file descriptor passing and UDP cases which were missing
socket_set_nonblock(fd) calls.
I considered moving socket_set_nonblock(fd) to a central location but it
turns out the code paths are different enough to require non-blocking at
different places.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
The fcntl(fd, F_SETFL, O_NONBLOCK) flag is not specific to sockets.
Rename to qemu_set_nonblock() just like qemu_set_cloexec().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Socket buffer sizes were hard-coded to 4K for VDE and socket netdevs. Bump this
up to 68K (ala tap netdev) to handle maximum GSO packet size (64k) plus plenty
of room for the ethernet and virtio_net headers.
Originally, ran into this limitation when using -netdev UDP sockets to connect
VM-to-VM, where VM interface is configure with MTU=9000. (Using virtio_net
NIC model). Test is simple: ping -M do -s 8500 <target>. This test will
attempt to ping with unfragmented packet of given size. Without patch, size
is limited to < 4K (minus protocol hdrs). With patch, ping test works with pkt
size up to 9000 (again, minus protocol hdrs).
v2: per Stefan, increase buf size to (4096+65536) as done in tap and apply
to vde and socket netdevs.
v1: increase buf size to 12K just for -netdev UDP sockets
Signed-off-by: Scott Feldman <sfeldma@cumulusnetworks.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
net_checksum_add_cont()
checksum calculation for scattered data with odd chunk sizes
net_raw_checksum()
checksum calculation for a buffer
Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Yan Vugenfirer <yan@daynix.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reduce -netdev socket latency by disabling the Nagle algorithm on
SOCK_STREAM sockets in net/socket.c. Since we are tunelling Ethernet
over TCP we shouldn't artificially delay outgoing packets, let the guest
decide packet scheduling.
I already get sub-millisecond -netdev socket ping times on localhost, so
there was no measurable difference in my testing. This won't hurt
though and may improve remote socket performance.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Fix various typos and misspellings. The bulk of these were found with
codespell.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Instead of adding missing type casts which are needed by MinGW for the
4th argument, the patch uses qemu_setsockopt which was invented for this
purpose.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Edivaldo reports a problem that the array of NetClientState in NICState is too
large - MAX_QUEUE_NUM(1024) which will wastes memory even if multiqueue is not
used.
Instead of static arrays, solving this issue by allocating the queues on demand
for both the NetClientState array in NICState and VirtIONetQueue array in
VirtIONet.
Tested by myself, with single virtio-net-pci device. The memory allocation is
almost the same as when multiqueue is not merged.
Cc: Edivaldo de Araujo Pereira <edivaldoapereira@yahoo.com.br>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
historically the kernel queues packets two times. once
at the device and second in qdisc. this is believed to cause
interface stalls if one of these queues overruns.
setting IFF_ONE_QUEUE is the default in kernels >= 3.8. the
flag is ignored since then. see kernel commit
5d097109257c03a71845729f8db6b5770c4bbedc
Signed-off-by: Peter Lieven <pl@kamp.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Obviously, hub does not support multiqueue tap. So this patch forbids creating
multiple queue tap when hub is used to prevent the crash when command line such
as "-net tap,queues=2" is used.
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In the current implementation of qemu, running without a network
backend will cause the queue to grow unbounded when the guest is
transmitting traffic.
This patch fixes the problem by implementing bounded size NetQueue,
used with an arbitrary limit of 10000 packets, and dropping packets
when the queue is full _and_ the sender does not pass a callback.
The second condition makes sure that we never drop packets that
contains a callback (which would be tricky, because the producer
expects the callback to be run when all previous packets have been
consumed; so we cannot run it when the packet is dropped).
If documentation is correct, producers that submit a callback should
stop sending when their packet is queued, so there is no real risk
that the queue exceeds the max size by large values.
Signed-off-by: Luigi Rizzo <rizzo@iet.unipi.it>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When frontend and backend are connected through a hub as below
(showing only one direction), and the frontend (or in general, all
output ports of the hub) cannot accept more traffic, the backend
queues packets in queue-A.
When the frontend (or in general, one output port) becomes ready again,
quemu tries to flush packets from queue-B, which is unfortunately empty.
e1000.0 <--[queue B]-- hub0port0(hub)hub0port1 <--[queue A]-- tap.0
To fix this i propose to introduce a new function net_hub_flush()
which is called when trying to flush a queue connected to a hub.
Signed-off-by: Luigi Rizzo <rizzo@iet.unipi.it>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The pSeries machine and some other devices don't supply a cleanup
callback. Revert part of 1ceef9f273 that
started calling it unconditionally.
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Message-id: 1360707366-9271-1-git-send-email-afaerber@suse.de
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
1ceef9f273 added handling for cleaning
up multiple queues in qemu_del_nic() for cases where multiqueue is in
use. To determine the number of queues it looks at nic->conf->queues,
then iterates through all the queues to cleanup the associated
NetClientStates. If no queues are found, no NetClientStates are deleted.
However, nic->conf->queues is only set when a peer is created via
-netdev or netdev_add, and is otherwise 0. This causes us to spin in
net_cleanup() if we attempt to shut down qemu before adding a host
device.
Since qemu_new_nic() unconditionally creates at least 1
queue/NetClientState at queue idx 0, make qemu_del_nic() always attempt
to clean it up.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The multiqueue patch series broke -netdev tap,fd=X which manifests
as libvirt not being able to start a guest. This was because it
passed NULL for the netdev name which results in an anonymous netdev
device regardless of what the user specified.
Cc: Jason Wang <jasowang@redhat.com>
Cc: Bruce Rogers <brogers@suse.com>
Reported-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Recently, linux support multiqueue tap which could let userspace call TUNSETIFF
for a signle device many times to create multiple file descriptors as
independent queues. User could also enable/disabe a specific queue through
TUNSETQUEUE.
The patch adds the generic infrastructure to create multiqueue taps. To achieve
this a new parameter "queues" were introduced to specify how many queues were
expected to be created for tap by qemu itself. Alternatively, management could
also pass multiple pre-created tap file descriptors separated with ':' through a
new parameter fds like -netdev tap,id=hn0,fds="X:Y:..:Z". Multiple vhost file
descriptors could also be passed in this way.
Each TAPState were still associated to a tap fd, which mean multiple TAPStates
were created when user needs multiqueue taps. Since each TAPState contains one
NetClientState, with the multiqueue nic support, an N peers of NetClientState
were built up.
A new parameter, mq_required were introduce in tap_open() to create multiqueue
tap fds.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch introduces a helper tap_get_ifname() to get the device name of tap
device. This is needed when ifname is unspecified in the command line and qemu
were asked to create tap device by itself. In this situation, the name were
allocated by kernel, so if multiqueue is asked, we need to fetch its name after
creating the first queue.
Only linux has this support since it's the only platform that supports
multiqueue tap.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch introduce a new bit - enabled in TAPState which tracks whether a
specific queue/fd is enabled. The tap/fd is enabled during initialization and
could be enabled/disabled by tap_enalbe() and tap_disable() which calls platform
specific helpers to do the real work. Polling of a tap fd can only done when
the tap was enabled.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch add basic multiqueue support for Linux. When multiqueue is needed, we
will first check whether kernel support multiqueue tap before creating more
queues. Two new functions tap_fd_enable() and tap_fd_disable() were introduced
to enable and disable a specific queue. Since the multiqueue is only supported
in Linux, return error on other platforms.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch factors out the common initialization of tap into a new helper
net_init_tap_one(). This will be used by multiqueue tap patches.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Import multiqueue constants from if_tun.h from 3.8-rc3. A new ifr flag
IFF_MULTI_QUEUE were introduced to create a multiqueue backend by calling
TUNSETIFF with the this flag and with the same interface name many times.
A new ioctl TUNSETQUEUE were introduced. When doing this ioctl with
IFF_DETACH_QUEUE, the queue were disabled in the linux kernel. When doing this
ioctl with IFF_ATTACH_QUEUE, the queue were enabled in the linux kernel.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch adds basic multiqueue support for qemu. The idea is simple, an array
of NetClientStates were introduced in NICState, parse_netdev() were extended to
find and match all NetClientStates belongs to the backend and place their
pointers in NICConf. Then qemu_new_nic can setup a N:N mapping between NICStates
that belongs to a nic and NICStates belongs to the netdev. And a queue_index
were introduced in NetClientState to track its index. After this, each peers of
a NICState were abstracted as a queue.
After this change, all NetClientState that belongs to the same backend/nic has
the same id. When use want to change the link status, all NetClientStates that
belongs to the same backend/nic will be also changed. When user want to delete
a device or netdev, all NetClientStates that belongs to the same backend/nic
will be deleted also. Changing or deleting an specific queue is not allowed.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
To allow allocating an array of NetClientState and free it once, this patch
introduces destructor of NetClientState. Which could do type specific free,
which could be used by multiqueue to free the array once.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch separates the setup of NetClientState from its allocation, this will
allow allocating an arrays of NetClientState and does the initialization one by
one which is what multiqueue needs.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In multiqueue, all NetClientState that belongs to the same netdev or nic has the
same id. So this patches introduces an helper qemu_find_net_clients_except()
which finds all NetClientState with the same id. This will be used by multiqueue
networking.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
To support multiqueue nic, this patch separate the nic destructor from
qemu_del_net_client() to a new helper qemu_del_nic() since the mapping bettween
NiCState and NetClientState were not 1:1 in multiqueue. The following patches
would refactor this function to support multiqueue nic.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
To support multiqueue, this patch introduces a helper qemu_get_nic() to get
NICState from a NetClientState. The following patches would refactor this helper
to support multiqueue.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
To support multiqueue, the patch introduce a helper qemu_get_queue()
which is used to get the NetClientState of a device. The following patches would
refactor this helper to support multiqueue.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch change all info call back function to take
additional QDict * parameter, which allow those command
take parameter. Now it is set to NULL at default case.
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
To fix building error:
CC net/vde.o
net/vde.c: In function ‘vde_cleanup’:
net/vde.c:65:5: error: implicit declaration of function ‘qemu_set_fd_handler’ [-Werror=implicit-function-declaration]
net/vde.c:65:5: error: nested extern declaration of ‘qemu_set_fd_handler’ [-Werror=nested-externs]
cc1: all warnings being treated as errors
Signed-off-by: Liming Wang <walimisdev@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
These and some more compiler warnings were caused by a recent commit:
net/tap-win32.c:724: warning: no previous prototype for ‘tap_has_ufo’
net/tap-win32.c:729: warning: no previous prototype for ‘tap_has_vnet_hdr’
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* bonzini/header-dirs: (45 commits)
janitor: move remaining public headers to include/
hw: move executable format header files to hw/
fpu: move public header file to include/fpu
softmmu: move remaining include files to include/ subdirectories
softmmu: move include files to include/sysemu/
misc: move include files to include/qemu/
qom: move include files to include/qom/
migration: move include files to include/migration/
monitor: move include files to include/monitor/
exec: move include files to include/exec/
block: move include files to include/block/
qapi: move include files to include/qobject/
janitor: add guards to headers
qapi: make struct Visitor opaque
qapi: remove qapi/qapi-types-core.h
qapi: move inclusions of qemu-common.h from headers to .c files
ui: move files to ui/ and include/ui/
qemu-ga: move qemu-ga files to qga/
net: reorganize headers
net: move net.c to net/
...
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move public headers to include/net, and leave private headers in net/.
Put the virtio headers in include/net/tap.h, removing the multiple copies
that existed. Leave include/net/tap.h as the interface for NICs, and
net/tap_int.h as the interface for OS-specific parts of the tap backend.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Touching char/char.h basically causes the whole of QEMU to
be rebuilt. Avoid this, it is usually unnecessary.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove some redundant blanks in the comments of
net_hub_id_for_client().
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
For tap, we currently assume the vnet header size is 10
(the default value) but that might not be the case
if tap is persistent and has been used by qemu previously.
To fix, set vnet header size correctly on open.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For tap, we currently assume the vnet header size is 10
(the default value) but that might not be the case
if tap is persistent and has been used by qemu previously.
To fix, set host header size in tap device on open.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch will allow the user to include the domain-search option in
replies from the built-in DHCP server. The domain suffixes can be
specified by adding dnssearch= entries to the "-net user" parameter.
[Jan: tiny style adjustments]
Signed-off-by: Klaus Stengel <Klaus.Stengel@asamnet.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Fix the problem that can not delete the udp socket.
It's caused by passing "udp" model to net_socket_udp_init,
but we do not have "udp" model in our model list.
Pass the right model "socket" to init function.
https://bugs.launchpad.net/qemu/+bug/1073585?comments=all
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add missing stubs to win32 to fix link failure.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The include file for net_init_tap was missing:
net/tap-win32.c:703:
warning: no previous prototype for ‘net_init_tap’
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch doesn't seem much useful alone, I must admit. However,
it makes sense as part of the upcoming directory reorganization,
where I want to have include/net/tap.h as the net<->hw interface
for tap. Then having both net/tap.h and include/net/tap.h does
not work. "Fixed" by moving all the init functions to a single
header file net/clients.h.
The patch also adopts a uniform style for including net/*.h files
from net/*.c, without the net/ path.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Commit 213fd5087e removed a type cast
which is needed for MinGW:
net/socket.c:136: warning:
pointer targets in passing argument 2 of ‘sendto’ differ in signedness
/usr/lib/gcc/amd64-mingw32msvc/4.4.4/../../../../amd64-mingw32msvc/include/winsock2.h:1313: note:
expected ‘const char *’ but argument is of type ‘const uint8_t *’
Add a 'qemu_sendto' macro which provides that type cast where needed
and use the new macro instead of 'sendto'.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Replace spinning send_all() with a proper non-blocking send. When the
socket write buffer limit is reached, we should stop trying to send and
wait for the socket to become writable again.
Non-blocking TCP sockets can return in two different ways when the write
buffer limit is reached:
1. ret = -1 and errno = EAGAIN/EWOULDBLOCK. No data has been written.
2. ret < total_size. Short write, only part of the message was
transmitted.
Handle both cases and keep track of how many bytes have been written in
s->send_index. (This includes the 'length' header before the actual
payload buffer.)
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Implement asynchronous send for UDP (or other SOCK_DGRAM) sockets. If
send fails with EAGAIN we wait for the socket to become writable again.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
The net/socket.c net client is not truly asynchronous. This patch
borrows the qemu_set_fd_handler2() code from net/tap.c as the basis for
proper asynchronous send/receive.
Only read packets from the socket when the peer is able to receive.
This avoids needless queuing.
Later patches implement asynchronous send.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
In commit 60c07d933c ("net: fix
qemu_can_send_packet logic") the "VLAN" broadcast behavior was changed
to queue packets if any net client cannot receive. It turns out that
this was not actually the right fix and just hides the real bug that
hw/usb/dev-network.c:usbnet_receive() clobbers its receive buffer when
called multiple times in a row. The commit also introduced a new bug
that "VLAN" packets would not be sent if one of multiple net clients was
down.
The hw/usb/dev-network.c bug has since been fixed, so this patch reverts
broadcast behavior to send packets as long as one net client can
receive. Packets simply get queued for the net clients that are
temporarily unable to receive.
Reported-by: Roy.Li <rongqing.li@windriver.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Net send functions have a return value where 0 means the packet has not
been sent and will be queued. A non-zero value means the packet was
sent or an error caused the packet to be dropped.
This patch fixes two instances where packets are queued but we return
their size. This causes callers to believe the packets were sent. When
the caller uses the async send interface this creates a real problem
because the callback will be invoked for a packet that the caller
believed to be already sent. This bug can cause double-frees in the
caller.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
virtio-net has code to flush the queue and notify the iothread
whenever new receive buffers are added by the guest. That is
fine, and indeed we need to do the same in all other drivers.
However, notifying the iothread should be work for the network
subsystem. And since we are at it we can add a little smartness:
if some of the queued packets already could not be delivered,
there is no need to notify the iothread.
Reported-by: Luigi Rizzo <rizzo@iet.unipi.it>
Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Jan Kiszka <jan.kiszka@siemens.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
This patch renames+moves the net_handle_fd_param() caller used to
obtain a file descriptor from either qemu_parse_fd() (the normal case)
or from monitor_get_fd() (migration case) into a generically prefixed
monitor_handle_fd_param() to be used by vhost-scsi code.
Also update net/[socket,tap].c consumers to use the new prefix.
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Clang compiler complained about use of reserved word 'restrict' in SLIRP
and QAPI.
Prefix C keywords with "q_", adjust SLIRP accordingly.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The -net socket,listen option does not work with the newer -netdev
syntax:
http://lists.gnu.org/archive/html/qemu-devel/2011-11/msg01508.html
This patch makes it work now.
For the case where one vlan has multiple listenning sockets,
the patch will also provide the support.
Supported syntax:
1.) -net socket,listen=127.0.0.1:1234,vlan=0
2.) -net socket,listen=127.0.0.1:1234,vlan=0 -net socket,listen=127.0.0.1:1235,vlan=0
3.) -netdev socket,listen=127.0.0.1:1234,id=socket0
Drop the NetSocketListenState struct and add a listen_fd field
to NetSocketState. When a -netdev socket,listen= instance is created
there will be a NetSocketState with fd=-1 and a valid listen_fd. The
net_socket_accept() handler waits for listen_fd to become readable and
then accepts the connection. When this state transition happens, we no
longer monitor listen_fd for incoming connections...until the client
disconnects again.
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Only when all other hub port's *peer* .can_receive() all return 1,
the source hub port .can_receive() return 1.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Another step in moving the vlan feature out of net core. Users only
deal with NetClientState and therefore qemu_del_vlan_client() should be
named qemu_del_net_client().
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Now that VLANClientState has been renamed to NetClientState all 'vc'
local variables should be 'nc'. Much of the code already used 'nc' but
there are places where 'vc' needs to be renamed.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
The vlan feature is no longer part of net core. Rename VLANClientState
to NetClientState because net clients are not explicitly associated with
a vlan at all, instead they have a peer net client to which they are
connected.
This patch is a mechanical search-and-replace except for a few
whitespace fixups where changing VLANClientState to NetClientState
misaligned whitespace.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Instead of using VLANState use net/hub.h to support the vlan qdev
property. The vlan qdev property becomes an alias for the peer qdev
property but is represented as a VLAN ID number. When a VLAN ID is
selected the device will really peer with a hub port.
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Since hubs are now used to implement the 'vlan' feature and the vlan
argument is always NULL, remove the argument entirely and update all net
clients that use qemu_new_net_client().
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Checks can be performed to make sure that hubs have at least one NIC and
one host device, warning the user if this is not the case.
Configurations which do not meet this rule tend to be broken but just
emit a warning. This patch preserves compatibility with the checks
performed by net core on vlans.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Stop using the special-case vlan code in net.c. Instead use the hub net
client to implement the vlan feature. The next patch will remove vlan
code from net.c completely.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
The vlan feature can be implemented in terms of hubs. By introducing a
hub net client it becomes possible to remove the special case vlan code
from net.c and push the vlan feature out of generic networking code.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
v1->v2:
- NetdevVdeOptions::port and ::mode are of type uint16. Remove superfluous
range checks.
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
I "reverse engineered" the following permissions between the -socket
sub-options:
fd listen connect mcast udp | localaddr
fd x . . . . | .
listen . x . . . | .
connect . . x . . | .
mcast . . . x . | x
udp . . . . x | x
-------------------------------------------+
localaddr . . . x x x
I transformed the code accordingly. The real fix would be to embed "fd",
"listen", "connect", "mcast" and "udp" in a separate union. However
OptsVisitor's enum parser only supports the type=XXX QemuOpt instance as
union discriminator.
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
v1->v2:
- NetdevDumpOptions::len is of type 'size', whose C type was changed to
uint64_t. Adapt the printf() format specifier macro.
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
The net_client_init() prototype is kept intact.
Based on "is_netdev", the QemuOpts-rooted QemuOpt-list is parsed as a
Netdev or a NetLegacy. The original meat of net_client_init() is moved to
and simplified in net_client_init1():
Fields not common between -net and -netdev are clearly separated. Getting
the name for the init functions is cleaner: Netdev::id is mandatory, and
all init functions handle a NULL NetLegacy::name. NetLegacy::vlan
explicitly depends on -net (see below).
Verifying the "type=" option for -netdev can be turned into a switch.
Format validation with qemu_opts_validate() can be removed because the
visitor covers it. Relatedly, the "net_client_types" array is reduced to
an array of init functions that can be directly indexed by opts->kind.
(Help text is available in the schema JSON.)
The outermost negation in the condition around qemu_find_vlan() was
flattened, because it expresses the dependent code's requirements more
clearly.
VLAN lookup is avoided if there's no init function to pass the VLAN to.
Whenever the value of type=... is needed, we substitute
NetClientOptionsKind_lookup[kind].
The individual init functions are not converted yet, thus the original
QemuOpts instance is passed transparently.
v1->v2:
- NetLegacy::name is optional. Tracked it through all init functions: they
all handle a NULL name. Updated commit message accordingly.
v2->v3:
- NetLegacy::id is allowed and takes precedence over NetLegacy::name.
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Users may pass the following parameters to qemu:
$ qemu-kvm -net nic -net user,smb= ...
$ qemu-kvm -net nic -net user,smb ...
$ qemu-kvm -net nic -net user,smb=bad_directory ...
In these cases, qemu started successfully while samba server
failed to start. Users will confuse since samba server
failed silently without any indication of what it did wrong.
To avoid it, we check whether the shared directory exist and
if users have permission to access this directory when QEMU's
"built-in" SMB server is enabled.
Signed-off-by: Dunrong Huang <riegamaths@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
When using guestfwd=, Qemu only connects the virtual server's TCP port
to a single chardev. This is useless in most cases, as we usually want
to have more than a single connection from the guest to the outside world.
This patch adds a new cmd: target to guestfwd= that allows for execution
of a command on every TCP connection. This leverages the same code as
the -smb parameter, just that here the command is user defined.
Reported-by: Sascha Wilde <wilde@intevation.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Windows 7 (and possibly other versions) cannot connect to the samba
share if the exported host directory is not world-readable. This can be
resolved by forcing the username used for access checks to the one
under which QEMU and smbd are running.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
This is needed to get file descriptors from SCM_RIGHTS.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
<libutil.h> and <util.h> on *BSD (some have one, some another)
were #included just for openpty() declaration. The only file
where this function is actually used is qemu-char.c.
In vl.c and net/tap-bsd.c, none of functions declared in libutil.h
(login logout logwtmp timdomain openpty forkpty uu_lock realhostname
fparseln and a few others depending on version) are used.
Initially the code which is currently in qemu-char.c was in vl.c,
it has been removed into separate file in commit 0e82f34d07
Fri Oct 31 18:44:40 2008, but the #includes were left in vl.c.
So with vl.c, we just remove includes - libutil.h, util.h and
pty.h (which declares only openpty() and forkpty()) from there.
The code in net/tap-bsd.c, which come from net/tap.c, had this
commit 5281d757ef
Author: Mark McLoughlin <markmc@redhat.com>
Date: Thu Oct 22 17:49:07 2009 +0100
net: split all the tap code out into net/tap.c
Note this commit not only moved stuff out of net.c to net/tap.c,
but also rewrote large portions of the tap code, and added these
completely unnecessary #includes -- as usual, I question why such
a misleading commit messages are allowed.
Again, no functions defined in libutil.h or util.h on *BSD are
used by neither net/tap.c nor net/tap-bsd.c. Removing them.
And finally, the only real user for these #includes, qemu-char.c,
which actually uses openpty(). There, the #ifdef logic is wrong.
A GLIBC-based system has <pty.h>, even if it is a variant of *BSD.
So __GLIBC__ should be checked first, and instead of trying to
include <libutil.h> or <util.h>, we include <pty.h>. If it is not
GLIBC-based, we check for variations between <*util.h> as before.
This patch fixes build of qemu 1.1 on Debian/kFreebsd (well, one
of the two problems): it is a distribution with a FreeBSD kernel,
so it #defines at least __FreeBSD_kernel__, but since it is based
on GLIBC, it has <pty.h>, but current version does not have neither
<util.h> nor <libutil.h>, which the code tries to include 3 times
but uses only once.
Signed-off-By: Michael Tokarev <mjt@tls.msk.ru>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The only backend that really uses it is the socket one, which calls
monitor_get_fd(). But it can use 'cur_mon' instead.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-By: Laszlo Ersek <lersek@redhat.com>
The smb.conf generated by the userspace networking does not include a state directory
directive. Samba therefore falls back to the default value. Since the user generally
does not have write access to this path, smbd immediately crashes.
The "state directory" option was added in Samba 3.4.0 (commit
http://gitweb.samba.org/?p=samba.git;a=commit;h=7b02e05eb64f3ffd7aa1cf027d10a7343c0da757).
This patch adds the missing option.
Signed-off-by: Nikolaus Rath <Nikolaus@rath.org>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
The "smb ports = 0" option causes recent samba versions to crash. It was
introduced in commit 157777ef3e with log message "Samba 3 support".
However, a value of 0 has never been officially supported by smb and is
also not necessary: if stdin is a socket, smb does not try to listen on
any ports and uses just stdin. This is necessary to support inetd based
operation (otherwise smbd would always fail when called from inetd,
because inetd already listens on the SMB port). Since samba has
supported inetd operation since pre-3.x, it should be safe to rely on
this feature. I have tested it with Samba 3.6.4 -- communication works
fine, and smbd is not listening on any ports.
I suspect the "smb ports = 0" hack may have been introduced when someone
tested the qemu generated samba config from the command line with "smbd
-i" and found it to fail (because then stdin isn't a socket).
Signed-off-by: Nikolaus Rath <Nikolaus@rath.org>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
This file only contains code from Red Hat, so it can use GPLv2+.
Tested with `git blame -M -C net/checksum.c`.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The most common use of -net tap is to connect a tap device to a bridge. This
requires the use of a script and running qemu as root in order to allocate a
tap device to pass to the script.
This model is great for portability and flexibility but it's incredibly
difficult to eliminate the need to run qemu as root. The only really viable
mechanism is to use tunctl to create a tap device, attach it to a bridge as
root, and then hand that tap device to qemu. The problem with this mechanism
is that it requires administrator intervention whenever a user wants to create
a guest.
By essentially writing a helper that implements the most common qemu-ifup
script that can be safely given cap_net_admin, we can dramatically simplify
things for non-privileged users. We still support existing -net tap options
as a mechanism for advanced users and backwards compatibility.
Currently, this is very Linux centric but there's really no reason why it
couldn't be extended for other Unixes.
A typical invocation would be similar to one of the following:
qemu linux.img -net bridge -net nic,model=virtio
qemu linux.img -net tap,helper="/usr/local/libexec/qemu-bridge-helper"
-net nic,model=virtio
qemu linux.img -netdev bridge,id=hn0
-device virtio-net-pci,netdev=hn0,id=nic1
qemu linux.img -netdev tap,helper="/usr/local/libexec/qemu-bridge-helper",id=hn0
-device virtio-net-pci,netdev=hn0,id=nic1
The default bridge that we attach to is br0. The thinking is that a distro
could preconfigure such an interface to allow out-of-the-box bridged networking.
Alternatively, if a user wants to use a different bridge, a typical invocation
would be simliar to one of the following:
qemu linux.img -net bridge,br=qemubr0 -net nic,model=virtio
qemu linux.img -net tap,helper="/usr/local/libexec/qemu-bridge-helper --br=qemubr0"
-net nic,model=virtio
qemu linux.img -netdev bridge,br=qemubr0,id=hn0
-device virtio-net-pci,netdev=hn0,id=nic1
qemu linux.img -netdev tap,helper="/usr/local/libexec/qemu-bridge-helper --br=qemubr0",id=hn0
-device virtio-net-pci,netdev=hn0,id=nic1
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Richa Marwaha <rmarwah@linux.vnet.ibm.com>
Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
All files under GPLv2 will get GPLv2+ changes starting tomorrow.
event_notifier.c and exec-obsolete.h were only ever touched by Red Hat
employees and can be relicensed now.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Fix a leak of a file descriptor due to missing closesocket() calls
in error paths in net_socket_listen_init().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Stored dates are no more 1970-01-01 (+ run time), but have a real meaning.
If someone wants to have comparable timestamps accross boots, it is
possible to start qemu with -rtc to give the startup date.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This prevents data of a previous run to be seen in the new dump file.
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Report an error when err is nonzero, not when it is zero.
Signed-off-by: Geoffrey Thomas <geofft@ldpreload.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
This patch fixes a bug where child processes of launch_script() can
misbehave due to SIGCHLD being blocked. In the case of `sudo`, this
causes a permanent hang.
Previously a SIGCHLD handler was added to reap fork_exec()'d zombie
processes by calling waitpid(-1, ...). This required other
fork()/waitpid() callers to temporarilly block SIGCHILD to avoid
having the final wait status being intercepted by the SIGCHLD
handler:
7c3370d4fe
Since then, the qemu_add_child_watch() interface was added to allow
registration of such processes and reap only from that specific set
of PIDs:
4d54ec7898
As a result, we can now avoid blocking SIGCHLD in launch_script(), so
drop that behavior.
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Today net/socket.c has no consistent policy for closing the socket file
descriptor when initialization fails. This means we leak the file
descriptor in some cases or we could also try to close it twice.
Make error paths consistent by taking ownership of the file descriptor
and closing it on error.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In order to make later patches sane, expand the tab characters and
conform to QEMU coding style now.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Double semicolons should be single.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
get_str_sep() can fail, but net_slirp_hostfwd_remove() doesn't check.
Works, because it initializes buf[] to "", which get_str_sep() doesn't
touch when it fails. Coverity doesn't like it, and neither do I.
Change it to work exactly like slirp_hostfwd().
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
I'm getting:
could not configure /dev/net/tun (tap%d): Operation not permitted
When the ioctl() fails, ifr.ifr_name will most likely not be overwritten.
So we better only use it when ifname contains a string.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Allow overriding the location of Samba's smbd.
Pretty much every OS I look at has some means of
changing this path (patching) so lets just make
it easier for OS developers creating packages
and/or end users to override the location.
Signed-off-by: Brad Smith <brad@comstyle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Fix network interface tap backend work on NetBSD.
It uses an ioctl to get the tap name.
Signed-off-by: Christoph Egger<Christoph.Egger@amd.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Avoid warnings like these by wrapping recv():
CC slirp/ip_icmp.o
/src/qemu/slirp/ip_icmp.c: In function 'icmp_receive':
/src/qemu/slirp/ip_icmp.c:418:5: error: passing argument 2 of 'recv' from incompatible pointer type [-Werror]
/usr/local/lib/gcc/i686-mingw32msvc/4.6.0/../../../../i686-mingw32msvc/include/winsock2.h:547:32: note: expected 'char *' but argument is of type 'struct icmp *'
Remove also casts used to avoid warnings.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Position entries of net_client_types according to the corresponding
values of NET_CLIENT_TYPE_*. The array size is now defined by
NET_CLIENT_TYPE_MAX. This will allow to obtain entries based on type
value in later patches.
At this chance rename NET_CLIENT_TYPE_SLIRP to NET_CLIENT_TYPE_USER for
the sake of consistency.
CC: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
All other boolean arguments accept on|off - except for slirp's restrict.
Fix that while still accepting the formerly allowed yes|y|no|n, but
reject everything else. This avoids accidentally allowing external
connections because syntax errors were so far interpreted as
'restrict=no'.
CC: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
error_report() prepends location, and appends a newline. The message
constructed from the arguments should not contain a newline. Fix the
obvious offenders.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
When using -net user,guestfwd=... Qemu immediately complains about the id
being in invalid format. This is because we pass in an id that contains a
colon, while the id restrictions don't allow colons.
This patch changes the colon into a dot, making guestfwd work again.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
This patch removes all references to signal.h when qemu-common.h is included
as they become redundant.
Signed-off-by: Alexandre Raymond <cerbere@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
This was done with:
sed -i 's/qemu_get_clock\>/qemu_get_clock_ns/' \
$(git grep -l 'qemu_get_clock\>' )
sed -i 's/qemu_new_timer\>/qemu_new_timer_ns/' \
$(git grep -l 'qemu_new_timer\>' )
after checking that get_clock and new_timer never occur twice
on the same line. There were no missed occurrences; however, even
if there had been, they would have been caught by the compiler.
There was exactly one false positive in qemu_run_timers:
- current_time = qemu_get_clock (clock);
+ current_time = qemu_get_clock_ns (clock);
which is of course not in this patch.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fix allows connection of internal VLAN to the external TAP interface.
If tap_win32_write function always returns 0, the TAP network interface
in QEMU is disabled.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
MSDN includes the following in WSAEALREADY error description for connect()
function: "To preserve backward compatibility, this error is reported as
WSAEINVAL to Winsock applications that link to either Winsock.dll or
Wsock32.dll". So check of this error code was added to allow network
connections through the sockets in Windows.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
When MSI is off, each interrupt needs to be bounced through the io
thread when it's set/cleared, so vhost-net causes more context switches and
higher CPU utilization than userspace virtio which handles networking in
the same thread.
We'll need to fix this by adding level irq support in kvm irqfd,
for now disable vhost-net in these configurations.
Added a vhostforce flag to force vhost-net back on.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
With current sndbuf default value, a blocked
target guest can prevent another guest from
transmitting any packets. While current
sndbuf value (1M) is reported to help some
UDP based workloads, the default should
be safe (0).
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Avoid this warning like other uses of setsockopt:
/src/qemu/net/socket.c: In function 'net_socket_mcast_create':
/src/qemu/net/socket.c:210: warning: passing argument 4 of 'setsockopt' from incompatible pointer type
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Add an option to specify the host IP to send multicast packets from,
when using a multicast socket for networking. The option takes an IP
address and sets the IP_MULTICAST_IF socket option, which causes the
packets to use that IP's interface as an egress.
This is useful if the host machine has several interfaces with several
virtual networks across disparate interfaces.
Signed-off-by: Mike Ryan <mikeryan@ISI.EDU>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
virtio-net expects set_offload to succeed after
peer cleanup.
Since we don't have an open fd anymore, make it so.
Fixes warning about the failure of offload setting.
Reported-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Frontends calling tap_get_vhost_net get an invalid pointer after the
peer backend has been deleted. Jason Wang <jasowang@redhat.com> reports
this leading to a crash in ack_features when we remove the vhost-net
bakend of a virtio nic.
The fix is simply to clear the backend pointer.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Compiling with GCC 4.6.0 20100925 produced warnings like:
/src/qemu/net/tap-win32.c: In function 'tap_win32_open':
/src/qemu/net/tap-win32.c:582:12: error: variable 'hThread' set but not used [-Werror=unused-but-set-variable]
Fix by removing the unused variables.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
If neither of __FreeBSD__, __FreeBSD_kernel__ and __DragonFly__ is defined,
util.h is included from tap-bsd.c.
Don't include it again if __OpenBSD__ is defined.
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Make host vnet header length a structure field in
preparation for using this support in linux kernel.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a missing #include statement to avoid a warning:
/src/qemu/net/tap-solaris.c: In function 'tap_open':
/src/qemu/net/tap-solaris.c:189: warning: implicit declaration of function 'error_report'
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
In net/tap-linux.c, when manipulation of /dev/net/tun fails, it prints
(with fprintf) something like this:
warning: could not open /dev/net/tun: no virtual network emulation
this has 2 issues:
1) it is not a warning really, it's a fatal error (kvm exits after
that),
2) there's no indication as of what's actually wrong: printing errno there
is helpful.
The patch below removes the "warning" prefix, uses %m (since it's linux,
%m is available as format modifier), and changes fprintf() to %qemu_error().
Now it prints something like this instead:
could not configure /dev/net/tun: Device or resource busy
(there are 2 messages like that in the same function)
This fixes Debian bug #578154, see
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=578154
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
will be used by virtio-net for vhost net support
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This adds vhost binary option to tap, to enable vhost net accelerator.
Default is off for now, we'll be able to make default on long term
when we know it's stable.
vhostfd option can be used by management, to pass in the fd. Assigning
vhostfd implies vhost=on.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Will be used by vhost to attach/detach to backend.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
CC net/tap-bsd.o
/src/qemu/net/tap-bsd.c: In function `tap_open':
/src/qemu/net/tap-bsd.c:93: warning: implicit declaration of function `error_report'
CC sparc-softmmu/../net/tap-win32.o
cc1: warnings being treated as errors
/src/qemu/target-sparc/../net/tap-win32.c: In function 'net_init_tap':
/src/qemu/target-sparc/../net/tap-win32.c:709: warning: implicit declaration of function 'error_report'
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Right now, downscript is not invoked reliably. If you execute 'quit' from the
monitor, it won't be invoked.
This fixes that by converting tap to use an exit_notifier to execute the
downscript. In this case, allowing an exit notifier to include state is
critically important for the conversion.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
error_report() terminates the message with a newline. Strip it it
from its arguments.
This fixes a few error messages lacking a newline:
net_handle_fd_param()'s "No file descriptor named %s found", and
tap_open()'s "vnet_hdr=1 requested, but no kernel support for
IFF_VNET_HDR available" (all three versions).
There's one place that passes arguments without newlines
intentionally: load_vmstate(). Fix it up.
we shouldn't call W*() macros until we check that fork worked.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
net_check_clients() prints this when an VLAN has host devices, but no
guest devices. It uses VLANState members nb_guest_devs and
nb_host_devs to keep track of these devices. However, -device does
not update nb_guest_devs, only net_init_nic() does that, for -net nic.
Check the VLAN clients directly, and remove the counters.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
CC net/slirp.o
cc1: warnings being treated as errors
net/slirp.c: In function 'slirp_smb_cleanup':
net/slirp.c:470: error: ignoring return value of 'system', declared with attribute warn_unused_result
make: *** [net/slirp.o] Error 1
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When vhost is bound to a backend device, we need to stop polling it when
vhost is started, and restart polling when vhost is stopped.
Add an API for that for use by vhost, and implement in tap backend.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Avoid an unresolved symbol error for TFR,
which is defined in sysemu.h.
Based on patch by Palle Lyckegaard.
Signed-off-by: Andreas Färber <afaerber@opensolaris.org>
Cc: Palle Lyckegaard <palle@lyckegaard.dk>
Cc: Ben Taylor <bentaylor.solx86@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
We're leaking file descriptors to child processes. Set FD_CLOEXEC on file
descriptors that don't need to be passed to children to stop this misbehaviour.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Print an error if the user specifies vnet_hdr=1 on the cmdline.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
net_tap_init() always sets vnet_hdr using qemu_opt_get_bool(), but
initialize it in net_init_tap() just to reduce confusion.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This avoids the "TUNSETOFFLOAD ioctl() failed: Invalid argument" message
on kernels without TUNSETOFFLOAD support.
Signed-off-by: Pierre Riteau <Pierre.Riteau@irisa.fr>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
vnet_hdr is initialized at 1 by default. We need to reset it to 0 if
the kernel doesn't support IFF_VNET_HDR.
Signed-off-by: Pierre Riteau <Pierre.Riteau@irisa.fr>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add a NetClientInfo pointer to VLANClientState and use that
for the typecode and function pointers.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Handle ifname on FreeBSD hosts; if no ifname is given, always start
the search from tap0. (Simplified/cleaned up version of what has been
in the FreeBSD ports for a long time.)
Signed-off-by: Juergen Lock <nox@jelal.kn-bremen.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
net/tap-bsd.c was assuming IFF_VNET_HDR was always available, which
I think isn't true on any BSD.
Signed-off-by: Juergen Lock <nox@jelal.kn-bremen.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Okay, let's try re-enabling the drain-entire-queue behaviour, with a
difference - before each subsequent packet, use qemu_can_send_packet()
to check that we can send it. This is similar to how we check before
polling the tap fd and avoids having to drop a packet if the receiver
cannot handle it.
This patch should be a performance improvement since we no longer have
to go through the mainloop for each packet.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Now that we disable any receiver whose queue is full, we do not require
senders to handle a zero return by supplying a sent callback.
This is a second step towards allowing can_receive() handlers to return
true even if no buffer space is available.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
If qemu_send_packet_async() returns zero, it means the packet has been
queued and the sent callback will be invoked once it has been flushed.
This is only possible where the NIC's receive() handler returns zero
and promises to notify the networking core that room is available in its
queue again.
In the case where the receive handler does not have this capability
(and its queue fills up) it returns -1 and the networking core does not
queue up the packet. This condition is indicated by a -1 return from
qemu_send_packet_async().
Currently, tap handles this condition simply by dropping the packet. It
should do its best to avoid getting into this situation by checking such
NIC's have room for a packet before copying the packet from the tap
interface.
tap_send() used to achieve this by only reading a single packet before
returning to the mainloop. That way, tap_can_send() is called before
reading each packet.
tap_send() was changed to completely drain the tap interface queue
without taking into account the situation where the NIC returns an
error and the packet is not queued. Let's start fixing this by
reverting to the previous behaviour of reading one packet at a time.
Reported-by: Scott Tsai <scottt.tw@gmail.com>
Tested-by: Sven Rudolph <Sven_Rudolph@drewag.de>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
commit 71f4effce7
Author: Alexander Graf <agraf@suse.de>
Date: Fri Oct 30 22:27:00 2009 +0100
Unbreak tap compilation on OS X
Broke the build on Linux due to a bad #if guard
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Currently compiling the tap sources breaks on Mac OS X. This is because of:
1) tap-linux.h requiring Linux includes
2) typos
3) missing #includes
This patch adds what's necessary to compile tap happily on Mac OS X.
I haven't tested if using tap actually works, but I don't think that's a
major issue as that code was probably seriously untested before already.
I didn't split the patch, because it's only a few lines of code and
splitting is probably not worth the effort here.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Okay, this makes the tap options available on AIX even though there's
no support, but if we want to do it right we should have not compile
the tap code at all on AIX using e.g. CONFIG_TAP.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>