Commit Graph

134 Commits

Author SHA1 Message Date
Stefan Hajnoczi
7157e2e23e virtio: guard against negative vq notifies
The virtio_queue_notify() function checks that the virtqueue number is
less than the maximum number of virtqueues.  A signed comparison is used
but the virtqueue number could be negative if a buggy or malicious guest
is run.  This results in memory accesses outside of the virtqueue array.

It is risky doing input validation in common code instead of at the
guest<->host boundary.  Note that virtio_queue_set_addr(),
virtio_queue_get_addr(), virtio_queue_get_num(), and many other virtio
functions do *not* validate the virtqueue number argument.

Instead of fixing the comparison in virtio_queue_notify(), move the
comparison to the virtio bindings (just like VIRTIO_PCI_QUEUE_SEL) where
we have a uint32_t value and can avoid ever calling into common virtio
code if the virtqueue number is invalid.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2011-06-12 10:33:38 +03:00
Isaku Yamahata
e75ccf2c03 virtio-pci.c: convert to PCIDEviceInfo to initialize ids
use PCIDeviceInfo to initialize ids.

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2011-06-12 10:33:38 +03:00
Aneesh Kumar K.V
9fe1ebebd0 virtio-9p: Move 9p device registration into virtio-9p.c
This patch move the 9p device registration into its own file

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri (JV) <jvrao@linux.vnet.ibm.com>
2011-06-01 10:23:58 -07:00
Alex Williamson
5ee8ad71e1 PXE: Use consistent naming for PXE ROMs
And add missing ROMs to tarbin build target.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2011-04-18 11:46:01 -06:00
Alexander Graf
29f82b37e5 virtio: use generic name when possible
We have two different virtio buses: pci and s390. The abstraction path
taken in qemu is to have generic aliases for each device type in the
architecture specific qdev devices.

So let's make use of these aliases whenever we can and define them
whenever we can.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2011-04-04 00:34:09 +02:00
Michael S. Tsirkin
89c473fd82 virtio-pci: fix bus master work around on load
Commit c81131db15
detects old guests by comparing virtio and
PCI status. It attempts to do this on load,
as well, but load_config callback in a binding
is invoked too early and so the virtio status
isn't set yet.

We could add yet another callback to the
binding, to invoke after load, but it
seems easier to reuse the existing vmstate
callback.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
2011-03-28 18:34:23 +02:00
Amit Shah
32059220d0 virtio-serial: Enable ioeventfd
Enable ioeventfd for virtio-serial devices by default.  Commit
25db9ebe15 lists the benefits of using
ioeventfd.

Copying a file from guest to host over a virtio-serial channel didn't
show much difference in time or io_exit rate.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
2011-03-21 16:55:12 +05:30
Amit Shah
6b331efb73 virtio-serial: Use a struct to pass config information from proxy
Instead of using a single variable to pass to the virtio_serial_init
function, use a struct so that expanding the number of variables to be
passed on later is easier.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
2011-03-21 16:55:11 +05:30
mst@redhat.com
5430a28fe4 vhost: force vhost off for non-MSI guests
When MSI is off, each interrupt needs to be bounced through the io
thread when it's set/cleared, so vhost-net causes more context switches and
higher CPU utilization than userspace virtio which handles networking in
the same thread.

We'll need to fix this by adding level irq support in kvm irqfd,
for now disable vhost-net in these configurations.

Added a vhostforce flag to force vhost-net back on.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-02-01 16:50:44 -06:00
Michael S. Tsirkin
b36e391441 ioeventfd: error handling cleanup
- Don't return status from start/stop functions where it's ignored
- report errors to make debugging easier
- assert on unexpected failures
- don't disable notifiers on error so that we'll
  retry when guest driver restarts

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-01-11 17:47:48 +02:00
Stefan Hajnoczi
25db9ebe15 virtio-pci: Use ioeventfd for virtqueue notify
Virtqueue notify is currently handled synchronously in userspace virtio.  This
prevents the vcpu from executing guest code while hardware emulation code
handles the notify.

On systems that support KVM, the ioeventfd mechanism can be used to make
virtqueue notify a lightweight exit by deferring hardware emulation to the
iothread and allowing the VM to continue execution.  This model is similar to
how vhost receives virtqueue notifies.

The result of this change is improved performance for userspace virtio devices.
Virtio-blk throughput increases especially for multithreaded scenarios and
virtio-net transmit throughput increases substantially.

Some virtio devices are known to have guest drivers which expect a notify to be
processed synchronously and spin waiting for completion.
For virtio-net, this also seems to interact with the guest stack in strange
ways so that TCP throughput for small message sizes (~200bytes)
is harmed. Only enable ioeventfd for virtio-blk for now.

Care must be taken not to interfere with vhost-net, which uses host
notifiers.  If the set_host_notifier() API is used by a device
virtio-pci will disable virtio-ioeventfd and let the device deal with
host notifiers as it wishes.

Finally, there used to be a limit of 6 KVM io bus devices inside the
kernel.  On such a kernel, don't use ioeventfd for virtqueue host
notification since the limit is reached too easily.  This ensures that
existing vhost-net setups (which always use ioeventfd) have ioeventfds
available so they can continue to work.

After migration and on VM change state (running/paused) virtio-ioeventfd
will enable/disable itself.

 * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd
 * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd
 * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd
 * vm_change_state(running=0) -> disable virtio-ioeventfd
 * vm_change_state(running=1) -> enable virtio-ioeventfd

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2011-01-10 14:44:16 +02:00
Stefan Hajnoczi
3dbca8e6a7 virtio-pci: Rename bugs field to flags
The VirtIOPCIProxy bugs field is currently used to enable workarounds
for older guests.  Rename it to flags so that other per-device behavior
can be tracked.

A later patch uses the flags field to remember whether ioeventfd should
be used for virtqueue host notification.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2011-01-10 13:47:41 +02:00
Gleb Natapov
779206de67 Introduce fw_name field to DeviceInfo structure.
Add "fw_name" to DeviceInfo to use in device path building. In
contrast to "name" "fw_name" should refer to functionality device
provides instead of particular device model like "name" does.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-12-11 21:27:44 +00:00
Stefan Hajnoczi
4e02d460dd virtio-pci: Convert fprintf() to error_report()
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-11-21 09:16:58 -06:00
Gerd Hoffmann
9dbcca5aa1 virtfs: enable MSI-X
This patch enables MSI-X for virtfs-9p-pci.  It also adds a
compat property to pc-0.13 which turns it of there to stay
compatible to 0.13-stable.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-11-21 09:16:57 -06:00
Michael S. Tsirkin
54dd932128 virtio: change set guest notifier to per-device
When using irqfd with vhost-net to inject interrupts,
a single evenfd might inject multiple interrupts.
Implementing this is much easier with a single
per-device callback to set guest notifiers.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-10-07 12:19:47 +02:00
Alex Williamson
a697a334b3 virtio-net: Introduce a new bottom half packet TX
Based on a patch from Mark McLoughlin, this patch introduces a new
bottom half packet transmitter that avoids the latency imposed by
the tx_timer approach.  Rather than scheduling a timer when a TX
packet comes in, schedule a bottom half to be run from the iothread.
The bottom half handler first attempts to flush the queue with
notification disabled (this is where we could race with a guest
without txburst).  If we flush a full burst, reschedule immediately.
If we send short of a full burst, try to re-enable notification.
To avoid a race with TXs that may have occurred, we must then
flush again.  If we find some packets to send, the guest it probably
active, so we can reschedule again.

tx_timer and tx_bh are mutually exclusive, so we can re-use the
tx_waiting flag to indicate one or the other needs to be setup.
This allows us to seamlessly migrate between timer and bh TX
handling.

The bottom half handler becomes the new default and we add a new
tx= option to virtio-net-pci.  Usage:

-device virtio-net-pci,tx=timer # select timer mitigation vs "bh"

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-09-07 20:29:29 +03:00
Alex Williamson
e3f30488e5 virtio-net: Limit number of packets sent per TX flush
If virtio_net_flush_tx() is called with notification disabled, we can
race with the guest, processing packets at the same rate as they
get produced.  The trouble is that this means we have no guaranteed
exit condition from the function and can spend minutes in there.
Currently flush_tx is only called with notification on, which seems
to limit us to one pass through the queue per call.  An upcoming
patch changes this.

Also add an option to set this value on the command line as different
workloads may wish to use different values.  We can't necessarily
support any random value, so this is a developer option: x-txburst=
Usage:

-device virtio-net-pci,x-txburst=64 # 64 packets per tx flush

One pass through the queue (256) seems to be a good default value
for this, balancing latency with throughput.  We use a signed int
for x-txburst because 2^31 packets in a burst would take many, many
minutes to process and it allows us to easily return a negative
value value from virtio_net_flush_tx() to indicate a back-off
or error condition.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-09-07 20:29:26 +03:00
Alex Williamson
f0c07c7c7b virtio-net: Make tx_timer timeout configurable
Add an option to make the TX mitigation timer adjustable as a device
option.  The 150us hard coded default used currently is reasonable,
but may not be suitable for all workloads, this gives us a way to
adjust it using a single binary.  We can't support any random option
though, so use the "x-" prefix to indicate this is a developer
option.  Usage:

-device virtio-net-pci,x-txtimer=500000,... # .5ms timeout

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-09-07 20:29:24 +03:00
Blue Swirl
2446333cd5 Rearrange block headers
Changing block.h or blockdev.h resulted in recompiling most objects.

Move DriveInfo typedef and BlockInterfaceType enum definitions
to qemu-common.h and rearrange blockdev.h use to decrease churn.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-08-24 15:22:24 +00:00
Amit Shah
8b53a86577 virtio-serial: Cleanup on device hot-unplug
Free malloc'ed memory, unregister from savevm and clean up virtio-common
bits on device hot-unplug.

This was found performing a migration after device hot-unplug.

Reported-by: <lihuang@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-08-22 16:19:00 -05:00
Alex Williamson
9d0d313859 virtio-blk: Create exit function to unregister savevm
Otherwise we can't migrate after we've removed a virtio block device.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-26 13:39:39 +02:00
Markus Armbruster
d75d25e34e virtio-blk: Fix virtio-blk-s390 to require drive
Move the check from virtio_blk_init_pci(), where it protects only
virtio-blk-pci, to virtio_blk_init().  Without that, virtio-blk-s390
initializes without a drive.  I figure that can lead to null pointer
dereferences.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-13 17:48:17 +02:00
Markus Armbruster
ac0c14d71b virtio-pci: Check for virtio_blk_init() failure
It can't actually fail now, but the next commit will change that.

s390_virtio_blk_init() already checks for failure, but
virtio_blk_init_pci() doesn't.  Fix that.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-13 17:48:17 +02:00
Isaku Yamahata
b80d4a9887 pci: don't overwrite multi functio bit in pci header type.
Don't overwrite pci header type.
Otherwise, multi function bit which pci_init_header_type() sets
appropriately is lost.
Anyway PCI_HEADER_TYPE_NORMAL is zero, so it is unnecessary to zero
which is already zero cleared.

how to test:
run qemu and issue info pci to see whether a device in question is
normal device, not pci-to-pci bridge.
This is handy because guest os isn't required.

tested changes:
The following files are covered by using following commands.
sparc64-softmmu
  apb_pci.c, vga-pci.c, cmd646.c, ne2k_pci.c, sun4u.c
ppc-softmmu
  grackle_pci.c, cmd646.c, ne2k_pci.c, vga-pci.c, macio.c
ppc-softmmu -M mac99
  unin_pci.c(uni-north, uni-north-agp)
ppc64-softmmu
  pci-ohci, ne2k_pci, vga-pci, unin_pci.c(u3-agp)
x86_64-softmmu
  acpi_piix4.c, ide/piix.c, piix_pci.c
  -vga vmware vmware_vga.c
  -watchdog i6300esb wdt_i6300esb.c
  -usb usb-uhci.c
  -sound ac97 ac97.c
  -nic model=rtl8139 rtl8139.c
  -nic model=pcnet pcnet.c
  -balloon virtio virtio-pci.c:

untested changes:
The following changes aren't tested.
prep_pci.c: ppc-softmmu -M prep should cover, but core dumped.
unin_pci.c(uni-north-pci): the caller is commented out.
openpic.c: the caller is commented out in ppc_prep.c

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-07-11 20:00:56 +03:00
Markus Armbruster
f8b6cc0070 qdev: Decouple qdev_prop_drive from DriveInfo
Make the property point to BlockDriverState, cutting out the DriveInfo
middleman.  This prepares the ground for block devices that don't have
a DriveInfo.

Currently all user-defined ones have a DriveInfo, because the only way
to define one is -drive & friends (they go through drive_init()).
DriveInfo is closely tied to -drive, and like -drive, it mixes
information about host and guest part of the block device.  I'm
working towards a new way to define block devices, with clean
host/guest separation, and I need to get DriveInfo out of the way for
that.

Fortunately, the device models are perfectly happy with
BlockDriverState, except for two places: ide_drive_initfn() and
scsi_disk_initfn() need to check the DriveInfo for a serial number set
with legacy -drive serial=...  Use drive_get_by_blockdev() there.

Device model code should now use DriveInfo only when explicitly
dealing with drives defined the old way, i.e. without -device.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-02 13:18:02 +02:00
Markus Armbruster
14bafc5407 blockdev: Clean up automatic drive deletion
We automatically delete blockdev host parts on unplug of the guest
device.  Too much magic, but we can't change that now.

The delete happens early in the guest device teardown, before the
connection to the host part is severed.  Thus, the guest part's
pointer to the host part dangles for a brief time.  No actual harm
comes from this, but we'll catch such dangling pointers a few commits
down the road.  Clean up the dangling pointers by delaying the
automatic deletion until the guest part's pointer is gone.

Device usb-storage deliberately makes two qdev properties refer to the
same drive, because it automatically creates a second device.  Again,
too much magic we can't change now.  Multiple references worked okay
before, but now free_drive() dies for the second one.  Zap the extra
reference.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-02 13:18:01 +02:00
Alex Williamson
8a91110738 virtio-pci: fix bus master bug setting on load
The comment suggests we're checking for the driver in the ready
state and bus master disabled, but the code is checking that it's
not in the ready state.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Found-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-06-22 15:15:51 -05:00
Venkateswararao Jujjuri (JV)
758e8e38eb virtio-9p: Make infrastructure for the new security model.
This patch adds required infrastructure for the new security model.

- A new configure option for attr/xattr.
- if CONFIG_VIRTFS will be defined if both CONFIG_LINUX and CONFIG_ATTR defined.
- Defines routines related to both security models.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-06-22 15:15:50 -05:00
Markus Armbruster
666daa6823 blockdev: Collect block device code in new blockdev.c
Anything that moves hundreds of lines out of vl.c can't be all bad.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-04 15:20:47 +02:00
Anthony Liguori
9f10751365 virtio-9p: Add a virtio 9p device to qemu
This patch doesn't implement the 9p protocol handling
code. It adds a simple device which dump the protocol data.

[jvrao@linux.vnet.ibm.com: Little-Endian to host format conversion]
[aneesh.kumar@linux.vnet.ibm.com: Multiple-mounts support]

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-05-03 12:17:37 -05:00
Michael S. Tsirkin
ade80dc845 virtio-pci: fill in notifier support
Support host/guest notifiers in virtio-pci.
The last one only with kvm, that's okay
because vhost relies on kvm anyway.

Note on kvm usage: kvm ioeventfd API
is implemented on non-kvm systems as well,
this is the reason we don't need if (kvm_enabled())
around it.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-04-01 13:56:43 -05:00
Michael S. Tsirkin
3e607cb503 virtio: add set_status callback
vhost net backend needs to be notified when
frontend status changes. Add a callback,
similar to set_features.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-04-01 13:56:43 -05:00
Alexander Graf
c81131db15 Don't check for bus master for old guests
Older Linux guests don't activate the bus master enable bit. So for those we
can just try to be clever and track if they set the DEVICE_OK bit even though
bus mastering is still disabled.

Under that condition we can disable the windows safety check. With that logic
in place both guests should work just fine. Without PCI hotplug breaks
virtio-net in Linux < 2.6.34 guests.

Signed-off-by: Alexander Graf <agraf@suse.de>
CC: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-03-31 11:16:52 -05:00
Markus Armbruster
1ecda02b24 error: Replace qemu_error() by error_report()
error_report() terminates the message with a newline.  Strip it it
from its arguments.

This fixes a few error messages lacking a newline:
net_handle_fd_param()'s "No file descriptor named %s found", and
tap_open()'s "vnet_hdr=1 requested, but no kernel support for
IFF_VNET_HDR available" (all three versions).

There's one place that passes arguments without newlines
intentionally: load_vmstate().  Fix it up.
2010-03-16 16:58:32 +01:00
Markus Armbruster
2f7920166d error: Move qemu_error & friends into their own header 2010-03-16 16:55:05 +01:00
Amit Shah
573fb60c97 virtio-pci: Use DEV_NVECTORS_UNSPECIFIED instead of -1 for virtio-serial
Use the named constant instead of -1.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reported-by: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-03-08 11:30:09 -06:00
Amit Shah
7b665b668a virtio-serial: pci: Allow MSI to be disabled
Michael noted we don't allow disabling of MSI for the virtio-serial-pci
device. Fix that.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-19 15:32:16 -06:00
Christoph Hellwig
428c149b0b block: add topology qdev properties
Add three new qdev properties to export block topology information to
the guest.  This is needed to get optimal I/O alignment for RAID arrays
or SSDs.

The options are:

 - physical_block_size to specify the physical block size of the device,
   this is going to increase from 512 bytes to 4096 kilobytes for many
   modern storage devices
 - min_io_size to specify the minimal I/O size without performance impact,
   this is typically set to the RAID chunk size for arrays.
 - opt_io_size to specify the optimal sustained I/O size, this is
   typically the RAID stripe width for arrays.

I decided to not auto-probe these values from blkid which might easily
be possible as I don't know how to deal with these issues on migration.

Note that we specificly only set the physical_block_size, and not the
logial one which is the unit all I/O is described in.  The reason for
that is that IDE does not support increasing the logical block size and
at last for now I want to stick to one meachnisms in queue and allow
for easy switching of transports for a given backing image which would
not be possible if scsi and virtio use real 4k sectors, while ide only
uses the physical block exponent.

To make this more common for the different block drivers introduce a
new BlockConf structure holding all common block properties and a
DEFINE_BLOCK_PROPERTIES macro to add them all together, mirroring
what is done for network drivers.  Also switch over all block drivers
to use it, except for the floppy driver which has weird driveA/driveB
properties and probably won't require any advanced block options ever.

Example usage for a virtio device with 4k physical block size and
8k optimal I/O size:

  -drive file=scratch.img,media=disk,cache=none,id=scratch \
  -device virtio-blk-pci,drive=scratch,physical_block_size=4096,opt_io_size=8192

aliguori: updated patch to take into account BLOCK events

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-10 16:53:25 -06:00
Amit Shah
a1829205a5 virtio-serial: Use MSI vectors for port virtqueues
This commit enables the use of MSI interrupts for virtqueue
notifications for ports. We use nr_ports + 1 (for control channel) msi
entries for the ports, as only the in_vq operations need an interrupt on
the guest.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-20 08:25:23 -06:00
Amit Shah
98b19252cf virtio-console: qdev conversion, new virtio-serial-bus
This commit converts the virtio-console device to create a new
virtio-serial bus that can host console and generic serial ports. The
file hosting this code is now called virtio-serial-bus.c.

The virtio console is now a very simple qdev device that sits on the
virtio-serial-bus and communicates between the bus and qemu's chardevs.

This commit also includes a few changes to the virtio backing code for
pci and s390 to spawn the virtio-serial bus.

As a result of the qdev conversion, we get rid of a lot of legacy code.
The old-style way of instantiating a virtio console using

    -virtioconsole ...

is maintained, but the new, preferred way is to use

    -device virtio-serial -device virtconsole,chardev=...

With this commit, multiple devices as well as multiple ports with a
single device can be supported.

For multiple ports support, each port gets an IO vq pair. Since the
guest needs to know in advance how many vqs a particular device will
need, we have to set this number as a property of the virtio-serial
device and also as a config option.

In addition, we also spawn a pair of control IO vqs. This is an internal
channel meant for guest-host communication for things like port
open/close, sending port properties over to the guest, etc.

This commit is a part of a series of other commits to get the full
implementation of multiport support. Future commits will add other
support as well as ride on the savevm version that we bump up here.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-20 08:25:23 -06:00
Michael S. Tsirkin
49e75cf388 virtio-pci: thinko fix
Since patch ed757e140c0ada220f213036e4497315d24ca8bct, virtio will
sometimes clear all status registers on bus master disable, which loses
information such as VIRTIO_CONFIG_S_FAILED bit.  This is a result of
a patch being misapplied: code uses !  instead of ~ for bit
operations as in Yan's original patch.  This obviously does not make
sense.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-11 13:41:00 -06:00
Michael S. Tsirkin
8172539d21 virtio: add features as qdev properties
Add feature bits as properties to virtio. This makes it possible to e.g. define
machine without indirect buffer support, which is required for 0.10
compatibility, or without hardware checksum support, which is required for 0.11
compatibility.  Since default values for optional features are now set by qdev,
get_features callback has been modified: it sets non-optional bits, and clears
bits not supported by host.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-11 13:40:59 -06:00
Michael S. Tsirkin
704a76fcd2 virtio: rename features -> guest_features
Rename features->guest_features. This is
what they are, avoid confusion with
host features which we also need to keep around.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-11 13:40:59 -06:00
Gerd Hoffmann
8c52c8f320 pci romfiles: add property, add default to PCIDeviceInfo
This patch adds a romfile property to the pci bus.  It allows to specify
a romfile to load into the rom bar of the pci device.  The default value
comes from a new field in PCIDeviceInfo.  The property allows to change
the file and also to disable the rom loading using an empty string.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-18 11:26:34 -06:00
Anthony Liguori
c2039bd0ff Support PCI based option rom loading
Currently, we preload option roms into the option rom space in memory.  This
prevents DDIM from functioning correctly which severely limits the number
of roms we can support.

This patch introduces a pci_add_option_rom() which registers the
PCI_ROM_ADDRESS bar which points to our option rom.  It also converts over
the cirrus vga adapter, the rtl8139, virtio, and the e1000 to use this
new mechanism.

The result is that PXE boot functions even with three unique types of cards.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-18 11:26:34 -06:00
Michael S. Tsirkin
6d74ca5aa8 virtio: verify features on load
migrating between hosts which have different features
might break silently, if the migration destination
does not support some features supported by source.

Prevent this from happening by comparing acked feature
bits with the mask supported by the device.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-12 07:59:38 -06:00
Michael S. Tsirkin
1b8e9b2742 virtio: do not reset msix state on soft reset
msix state is managed by OS, not the
driver, so it's wrong to touch it
on io from driver.
Mark all vectors unused instead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2009-12-01 17:58:48 +02:00
Isaku Yamahata
6e355d901b pci: introduce pcibus_t to represent pci bus address/size instead of uint32_t
This patch is preliminary for 64 bit BAR support.
Introduce dedicated type, pcibus_t, to represent pci bus address/size
instead of uint32_t.
Later this type will be changed to uint64_t.

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-09 08:43:08 -06:00
Isaku Yamahata
0392a017ae pci: s/PCI_ADDRESS_SPACE_/PCI_BASE_ADDRESS_SPACE_/ to match pci_regs.h
make constants for pci base address match pci_regs.h by
renaming PCI_ADDRESS_SPACE_xxx to PCI_BASE_ADDRESS_SPACE_xxx.

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-09 08:43:07 -06:00
Gerd Hoffmann
97b156213e virtio: use qdev properties for configuration.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-27 12:28:40 -05:00
Michael S. Tsirkin
0f457d91c4 qemu/pci: make pci not depend on msix
Making pci device cleanup msix automatically makes pci.c depend on
msix.c, which is IMO messy.  Since devices do msix_init it's easy and
natural for them to also do msix_uninit.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-06 14:36:13 -05:00
Amit Shah
25fe365483 virtio-pci: return error if virtio_console_init fails
Currently only one virtio_console device is supported. Trying to add
multiple devices fails and such failure should be reported back to the
qdev init functions.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-05 09:32:53 -05:00
Michael S. Tsirkin
85352471ce qemu/virtio-pci: remove unnecessary check
it's safe to call msix_write_config if msix
is disabled, so call it unconditionally on
pci config write.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-05 09:32:52 -05:00
Michael S. Tsirkin
5a1fc5e852 qemu: clean up target page usage in msix
Since cpu_register_phys_memory does not require size to be a multiple of
target page size, simply make msix page size 0x1000.  Do this in msix,
reverting part of 5e520a7d50, as we no
longer have to pass target page around.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-05 09:32:52 -05:00
Michael S. Tsirkin
e489030df2 qemu/virtio: fix reset with device removal
virtio pci registers its own reset handler, but fails to unregister it,
which will lead to crashes after device removal.  Solve this problem by
switching to qdev reset handler, which is automatically unregistered.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-05 09:32:50 -05:00
Gerd Hoffmann
56a1493880 drive cleanup fixes.
Changes:
  * drive_uninit() wants a DriveInfo now.
  * drive_uninit() also calls bdrv_delete(),
    so callers don't need to do that.
  * drive_uninit() calls are moved over to the ->exit()
    callbacks, destroy_bdrvs() is zapped.
  * setting bdrv->private is not needed any more as the
    only user (destroy_bdrvs) is gone.
  * usb-storage needs no drive_uninit, scsi-disk will
    handle that.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-05 09:32:49 -05:00
Gerd Hoffmann
e3936fa574 pci: move unregister from PCIDevice to PCIDeviceInfo
One more cleanup while being at it ;)

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-05 09:32:48 -05:00
Anthony Liguori
c227f0995e Revert "Get rid of _t suffix"
In the very least, a change like this requires discussion on the list.

The naming convention is goofy and it causes a massive merge problem.  Something
like this _must_ be presented on the list first so people can provide input
and cope with it.

This reverts commit 99a0949b72.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-01 16:12:16 -05:00
malc
99a0949b72 Get rid of _t suffix
Some not so obvious bits, slirp and Xen were left alone for the time
being.

Signed-off-by: malc <av1474@comtv.ru>
2009-10-01 22:45:02 +04:00
Blue Swirl
5e520a7d50 Compile msix only once
Get page size in device init.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-09-20 15:35:55 +00:00
Yan Vugenfirer
ed757e140c VirtIO: Fix QEMU crash during Windows PNP tests
Hello,

In some cases bus driver can deassert "bus master" bit in PCI command
register. The driver will no longer be able to update related registers in
the device. Eventually it will cause QEMU to exit in "virtqueue_num_heads"
function.

Attached path that fixes the described issue.

Best regards,
Yan Vugenfirer.

>From 3fdafbdfad676ec8479dc073cff70bf356868bfe Mon Sep 17 00:00:00 2001
From: Yan Vugenfirer <yvugenfi@redhat.com>
Date: Tue, 8 Sep 2009 10:08:14 -0400
Subject: [PATCH] VirtIO: Fix QEMU crash during Windows PNP tests

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-09-11 10:19:47 -05:00
Gerd Hoffmann
84fc5589f8 virtio-pci error logging
Use the new qemu_error() function for virtio-blk-pci.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-08-27 20:43:33 -05:00
Gerd Hoffmann
81a322d4a1 qdev: add return value to init() callbacks.
Sorry folks, but it has to be.  One more of these invasive qdev patches.

We have a serious design bug in the qdev interface:  device init
callbacks can't signal failure because the init() callback has no
return value.  This patch fixes it.

We have already one case in-tree where this is needed:
Try -device virtio-blk-pci (without drive= specified) and watch qemu
segfault.  This patch fixes it.

With usb+scsi being converted to qdev we'll get more devices where the
init callback can fail for various reasons.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-08-27 20:43:28 -05:00
Gerd Hoffmann
177539e06d virtio-blk: add msi support.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-08-24 08:46:48 -05:00
Gerd Hoffmann
72c61d0bf4 qdev/prop: convert virtio-pci.c to helper macros.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Message-Id:
2009-08-10 13:11:26 -05:00
Gerd Hoffmann
d176c495b6 qdev-ify virtio-blk.
First user of the new drive property.  With this patch applied host
and guest config can be specified separately, like this:

  -drive if=none,id=disk1,file=/path/to/disk.img
  -device virtio-blk-pci,drive=disk1

You can set any property for virtio-blk-pci now.  You can set the pci
address via addr=.  You can switch the device into 0.10 compat mode
using class=0x0180.  As this is per device you can have one 0.10 and one
0.11 virtio block device in a single virtual machine.

Old syntax continues to work.  Internally it does the same as the two
lines above though.  One side effect this has is a different
initialization order, which might result in a different pci address
being assigned by default.

Long term plan here is to have this working for all block devices, i.e.
once all scsi is properly qdev-ified you will be able to do something
like this:

  -drive if=none,id=sda,file=/path/to/disk.img
  -device lsi,id=lsi,addr=<pciaddr>
  -device scsi-disk,drive=sda,bus=lsi.0,lun=<n>

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Message-Id:
2009-08-10 13:05:28 -05:00
Mark McLoughlin
85c2c7359b Remove the virtio-{blk, console}-pci-0-10 device types
These are now unused.

However, perhaps the idea is that when we add -device, they will be
useful? In that case, we should add virtio-net-pci-0-10 too.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-30 09:50:37 -05:00
Gerd Hoffmann
a1e0fea587 qdev/compat: virtio-net-pci 0.10 compatibility.
Add vectors property, allowing to turn off msi by setting vectors=0.
Add compat property to pc-0.10 disabling msi.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-16 17:28:53 -05:00
Gerd Hoffmann
d6beee9938 qdev/compat: virtio-console-pci 0.10 compatibility.
Add class property to virtio-console-pci allowing to specify the PCI class.
Add compat property to pc-0.10 to set the old PCI class.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-16 17:28:53 -05:00
Gerd Hoffmann
ab73ff29ce qdev/compat: virtio-blk-pci 0.10 compatibility.
Add class property to virtio-blk-pci allowing to specify the PCI class.
Add compat property to pc-0.10 to set the old PCI class.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-16 17:28:53 -05:00
Gerd Hoffmann
15239b2e52 cleanup: drop unused struct elements from VirtIOPCIProxy.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-16 17:28:51 -05:00
Mark McLoughlin
21d58b575e Change default PCI class of virtio-console to PCI_CLASS_SERIAL_OTHER
We're using PCI_CLASS_DISPLAY_OTHER now, but qemu-kvm.git is using
PCI_CLASS_OTHERS because:

  "As a PCI_CLASS_DISPLAY_OTHER, it reduces primary display somehow on
   Windows XP (possibly Windows disables acceleration since it fails
   to find a driver)."

While this is valid, many versions of X will get confused by it.
Class major number of 0 gets treated as a possibly prehistoric VGA
device, and then the autoconfig logic gets confused trying to figure
out whether the virtio console or the pv vga device are the real VGA.

We should really set a proper class ID. 0x0780 (serial / other) seems
most appropriate. This shouldn't require any kernel changes, the
modalias for virtio looks like:

  alias:          pci:v00001AF4d*sv*sd*bc*sc*i*

so won't care what the base class or subclass are.

It shows up in the guest as:

  00:05.0 Communication controller: Qumranet, Inc. Virtio console

A new qdev type is introduced to allow devices using the old class
to be created for compatibility with qemu-0.10.x.

Reported-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-16 08:28:10 -05:00
Mark McLoughlin
5c634ef30d Change default PCI class of virtio-blk to PCI_CLASS_STORAGE_SCSI
Windows virtio driver cannot pass DTM (certification) tests while the
storage class is PCI_CLASS_STORAGE_UNKNOWN.

A new qdev type is introduced to allow devices using the old class
to be created for compatibility with qemu-0.10.x.

Reported-by: Dor Laor <dlaor@redhat.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-16 08:28:10 -05:00
Michael S. Tsirkin
e6da768000 qemu/virtio: mark msi vectors used on load
Usage of msi vectors is controlled by the guest and so needs to be
restored on load. Do this for msi vectors used by the virtio device.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-10 13:44:30 -05:00
Gerd Hoffmann
0aab0d3a4a qdev: update pci device registration.
Makes pci_qdev_register take a PCIDeviceInfo struct instead of a bunch
of parameters.  Also adds config_read and config_write callbacks to
PCIDeviceInfo, so drivers needing these can be converted to the qdev
device API too.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2009-07-09 13:07:02 +01:00
Jan Kiszka
a08d43677f Revert "Introduce reset notifier order"
This reverts commit 8217606e6e (and
updates later added users of qemu_register_reset), we solved the
problem it originally addressed less invasively.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-29 14:18:08 -05:00
Michael S. Tsirkin
ff24bd589c qemu/virtio: virtio save/load bindings
Implement bindings for virtio save/load. Use them in virtio pci.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-24 09:09:15 -05:00
Michael S. Tsirkin
aba800a3ff qemu/virtio: MSI-X support in virtio PCI
This enables actual support for MSI-X in virtio PCI.
First user will be virtio-net.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-24 09:09:14 -05:00
Michael S. Tsirkin
7055e687cd qemu/virtio: virtio support for many interrupt vectors
Extend virtio to support many interrupt vectors, and rearrange code in
preparation for multi-vector support (mostly move reset out to bindings,
because we will have to reset the vectors in transport-specific code).
Actual bindings in pci, and use in net, to follow.
Load and save are not connected to bindings yet, so they are left
stubbed out for now.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-24 09:09:14 -05:00
Mark McLoughlin
efeea6d048 virtio: add support for indirect ring entries
Support a new feature flag for indirect ring entries. These are ring
entries which point to a table of buffer descriptors.

The idea here is to increase the ring capacity by allowing a larger
effective ring size whereby the ring size dictates the number of
requests that may be outstanding, rather than the size of those
requests.

This should be most effective in the case of block I/O where we can
potentially benefit by concurrently dispatching a large number of
large requests. Even in the simple case of single segment block
requests, this results in a threefold increase in ring capacity.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-22 10:10:50 -05:00
Avi Kivity
28c2c26495 Rename pci_register_io_region() to pci_register_bar()
This function is used to manage a PCI BAR, so make the more generic
pci_register_io_region() available to other uses.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-16 15:18:38 -05:00
Paul Brook
1ad2134f91 Hardware convenience library
The only target dependency for most hardware is sizeof(target_phys_addr_t).
Build these files into a convenience library, and use that instead of
building for every target.

Remove and poison various target specific macros to avoid bogus target
dependencies creeping back in.

Big/Little endian is not handled because devices should not know or care
about this to start with.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2009-05-19 16:17:58 +01:00
Paul Brook
53c25cea7d Separate virtio PCI code
Split the PCI host bindings from the VRing transport implementation.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2009-05-18 18:26:33 +01:00