We automatically delete blockdev host parts on unplug of the guest
device. Too much magic, but we can't change that now.
The delete happens early in the guest device teardown, before the
connection to the host part is severed. Thus, the guest part's
pointer to the host part dangles for a brief time. No actual harm
comes from this, but we'll catch such dangling pointers a few commits
down the road. Clean up the dangling pointers by delaying the
automatic deletion until the guest part's pointer is gone.
Device usb-storage deliberately makes two qdev properties refer to the
same drive, because it automatically creates a second device. Again,
too much magic we can't change now. Multiple references worked okay
before, but now free_drive() dies for the second one. Zap the extra
reference.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The comment suggests we're checking for the driver in the ready
state and bus master disabled, but the code is checking that it's
not in the ready state.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Found-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch adds required infrastructure for the new security model.
- A new configure option for attr/xattr.
- if CONFIG_VIRTFS will be defined if both CONFIG_LINUX and CONFIG_ATTR defined.
- Defines routines related to both security models.
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Anything that moves hundreds of lines out of vl.c can't be all bad.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch doesn't implement the 9p protocol handling
code. It adds a simple device which dump the protocol data.
[jvrao@linux.vnet.ibm.com: Little-Endian to host format conversion]
[aneesh.kumar@linux.vnet.ibm.com: Multiple-mounts support]
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Support host/guest notifiers in virtio-pci.
The last one only with kvm, that's okay
because vhost relies on kvm anyway.
Note on kvm usage: kvm ioeventfd API
is implemented on non-kvm systems as well,
this is the reason we don't need if (kvm_enabled())
around it.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
vhost net backend needs to be notified when
frontend status changes. Add a callback,
similar to set_features.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Older Linux guests don't activate the bus master enable bit. So for those we
can just try to be clever and track if they set the DEVICE_OK bit even though
bus mastering is still disabled.
Under that condition we can disable the windows safety check. With that logic
in place both guests should work just fine. Without PCI hotplug breaks
virtio-net in Linux < 2.6.34 guests.
Signed-off-by: Alexander Graf <agraf@suse.de>
CC: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
error_report() terminates the message with a newline. Strip it it
from its arguments.
This fixes a few error messages lacking a newline:
net_handle_fd_param()'s "No file descriptor named %s found", and
tap_open()'s "vnet_hdr=1 requested, but no kernel support for
IFF_VNET_HDR available" (all three versions).
There's one place that passes arguments without newlines
intentionally: load_vmstate(). Fix it up.
Use the named constant instead of -1.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reported-by: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Michael noted we don't allow disabling of MSI for the virtio-serial-pci
device. Fix that.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add three new qdev properties to export block topology information to
the guest. This is needed to get optimal I/O alignment for RAID arrays
or SSDs.
The options are:
- physical_block_size to specify the physical block size of the device,
this is going to increase from 512 bytes to 4096 kilobytes for many
modern storage devices
- min_io_size to specify the minimal I/O size without performance impact,
this is typically set to the RAID chunk size for arrays.
- opt_io_size to specify the optimal sustained I/O size, this is
typically the RAID stripe width for arrays.
I decided to not auto-probe these values from blkid which might easily
be possible as I don't know how to deal with these issues on migration.
Note that we specificly only set the physical_block_size, and not the
logial one which is the unit all I/O is described in. The reason for
that is that IDE does not support increasing the logical block size and
at last for now I want to stick to one meachnisms in queue and allow
for easy switching of transports for a given backing image which would
not be possible if scsi and virtio use real 4k sectors, while ide only
uses the physical block exponent.
To make this more common for the different block drivers introduce a
new BlockConf structure holding all common block properties and a
DEFINE_BLOCK_PROPERTIES macro to add them all together, mirroring
what is done for network drivers. Also switch over all block drivers
to use it, except for the floppy driver which has weird driveA/driveB
properties and probably won't require any advanced block options ever.
Example usage for a virtio device with 4k physical block size and
8k optimal I/O size:
-drive file=scratch.img,media=disk,cache=none,id=scratch \
-device virtio-blk-pci,drive=scratch,physical_block_size=4096,opt_io_size=8192
aliguori: updated patch to take into account BLOCK events
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This commit enables the use of MSI interrupts for virtqueue
notifications for ports. We use nr_ports + 1 (for control channel) msi
entries for the ports, as only the in_vq operations need an interrupt on
the guest.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This commit converts the virtio-console device to create a new
virtio-serial bus that can host console and generic serial ports. The
file hosting this code is now called virtio-serial-bus.c.
The virtio console is now a very simple qdev device that sits on the
virtio-serial-bus and communicates between the bus and qemu's chardevs.
This commit also includes a few changes to the virtio backing code for
pci and s390 to spawn the virtio-serial bus.
As a result of the qdev conversion, we get rid of a lot of legacy code.
The old-style way of instantiating a virtio console using
-virtioconsole ...
is maintained, but the new, preferred way is to use
-device virtio-serial -device virtconsole,chardev=...
With this commit, multiple devices as well as multiple ports with a
single device can be supported.
For multiple ports support, each port gets an IO vq pair. Since the
guest needs to know in advance how many vqs a particular device will
need, we have to set this number as a property of the virtio-serial
device and also as a config option.
In addition, we also spawn a pair of control IO vqs. This is an internal
channel meant for guest-host communication for things like port
open/close, sending port properties over to the guest, etc.
This commit is a part of a series of other commits to get the full
implementation of multiport support. Future commits will add other
support as well as ride on the savevm version that we bump up here.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Since patch ed757e140c0ada220f213036e4497315d24ca8bct, virtio will
sometimes clear all status registers on bus master disable, which loses
information such as VIRTIO_CONFIG_S_FAILED bit. This is a result of
a patch being misapplied: code uses ! instead of ~ for bit
operations as in Yan's original patch. This obviously does not make
sense.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add feature bits as properties to virtio. This makes it possible to e.g. define
machine without indirect buffer support, which is required for 0.10
compatibility, or without hardware checksum support, which is required for 0.11
compatibility. Since default values for optional features are now set by qdev,
get_features callback has been modified: it sets non-optional bits, and clears
bits not supported by host.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Rename features->guest_features. This is
what they are, avoid confusion with
host features which we also need to keep around.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch adds a romfile property to the pci bus. It allows to specify
a romfile to load into the rom bar of the pci device. The default value
comes from a new field in PCIDeviceInfo. The property allows to change
the file and also to disable the rom loading using an empty string.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Currently, we preload option roms into the option rom space in memory. This
prevents DDIM from functioning correctly which severely limits the number
of roms we can support.
This patch introduces a pci_add_option_rom() which registers the
PCI_ROM_ADDRESS bar which points to our option rom. It also converts over
the cirrus vga adapter, the rtl8139, virtio, and the e1000 to use this
new mechanism.
The result is that PXE boot functions even with three unique types of cards.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
migrating between hosts which have different features
might break silently, if the migration destination
does not support some features supported by source.
Prevent this from happening by comparing acked feature
bits with the mask supported by the device.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
msix state is managed by OS, not the
driver, so it's wrong to touch it
on io from driver.
Mark all vectors unused instead.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch is preliminary for 64 bit BAR support.
Introduce dedicated type, pcibus_t, to represent pci bus address/size
instead of uint32_t.
Later this type will be changed to uint64_t.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
make constants for pci base address match pci_regs.h by
renaming PCI_ADDRESS_SPACE_xxx to PCI_BASE_ADDRESS_SPACE_xxx.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Making pci device cleanup msix automatically makes pci.c depend on
msix.c, which is IMO messy. Since devices do msix_init it's easy and
natural for them to also do msix_uninit.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Currently only one virtio_console device is supported. Trying to add
multiple devices fails and such failure should be reported back to the
qdev init functions.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
it's safe to call msix_write_config if msix
is disabled, so call it unconditionally on
pci config write.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Since cpu_register_phys_memory does not require size to be a multiple of
target page size, simply make msix page size 0x1000. Do this in msix,
reverting part of 5e520a7d50, as we no
longer have to pass target page around.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
virtio pci registers its own reset handler, but fails to unregister it,
which will lead to crashes after device removal. Solve this problem by
switching to qdev reset handler, which is automatically unregistered.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Changes:
* drive_uninit() wants a DriveInfo now.
* drive_uninit() also calls bdrv_delete(),
so callers don't need to do that.
* drive_uninit() calls are moved over to the ->exit()
callbacks, destroy_bdrvs() is zapped.
* setting bdrv->private is not needed any more as the
only user (destroy_bdrvs) is gone.
* usb-storage needs no drive_uninit, scsi-disk will
handle that.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In the very least, a change like this requires discussion on the list.
The naming convention is goofy and it causes a massive merge problem. Something
like this _must_ be presented on the list first so people can provide input
and cope with it.
This reverts commit 99a0949b72.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Hello,
In some cases bus driver can deassert "bus master" bit in PCI command
register. The driver will no longer be able to update related registers in
the device. Eventually it will cause QEMU to exit in "virtqueue_num_heads"
function.
Attached path that fixes the described issue.
Best regards,
Yan Vugenfirer.
>From 3fdafbdfad676ec8479dc073cff70bf356868bfe Mon Sep 17 00:00:00 2001
From: Yan Vugenfirer <yvugenfi@redhat.com>
Date: Tue, 8 Sep 2009 10:08:14 -0400
Subject: [PATCH] VirtIO: Fix QEMU crash during Windows PNP tests
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Use the new qemu_error() function for virtio-blk-pci.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Sorry folks, but it has to be. One more of these invasive qdev patches.
We have a serious design bug in the qdev interface: device init
callbacks can't signal failure because the init() callback has no
return value. This patch fixes it.
We have already one case in-tree where this is needed:
Try -device virtio-blk-pci (without drive= specified) and watch qemu
segfault. This patch fixes it.
With usb+scsi being converted to qdev we'll get more devices where the
init callback can fail for various reasons.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
First user of the new drive property. With this patch applied host
and guest config can be specified separately, like this:
-drive if=none,id=disk1,file=/path/to/disk.img
-device virtio-blk-pci,drive=disk1
You can set any property for virtio-blk-pci now. You can set the pci
address via addr=. You can switch the device into 0.10 compat mode
using class=0x0180. As this is per device you can have one 0.10 and one
0.11 virtio block device in a single virtual machine.
Old syntax continues to work. Internally it does the same as the two
lines above though. One side effect this has is a different
initialization order, which might result in a different pci address
being assigned by default.
Long term plan here is to have this working for all block devices, i.e.
once all scsi is properly qdev-ified you will be able to do something
like this:
-drive if=none,id=sda,file=/path/to/disk.img
-device lsi,id=lsi,addr=<pciaddr>
-device scsi-disk,drive=sda,bus=lsi.0,lun=<n>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Message-Id:
These are now unused.
However, perhaps the idea is that when we add -device, they will be
useful? In that case, we should add virtio-net-pci-0-10 too.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add class property to virtio-console-pci allowing to specify the PCI class.
Add compat property to pc-0.10 to set the old PCI class.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add class property to virtio-blk-pci allowing to specify the PCI class.
Add compat property to pc-0.10 to set the old PCI class.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We're using PCI_CLASS_DISPLAY_OTHER now, but qemu-kvm.git is using
PCI_CLASS_OTHERS because:
"As a PCI_CLASS_DISPLAY_OTHER, it reduces primary display somehow on
Windows XP (possibly Windows disables acceleration since it fails
to find a driver)."
While this is valid, many versions of X will get confused by it.
Class major number of 0 gets treated as a possibly prehistoric VGA
device, and then the autoconfig logic gets confused trying to figure
out whether the virtio console or the pv vga device are the real VGA.
We should really set a proper class ID. 0x0780 (serial / other) seems
most appropriate. This shouldn't require any kernel changes, the
modalias for virtio looks like:
alias: pci:v00001AF4d*sv*sd*bc*sc*i*
so won't care what the base class or subclass are.
It shows up in the guest as:
00:05.0 Communication controller: Qumranet, Inc. Virtio console
A new qdev type is introduced to allow devices using the old class
to be created for compatibility with qemu-0.10.x.
Reported-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Windows virtio driver cannot pass DTM (certification) tests while the
storage class is PCI_CLASS_STORAGE_UNKNOWN.
A new qdev type is introduced to allow devices using the old class
to be created for compatibility with qemu-0.10.x.
Reported-by: Dor Laor <dlaor@redhat.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Usage of msi vectors is controlled by the guest and so needs to be
restored on load. Do this for msi vectors used by the virtio device.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Makes pci_qdev_register take a PCIDeviceInfo struct instead of a bunch
of parameters. Also adds config_read and config_write callbacks to
PCIDeviceInfo, so drivers needing these can be converted to the qdev
device API too.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This reverts commit 8217606e6e (and
updates later added users of qemu_register_reset), we solved the
problem it originally addressed less invasively.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Implement bindings for virtio save/load. Use them in virtio pci.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This enables actual support for MSI-X in virtio PCI.
First user will be virtio-net.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Extend virtio to support many interrupt vectors, and rearrange code in
preparation for multi-vector support (mostly move reset out to bindings,
because we will have to reset the vectors in transport-specific code).
Actual bindings in pci, and use in net, to follow.
Load and save are not connected to bindings yet, so they are left
stubbed out for now.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Support a new feature flag for indirect ring entries. These are ring
entries which point to a table of buffer descriptors.
The idea here is to increase the ring capacity by allowing a larger
effective ring size whereby the ring size dictates the number of
requests that may be outstanding, rather than the size of those
requests.
This should be most effective in the case of block I/O where we can
potentially benefit by concurrently dispatching a large number of
large requests. Even in the simple case of single segment block
requests, this results in a threefold increase in ring capacity.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This function is used to manage a PCI BAR, so make the more generic
pci_register_io_region() available to other uses.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The only target dependency for most hardware is sizeof(target_phys_addr_t).
Build these files into a convenience library, and use that instead of
building for every target.
Remove and poison various target specific macros to avoid bogus target
dependencies creeping back in.
Big/Little endian is not handled because devices should not know or care
about this to start with.
Signed-off-by: Paul Brook <paul@codesourcery.com>