Commit Graph

7196 Commits

Author SHA1 Message Date
Jan Kiszka
0ae1625177 pci: Add INTx routing notifier
This per-device notifier shall be triggered by any interrupt router
along the path of a device's legacy interrupt signal on routing changes.
For simplicity reasons and as this is a slow path anyway, no further
details on the routing changes are provided. Instead, the callback is
expected to use pci_device_route_intx_to_irq to check the effect of the
change.

Will be used by KVM PCI device assignment and VFIO.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-07-19 17:15:42 +03:00
Michael S. Tsirkin
3afa9bb488 pci: Add pci_device_route_intx_to_irq
Device assigned on KVM needs to know the mode
(enabled/inverted/disabled) and the IRQ number that a given device
triggers in the attached interrupt controller.

Add a PCI IRQ path discovery function that walks from a given device to
the host bridge, and gets this information.  For
this purpose, a host bridge callback function is introduced:
route_intx_to_irq. It is so far only implemented by the PIIX3, other
host bridges can be added later on as required.

Will be used for KVM PCI device assignment and VFIO.

Based on patch by Jan Kiszka, with minor tweaks.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-07-19 17:11:47 +03:00
Alex Williamson
7cf1b0fd95 pci: Unregister BARs before device exit
BARs are registered in init functions from memory regions created
by the drivers.  Exit functions destroy those memory regions.
By unregistering the io regions after exit(), we're calling
memory_region_del_subregion on freed memory.  Don't do that.  The
option rom comes along for the ride because it's more symmetric
to how it's created.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-07-04 15:55:07 +03:00
Alex Williamson
f90c2bcdbc pci: convert PCIUnregisterFunc to void
Not a single driver has any possibility of failure on their
exit function, let's keep it that way.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-07-04 15:52:55 +03:00
Alex Williamson
572992eefa msix: Switch msix_uninit to return void
It can't fail.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18 10:21:12 +03:00
Alex Williamson
5a2c202981 msix: Allow full specification of MSIX layout
Finally, complete the fully specified interface.  msix_add_config()
gets folded into msix_init() because we now have quite a few parameters
to pass and rolling it in let's us error earlier, avoiding the ugly
unwind exit path.  msix_mmio_setup() also gets rolled in, just because
it's redundant to rediscover offsets when we already have them for
such a tiny function.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18 10:21:11 +03:00
Alex Williamson
d35e428c84 msix: Split PBA into it's own MemoryRegion
These don't have to be contiguous.  Size them to only what
they need and use separate MemoryRegions for the vector
table and PBA.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18 10:21:11 +03:00
Alex Williamson
2cf62ad742 msix: Note endian TODO item
MSIX, like PCI, is little endian.  Specifying native is wrong here,
but we need to check the rest of the file to determine if it's
as simple as flipping this macro.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18 10:21:11 +03:00
Alex Williamson
eebcb0a76a msix: Move msix_mmio_read
What's this doing so far from msix_mmio_ops?

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18 10:21:11 +03:00
Alex Williamson
b2357c484d virtio: Convert to msix_init_exclusive_bar() interface
Simple conversion.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18 10:21:10 +03:00
Alex Williamson
1116b53921 ivshmem: Convert to msix_init_exclusive_bar() interface
Trivial conversion, failed to have an uninit before and after.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18 10:21:05 +03:00
Alex Williamson
53f949254a msix: Add simple BAR allocation MSIX setup functions
msi_init() takes over a BAR without really specifying or allowing
specification of how it does so.  Instead, let's split it into
two interfaces, one fully specified, and one trivially easy.  This
implements the latter.  msix_init_exclusive_bar() takes over
allocating and filling a PCI BAR _exclusively_ for the use of MSIX.
When used, the matching msi_uninit_exclusive_bar() should be used
to tear it down.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18 10:21:05 +03:00
Alex Williamson
118f2c2b48 msix: fix PCIDevice naming inconsistency
msix.h calls the PCIDevice * parameter "dev" almost everywhere except
the msix_write_config declaration. Fix the inconsistency.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18 10:21:04 +03:00
Jan Kiszka
393a98924e msix: drop unused msix_bar_size, require valid bar_size
No user in sight for msix_bar_size.
bar_size for all users is aligned, let's simply
require this instead of trying to fix up invalid input.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18 10:21:04 +03:00
Jason Baron
80aa796bf3 pci_bridge_dev: fix error path in pci_bridge_dev_initfn()
Currently, we do not properly cleanup, if pci_bridge_dev_initfn
fails to initialize properly. Make sure to call pci_bridge_exitfn()
in the error path.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-11 22:55:13 +03:00
Jason Baron
266ca11a04 qdev: release parent properties on dc->init failure
While looking into hot-plugging bridges, I can create a qemu segfault via:

$ device_add pci-bridge

Bridge chassis not specified. Each bridge is required to be assigned a unique chassis id > 0.
**
ERROR:qom/object.c:389:object_delete: assertion failed: (obj->ref == 0)

I'm proposing to fix this by adding a call to 'object_unparent()', before the
call to qdev_free(). I see there is already a precedent for this usage pattern as
seen in qdev_simple_unplug_cb():

/* can be used as ->unplug() callback for the simple cases */
int qdev_simple_unplug_cb(DeviceState *dev)
{
    /* just zap it */
    object_unparent(OBJECT(dev));
    qdev_free(dev);
    return 0;
}

Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-11 22:55:13 +03:00
Jan Kiszka
44701ab71a msi: Use msi/msix_present more consistently
Replace some open-coded msi/msix_present checks.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07 17:19:01 +03:00
Jan Kiszka
95d6580024 msi: Invoke msi/msix_write_config from PCI core
Also this functions is better invoked by the core than by each and every
device. This allows to drop the config_write callbacks from ich and
intel-hda.

CC: Alexander Graf <agraf@suse.de>
CC: Gerd Hoffmann <kraxel@redhat.com>
CC: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07 17:19:00 +03:00
Jan Kiszka
7c9958b043 msi: Guard msi/msix_write_config with msi_present
Terminate msi/msix_write_config early if support is not enabled. This
allows to remove checks at the caller site if MSI is optional.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07 17:19:00 +03:00
Jan Kiszka
cbd2d4342b msi: Invoke msi/msix_reset from PCI core
There is no point in pushing this burden to the devices, they tend to
forget to call them (like intel-hda, ahci, xhci did). Instead, reset
functions are now called from pci_device_reset. They do nothing if
MSI/MSI-X is not in use.

CC: Alexander Graf <agraf@suse.de>
CC: Gerd Hoffmann <kraxel@redhat.com>
CC: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07 17:19:00 +03:00
Jan Kiszka
520064c8b1 msi: Guard msi_reset with msi_present
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07 17:18:59 +03:00
Jan Kiszka
8ab60a0703 ahci: Clean up reset functions
Properly register reset functions via the device class.

CC: Alexander Graf <agraf@suse.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07 17:18:59 +03:00
Jan Kiszka
8e729e3b52 intel-hda: Fix reset of MSI function
Call msi_reset on device reset as still required by the core.

CC: Gerd Hoffmann <kraxel@redhat.com>
CC: qemu-stable@nongnu.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07 17:18:59 +03:00
Jan Kiszka
868a1a5226 ahci: Fix reset of MSI function
Call msi_reset on device reset as still required by the core.

CC: Alexander Graf <agraf@suse.de>
CC: qemu-stable@nongnu.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07 17:18:59 +03:00
Fernando Luis Vazquez Cao
fee9d348ff rtl8139: honor RxOverflow flag in can_receive method
Some drivers (Linux' 8139too among them) rely on the NIC
injecting an interrupt in the event of a receive buffer overflow
and, accordingly, set the RxOverflow bit in the interrupt
mask. Unfortunately rtl8139's can_receive method ignores the
RxOverflow flag, which may lead to a situation where rtl8139
stops receiving packets (can_receive returns 0) when the receive
buffer becomes full.

If the driver eventually read from the receive buffer or reset
the card the emulator could recover from this situation. However
some implementations only do this upon receiving an interrupt
with either RxOK or RxOverflow set in the ISR; interrupt that
will never come because QEMU's flow control mechanisms would
prevent rtl8139 from receiving any packet.

Letting packets go through when the overflow interrupt is enabled
makes the QEMU emulator compliant to the spec and solves the
problem.

This patch should fix a relatively common (in our experience)
network stall observed when running enterprise distros with
rtl8139 as the NIC; in some cases the 8139too device driver gets
loaded and when under heavy load the network eventually stops
working.

Reported-by: Hayato Kakuta <kakuta.hayato@oss.ntt.co.jp>
Tested-by: Hayato Kakuta <kakuta.hayato@oss.ntt.co.jp>
Acked-by: Igor Kovalenko <igor.v.kovalenko@gmail.com>
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07 17:18:58 +03:00
Michael S. Tsirkin
e9adf2605d shpc: unparent device before free
Recent core change removed unparent
so we need to do this in all callers now.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07 17:18:58 +03:00
Jason Wang
9c92bf7f6c Revert "rtl8139: do the network/host communication only in normal operating mode"
This reverts commit ff71f2e8ca. This is because
the linux 8139cp driver would leave the card in "Config Register Write Enable"
mode after the eeprom were read or write ( which is unexpected in the spec
). Also a physical 8139 card can still DMA into host memory in modes other than
Normal mode, so we need revert this commit to align with the behavior of
physical card.

The issue of 8139cp driver should be fixed in linux seperately.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-06-04 12:58:36 +08:00
Anthony Liguori
74f4d2279b Merge remote-tracking branch 'qemu-kvm/uq/master' into staging
* qemu-kvm/uq/master:
  virtio/vhost: Add support for KVM in-kernel MSI injection
  msix: Add msix_nr_vectors_allocated
  kvm: Enable use of kvm_irqchip_in_kernel in hwlib code
  kvm: Introduce kvm_irqchip_add/remove_irqfd
  kvm: Make kvm_irqchip_commit_routes an internal service
  kvm: Publicize kvm_irqchip_release_virq
  kvm: Introduce kvm_irqchip_add_msi_route
  kvm: Rename kvm_irqchip_add_route to kvm_irqchip_add_irq_route
  msix: Introduce vector notifiers
  msix: Invoke msix_handle_mask_update on msix_mask_all
  msix: Factor out msix_get_message
  kvm: update vmxcap for EPT A/D, INVPCID, RDRAND, VMFUNC
  kvm: Enable in-kernel irqchip support by default
  kvm: Add support for direct MSI injections
  kvm: Update kernel headers
  kvm: x86: Wire up MSI support for in-kernel irqchip
  pc: Enable MSI support at APIC level
  kvm: Introduce basic MSI support for in-kernel irqchips
  Introduce MSIMessage structure
  kvm: Refactor KVMState::max_gsi to gsi_count
2012-06-03 07:56:23 +08:00
Daniel Verkamp
4bb9c939a5 ahci: SATA FIS is 20 bytes, not 0x20
As in the SATA and AHCI specifications, a FIS is 5 Dwords of 4 bytes
each, which comes to 20 bytes (decimal), not 0x20.

Signed-off-by: Daniel Verkamp <daniel@drv.nu>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-30 14:51:09 +02:00
Christian Borntraeger
136be99e6e virtio-blk: Fix geometry sector calculation
Currently the sector value for the geometry is masked, even if the
user usesa command line parameter that explicitely gives a number.
This breaks dasd devices on s390. A dasd device can have
a physical block size of 4096 (== same for logical block size)
and a typcial geometry of 15 heads and 12 sectors per cyl.
The ibm partition detection relies on a correct geometry
reported by the device. Unfortunately the current code changes
12 to 8. This would be necessary if the total size is
not a multiple of logical sector size,  but for dasd this
is not the case.

This patch checks the device size and only applies sector
mask if necessary.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
CC: Christoph Hellwig <hch@lst.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-30 14:51:04 +02:00
Stefan Weil
47ce9ef7f8 virtio: Fix compiler warning for non Linux hosts
The local variables ret, i are only used if __linux__ is defined.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-30 09:49:49 +02:00
Amos Kong
a6de8ed80e pci: call object_unparent() before free_qdev()
Start VM with 8 multiple-function block devs, hot-removing
those block devs by 'device_del ...' would cause qemu abort.

| (qemu) device_del virti0-0-0
| (qemu) **
|ERROR:qom/object.c:389:object_delete: assertion failed: (obj->ref == 0)

It's a regression introduced by commit 57c9fafe

The whole PCI slot should be removed once. Currently only one func
is cleaned in pci_unplug_device(), if you try to remove a single
func by monitor cmd.

free_qdev() are called for all functions in slot,
but unparent_delete() is only called for one
function.

Signed-off-by: XXXX
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-29 20:19:24 -05:00
Scott Moser
9c3a596a03 fix multiboot loading if load_end_addr == 0
The previous multiboot load code did not treat the case where
load_end_addr was 0 specially.  The multiboot specification says the
following:
 * load_end_addr
   Contains the physical address of the end of the data segment.
   (load_end_addr - load_addr) specifies how much data to load. This
   implies that the text and data segments must be consecutive in the
   OS image; this is true for existing a.out executable formats. If
   this field is zero, the boot loader assumes that the text and data
   segments occupy the whole OS image file.

Signed-off-by: Scott Moser <smoser@ubuntu.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-29 20:19:24 -05:00
Avi Kivity
8294a64d7f vga: fix vram double-mapping with -vga std and -M pc-0.12
With pc-0.12, we map the video RAM both through the PCI BAR (the guest does
this) and through a fixed mapping at 0xe0000000.  The memory API doesn't allow
this double map, and aborts.

Fix by using an alias.

Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-29 20:19:24 -05:00
Anthony Liguori
dd86df756e Merge remote-tracking branch 'sstabellini/for_1.1_rc3' into staging
* sstabellini/for_1.1_rc3:
  Call xc_domain_shutdown with the reboot flag when the guest requests a reboot.
  xen: Fix PV-on-HVM
  xen_disk: properly update stats in ioreq_release()
  xen_disk: use bdrv_aio_flush instead of bdrv_flush
  xen_disk: remove syncwrite option
  xen: disable rtc_clock
  xen: do not initialize the interval timer and PCSPK emulator
2012-05-29 04:32:13 -05:00
Anthony Liguori
306761537f Merge remote-tracking branch 'kwolf/for-anthony' into staging
* kwolf/for-anthony:
  fdc-test: introduced qtest no_media_on_start and cmos qtest for floppy
  fdc: fix media detection
  fdc: floppy drive should be visible after start without media
  qemu-iotests: mark 035 qcow2-only
  qcow2: Check qcow2_alloc_clusters_at() return value
  sheepdog: use heap instead of stack for BDRVSheepdogState
  sheepdog: return -errno on error
  sheepdog: mark image as snapshot when tag is specified
  qemu-img: Explain how rebase operation can be used to perform a 'diff' operation.
  qcow2: don't leak buffer for unexpected qcow_version in header
2012-05-29 04:30:49 -05:00
Pavel Hrdina
cfb08fbafc fdc: fix media detection
We have to set up 'media_changed' after guest start so floppy driver
could detect that there is no media in drive. For this purpose we call
'fdctrl_change_cb' instead of 'fd_revalidate' in 'fdctrl_connect_drives'.
'fd_revalidate' is called inside 'fdctrl_change_cb'.

We still have to set default drive geometry in 'fd_revalidate' even
if there is no media in drive. When you try to open (windows) or mount (linux)
floppy the driver tries to seek on track 1. Linux guest stuck in loop then
kernel crashes and windows guest prints error message.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-25 18:21:12 +02:00
Pavel Hrdina
9ecd394753 fdc: floppy drive should be visible after start without media
If you start guest with floppy drive but without media inserted, guest
still should see floppy drive pressent.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-25 18:18:53 +02:00
Jim Meyering
12badfc238 scsi: declare vmstate_info_scsi_requests to be static
Signed-off-by: Jim Meyering <meyering@redhat.com>
2012-05-25 13:00:27 +02:00
Stefan Weil
f8687bab91 es1370: Fix debug code
When DEBUG_ES1370 is defined, the compiler shows these warnings:

hw/es1370.c: In function ?es1370_update_voices?:
hw/es1370.c:414: warning: format ?%d? expects type ?int?, but argument 3 has type ?size_t?
hw/es1370.c: In function ?es1370_writel?:
hw/es1370.c:582: warning: format ?%d? expects type ?int?, but argument 3 has type ?long int?
hw/es1370.c:592: warning: format ?%d? expects type ?int?, but argument 3 has type ?long int?
hw/es1370.c:609: warning: format ?%d? expects type ?int?, but argument 3 has type ?long int?
hw/es1370.c: In function ?es1370_readl?:
hw/es1370.c:751: warning: suggest braces around empty body in an ?if? statement

Fix the format strings and add the missing braces.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: malc <av1474@comtv.ru>
2012-05-24 02:03:30 +04:00
Anthony PERARD
4accd107d0 xen: Fix PV-on-HVM
In the context of PV-on-HVM under Xen, the emulated nics are supposed to be
unplug before the guest drivers are initialized, when the guest write to a
specific IO port.

Without this patch, the guest end up with two nics with the same MAC, the
emulated nic and the PV nic.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-21 15:40:51 -05:00
dunrong huang
a340046614 qdev: Fix memory leak
The str allocated in visit_type_str was not freed.

The visit_type_str function is an input visitor(<QMP/String/etc>-to-native)
here, it will allocate memory for caller, so the caller is responsible for
freeing the memory.

Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: dunrong huang <riegamaths@gmail.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-21 15:40:51 -05:00
Orit Wassermann
2a633c461e virtio: check virtio_load return code
Otherwise we crash on error.

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Orit Wassermann <owasserm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-21 15:40:50 -05:00
Paolo Bonzini
a6c5c84ae2 virtio-blk: always enable VIRTIO_BLK_F_SCSI
VIRTIO_BLK_F_SCSI is supposed to mean whether the host can *parse*
SCSI requests, not *execute* them.  You could run QEMU with scsi=on
and a file-backed disk, and QEMU would fail all SCSI requests even
though it advertises VIRTIO_BLK_F_SCSI.

Because we need to do this to fix a migration compatibility problem
related to how QEMU is invoked by management, we must do this
unconditionally even on older machine types.  This more or less assumes
that no one ever invoked QEMU with scsi=off.

Here is how testing goes:

- old QEMU, scsi=on -> new QEMU, scsi=on
- new QEMU, scsi=on -> old QEMU, scsi=on
- old QEMU, scsi=off -> new QEMU, scsi=on
- new QEMU, scsi=off -> old QEMU, scsi=on
        ok (new QEMU has VIRTIO_BLK_F_SCSI, adding host features is fine)

- old QEMU, scsi=off -> new QEMU, scsi=off
        ok (new QEMU has VIRTIO_BLK_F_SCSI, adding host features is fine)

- old QEMU, scsi=on -> new QEMU, scsi=off
        ok, bug fixed

- new QEMU, scsi=on -> old QEMU, scsi=off
        doesn't work (same as: old QEMU, scsi=on -> old QEMU, scsi=off)

- new QEMU, scsi=off -> old QEMU, scsi=off
        broken by the patch

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-21 15:40:50 -05:00
Paolo Bonzini
12c5674b84 virtio-blk: define VirtIOBlkConf
We will have to add another field to the virtio-blk configuration in
the next patch.  Avoid a proliferation of arguments to virtio_blk_init.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-21 15:40:50 -05:00
Paolo Bonzini
0e47931b88 virtio-blk: blockdev_mark_auto_del is transport-independent
Move it from virtio_blk_exit_pci to virtio_blk_exit.

This is included here because the next patch removes proxy->block.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-21 15:40:50 -05:00
Paolo Bonzini
f34e73cd69 virtio-blk: report non-zero status when failing SG_IO requests
Linux really looks only at scsi->errors for SG_IO requests; it does
not look at the virtio request status at all.  Because of this, when
a SG_IO request is failed early with virtio_blk_req_complete(req,
VIRTIO_BLK_S_UNSUPP), without writing hdr.status, it will look like
a success to the guest.

This is their bug, but we can make it safe for older guests now by
forcing scsi->errors to have a non-zero value whenever a request
has to be failed.

But if we fix the bug in the guest driver, we will have another problem
because QEMU returns VIRTIO_BLK_S_IOERR if the status is non-zero, and
Linux translates that to -EIO.  Rather, the guest should succeed the
request and pass the non-zero status via the userspace-provided SG_IO
structure.  So, remove the case where virtio_blk_handle_scsi can
return VIRTIO_BLK_S_IOERR.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-21 15:40:50 -05:00
Mark Langsdorf
80a2ba3d3c use an uint64_t for the max_sz parameter in load_image_targphys
Allow load_image_targphys to load files on systems with more than 2G of
emulated memory by changing the max_sz parameter from an int to an
uint64_t.

Reviewed-by: Andreas F=E4rber <afaerber@suse.de>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-21 15:40:50 -05:00
Jan Kiszka
7d37d351df virtio/vhost: Add support for KVM in-kernel MSI injection
Make use of the new vector notifier to track changes of the MSI-X
configuration of virtio PCI devices. On enabling events, we establish
the required virtual IRQ to MSI-X message route and link the signaling
eventfd file descriptor to this vIRQ line. That way, vhost-generated
interrupts can be directly delivered to an in-kernel MSI-X consumer like
the x86 APIC.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-05-21 19:22:50 +03:00
Jan Kiszka
cb697aaab9 msix: Add msix_nr_vectors_allocated
Analogously to msi_nr_vectors_allocated, add a service for MSI-X. Will
be used by the virtio-pci layer.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-05-21 19:22:50 +03:00