Old kvm.ko versions only supported a tiny number of ioeventfds so
virtio-pci avoids ioeventfds when kvm_has_many_ioeventfds() returns 0.
Do not check kvm_has_many_ioeventfds() when KVM is disabled since it
always returns 0. Since commit 8c56c1a592
("memory: emulate ioeventfd") it has been possible to use ioeventfds in
qtest or TCG mode.
This patch makes -device virtio-blk-pci,iothread=iothread0 work even
when KVM is disabled.
I have tested that virtio-blk-pci works under TCG both with and without
iothread.
This patch fixes qemu-iotests 068, which was accidentally merged early
despite the dependency on ioeventfd.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Tested-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20170628184724.21378-7-stefanha@redhat.com
Message-id: 20170615163813.7255-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This commit introduces a vhost-user device for SCSI. This is based
on the existing vhost-scsi implementation, but done over vhost-user
instead. It also uses a chardev to connect to the backend. Unlike
vhost-scsi (today), VMs using vhost-user-scsi can be live migrated.
To use it, start Qemu with a command line equivalent to:
qemu-system-x86_64 \
-chardev socket,id=vus0,path=/tmp/vus.sock \
-device vhost-user-scsi-pci,chardev=vus0,bus=pci.0,addr=...
A separate commit presents a sample application linked with libiscsi to
provide a backend for vhost-user-scsi.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Message-Id: <1488479153-21203-4-git-send-email-felipe@nutanix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make Power Management State flag writable to conform
with the PCI Express spec.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Make several Link Control Register flags writable to conform
with the PCI Express spec.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When the virtio devices are PCI Express, make error-enabling flags
writable to respect the PCIe spec.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We don't reset proxy->vqs[].{num|desc[]|avail[]|used[]}. This means if
a driver enable the vq without setting vq address after reset. The old
addresses were leaked. Fixing this by resetting modern vq meta data
during device reset.
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit
96a8821d21. Previous patch is a better
solution which does not require a strict order between virtio and IOMMU.
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Commit c611c76417 ("virtio: add MemoryListener to cache ring
translations") registers a memory listener to dma_as. This may not
work when IOMMU is enabled: dma_as(bus_master_as) were initialized in
pcibus_machine_done() after virtio_realize(). This will cause a
segfault. Fixing this by using pci_device_iommu_address_space()
instead to make sure address space were initialized at this time.
With this fix, IOMMU device were required to be initialized before any
virtio-pci devices.
Fixes: c611c76417 ("virtio: add MemoryListener to cache ring translations")
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
msix_init() reports errors with error_report(), which is wrong when
it's used in realize(). The same issue was fixed for msi_init() in
commit 1108b2f. In order to make the API change as small as possible,
leave the return value check to later patch.
For some devices(like e1000e, vmxnet3, nvme) who won't fail because of
msix_init's failure, suppress the error report by passing NULL error
object.
Bonus: add comment for msix_init.
CC: Jiri Pirko <jiri@resnulli.us>
CC: Gerd Hoffmann <kraxel@redhat.com>
CC: Dmitry Fleytman <dmitry@daynix.com>
CC: Jason Wang <jasowang@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Hannes Reinecke <hare@suse.de>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Alex Williamson <alex.williamson@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Current migration code cannot handle some data structures such as
QTAILQ in qemu/queue.h. Here we extend the signatures of put/get
in VMStateInfo so that customized handling is supported. put now
will return int type.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Message-Id: <1484852453-12728-2-git-send-email-duanj@linux.vnet.ibm.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
virtio core has code to revert queue number
to maximum on reset. Drop TODO to add that.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
virtio-net-pci does not enable ioeventfd for historical reasons (and
nobody ever checked whether it should be revisited). Note that other
backends do enable ioeventfd for virtio-net.
However, it has a major effect on performance. On Windows, throughput is
_multiplied_ by 2 or 3 on TCP_STREAM (on small packets it is "only" a 30%
improvement) and a little less so on TCP_MAERTS albeit still very much
statistically significant. Latency also has a single digit improvement.
This is not visible when using vhost, which forces ioeventfd=on, but it
is substantial without vhost. In addition, also on Windows and with the
RHEL 7.3 kernel, APICv seems to slow down virtio-net performance a bit,
but the penalty with this patch goes from -25% to -7%.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patches enable the Address Translation Service support for virtio
pci devices. This is needed for a guest visible Device IOTLB
implementation and will be required by vhost device IOTLB API
implementation for intel IOMMU.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, all virtio devices bypass IOMMU completely. This is because
address_space_memory is assumed and used during DMA emulation. This
patch converts the virtio core API to use DMA API. This idea is
- introducing a new transport specific helper to query the dma address
space. (only pci version is implemented).
- query and use this address space during virtio device guest memory
accessing when iommu platform (VIRTIO_F_IOMMU_PLATFORM) was enabled
for this device.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch fixes a cross-version migration regression introduced
by commit d1b4259f ("virtio-bus: Plug devices after features are
negotiated").
The problem is encountered when host's vhost backend does not support
VIRTIO_F_VERSION_1, and migration is initiated from a v2.7 or prior
machine with virtio-pci modern capabilities enabled to a v2.8 machine.
In this case, modern capabilities get exposed to the guest by the source,
whereas the target will detect version 1 is not supported so will only
expose legacy capabilities.
The problem is fixed by introducing a new "x-ignore-backend-features"
property, which is set in v2.7 and prior compatibility modes. Doing this,
v2.7 machine keeps its broken behaviour (enabling modern while version
is not supported), and newer machines will behave correctly.
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Message-id: 20161214163035.3297-1-maxime.coquelin@redhat.com
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This will be needed once dataplane will be able to set it outside
the big QEMU lock.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Legacy features are those that transitional devices only
expose on the legacy interface.
Allow different ones per device class.
Cc: qemu-stable@nongnu.org # dependency for the next patch
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
This will be used to forbid iothread configuration when the
proxy does not allow using ioeventfd. To simplify the implementation,
change the direction of the ioeventfd_disabled callback too.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This simplifies the code and removes the ioeventfd_started
and ioeventfd_set_started callback. The only difference is
in how virtio-ccw handles an error---it doesn't disable
ioeventfd forever anymore. It was the only backend to do
so, and if desired this behavior should be implemented in
virtio-bus.c.
Instead of ioeventfd_started, the ioeventfd_assign callback now
determines whether the virtio bus supports host notifiers.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This simplifies the code and removes the ioeventfd_set_disabled
callback.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Avoid "tricking" virtio-blk-dataplane into thinking that ioeventfd will be
available when it is not. This bug has always been there, but it will break
TCG+ioeventfd=on once the dataplane code will be always used when ioeventfd=on.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
the bar index names are much similar to the bar memory regions,
distinguish them to improve the code readability.
Signed-off-by: Chen Fan <fan.chen@easystack.cn>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Currently, devices are plugged before features are negotiated.
If the backend doesn't support VIRTIO_F_VERSION_1, the transport
needs to rewind some settings.
This is the case for CCW, for which a post_plugged callback had
been introduced, where max_rev field is just updated if
VIRTIO_F_VERSION_1 is not supported by the backend.
For PCI, implementing post_plugged would be much more
complicated, so it needs to know whether the backend supports
VIRTIO_F_VERSION_1 at plug time.
Currently, nothing is done for PCI. Modern capabilities get
exposed to the guest even if VIRTIO_F_VERSION_1 is not supported
by the backend, which confuses the guest.
This patch replaces existing post_plugged solution with an
approach that fits with both transports.
Features negotiation is performed before ->device_plugged() call.
A pre_plugged callback is introduced so that the transports can
set their supported features.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-stable@nongnu.org
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com> [ccw]
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Implement the new virtio sockets device for host<->guest communication
using the Sockets API. Most of the work is done in a vhost kernel
driver so that virtio-vsock can hook into the AF_VSOCK address family.
The QEMU vhost-vsock device handles configuration and live migration
while the rx/tx happens in the vhost_vsock.ko Linux kernel driver.
The vsock device must be given a CID (host-wide unique address):
# qemu -device vhost-vsock-pci,id=vhost-vsock-pci0,guest-cid=3 ...
For more information see:
http://qemu-project.org/Features/VirtioVsock
[Endianness fixes and virtio-ccw support by Claudio Imbrenda
<imbrenda@linux.vnet.ibm.com>]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
[mst: rebase to master]
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
!legacy && !modern is shorter than !(legacy || modern).
I also perfer this (less ()s) as a matter of taste.
Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Without presuming if we got there because of a user mistake or some
more subtle bug in the tooling, it really does not make sense to
implement a non-functional device.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently each VQ Notification Virtio Capability is allocated
on a different page. The idea is to enable split drivers within
guests, however there are no known plans to do that.
The allocation will result in a 8MB BAR, more than various
guest firmwares pre-allocates for PCI Bridges hotplug process.
Reserve 4 bytes per VQ by default and add a new parameter
"page-per-vq" to be used with split drivers.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
PIO MR registration should use size from the correct notify struct.
Doesn't affect any visible behaviour because the field values are the
same (both are 4).
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Enable transitional virtio devices by default.
Enable virtio-1.0 for devices plugged into
PCIe ports (Root ports or Downstream ports).
Using the virtio-1 mode will remove the limitation
of the number of devices that can be attached to a machine
by removing the need for the IO BAR.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
In the past, we are doing gsi route commit for each irqchip route
update. This is not efficient if we are updating lots of routes in the
same time. This patch removes the committing phase in
kvm_irqchip_update_msi_route(). Instead, we do explicit commit after all
routes updated.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Changing the original MSIMessage parameter in kvm_irqchip_add_msi_route
into the vector number. Vector index provides more information than the
MSIMessage, we can retrieve the MSIMessage using the vector easily. This
will avoid fetching MSIMessage every time before adding MSI routes.
Meanwhile, the vector info will be used in the coming patches to further
enable gsi route update notifications.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Convert to new interface.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use Coccinelle script to replace 'ret = E; return ret' with
'return E'. The script will do the substitution only when the
function return type and variable type are the same.
Manual fixups:
* audio/audio.c: coding style of "read (...)" and "write (...)"
* block/qcow2-cluster.c: wrap line to make it shorter
* block/qcow2-refcount.c: change indentation of wrapped line
* target-tricore/op_helper.c: fix coding style of
"remainder|quotient"
* target-mips/dsp_helper.c: reverted changes because I don't
want to argue about checkpatch.pl
* ui/qemu-pixman.c: fix line indentation
* block/rbd.c: restore blank line between declarations and
statements
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1465855078-19435-4-git-send-email-ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Unused Coccinelle rule name dropped along with a redundant comment;
whitespace touched up in block/qcow2-cluster.c; stale commit message
paragraph deleted]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the
Error typedef. Since then, we've moved to include qemu/osdep.h
everywhere. Its file comment explains: "To avoid getting into
possible circular include dependencies, this file should not include
any other QEMU headers, with the exceptions of config-host.h,
compiler.h, os-posix.h and os-win32.h, all of which are doing a
similar job to this file and are under similar constraints."
qapi/error.h doesn't do a similar job, and it doesn't adhere to
similar constraints: it includes qapi-types.h. That's in excess of
100KiB of crap most .c files don't actually need.
Add the typedef to qemu/typedefs.h, and include that instead of
qapi/error.h. Include qapi/error.h in .c files that need it and don't
get it now. Include qapi-types.h in qom/object.h for uint16List.
Update scripts/clean-includes accordingly. Update it further to match
reality: replace config.h by config-target.h, add sysemu/os-posix.h,
sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h
comment quoted above similarly.
This reduces the number of objects depending on qapi/error.h from "all
of them" to less than a third. Unfortunately, the number depending on
qapi-types.h shrinks only a little. More work is needed for that one.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
[Fix compilation without the spice devel packages. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Actually fixes linux not finding virtio 1.0 device virtqueues after
reboot. Which is new I think, any chance linux kernel virtio code
became more strict in 4.3?
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Tested-by: Fam Zheng <famz@redhat.com>
Since guest_mask_notifier can not be used in vhost-user mode due
to buffering implied by unix control socket, force
use_mask_notifier on virtio devices of vhost-user interfaces, and
send correct callfd to the guest at vhost start.
Using guest_notifier_mask function in vhost-user case may
break interrupt mask paradigm, because mask/unmask is not
really done when returning from guest_notifier_mask call, instead
message is posted in a unix socket, and processed later.
Add an option boolean flag 'use_mask_notifier' to disable the use
of guest_notifier_mask in virtio pci.
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453832250-766-15-git-send-email-peter.maydell@linaro.org
In 1811e64 'hw/virtio: Add PCIe capability to virtio devices', the
QEMU_PCI_CAP_EXPRESS capability was added to virtio's pci_dev, within
'virtio_pci_realize' - the pci device object realization method.
This occurs to late, as 'pci_qdev_realize' (DeviceClass.realize of
TYPE_PCI_DEVICE) has already been called, without knowing that the
device instance is indeed an "express" instance, thus allocating
insufficient pci config space.
As a result, device may crash upon attempt to write to the PCIE config
space.
Fix, by arming the QEMU_PCI_CAP_EXPRESS capability early in virtio-pci's
own DeviceClass realize method.
This also makes code cleaner, as 'virtio_pci_realize' may now access the
'pci_is_express' predicate when needed.
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
The virtio devices are converted to PCI-Express
if they are plugged into a PCI-Express bus and
the 'modern' protocol is enabled.
Devices plugged directly into the Root Complex as
Integrated Endpoints remain PCI.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Guest always get zero when reading queue_enable. This violates
spec. Fixing this by setting the queue_enable to true during any guest
writing and setting it to zero during reset.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We used to use mmio for notification. This could be slow on some arch
(e.g on x86 without EPT). So this patch introduces pio bar and a pio
notification cap for modern device. This ability is enabled through
property "modern-pio-notify" for virtio pci devices and was disabled
by default. Management can enable when it thinks it was needed.
Benchmarks shows almost no obvious difference compared to legacy
device on machines without ept. Thanks Wenli Quan <wquan@redhat.com>
for the benchmarking.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We use data match eventfd for 1.0 notification currently. This could
be slow since software decoding is needed for mmio exit. To speed this
up, we can switch to use zero length mmio eventfd for 1.0 notification
since we can examine the queue index directly from the writing
address. KVM kernel module can utilize this by registering it to fast
mmio bus which could be as fast as pio on ept capable machine when
fast mmio is supported by host kernel.
Lots of improvements were seen on a ept capable machine:
Guest RX:(TCP)
size/session/+throughput%/+cpu%/-+per cpu%/
64/1/+1.6807%/[-16.2421%]/[+21.3984%]/
64/2/+0.6091%/[-11.0187%]/[+13.0678%]/
64/4/+0.0553%/[-5.9768%]/[+6.4155%]/
64/8/+0.1206%/[-4.0057%]/[+4.2984%]/
256/1/-0.0031%/[-10.1166%]/[+11.2517%]/
256/2/-0.5058%/[-6.1656%]/+6.0317%]/
...
Guest TX:(TCP)
size/session/+throughput%/+cpu%/-+per cpu%/
64/1/[+18.9183%]/-0.2823%/[+19.2550%]/
64/2/[+13.5714%]/[+2.2675%]/[+11.0533%]/
64/4/[+13.1070%]/[+2.1817%]/[+10.6920%]/
64/8/[+13.0426%]/[+2.0887%]/[+10.7299%]/
256/1/[+36.2761%]/+6.3434%/[+28.1471%]/
...
1024/1/[+44.8873%]/+2.0811%/[+41.9335%]/
...
1024/4/+0.0228%/[-2.2044%]/[+2.2774%]/
...
16384/2/+0.0127%/[-5.0346%]/[+5.3148%]/
...
65535/1/[+0.0062%]/[-4.1183%]/[+4.3017%]/
65535/2/+0.0004%/[-4.2311%]/[+4.4185%]/
65535/4/+0.0107%/[-4.6106%]/[+4.8446%]/
65535/8/-0.0090%/[-5.5178%]/[+5.8306%]/
Latency:(TCP_RR)
size/session/+transaction rate%/+cpu%/-+per cpu%/
64/1/[+6.5248%]/[-9.2882%]/[+17.4322%]/
64/25/[+11.0854%]/[+0.8000%]/[+10.2038%]/
64/50/[+12.1076%]/[+2.4627%]/[+9.4131%]/
256/1/[+5.3677%]/[+10.5669%]/-4.7024%/
256/25/[+5.6402%]/-0.8962%/[+6.5955%]/
256/50/[+5.9685%]/[+1.7766%]/[+4.1188%]/
4096/1/+0.2508%/[-10.4941%]/[+12.0047%]/
4096/25/[+1.8533%]/-0.0273%/+1.8812%/
4096/50/[+1.2156%]/-1.4134%/+2.6667%/
Notes: data with '[]' is the one whose significance is greater than 95%.
Thanks Wenli Quan <wquan@redhat.com> for the benchmarking.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We don't migrate the followings fields for virtio-pci:
uint32_t dfselect;
uint32_t gfselect;
uint32_t guest_features[2];
struct {
uint16_t num;
bool enabled;
uint32_t desc[2];
uint32_t avail[2];
uint32_t used[2];
} vqs[VIRTIO_QUEUE_MAX];
This will confuse driver if migrating during initialization. Solves
this issue by:
- introduce transport specific callbacks to load and store extra
virtqueue states.
- add a new subsection for virtio to migrate transport specific modern
device state.
- implement pci specific callbacks.
- add a new property for virtio-pci for whether or not to migrate
extra state.
- compat the migration for 2.4 and elder machine types
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
In-kernel ITS emulation on ARM64 will require to supply requester IDs.
These IDs can now be retrieved from the device pointer using new
pci_requester_id() function.
This patch adds pci_dev pointer to KVM GSI routing functions and makes
callers passing it.
x86 architecture does not use requester IDs, but hw/i386/kvm/pci-assign.c
also made passing PCI device pointer instead of NULL for consistency with
the rest of the code.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Message-Id: <ce081423ba2394a4efc30f30708fca07656bc500.1444916432.git.p.fedin@samsung.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When CONFIG_LINUX is off, devices "virtio-keyboard-device",
"virtio-mouse-device", "virtio-tablet-device" and
"virtio-input-host-device" aren't compiled in, yet
"virtio-keyboard-pci", "virtio-mouse-pci", "virtio-tablet-pci" and
"virtio-input-host-pci" still are. Attempts to introspect them crash,
e.g.
$ qemu-system-x86_64 -device virtio-tablet-pci,help
**
ERROR:/work/armbru/qemu/qom/object.c:333:object_initialize_with_type: assertion failed: (type != NULL)
Broken in commit 710e2d9 and commit 006a5ed.
Fix by compiling the "virtio-FOO-pci" exactly when compiling the
"virtio-FOO-device": compile "virtio-keyboard-device",
"virtio-mouse-device", "virtio-tablet-device" regardless of
CONFIG_LINUX, and compile "virtio-input-host-pci" only for
CONFIG_LINUX.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <1444320700-26260-3-git-send-email-armbru@redhat.com>
And do not issue an error_report in that case.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the utility function pow2ceil() for rounding up to the next
largest power of 2, rather than inline calculation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1437741192-20955-3-git-send-email-peter.maydell@linaro.org
Each memory_region_add_subregion must be paired with
memory_region_del_subregion.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>