Commit Graph

617 Commits

Author SHA1 Message Date
Maxime Coquelin
b9ec9bd468 vhost-user: unregister slave req handler at cleanup time
If the backend sends a request just before closing the socket,
the aio dispatcher might schedule its reading after the vhost
device has been cleaned, leading to a NULL pointer dereference
in slave_read();

vhost_user_cleanup() already closes the socket but it is not
enough, the handler has to be unregistered.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03 22:29:49 +03:00
Maxime Coquelin
384b557da1 vhost: ensure vhost_ops are set before calling iotlb callback
This patch fixes a crash that happens when vhost-user iommu
support is enabled and vhost-user socket is closed.

When it happens, if an IOTLB invalidation notification is sent
by the IOMMU, vhost_ops's NULL pointer is dereferenced.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03 22:29:49 +03:00
Mao Zhongyi
9a7c2a5970 pci: Make errp the last parameter of pci_add_capability()
Add Error argument for pci_add_capability() to leverage the errp
to pass info on errors. This way is helpful for its callers to
make a better error handling when moving to 'realize'.

Cc: pbonzini@redhat.com
Cc: rth@twiddle.net
Cc: ehabkost@redhat.com
Cc: mst@redhat.com
Cc: jasowang@redhat.com
Cc: marcel@redhat.com
Cc: alex.williamson@redhat.com
Cc: armbru@redhat.com
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03 22:29:49 +03:00
Stefan Hajnoczi
c324fd0a39 virtio-pci: use ioeventfd even when KVM is disabled
Old kvm.ko versions only supported a tiny number of ioeventfds so
virtio-pci avoids ioeventfds when kvm_has_many_ioeventfds() returns 0.

Do not check kvm_has_many_ioeventfds() when KVM is disabled since it
always returns 0.  Since commit 8c56c1a592
("memory: emulate ioeventfd") it has been possible to use ioeventfds in
qtest or TCG mode.

This patch makes -device virtio-blk-pci,iothread=iothread0 work even
when KVM is disabled.

I have tested that virtio-blk-pci works under TCG both with and without
iothread.

This patch fixes qemu-iotests 068, which was accidentally merged early
despite the dependency on ioeventfd.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Tested-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20170628184724.21378-7-stefanha@redhat.com
Message-id: 20170615163813.7255-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-30 11:03:45 +01:00
Felipe Franciosi
f12c1ebddf vhost-user-scsi: Introduce vhost-user-scsi host device
This commit introduces a vhost-user device for SCSI. This is based
on the existing vhost-scsi implementation, but done over vhost-user
instead. It also uses a chardev to connect to the backend. Unlike
vhost-scsi (today), VMs using vhost-user-scsi can be live migrated.

To use it, start Qemu with a command line equivalent to:

qemu-system-x86_64 \
       -chardev socket,id=vus0,path=/tmp/vus.sock \
       -device vhost-user-scsi-pci,chardev=vus0,bus=pci.0,addr=...

A separate commit presents a sample application linked with libiscsi to
provide a backend for vhost-user-scsi.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Message-Id: <1488479153-21203-4-git-send-email-felipe@nutanix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-15 11:18:40 +02:00
Peter Maydell
cb8b8ef457 -----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZMbiwAAoJENro4Ql1lpzlEm0P/RViCB92pz62wdcsjpXvqwBO
 ddhfQqPaGE+BD1tRcrOxDFKTWEnxFN4Oj9zNQ+/FgLcckZ5qAy6PkuCnesG2eJjG
 c43y07e+u89G8L+7zLoAw+fYt8tnb5+ood6Q6WWH9rbNHLaIlMvwzomm3Rkf+1L/
 PKVixcnFBxulTTftLgVUqpFSRrDDlywrmK9gjoDRHeIp7fZTxUib5T720ShszZe5
 C2iym5Ucb5ohEoq4LfY+mPWVGkTdMtsX3Vz5eHsTqoYYY1akTg5CwgGQuqbS/kXN
 7/4nkX6otkxTlU1+ydWEQCpD3orWEUJUeKUlLA48rAckqJyX3mfx94AeIOYseILM
 vMdxQo549ofcann1RtHyPkgfwyl8rrFRZ2xdkYGeSJ2zyv6Ekcxs0r5NXX7DMZBY
 oyjH38UTpHBhZp5l/0KAff0y0FNnDz6JJttgkcqHz8Qd4chE6JP9X9gx+9xkakst
 ytbiFP5NuG9RTwWjFSXObgZl9QU/n+JuJ/kr/LMMTgmPYwONW3NglBGgAoRDu/gC
 4YAmdSWf2UYlQg1IxTn2KT4U4Dn56DEmcaL9yDBH+UJLhWpxle9+D2MVruJN7MPS
 H0/8ytXvmb99DMWOhIdRmOzWCmQ/YAg/CAXmUZPX7sul0b3TNhSDqEi2qnLR84GT
 GSOh5aGbQ2H8DGQioGLV
 =5eL6
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/elmarco/tags/chrfe-pull-request' into staging

# gpg: Signature made Fri 02 Jun 2017 20:12:48 BST
# gpg:                using RSA key 0xDAE8E10975969CE5
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>"
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* remotes/elmarco/tags/chrfe-pull-request:
  char: move char devices to chardev/
  char: make chr_fe_deinit() optionaly delete backend
  char: rename functions that are not part of fe
  char: move CharBackend handling in char-fe unit
  char: generalize qemu_chr_write_all()
  be-hci: use backend functions
  chardev: serial & parallel declaration to own headers
  chardev: move headers to include/chardev
  Remove/replace sysemu/char.h inclusion
  char-win: close file handle except with console
  char-win: rename hcom->file
  char-win: rename win_chr_init/poll win_chr_serial_init/poll
  char-win: remove WinChardev.len
  char-win: simplify win_chr_read()
  char: cast ARRAY_SIZE() as signed to silent warning on empty array

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-05 10:09:14 +01:00
Maxime Coquelin
6dcdd06e3b spec/vhost-user spec: Add IOMMU support
This patch specifies and implements the master/slave communication
to support device IOTLB in slave.

The vhost_iotlb_msg structure introduced for kernel backends is
re-used, making the design close between the two backends.

An exception is the use of the secondary channel to enable the
slave to send IOTLB miss requests to the master.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Marc-André Lureau
4bbeeba023 vhost-user: add slave-req-fd support
Learn to give a socket to the slave to let him make requests to the
master.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Marc-André Lureau
2152f3fead vhost-user: add vhost_user to hold the chr
Next patches will add more fields to the structure

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Maxime Coquelin
020e571b8b vhost: rework IOTLB messaging
This patch reworks IOTLB messaging to prepare for vhost-user
device IOTLB support.

IOTLB messages handling is extracted from vhost-kernel backend,
so that only the messages transport remains backend specifics.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Maxime Coquelin
fc58bd0d97 vhost: propagate errors in vhost_device_iotlb_miss()
Some backends might want to know when things went wrong.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Paolo Bonzini
b0ac429f13 virtio: add virtqueue_alloc_element tracepoint
This tracepoint can help diagnosing failures due to memory
fragmentation in the guest.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Marc-André Lureau
4d43a603c7 char: move CharBackend handling in char-fe unit
Move all the frontend struct and methods to a seperate unit. This avoids
accidentally mixing backend and frontend calls, and helps with readabilty.

Make qemu_chr_replay() a macro shared by both char and char-fe.

Export qemu_chr_write(), and use a macro for qemu_chr_write_all()

(nb: yes, CharBackend is for char frontend :)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:53 +04:00
Marc-André Lureau
8228e353d8 chardev: move headers to include/chardev
So they are all in one place. The following patch will move serial &
parallel declarations to the respective headers.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:52 +04:00
Maxime Coquelin
3cf7daf8c3 vhost-user: pass message as a pointer to process_message_reply()
process_message_reply() was recently updated to get full message
content instead of only its request field.

There is no need to copy all the struct content into the stack,
so just pass its pointer as const.

Reviewed-by: Jens Freimann <jfreiman@redhat.com>
Reviewed-by: Zhiyong Yang <zhiyong.yang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-25 21:25:28 +03:00
Juan Quintela
68ba3b0743 migration: migration.h was not needed
This files don't use any function from migration.h, so drop it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-05-18 19:20:59 +02:00
Stefan Hajnoczi
2ccbd47c1d migration/next for 20170517
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJZHCoMAAoJEPSH7xhYctcjyOcQAN82GDYgXj93k40rU/SmZTP7
 blelisGsY5UNo33bLZq07fVwwdk1vIR0OIZvjMyGVWptAX49QJ6BVwX2E5zmb9LW
 AT3rVeyqz8nnC6OwWBxN9bu+sPJ13ibGs1l2j5Kn9jZ6a9rJCC7LOKdo4Dxbs3Uk
 Obw4f7swsozTQPxeHfrsBgFIvcB8qXLjdxsVhj+IWkmp1KDKVg+TWfNFJx30dK0G
 ktVsV0Xu6exEzcnzpTf93Bcv8vt49JRrCka9N5YryPTZmFuGgW291lqviPWiZg/W
 39F3cga5QfDzcs4Z6Lrz3Qeo/q+2n5G5O23UmrJccZ//UQMdeW9sd5udj211aMeq
 I7UdrarIHWRCCVTVdVL7AGJ8xmMIKHsvKRWstw7FEMHQ+lD/sFSfpWBtYdGhAotF
 mf/yncMKb52QbNyIuanoKi8UjU+RCvuslCac87U3fPqz/qYGvhnmO145S/wai1mR
 +FQQXORJOhdsWDqRRz9q8/uXqPwm173+rHHzMgFa3P1X9u1jfLhjJk0g9sDFtyAb
 If4IzOwfuCLJyelcuzzy9SSOzDsGu1LcrBoRgqTugX+MSJXFjWOKKfA1wxnAKkPf
 T2fQIqny2N7VCfpDB1iaCfxnkizIwrYEI3YRkMuJpYU3489x/BJQIILoLo1yEj4G
 vNhq+qJ9V/Uj8X+X5/cL
 =A5DU
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'quintela/tags/migration/20170517' into staging

migration/next for 20170517

# gpg: Signature made Wed 17 May 2017 11:46:36 AM BST
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* quintela/tags/migration/20170517:
  migration: Move check_migratable() into qdev.c
  migration: Move postcopy stuff to postcopy-ram.c
  migration: Move page_cache.c to migration/
  migration: Create migration/blocker.h
  ram: Rename RAM_SAVE_FLAG_COMPRESS to RAM_SAVE_FLAG_ZERO
  migration: Pass Error ** argument to {save,load}_vmstate
  migration: Fix regression with compression threads

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-18 10:05:52 +01:00
Greg Kurz
66453cff9e virtio: allow broken device to notify guest
According to section 2.1.2 of the virtio-1 specification:

"The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state that
a reset is needed. If DRIVER_OK is set, after it sets DEVICE_NEEDS_RESET,
the device MUST send a device configuration change notification to the
driver."

Commit "f5ed36635d8f virtio: stop virtqueue processing if device is broken"
introduced a virtio_error() call that just does that:

- internally mark the device as broken
- set the DEVICE_NEEDS_RESET bit in the status
- send a configuration change notification

Unfortunately, virtio_notify_vector(), called by virtio_notify_config(),
returns right away when the device is marked as broken and the notification
isn't sent in this case.

The spec doesn't say whether a broken device can send notifications
in other situations or not. But since the driver isn't supposed to do
anything but to reset the device, it makes sense to keep the check in
virtio_notify_config().

Marking the device as broken AFTER the configuration change notification was
sent is enough to fix the issue.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-18 00:35:15 +03:00
Juan Quintela
795c40b8bd migration: Create migration/blocker.h
This allows us to remove lots of includes of migration/migration.h

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-17 12:04:59 +02:00
Zhiyong Yang
60cd11024f hw/virtio: fix vhost user fails to startup when MQ
Qemu2.7~2.9 and vhost user for dpdk 17.02 release work together
to cause failures of new connection when negotiating to set MQ.
(one queue pair works well).
   Because there exist some bugs in qemu code when introducing
VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. When vhost_user_set_mem_table
is invoked to deal with the vhost message VHOST_USER_SET_MEM_TABLE
for the second time, qemu indeed doesn't send the messge (The message
needs to be sent only once)but still will be waiting for dpdk's reply
ack, then, qemu is always freezing, while DPDK is always waiting for
next vhost message from qemu.
  The patch aims to fix the bug, MQ can work well.
  The same bug is found in function vhost_user_net_set_mtu, it is fixed
at the same time.
  DPDK related patch is as following:
  http://www.dpdk.org/dev/patchwork/patch/23955/

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Cc: qemu-stable@nongnu.org
Fixes: ca525ce561 ("vhost-user: Introduce a new protocol feature REPLY_ACK.")
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Jens Freimann <jfreiman@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-10 22:04:23 +03:00
Peter Maydell
32c7e0ab75 migration/next for 20170421
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJY+d69AAoJEPSH7xhYctcj/4oQAIFFEyWaqrL9ve5ySiJgdtcY
 zYtiIhZQ+nPuy2i1oDSX+vbMcmkJDDyfO5qLovxyHGkZHniR8HtxNHP+MkZQa07p
 DiSIvd51HvcixIouhbGcoUCU63AYxqNL3o5/TyNpUI72nvsgwl3yfOot7PtutE/F
 r384j8DrOJ9VwC5GGPg27mJvRPvyfDQWfxDCyMYVw153HTuwVYtgiu/layWojJDV
 D2L1KV45ezBuGckZTHt9y6K4J5qz8qHb/dJc+whBBjj4j9T9XOILU9NPDAEuvjFZ
 gHbrUyxj7EiApjHcDZoQm9Raez422ALU30yc9Kn7ik7vSqTxk2Ejq6Gz7y9MJrDn
 KdMj75OETJNjBL+0T9MmbtWts28+aalpTUXtBpmi3eWQV5Hcox2NF1RP42jtD9Pa
 lkrM6jv0nsdNfBPlQ+ZmBTJxysWECcMqy487nrzmPNC8vZfokjXL5be12puho9fh
 ziU4gx9C6/k82S+/H6WD/AUtRiXJM7j4oTU2mnjrsSXQC1JNWqODBOFUo9zsDufl
 vtcrxfPhSD1DwOInFSIBHf/RylcgTkPCL0rPoJ8npNDly6rHFYJ+oIbsn84Z4uYY
 RWvH8xB9wgRlK9L1WdRgOd2q7PaeHQoSSdPOiS9YVEVMVvSW8Es5CRlhcAsw/M/T
 1Tl65cNrjETAuZKL3dLH
 =EsZ5
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20170421' into staging

migration/next for 20170421

# gpg: Signature made Fri 21 Apr 2017 11:28:13 BST
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* remotes/juanquintela/tags/migration/20170421: (65 commits)
  hmp: info migrate_parameters format tunes
  hmp: info migrate_capability format tunes
  migration: rename max_size to threshold_size
  migration: set current_active_state once
  virtio-rng: stop virtqueue while the CPU is stopped
  migration: don't close a file descriptor while it can be in use
  ram: Remove migration_bitmap_extend()
  migration: Disable hotplug/unplug during migration
  qdev: Move qdev_unplug() to qdev-monitor.c
  qdev: Export qdev_hot_removed
  qdev: qdev_hotplug is really a bool
  migration: Remove MigrationState parameter from migration_is_idle()
  ram: Use RAMBitmap type for coherence
  ram: rename last_ram_offset() last_ram_pages()
  ram: Use ramblock and page offset instead of absolute offset
  ram: Change offset field in PageSearchStatus to page
  ram: Remember last_page instead of last_offset
  ram: Use page number instead of an address for the bitmap operations
  ram: reorganize last_sent_block
  ram: ram_discard_range() don't use the mis parameter
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-21 15:59:27 +01:00
Laurent Vivier
a23a6d1839 virtio-rng: stop virtqueue while the CPU is stopped
If we modify the virtio-rng virqueue while the
vmstate is already migrated we can have some
inconsistencies between the virtqueue state and
the memory content.

To avoid this, stop the virtqueue while the CPU
is stopped.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by:  Amit Shah <amit@kernel.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-04-21 12:25:40 +02:00
Peter Xu
698feb5e13 memory: add section range info for IOMMU notifier
In this patch, IOMMUNotifier.{start|end} are introduced to store section
information for a specific notifier. When notification occurs, we not
only check the notification type (MAP|UNMAP), but also check whether the
notified iova range overlaps with the range of specific IOMMU notifier,
and skip those notifiers if not in the listened range.

When removing an region, we need to make sure we removed the correct
VFIOGuestIOMMU by checking the IOMMUNotifier.start address as well.

This patch is solving the problem that vfio-pci devices receive
duplicated UNMAP notification on x86 platform when vIOMMU is there. The
issue is that x86 IOMMU has a (0, 2^64-1) IOMMU region, which is
splitted by the (0xfee00000, 0xfeefffff) IRQ region. AFAIK
this (splitted IOMMU region) is only happening on x86.

This patch also helps vhost to leverage the new interface as well, so
that vhost won't get duplicated cache flushes. In that sense, it's an
slight performance improvement.

Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-2-git-send-email-peterx@redhat.com>
[ehabkost: included extra vhost_iommu_region_del() change from Peter Xu]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Jason Wang
375f74f473 vhost: generalize iommu memory region
We assumes the iommu_ops were attached to the root region of address
space. This may not be true for all kinds of IOMMU implementation and
especially after commit 3716d5902d ("pci: introduce a bus master
container"). So fix this by not assuming as->root has iommu_ops,
instead depending on the regions reported by memory listener through:

- register a memory listener to dma_as
- during region_add, if it's a region of IOMMU, register a specific
  IOMMU notifier, and store all notifiers in a list.
- during region_del, compare and delete the IOMMU notifier from the list

This is also a must for making vhost device IOTLB works for all types
of IOMMUs. Note, since we register one notifier during each
.region_add, the IOTLB may be flushed more than one times, this is
suboptimal and could be optimized in the future.

Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Fixes: 3716d5902d ("pci: introduce a bus master container")
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-03-30 19:09:16 +03:00
Paolo Bonzini
e49a661840 virtio: always use handle_aio_output if registered
Commit ad07cd6 ("virtio-scsi: always use dataplane path if ioeventfd is
active", 2016-10-30) and 9ffe337 ("virtio-blk: always use dataplane
path if ioeventfd is active", 2016-10-30) broke the virtio 1.0
indirect access registers.

The indirect access registers bypass the ioeventfd, so that virtio-blk
and virtio-scsi now repeatedly try to initialize dataplane instead of
triggering the guest->host EventNotifier.  Detect the situation by
checking vq->handle_aio_output; if it is not NULL, trigger the
EventNotifier, which is how the device expects to get notifications
and in fact the only thread-safe manner to deliver them.

Fixes: ad07cd6
Fixes: 9ffe337
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-22 17:56:00 +02:00
Fam Zheng
a77690c41d virtio: Fix error handling in virtio_bus_device_plugged
For one thing we shouldn't continue if an error happened, for the other
two steps failing can cause an abort() in error_setg because we reuse
the same errp blindly.

Add error handling checks to fix both issues.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-22 17:54:32 +02:00
Marcel Apfelbaum
27ce0f3afc hw/virtio: fix Power Management Control Register for PCI Express virtio devices
Make Power Management State flag writable to conform
with the PCI Express spec.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-16 01:46:41 +02:00
Marcel Apfelbaum
d584f1b9ca hw/virtio: fix Link Control Register for PCI Express virtio devices
Make several Link Control Register flags writable to conform
with the PCI Express spec.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-16 01:46:41 +02:00
Marcel Apfelbaum
c2cabb3422 hw/virtio: fix error enabling flags in Device Control register
When the virtio devices are PCI Express, make error-enabling flags
writable to respect the PCIe spec.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-16 01:46:40 +02:00
Jason Wang
60a8d80234 virtio-pci: reset modern vq meta data
We don't reset proxy->vqs[].{num|desc[]|avail[]|used[]}. This means if
a driver enable the vq without setting vq address after reset. The old
addresses were leaked. Fixing this by resetting modern vq meta data
during device reset.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-15 19:59:18 +02:00
Jason Wang
f0edf23978 Revert "virtio: unbreak virtio-pci with IOMMU after caching ring translations"
This reverts commit
96a8821d21. Previous patch is a better
solution which does not require a strict order between virtio and IOMMU.

CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-15 19:59:00 +02:00
Jason Wang
e45da65322 virtio: validate address space cache during init
We don't check the return value of address_space_cache_init(), this
may lead buggy driver use incorrect region caches. Instead of
triggering an assert, catch and warn this early in
virtio_init_region_cache().

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-15 19:37:19 +02:00
Jason Wang
e0e2d64409 virtio: destroy region cache during reset
We don't destroy region cache during reset which can make the maps
of previous driver leaked to a buggy or malicious driver that don't
set vring address before starting to use the device. Fix this by
destroy the region cache during reset and validate it before trying to
see them.

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-15 19:37:19 +02:00
Jason Wang
168e4af3c1 virtio: guard against NULL pfn
To avoid access stale memory region cache after reset, this patch
check the existence of virtqueue pfn for all exported virtqueue access
helpers before trying to use them.

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-15 19:37:19 +02:00
Jason Wang
96a8821d21 virtio: unbreak virtio-pci with IOMMU after caching ring translations
Commit c611c76417 ("virtio: add MemoryListener to cache ring
translations") registers a memory listener to dma_as. This may not
work when IOMMU is enabled: dma_as(bus_master_as) were initialized in
pcibus_machine_done() after virtio_realize(). This will cause a
segfault. Fixing this by using pci_device_iommu_address_space()
instead to make sure address space were initialized at this time.

With this fix, IOMMU device were required to be initialized before any
virtio-pci devices.

Fixes: c611c76417 ("virtio: add MemoryListener to cache ring translations")
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02 07:14:28 +02:00
Stefan Hajnoczi
874adf45db virtio: add missing region cache init in virtio_load()
Commit 97cd965c07 ("virtio: use
VRingMemoryRegionCaches for avail and used rings") switched to a memory
region cache to avoid repeated map/unmap operations.

The virtio_load() process is a little tricky because vring addresses are
serialized in two separate places.  VIRTIO 1.0 devices serialize desc
and then a subsection with used and avail.  Legacy devices only
serialize desc.

Live migration of VIRTIO 1.0 devices fails on the destination host with:

  VQ 0 size 0x80 < last_avail_idx 0x12f8 - used_idx 0x0
  Failed to load virtio-blk:virtio
  error while loading state for instance 0x0 of device '0000:00:04.0/virtio-blk'

This happens because the memory region cache is only initialized after
desc is loaded and not after the used and avail subsection is loaded.
If the guest chose memory addresses that don't match the legacy ring
layout then the wrong guest memory location is accessed.

Wait until all ring addresses are known before trying to initialize the
region cache.  Also clarify the incomplete comment about VIRTIO-1 ring
address subsection.

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
2017-03-02 07:14:28 +02:00
Stefan Hajnoczi
3cdf847329 virtio: invalidate memory in vring_set_avail_event()
Remember to invalidate the avail event field so the memory pages are
marked dirty.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
2017-03-02 07:14:27 +02:00
Cornelia Huck
34c6bf22a8 virtio: guard vring access when setting notification
Switching to vring caches exposed an existing bug in
virtio_queue_set_notification(): We can't access vring structures
if they have not been set up yet. This may happen, for example,
for virtio-blk devices with multiple queues: The code will try to
switch notifiers for every queue, but the guest may have only set up
a subset of them.

Fix this by guarding access to the vring memory by checking for
vring.desc. The first aio poll will iron out any remaining
inconsistencies for later-configured queues (buggy legacy drivers).

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02 07:14:27 +02:00
Paolo Bonzini
dd3dd4ba7b virtio: check for vring setup in virtio_queue_empty
If the vring has not been set up, there is nothing in the virtqueue.
virtio_queue_host_notifier_aio_poll calls virtio_queue_empty even in
this case; we have to filter it out just like virtio_queue_notify_aio_vq.

Reported-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-03-02 07:14:27 +02:00
Michael S. Tsirkin
b4b9862b53 virtio: Fix no interrupt when not creating msi controller
For ARM virt machine, if we use virt-2.7 which will not create ITS node,
the virtio-net can not recieve interrupts so it can't get ip address
through dhcp.
This fixes commit 83d768b(virtio: set ISR on dataplane notifications).

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Paolo Bonzini
97cd965c07 virtio: use VRingMemoryRegionCaches for avail and used rings
The virtio-net change is necessary because it uses virtqueue_fill
and virtqueue_flush instead of the more convenient virtqueue_push.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Paolo Bonzini
ca0176ad83 virtio: check for vring setup in virtio_queue_update_used_idx
If the vring has not been set up, it is not necessary for vring_used_idx
to do anything (as is already the case when the caller is virtio_load).
This is harmless for now, but it will be a problem when the
MemoryRegionCache has not been set up.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Paolo Bonzini
991976f751 virtio: use VRingMemoryRegionCaches for descriptor ring
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Paolo Bonzini
c611c76417 virtio: add MemoryListener to cache ring translations
The cached translations are RCU-protected to allow efficient use
when processing virtqueues.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Paolo Bonzini
5eba0404b9 virtio: use MemoryRegionCache to access descriptors
For now, the cache is created on every virtqueue_pop.  Later on,
direct descriptors will be able to reuse it.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Paolo Bonzini
9796d0ac8f virtio: use address_space_map/unmap to access descriptors
This makes little difference, but it makes the code change smaller
for the next patch that introduces MemoryRegionCache.  This is
because map/unmap are similar to MemoryRegionCache init/destroy.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Fam Zheng
0793169870 virtio: Report real progress in VQ aio poll handler
In virtio_queue_host_notifier_aio_poll, not all "!virtio_queue_empty()"
cases are making true progress.

Currently the offending one is virtio-scsi event queue, whose handler
does nothing if no event is pending. As a result aio_poll() will spin on
the "non-empty" VQ and take 100% host CPU.

Fix this by reporting actual progress from virtio queue aio handlers.

Reported-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Ed Swierk <eswierk@skyportsystems.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Michael S. Tsirkin
d56ec1e98c vhost: skip ROM sections
vhost does not support RO protections on memory at the moment - adding
ROMs would mean that e.g. a buggy guest might change them in-memory - a
condition from which guest reset does not recover. Not nice.

We also definitely don't want to try logging writes into ROMs -
in particular guests set very high addresses for ROM BARs
so logging these writes would waste a lot of memory.

Maybe ROMs could be supported with the iotlb variant -
not sure, but there seems to be no good reason for virtio
to try to do DMA from ROM. So let's just skip ROM memory.

Suggested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
2017-02-01 03:37:18 +02:00
Paolo Bonzini
c25d97c4ff virtio: make virtio_should_notify static
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-01 03:37:18 +02:00
Cao jin
ee640c625e pci: Convert msix_init() to Error and fix callers
msix_init() reports errors with error_report(), which is wrong when
it's used in realize().  The same issue was fixed for msi_init() in
commit 1108b2f. In order to make the API change as small as possible,
leave the return value check to later patch.

For some devices(like e1000e, vmxnet3, nvme) who won't fail because of
msix_init's failure, suppress the error report by passing NULL error
object.

Bonus: add comment for msix_init.

CC: Jiri Pirko <jiri@resnulli.us>
CC: Gerd Hoffmann <kraxel@redhat.com>
CC: Dmitry Fleytman <dmitry@daynix.com>
CC: Jason Wang <jasowang@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Hannes Reinecke <hare@suse.de>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Alex Williamson <alex.williamson@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-01 03:37:18 +02:00