Commit Graph

1109 Commits

Author SHA1 Message Date
Arseny Krasnov
1e08fd0a46 vhost-vsock: SOCK_SEQPACKET feature bit support
This adds processing of VIRTIO_VSOCK_F_SEQPACKET features bit. Guest
negotiates it with vhost, thus both will know that SOCK_SEQPACKET
supported by peer.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
Message-Id: <20210622144747.2949134-1-arseny.krasnov@kaspersky.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-16 11:10:45 -04:00
Viresh Kumar
538bb6f121 hw/virtio: add vhost-user-i2c-pci boilerplate
This allows is to instantiate a vhost-user-i2c device as part of a PCI
bus. It is mostly boilerplate which looks pretty similar to the
vhost-user-fs-pci device.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Message-Id: <8a083eaa57d93feaab12acd1f94b225879212f20.1625806763.git.viresh.kumar@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-16 11:10:45 -04:00
Viresh Kumar
7221d3b634 hw/virtio: add boilerplate for vhost-user-i2c device
This creates the QEMU side of the vhost-user-i2c device which connects
to the remote daemon. It is based of vhost-user-fs code.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Message-Id: <e80591b52fea4b51631818bb92a798a3daf90399.1625806763.git.viresh.kumar@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-16 11:10:40 -04:00
Peter Maydell
86108e23d7 Trivial patches pull request 20210709
-----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmDosQwSHGxhdXJlbnRA
 dml2aWVyLmV1AAoJEPMMOL0/L748Jy0QAKj5svKv2Ad5ncvAwYBSRKQxnB4jg0MR
 /jZfshyw6cLL5aHV4HkCmsAVfPEo+f9CmUEXJBkXiJJsvthXAQw/xkskQe3iaRuF
 hQggYDjA94XabkWY35ie0OHVdwNpDzdKPt+gFft4UpstlNJkTb+jUoDi2+bjTFbS
 NsuDi5W4MYwt4r7Zauf1q4298bjNx2652aPXftRhCbbInHv0Nx++tfpDMrO3BTbH
 aINykWi552MjNFP/fzZQcJvKlOx8k/YTDgypIJl5fl4qZeh4HWOYX+92UuNdZDaP
 JFslAu7mF7E5iRTqJCEv3doXwG3HA5llhtYw5gaDGXj6i6GFrswNFtJE0qB/ZB8o
 EARt62u/+Z2Z8GYj2WmSbDXnMwQMaf3GbvTYlNaFV6HLmtIg3eR+DySFoBCj1SP4
 ZgYa3phH/xpE5fPPcnZ6Ae4OzzrEOQaK2PgBhT6wCuY6ZAbY1SrRbXvCuDA+BLyr
 i6hycblGT3LF3YfT5cw5ek+jqliOUcXivzPjomwJVpFdrXKa+iC4JHxQGSn+Wayw
 mXHx7JmQ4oxiizUNOxAEUo4FlZlerN5DyBmY/YuY7IpjuL6DmwchEDcirO72BiKL
 C5npCVx37WJvJ8EM1REo75kkWTzKgMdUMjRjGzBf/MqPTNMe/fXMHF2OeRyOlbcX
 +x5tnSU45fTJ
 =6fZ6
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vivier2/tags/trivial-branch-for-6.1-pull-request' into staging

Trivial patches pull request 20210709

# gpg: Signature made Fri 09 Jul 2021 21:26:52 BST
# gpg:                using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
# gpg:                issuer "laurent@vivier.eu"
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>" [full]
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/trivial-branch-for-6.1-pull-request:
  util/guest-random: Fix size arg to tail memcpy
  migration: fix typo in mig_throttle_guest_down comment
  target/xtensa/xtensa-semi: Fix compilation problem on Haiku
  hw/virtio: Document *_should_notify() are called within rcu_read_lock()
  misc: Remove redundant new line in perror()
  virtiofsd: Add missing newline in error message
  misc: Fix "havn't" typo
  memory: Display MemoryRegion name in read/write ops trace events
  qemu-option: Drop dead assertion

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-07-11 18:49:25 +01:00
Peter Maydell
42e1d798a6 Block layer patches
- Make blockdev-reopen stable
 - Remove deprecated qemu-img backing file without format
 - rbd: Convert to coroutines and add write zeroes support
 - rbd: Updated MAINTAINERS
 - export/fuse: Allow other users access to the export
 - vhost-user: Fix backends without multiqueue support
 - Fix drive-backup transaction endless drained section
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmDoRdIRHGt3b2xmQHJl
 ZGhhdC5jb20ACgkQfwmycsiPL9bvgQ/+Ogq24n1UOQc8FEKRYfyhajNToQ9ofzWN
 iLiblSGx2QDq+CauD3qdu6z7DLlqEXeoM4NYM462oIPumptQj+9XZt7ftfh6FLWW
 4yJEbjfnVKOba+vFdJ+E0DStwnPaxYdnrPGd53cwHZfbZh4ZmkpTM350mzHHiLTb
 KYKOgWd+UHZbkYeCVNYTGe30SRBiKeAecTpsVZ5HVhe7LstjByuy5stk8dytLpdV
 YqdKOToZfOp77XiHr8YcLLp1HHBGlr5hw73V4SDas0beCp7hqtnAqsTYyXBue4xO
 4zfD4Gujr5JVOCb0crDTyOmOQY5E+y2dqFoOUF00D5AoN2vj4nfQ9ESkbqlE9BVh
 mgJ1izSokYlN2X8rIwGXNR5fbxRmxxfkAA4rScNRytj1KxDHyrDxrp/k8YFemxSQ
 qwgb/FBm0fcr69evPRzovKwZFhcyPremksluHQE4rZZ66qBQ2cGuDJPE7PWVTpPH
 67JCrIVK/O6n5p+4ilFHmQQ3aP3ol0frMFcboYVRchJ2MhIDTsfFL3F/tTK8hy86
 AmrrdQ1BQIAoKNOKnAmOSOUdExM55OcfPmX69+AhEk2GeWP6kgz5Pks4H3qCiKGf
 YoRk8F1V+N4q+C0mFFovB61bNQ6COIlBuzmD9EtmpDD/Ta3Wib+3ZnoGVIdPS+OI
 jyj+qJxd9z4=
 =kH+r
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches

- Make blockdev-reopen stable
- Remove deprecated qemu-img backing file without format
- rbd: Convert to coroutines and add write zeroes support
- rbd: Updated MAINTAINERS
- export/fuse: Allow other users access to the export
- vhost-user: Fix backends without multiqueue support
- Fix drive-backup transaction endless drained section

# gpg: Signature made Fri 09 Jul 2021 13:49:22 BST
# gpg:                using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg:                issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (28 commits)
  block: Make blockdev-reopen stable API
  iotests: Test reopening multiple devices at the same time
  block: Support multiple reopening with x-blockdev-reopen
  block: Acquire AioContexts during bdrv_reopen_multiple()
  block: Add bdrv_reopen_queue_free()
  qcow2: Fix dangling pointer after reopen for 'file'
  qemu-img: Improve error for rebase without backing format
  qemu-img: Require -F with -b backing image
  qcow2: Prohibit backing file changes in 'qemu-img amend'
  blockdev: fix drive-backup transaction endless drained section
  vhost-user: Fix backends without multiqueue support
  MAINTAINERS: add block/rbd.c reviewer
  block/rbd: fix type of task->complete
  iotests/fuse-allow-other: Test allow-other
  iotests/308: Test +w on read-only FUSE exports
  export/fuse: Let permissions be adjustable
  export/fuse: Give SET_ATTR_SIZE its own branch
  export/fuse: Add allow-other option
  export/fuse: Pass default_permissions for mount
  util/uri: do not check argument of uri_free()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-07-10 19:55:21 +01:00
Peter Maydell
ebd1f71002 Machine queue, 2021-07-07
Deprecation:
 * Deprecate pmem=on with non-DAX capable backend file
   (Igor Mammedov)
 
 Feature:
 * virtio-mem: vfio support (David Hildenbrand)
 
 Cleanup:
 * vmbus: Don't make QOM property registration conditional
   (Eduardo Habkost)
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEWjIv1avE09usz9GqKAeTb5hNxaYFAmDnWBgUHGVoYWJrb3N0
 QHJlZGhhdC5jb20ACgkQKAeTb5hNxaaH4A//SNp6wZ5VtK59KtnrVgGSeeqJFHC8
 NyRDy9vQgYJHwBdvQ77IahAovjV2vSD7m1bl2R4ZGUkUNl0t34oq/BvW9olZHJ6R
 Z52L92kZ8KjfWq99DbMvz3n7maR4mvTLmDcksi459V2+nf+pn9iI9Bux+dRVy6as
 u85yK7rVmkwNKakYSBHsFQBzImkf7ufWvRVe200c5rm46z3t9jVxI7p/q49J8bgi
 OurdkxXHIOAjkVbiWjxIW9pL+uf81+UUPrn6v3Pw62su47Ra5edtHopdTVJb35rh
 YTLnJnnFqXTFn+s9RZWdR8okJZdWU3PA2opeT+pFXqPP11etL59l/j1zCWuVxYCa
 afbEaZiFLTS7vhy8aXpCVi2jI+3OBDvK2+UyS4zcUxs5T25eqTUqUKHhU0zNwK0s
 srBaRbl7Clj1keV6SYRSCh79NxjMskLE9bb36fY3XaUTVoQQ5+SvvErzsZmr4U4p
 /zGf+ilQhLCOgkDxpO0NEAtWV2UlQPhdFJDTMQHACC9GCQvU0meJhwi0UuAZ2QXj
 Yoo+yhcBnOfpbqKaX+Qoc7fKruRNNM7be130ESC3AqeC2NEPXenonnkBFbCYChvB
 elMYABjsKfYwf56n4pa9PKSidDS1ld0XImcqobobqpZ4Fd6rzyPocvz1Q63zPYkd
 presZ5ePekGcW+M=
 =NEaj
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ehabkost-gl/tags/machine-next-pull-request' into staging

Machine queue, 2021-07-07

Deprecation:
* Deprecate pmem=on with non-DAX capable backend file
  (Igor Mammedov)

Feature:
* virtio-mem: vfio support (David Hildenbrand)

Cleanup:
* vmbus: Don't make QOM property registration conditional
  (Eduardo Habkost)

# gpg: Signature made Thu 08 Jul 2021 20:55:04 BST
# gpg:                using RSA key 5A322FD5ABC4D3DBACCFD1AA2807936F984DC5A6
# gpg:                issuer "ehabkost@redhat.com"
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" [full]
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost-gl/tags/machine-next-pull-request:
  vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus
  virtio-mem: Require only coordinated discards
  softmmu/physmem: Extend ram_block_discard_(require|disable) by two discard types
  softmmu/physmem: Don't use atomic operations in ram_block_discard_(disable|require)
  vfio: Support for RamDiscardManager in the vIOMMU case
  vfio: Sanity check maximum number of DMA mappings with RamDiscardManager
  vfio: Query and store the maximum number of possible DMA mappings
  vfio: Support for RamDiscardManager in the !vIOMMU case
  virtio-mem: Implement RamDiscardManager interface
  virtio-mem: Don't report errors when ram_block_discard_range() fails
  virtio-mem: Factor out traversing unplugged ranges
  memory: Helpers to copy/free a MemoryRegionSection
  memory: Introduce RamDiscardManager for RAM memory regions
  Deprecate pmem=on with non-DAX capable backend file
  vmbus: Don't make QOM property registration conditional

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-07-09 17:58:38 +01:00
Philippe Mathieu-Daudé
4c6dd9a026 hw/virtio: Document *_should_notify() are called within rcu_read_lock()
Such comments make reviewing this file somehow easier.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210523094040.3516968-1-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-07-09 18:42:46 +02:00
Kevin Wolf
84affad1fd vhost-user: Fix backends without multiqueue support
dev->max_queues was never initialised for backends that don't support
VHOST_USER_PROTOCOL_F_MQ, so it would use 0 as the maximum number of
queues to check against and consequently fail for any such backend.

Set it to 1 if the backend doesn't have multiqueue support.

Fixes: c90bd505a3
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210705171429.29286-1-kwolf@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-07-09 12:26:05 +02:00
David Hildenbrand
bc072ed403 virtio-mem: Require only coordinated discards
We implement the RamDiscardManager interface and only require coordinated
discarding of RAM to work.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-13-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-08 15:54:45 -04:00
David Hildenbrand
0fd7616e0f vfio: Support for RamDiscardManager in the vIOMMU case
vIOMMU support works already with RamDiscardManager as long as guests only
map populated memory. Both, populated and discarded memory is mapped
into &address_space_memory, where vfio_get_xlat_addr() will find that
memory, to create the vfio mapping.

Sane guests will never map discarded memory (e.g., unplugged memory
blocks in virtio-mem) into an IOMMU - or keep it mapped into an IOMMU while
memory is getting discarded. However, there are two cases where a malicious
guests could trigger pinning of more memory than intended.

One case is easy to handle: the guest trying to map discarded memory
into an IOMMU.

The other case is harder to handle: the guest keeping memory mapped in
the IOMMU while it is getting discarded. We would have to walk over all
mappings when discarding memory and identify if any mapping would be a
violation. Let's keep it simple for now and print a warning, indicating
that setting RLIMIT_MEMLOCK can mitigate such attacks.

We have to take care of incoming migration: at the point the
IOMMUs get restored and start creating mappings in vfio, RamDiscardManager
implementations might not be back up and running yet: let's add runstate
priorities to enforce the order when restoring.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-10-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-08 15:54:45 -04:00
David Hildenbrand
2044969f0b virtio-mem: Implement RamDiscardManager interface
Let's properly notify when (un)plugging blocks, after discarding memory
and before allowing the guest to consume memory. Handle errors from
notifiers gracefully (e.g., no remaining VFIO mappings) when plugging,
rolling back the change and telling the guest that the VM is busy.

One special case to take care of is replaying all notifications after
restoring the vmstate. The device starts out with all memory discarded,
so after loading the vmstate, we have to notify about all plugged
blocks.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-6-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-08 15:54:45 -04:00
David Hildenbrand
3aca6380fd virtio-mem: Don't report errors when ram_block_discard_range() fails
Any errors are unexpected and ram_block_discard_range() already properly
prints errors. Let's stop manually reporting errors.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-5-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-08 15:54:45 -04:00
David Hildenbrand
7a9d5d0282 virtio-mem: Factor out traversing unplugged ranges
Let's factor out the core logic, no need to replicate.

Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-4-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-08 15:54:45 -04:00
Andrew Melnychenko
df07a8f8cb virtio-pci: Changed return values for "notify", "device" and "isr" read.
At some point, after unplugging virtio-pci the virtio device may be unrealised,
but the memory regions may be present in flatview. So, it's a possible situation
when memory region's callbacks are called for "unplugged" device.

Previous two patches made sure this case does not cause QEMU to crash.
This patch adds check for "notify" memory region. Now reads will return "-1" if a virtio
device is not present on a virtio bus.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1938042
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1743098

Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Message-Id: <20210609095843.141378-4-andrew@daynix.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-03 01:39:33 -04:00
Andrew Melnychenko
bf697371db virtio-pci: Added check for virtio device in PCI config cbs.
Now, if virtio device is not present on virtio-bus - pci config callbacks
will not lead to possible crush. The read will return "-1" which should be
interpreted by a driver that pci device may be unplugged.

Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Message-Id: <20210609095843.141378-3-andrew@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-03 01:39:33 -04:00
Andrew Melnychenko
80ebfd69b9 virtio-pci: Added check for virtio device presence in mm callbacks.
During unplug the virtio device is unplugged from virtio-bus on pci. In some cases,
requests to virtio-pci mm may acquire during/after unplug. Added check that virtio
device is on the bus, for "common" memory region.

Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Message-Id: <20210609095843.141378-2-andrew@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-03 01:39:33 -04:00
Greg Kurz
9cf4fd872d virtio: Clarify MR transaction optimization
The device model batching its ioeventfds in a single MR transaction is
an optimization. Clarify this in virtio-scsi, virtio-blk and generic
virtio code. Also clarify that the transaction must commit before
closing ioeventfds so that no one is tempted to merge the loops
in the start functions error path and in the stop functions.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <162125799728.1394228.339855768563326832.stgit@bahia.lan>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-02 11:13:39 -04:00
Pavel Dovgalyuk
3909c07945 virtio: disable ioeventfd for record/replay
virtio devices support separate iothreads waiting for
events from file descriptors. These are asynchronous
events that can't be recorded and replayed, therefore
this patch disables ioeventfd for all devices when
record or replay is enabled.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Message-Id: <162125678869.1252810.4317416444097392406.stgit@pasha-ThinkPad-X280>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-02 10:20:13 -04:00
Kevin Wolf
50de51387f vhost: Distinguish errors in vhost_dev_get_config()
Instead of just returning 0/-1 and letting the caller make up a
meaningless error message, add an Error parameter to allow reporting the
real error and switch to 0/-errno so that different kind of errors can
be distinguished in the caller.

config_len in vhost_user_get_config() is defined by the device, so if
it's larger than VHOST_USER_MAX_CONFIG_SIZE, this is a programming
error. Turn the corresponding check into an assertion.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210609154658.350308-6-kwolf@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-30 13:18:42 +02:00
Kevin Wolf
f2a6e6c4fa vhost: Return 0/-errno in vhost_dev_init()
Instead of just returning 0/-1 and letting the caller make up a
meaningless error message, switch to 0/-errno so that different kinds of
errors can be distinguished in the caller.

This involves changing a few more callbacks in VhostOps to return
0/-errno: .vhost_set_owner(), .vhost_get_features() and
.vhost_virtqueue_set_busyloop_timeout(). The implementations of these
functions are trivial as they generally just send a message to the
backend.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210609154658.350308-4-kwolf@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-30 13:16:05 +02:00
Kevin Wolf
28770ff935 vhost: Distinguish errors in vhost_backend_init()
Instead of just returning 0/-1 and letting the caller make up a
meaningless error message, add an Error parameter to allow reporting the
real error and switch to 0/-errno so that different kind of errors can
be distinguished in the caller.

Specifically, in vhost-user, EPROTO is used for all errors that relate
to the connection itself, whereas other error codes are used for errors
relating to the content of the connection. This will allow us later to
automatically reconnect when the connection goes away, without ending up
in an endless loop if it's a permanent error in the configuration.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210609154658.350308-3-kwolf@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-30 13:16:03 +02:00
Kevin Wolf
a6945f2287 vhost: Add Error parameter to vhost_dev_init()
This allows callers to return better error messages instead of making
one up while the real error ends up on stderr. Most callers can
immediately make use of this because they already have an Error
parameter themselves. The others just keep printing the error with
error_report_err().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210609154658.350308-2-kwolf@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-30 13:15:44 +02:00
Jason Wang
c33f23a419 vhost-vdpa: don't initialize backend_features
We used to initialize backend_features during vhost_vdpa_init()
regardless whether or not it was supported by vhost. This will lead
the unsupported features like VIRTIO_F_IN_ORDER to be included and set
to the vhost-vdpa during vhost_dev_start. Because the
VIRTIO_F_IN_ORDER is not supported by vhost-vdpa so it won't be
advertised to guest which will break the datapath.

Fix this by not initializing the backend_features, so the
acked_features could be built only from guest features via
vhost_net_ack_features().

Fixes: 108a64818e ("vhost-vdpa: introduce vhost-vdpa backend")
Cc: qemu-stable@nongnu.org
Cc: Gautam Dawar <gdawar@xilinx.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-06-11 10:30:13 +08:00
Jason Wang
d0416d487b vhost-vdpa: map virtqueue notification area if possible
This patch implements the vq notification mapping support for
vhost-vDPA. This is simply done by using mmap()/munmap() for the
vhost-vDPA fd during device start/stop. For the device without
notification mapping support, we fall back to eventfd based
notification gracefully.

Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-06-11 10:30:13 +08:00
Jason Wang
d60c75d28f vhost-vdpa: skip ram device from the IOTLB mapping
vDPA is not tie to any specific hardware, for safety and simplicity,
vhost-vDPA doesn't allow MMIO area to be mapped via IOTLB. Only the
doorbell could be mapped via mmap(). So this patch exclude skip the
ram device from the IOTLB mapping.

Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-06-11 10:30:13 +08:00
Xie Yongji
df77d45a51 vhost-vdpa: Remove redundant declaration of address_space_memory
The symbol address_space_memory are already declared in
include/exec/address-spaces.h. So let's add this header file
and remove the redundant declaration in include/hw/virtio/vhost-vdpa.h.

Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20210517123246.999-1-xieyongji@bytedance.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-06-05 21:33:46 +02:00
Stefano Garzarella
d0fb9657a3 docs: fix references to docs/devel/tracing.rst
Commit e50caf4a5c ("tracing: convert documentation to rST")
converted docs/devel/tracing.txt to docs/devel/tracing.rst.

We still have several references to the old file, so let's fix them
with the following command:

  sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt)

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210517151702.109066-2-sgarzare@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-06-02 06:51:09 +02:00
Philippe Mathieu-Daudé
cdba7e2f49 cpu: Introduce cpu_virtio_is_big_endian()
Introduce the cpu_virtio_is_big_endian() generic helper to avoid
calling CPUClass internal virtio_is_big_endian() one.

Similarly to commit bf7663c4bd ("cpu: introduce
CPUClass::virtio_is_big_endian()"), we keep 'virtio' in the method
name to hint this handler shouldn't be called anywhere but from the
virtio code.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210517105140.1062037-8-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-05-26 15:33:59 -07:00
Kevin Wolf
c90bd505a3 vhost-user-blk: Check that num-queues is supported by backend
Creating a device with a number of queues that isn't supported by the
backend is pointless, the device won't work properly and the error
messages are rather confusing.

Just fail to create the device if num-queues is higher than what the
backend supports.

Since the relationship between num-queues and the number of virtqueues
depends on the specific device, this is an additional value that needs
to be initialised by the device. For convenience, allow leaving it 0 if
the check should be skipped. This makes sense for vhost-user-net where
separate vhost devices are used for the queues and custom initialisation
code is needed to perform the check.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1935031
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20210429171316.162022-7-kwolf@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-05-18 12:57:39 +02:00
Kevin Wolf
04ceb61a40 virtio: Fail if iommu_platform is requested, but unsupported
Commit 2943b53f6 (' virtio: force VIRTIO_F_IOMMU_PLATFORM') made sure
that vhost can't just reject VIRTIO_F_IOMMU_PLATFORM when it was
requested. However, just adding it back to the negotiated flags isn't
right either because it promises support to the guest that the device
actually doesn't support. One example of a vhost-user device that
doesn't have support for the flag is the vhost-user-blk export of QEMU.

Instead of successfully creating a device that doesn't work, just fail
to plug the device when it doesn't support the feature, but it was
requested. This results in much clearer error messages.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1935019
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20210429171316.162022-6-kwolf@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-05-18 12:57:38 +02:00
Peter Maydell
6005ee07c3 pc,pci,virtio: bugfixes, improvements
Fixes all over the place. Faster boot for virtio. ioeventfd support for
 mmio.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmCeiMEPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpqsIH/A49Av5Bv8huL75lf9GzCx3E1a/z2W9Fphik
 OcQ1ahR+7CRDARub+vTG40MBmZBVefIWjLAj3BwBWzFGPX0DZq0zeI102VzlEVKY
 OeUx8ixuiKOSLcS+QxE7ZXIBL2Pn7l+MFUi4nLMYKti7c/kola7zlB57qsmXh+VD
 AOQ7Utj6NWoi6QocWJsMSCyHCh3Fk9QzcStLlr6/MkSJa1zqv8l22+8oWH07Fk2M
 wZfhrm9k094on28iSejsFYL5e4ROeXUajbOdfyMIxWvAB7boC9Jxk/e0oAbuSB4y
 2f71Gfk3mU6irS7PvrxcKbk6BVD2zxM2WumOchZJgxFAujDO6yg=
 =fvkT
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc,pci,virtio: bugfixes, improvements

Fixes all over the place. Faster boot for virtio. ioeventfd support for
mmio.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Fri 14 May 2021 15:27:13 BST
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  Fix build with 64 bits time_t
  vhost-vdpa: Make vhost_vdpa_get_device_id() static
  hw/virtio: enable ioeventfd configuring for mmio
  hw/smbios: support for type 41 (onboard devices extended information)
  checkpatch: Fix use of uninitialized value
  virtio-scsi: Configure all host notifiers in a single MR transaction
  virtio-scsi: Set host notifiers and callbacks separately
  virtio-blk: Configure all host notifiers in a single MR transaction
  virtio-blk: Fix rollback path in virtio_blk_data_plane_start()
  pc-dimm: remove unnecessary get_vmstate_memory_region() method
  amd_iommu: fix wrong MMIO operations
  virtio-net: Constify VirtIOFeature feature_sizes[]
  virtio-blk: Constify VirtIOFeature feature_sizes[]
  hw/virtio: Pass virtio_feature_get_config_size() a const argument
  x86: acpi: use offset instead of pointer when using build_header()
  amd_iommu: Fix pte_override_page_mask()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	hw/arm/virt.c
2021-05-16 17:22:46 +01:00
Zenghui Yu
c232b8f453 vhost-vdpa: Make vhost_vdpa_get_device_id() static
As it's only used inside hw/virtio/vhost-vdpa.c.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Message-Id: <20210413133737.1574-1-yuzenghui@huawei.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-05-14 10:26:18 -04:00
Pavel Dovgalyuk
b8893a3c86 hw/virtio: enable ioeventfd configuring for mmio
This patch adds ioeventfd flag for virtio-mmio configuration.
It allows switching ioeventfd on and off.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Message-Id: <161700379211.1135943.8859209566937991305.stgit@pasha-ThinkPad-X280>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-05-14 10:26:18 -04:00
Philippe Mathieu-Daudé
4c21e3534a hw/virtio: Pass virtio_feature_get_config_size() a const argument
The VirtIOFeature structure isn't modified, mark it const.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210511104157.2880306-2-philmd@redhat.com>
2021-05-14 08:12:09 -04:00
David Hildenbrand
1a37352277 migrate/ram: remove "ram_bulk_stage" and "fpo_enabled"
The bulk stage is kind of weird: migration_bitmap_find_dirty() will
indicate a dirty page, however, ram_save_host_page() will never save it, as
migration_bitmap_clear_dirty() detects that it is not dirty.

We already fill the bitmap in ram_list_init_bitmaps() with ones, marking
everything dirty - it didn't used to be that way, which is why we needed
an explicit first bulk stage.

Let's simplify: make the bitmap the single source of thuth. Explicitly
handle the "xbzrle_enabled after first round" case.

Regarding XBZRLE (implicitly handled via "ram_bulk_stage = false" right
now), there is now a slight change in behavior:
- Colo: When starting, it will be disabled (was implicitly enabled)
  until the first round actually finishes.
- Free page hinting: When starting, XBZRLE will be disabled (was implicitly
  enabled) until the first round actually finished.
- Snapshots: When starting, XBZRLE will be disabled. We essentially only
  do a single run, so I guess it will never actually get disabled.

Postcopy seems to indirectly disable it in ram_save_page(), so there
shouldn't be really any change.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210216105039.40680-1-david@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-13 18:21:13 +01:00
Thomas Huth
ee86213aa3 Do not include exec/address-spaces.h if it's not really necessary
Stop including exec/address-spaces.h in files that don't need it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-5-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:51 +02:00
Anton Kuchin
ace66791cd vhost-user-fs: fix features handling
Make virtio-fs take into account server capabilities.

Just returning requested features assumes they all of then are implemented
by server and results in setting unsupported configuration if some of them
are absent.

Signed-off-by: Anton Kuchin <antonkuchin@yandex-team.ru>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  With changes suggested by Stefan
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-04-13 16:13:41 +01:00
Andrey Gruzdev
1a8e44a89f migration: Inhibit virtio-balloon for the duration of background snapshot
The same thing as for incoming postcopy - we cannot deal with concurrent
RAM discards when using background snapshot feature in outgoing migration.

Fixes: 8518278a6a (migration: implementation
  of background snapshot thread)
Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Reported-by: David Hildenbrand <david@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210401092226.102804-3-andrey.gruzdev@virtuozzo.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-04-06 18:56:01 +01:00
Jason Wang
d83f46d189 virtio-pci: compat page aligned ATS
Commit 4c70875372 ("pci: advertise a page aligned ATS") advertises
the page aligned via ATS capability (RO) to unbrek recent Linux IOMMU
drivers since 5.2. But it forgot the compat the capability which
breaks the migration from old machine type:

(qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x104 read:
0 device: 20 cmask: ff wmask: 0 w1cmask:0

This patch introduces a new parameter "x-ats-page-aligned" for
virtio-pci device and turns it on for machine type which is newer than
5.1.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: qemu-stable@nongnu.org
Fixes: 4c70875372 ("pci: advertise a page aligned ATS")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210406040330.11306-1-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-04-06 07:11:36 -04:00
Yuri Benditovich
51e0e42cab virtio-pci: remove explicit initialization of val
The value is assigned later in this procedure.

Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Message-Id: <20210315115937.14286-3-yuri.benditovich@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-04-01 11:39:12 -04:00
Yuri Benditovich
c3fd706165 virtio-pci: add check for vdev in virtio_pci_isr_read
https://bugzilla.redhat.com/show_bug.cgi?id=1743098
This commit completes the solution of segfault in hot unplug flow
(by commit ccec7e9603).
Added missing check for vdev in virtio_pci_isr_read.
Typical stack of crash:
virtio_pci_isr_read ../hw/virtio/virtio-pci.c:1365 with proxy-vdev = 0
memory_region_read_accessor at ../softmmu/memory.c:442
access_with_adjusted_size at ../softmmu/memory.c:552
memory_region_dispatch_read1 at ../softmmu/memory.c:1420
memory_region_dispatch_read  at ../softmmu/memory.c:1449
flatview_read_continue at ../softmmu/physmem.c:2822
flatview_read at ../softmmu/physmem.c:2862
address_space_read_full at ../softmmu/physmem.c:2875

Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Message-Id: <20210315115937.14286-2-yuri.benditovich@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-04-01 11:39:12 -04:00
Wang Liang
d2adda34a9 virtio-pmem: fix virtio_pmem_resp assign problem
ret in virtio_pmem_resp is a uint32_t variable, which should be assigned
using virtio_stl_p.

The kernel side driver does not guarantee virtio_pmem_resp to be initialized
to zero in advance, So sometimes the flush operation will fail.

Signed-off-by: Wang Liang <wangliangzz@inspur.com>
Message-Id: <20210317024145.271212-1-wangliangzz@126.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 10:17:53 -04:00
Greg Kurz
db8a3772e3 vhost-user: Monitor slave channel in vhost_user_read()
Now that everything is in place, have the nested event loop to monitor
the slave channel. The source in the main event loop is destroyed and
recreated to ensure any pending even for the slave channel that was
previously detected is purged. This guarantees that the main loop
wont invoke slave_read() based on an event that was already handled
by the nested loop.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-7-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-03-22 10:17:53 -04:00
Greg Kurz
a7f523c7d1 vhost-user: Introduce nested event loop in vhost_user_read()
A deadlock condition potentially exists if a vhost-user process needs
to request something to QEMU on the slave channel while processing a
vhost-user message.

This doesn't seem to affect any vhost-user implementation so far, but
this is currently biting the upcoming enablement of DAX with virtio-fs.
The issue is being observed when the guest does an emergency reboot while
a mapping still exits in the DAX window, which is very easy to get with
a busy enough workload (e.g. as simulated by blogbench [1]) :

- QEMU sends VHOST_USER_GET_VRING_BASE to virtiofsd.

- In order to complete the request, virtiofsd then asks QEMU to remove
  the mapping on the slave channel.

All these dialogs are synchronous, hence the deadlock.

As pointed out by Stefan Hajnoczi:

When QEMU's vhost-user master implementation sends a vhost-user protocol
message, vhost_user_read() does a "blocking" read during which slave_fd
is not monitored by QEMU.

The natural solution for this issue is an event loop. The main event
loop cannot be nested though since we have no guarantees that its
fd handlers are prepared for re-entrancy.

Introduce a new event loop that only monitors the chardev I/O for now
in vhost_user_read() and push the actual reading to a one-shot handler.
A subsequent patch will teach the loop to monitor and process messages
from the slave channel as well.

[1] https://github.com/jedisct1/Blogbench

Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-6-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-03-22 10:17:53 -04:00
Greg Kurz
57dc02173c vhost-user: Convert slave channel to QIOChannelSocket
The slave channel is implemented with socketpair() : QEMU creates
the pair, passes one of the socket to virtiofsd and monitors the
other one with the main event loop using qemu_set_fd_handler().

In order to fix a potential deadlock between QEMU and a vhost-user
external process (e.g. virtiofsd with DAX), we want to be able to
monitor and service the slave channel while handling vhost-user
requests.

Prepare ground for this by converting the slave channel to be a
QIOChannelSocket. This will make monitoring of the slave channel
as simple as calling qio_channel_add_watch_source(). Since the
connection is already established between the two sockets, only
incoming I/O (G_IO_IN) and disconnect (G_IO_HUP) need to be
serviced.

This also allows to get rid of the ancillary data parsing since
QIOChannelSocket can do this for us. Note that the MSG_CTRUNC
check is dropped on the way because QIOChannelSocket ignores this
case. This isn't a problem since slave_read() provisions space for
8 file descriptors, but affected vhost-user slave protocol messages
generally only convey one. If for some reason a buggy implementation
passes more file descriptors, no need to break the connection, just
like we don't break it if some other type of ancillary data is
received : this isn't explicitely violating the protocol per-se so
it seems better to ignore it.

The current code errors out on short reads and writes. Use the
qio_channel_*_all() variants to address this on the way.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-5-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-03-22 10:17:53 -04:00
Greg Kurz
de62e49460 vhost-user: Factor out duplicated slave_fd teardown code
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-4-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-03-22 10:17:53 -04:00
Greg Kurz
9e06080bed vhost-user: Fix double-close on slave_read() error path
Some message types, e.g. VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG,
can convey file descriptors. These must be closed before returning
from slave_read() to avoid being leaked. This can currently be done
in two different places:

[1] just after the request has been processed

[2] on the error path, under the goto label err:

These path are supposed to be mutually exclusive but they are not
actually. If the VHOST_USER_NEED_REPLY_MASK flag was passed and the
sending of the reply fails, both [1] and [2] are performed with the
same descriptor values. This can potentially cause subtle bugs if one
of the descriptor was recycled by some other thread in the meantime.

This code duplication complicates rollback for no real good benefit.
Do the closing in a unique place, under a new fdcleanup: goto label
at the end of the function.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-3-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-03-22 10:17:53 -04:00
Greg Kurz
a890557d5a vhost-user: Drop misleading EAGAIN checks in slave_read()
slave_read() checks EAGAIN when reading or writing to the socket
fails. This gives the impression that the slave channel is in
non-blocking mode, which is certainly not the case with the current
code base. And the rest of the code isn't actually ready to cope
with non-blocking I/O.

Just drop the checks everywhere in this function for the sake of
clarity.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210312092212.782255-2-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-03-22 10:17:53 -04:00
Laurent Vivier
0ab8c021c6 virtio: Fix virtio_mmio_read()/virtio_mmio_write()
Both functions don't check the personality of the interface (legacy or
modern) before accessing the configuration memory and always use
virtio_config_readX()/virtio_config_writeX().

With this patch, they now check the personality and in legacy mode
call virtio_config_readX()/virtio_config_writeX(), otherwise call
virtio_config_modern_readX()/virtio_config_modern_writeX().

This change has been tested with virtio-mmio guests (virt stretch/armhf and
virt sid/m68k) and virtio-pci guests (pseries RHEL-7.3/ppc64 and /ppc64le).

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210314200300.3259170-1-laurent@vivier.eu>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 10:17:53 -04:00
Jason Wang
51a81a2118 virtio-net: calculating proper msix vectors on init
Currently, the default msix vectors for virtio-net-pci is 3 which is
obvious not suitable for multiqueue guest, so we depends on the user
or management tools to pass a correct vectors parameter. In fact, we
can simplifying this by calculating the number of vectors on realize.

Consider we have N queues, the number of vectors needed is 2*N + 2
(#queue pairs + plus one config interrupt and control vq). We didn't
check whether or not host support control vq because it was added
unconditionally by qemu to avoid breaking legacy guests such as Minix.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-15 16:41:22 +08:00