Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20190308140454.32437-6-marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Take a VhostUserState* that can be pre-allocated, and initialize it
with the associated chardev.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Message-Id: <20190308140454.32437-4-marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Prior to f6deb6d9 "virtio-balloon: Remove unnecessary MADV_WILLNEED on
deflate", the balloon device issued an madvise() MADV_WILLNEED on
pages removed from the balloon. That would hint to the host kernel
that the pages were likely to be needed by the guest in the near
future.
It's unclear if this is actually valuable or not, and so f6deb6d9
removed this, essentially ignoring balloon deflate requests. However,
concerns have been raised that this might cause a performance
regression by causing extra latency for the guest in certain
configurations.
So, until we can get actual benchmark data to see if that's the case,
this restores the old behaviour, issuing a MADV_WILLNEED when a page is
removed from the balloon.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190306030601.21986-4-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This fixes a balloon bug with a nasty consequence - potentially
corrupting guest memory - but which is extremely unlikely to be
triggered in practice.
The balloon always works in 4kiB units, but the host could have a
larger page size on certain platforms. Since ed48c59 "virtio-balloon:
Safely handle BALLOON_PAGE_SIZE < host page size" we've handled this
by accumulating requests to balloon 4kiB subpages until they formed a
full host page. Since f6deb6d "virtio-balloon: Remove unnecessary
MADV_WILLNEED on deflate" we essentially ignore deflate requests.
Suppose we have a host with 8kiB pages, and one host page has subpages
A & B. If we get this sequence of events -
inflate A
deflate A
inflate B
- the current logic will discard the whole host page. That's
incorrect because the guest has deflated subpage A, and could have
written important data to it.
This patch fixes the problem by adjusting our state information about
partially ballooned host pages when deflate requests are received.
Fixes: ed48c59 "virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size"
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190306030601.21986-3-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
ed48c59875 "virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host
page size" introduced a new temporary data structure which tracks 4kiB
chunks which have been inserted into the balloon by the guest but
don't yet form a full host page which we can discard.
Unfortunately, I had a thinko and allocated that structure with
g_malloc0() but freed it with a plain free() rather than g_free().
This corrects the problem.
Fixes: ed48c59875
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190306030601.21986-2-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
The elem could theorically contain both outbuf and inbufs. We move the
free operation to the end of this function to avoid using elem->in_sg
while elem has been freed.
Fixes: c13c4153f7
("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Peter Xu <peterx@redhat.com>
Message-Id: <1552383280-4122-1-git-send-email-wei.w.wang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Needed when VirtioPCIClass subclasses have their own
class struct with some extra fields.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20190307080244.9011-2-kraxel@redhat.com
Instead of hard-coding all config switches in the config file
default-configs/s390x-softmmu.mak, let's use the new Kconfig files
to express the necessary dependencies: The S390_CCW_VIRTIO config switch
for the "s390-ccw-virtio" machine now selects all non-optional devices.
And since we already have the VIRTIO_PCI and VIRTIO_MMIO config switches
for the other two virtio transports, this patch also introduces a new
config switch VIRTIO_CCW for the third, s390x-specific virtio transport,
so that all three virtio transports are now handled in the same way.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190123065618.3520-42-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of including the same list of devices for each target,
set CONFIG_PCI to true, and make the devices default to present
whenever PCI is available. However, s390x does not want all the
PCI devices, so there is a separate symbol to enable them.
Done mostly with the following script:
while read i; do
i=${i%=y}; i=${i#CONFIG_}
sed -i -e'/^config '$i'$/!b' -en \
-e'a\' -e' default y if PCI_DEVICES\' -e' depends on PCI' \
`grep -lw $i hw/*/Kconfig`
done < default-configs/pci.mak
followed by replacing a few "depends on" clauses with "select"
whenever the symbol is not really related to PCI.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190123065618.3520-31-yang.zhong@intel.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The Kconfig files were generated mostly with this script:
for i in `grep -ho CONFIG_[A-Z0-9_]* default-configs/* | sort -u`; do
set fnord `git grep -lw $i -- 'hw/*/Makefile.objs' `
shift
if test $# = 1; then
cat >> $(dirname $1)/Kconfig << EOF
config ${i#CONFIG_}
bool
EOF
git add $(dirname $1)/Kconfig
else
echo $i $*
fi
done
sed -i '$d' hw/*/Kconfig
for i in hw/*; do
if test -d $i && ! test -f $i/Kconfig; then
touch $i/Kconfig
git add $i/Kconfig
fi
done
Whenever a symbol is referenced from multiple subdirectories, the
script prints the list of directories that reference the symbol.
These symbols have to be added manually to the Kconfig files.
Kconfig.host and hw/Kconfig were created manually.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20190123065618.3520-27-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new feature enables the virtio-balloon device to receive hints of
guest free pages from the free page vq.
A notifier is registered to the migration precopy notifier chain. The
notifier calls free_page_start after the migration thread syncs the dirty
bitmap, so that the free page optimization starts to clear bits of free
pages from the bitmap. It calls the free_page_stop before the migration
thread syncs the bitmap, which is the end of the current round of ram
save. The free_page_stop is also called to stop the optimization in the
case when there is an error occurred in the process of ram saving.
Note: balloon will report pages which were free at the time of this call.
As the reporting happens asynchronously, dirty bit logging must be
enabled before this free_page_start call is made. Guest reporting must be
disabled before the migration dirty bitmap is synchronized.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Peter Xu <peterx@redhat.com>
Message-Id: <1544516693-5395-8-git-send-email-wei.w.wang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
dgilbert: Dropped kernel header update, fixed up CMD_ID_* name change
Lots of work on tests: BiosTablesTest UEFI app,
vhost-user testing for non-Linux hosts.
Misc cleanups and fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJccBqMAAoJECgfDbjSjVRpvSEIAKYPRNdCBX/SSS/L/tmJS5Zt
8IyU/HW1YJ249vO+aT6z4Q3QPgqNC3KjXC3brx/WRoPZnRroen4rv2Kqnk6SayPa
a52d2ubXKWxb3swdG1CAVzFRhq/ABpgAPx0dr1JW+RXgo2lxpJ4GNYxKMosQTaPE
hRNeXl1XlcIK525kJhFH3Hlij9mTRuY6T7ydpPQd8dUq2dBRaL9RrzZRrkZxCy6l
gQPUqNzPhG0XXyOiJmwYyVX0zGzbYrMLrMQAor2SBIYmU+zv2eZGPJUYxoMTUMzt
YR0WCpvkvPITlAryaBoozAIDYVz8PxBRT1KRwpDal+2rzlm6o+veKDiF8R46gn0=
=GzUz
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc, virtio: fixes, cleanups, tests
Lots of work on tests: BiosTablesTest UEFI app,
vhost-user testing for non-Linux hosts.
Misc cleanups and fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 22 Feb 2019 15:51:40 GMT
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (26 commits)
pci: Sanity test minimum downstream LNKSTA
hw/smbios: fix offset of type 3 sku field
pci: Move NVIDIA vendor id to the rest of ids
virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size
virtio-balloon: Use ram_block_discard_range() instead of raw madvise()
virtio-balloon: Rework ballon_page() interface
virtio-balloon: Corrections to address verification
virtio-balloon: Remove unnecessary MADV_WILLNEED on deflate
i386/kvm: ignore masked irqs when update msi routes
contrib/vhost-user-blk: fix the compilation issue
Revert "contrib/vhost-user-blk: fix the compilation issue"
pc-dimm: use same mechanism for [get|set]_addr
tests/data: introduce "uefi-boot-images" with the "bios-tables-test" ISOs
tests/uefi-test-tools: add build scripts
tests: introduce "uefi-test-tools" with the BiosTablesTest UEFI app
roms: build the EfiRom utility from the roms/edk2 submodule
roms: add the edk2 project as a git submodule
vhost-user-test: create a temporary directory per TestServer
vhost-user-test: small changes to init_hugepagefs
vhost-user-test: create a main loop per TestServer
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The virtio-balloon always works in units of 4kiB (BALLOON_PAGE_SIZE), but
we can only actually discard memory in units of the host page size.
Now, we handle this very badly: we silently ignore balloon requests that
aren't host page aligned, and for requests that are host page aligned we
discard the entire host page. The latter can corrupt guest memory if its
page size is smaller than the host's.
The obvious choice would be to disable the balloon if the host page size is
not 4kiB. However, that would break the special case where host and guest
have the same page size, but that's larger than 4kiB. That case currently
works by accident[1] - and is used in practice on many production POWER
systems where 64kiB has long been the Linux default page size on both host
and guest.
To make the balloon safe, without breaking that useful special case, we
need to accumulate 4kiB balloon requests until we have a whole contiguous
host page to discard.
We could in principle do that across all guest memory, but it would require
a large bitmap to track. This patch represents a compromise: we track
ballooned subpages for a single contiguous host page at a time. This means
that if the guest discards all 4kiB chunks of a host page in succession,
we will discard it. This is the expected behaviour in the (host page) ==
(guest page) != 4kiB case we want to support.
If the guest scatters 4kiB requests across different host pages, we don't
discard anything, and issue a warning. Not ideal, but at least we don't
corrupt guest memory as the previous version could.
Warning reporting is kind of a compromise here. Determining whether we're
in a problematic state at realize() time is tricky, because we'd have to
look at the host pagesizes of all memory backends, but we can't really know
if some of those backends could be for special purpose memory that's not
subject to ballooning.
Reporting only when the guest tries to balloon a partial page also isn't
great because if the guest page size happens to line up it won't indicate
that we're in a non ideal situation. It could also cause alarming repeated
warnings whenever a migration is attempted.
So, what we do is warn the first time the guest attempts balloon a partial
host page, whether or not it will end up ballooning the rest of the page
immediately afterwards.
[1] Because when the guest attempts to balloon a page, it will submit
requests for each 4kiB subpage. Most will be ignored, but the one
which happens to be host page aligned will discard the whole lot.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190214043916.22128-6-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In order to use VirtIOFeature also in other virtio devices, we move
its declaration and the endof() macro (renamed in virtio_endof())
in virtio.h.
We add virtio_feature_get_config_size() function to iterate the array
of VirtIOFeature and to return the config size depending on the
features enabled. (as virtio_net_set_config_size() did)
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190221103314.58500-5-sgarzare@redhat.com
Message-Id: <20190221103314.58500-5-sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Currently, virtio-balloon uses madvise() with MADV_DONTNEED to actually
discard RAM pages inserted into the balloon. This is basically a Linux
only interface (MADV_DONTNEED exists on some other platforms, but doesn't
always have the same semantics). It also doesn't work on hugepages and has
some other limitations.
It turns out that postcopy also needs to discard chunks of memory, and uses
a better interface for it: ram_block_discard_range(). It doesn't cover
every case, but it covers more than going direct to madvise() and this
gives us a single place to update for more possibilities in future.
There are some subtleties here to maintain the current balloon behaviour:
* For now, we just ignore requests to balloon in a hugepage backed region.
That matches current behaviour, because MADV_DONTNEED on a hugepage would
simply fail, and we ignore the error.
* If host page size is > BALLOON_PAGE_SIZE we can frequently call this on
non-host-page-aligned addresses. These would also fail in madvise(),
which we then ignored. ram_block_discard_range() error_report()s calls
on unaligned addresses, so we explicitly check that case to avoid
spamming the logs.
* We now call ram_block_discard_range() with the *host* page size, whereas
we previously called madvise() with BALLOON_PAGE_SIZE. Surprisingly,
this also matches existing behaviour. Although the kernel fails madvise
on unaligned addresses, it will round unaligned sizes *up* to the host
page size. Yes, this means that if BALLOON_PAGE_SIZE < guest page size
we can incorrectly discard more memory than the guest asked us to. I'm
planning to address that soon.
Errors other than the ones discussed above, will now be reported by
ram_block_discard_range(), rather than silently ignored, which means we
have a much better chance of seeing when something is going wrong.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20190214043916.22128-5-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This replaces the balloon_page() internal interface with
ballon_inflate_page(), with a slightly different interface. The new
interface will make future alterations simpler.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20190214043916.22128-4-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The virtio-balloon device's verification of the address given to it by the
guest has a number of faults:
* The addresses here are guest physical addresses, which should be
'hwaddr' rather than 'ram_addr_t' (the distinction is admittedly
pretty subtle and confusing)
* We don't check for section.mr being NULL, which is the main way that
memory_region_find() reports basic failures. We really need to check
that before looking at any other section fields, because
memory_region_find() doesn't initialize them on the failure path
* We're passing a length of '1' to memory_region_find(), but really the
guest is requesting that we put the entire page into the balloon,
so it makes more sense to call it with BALLOON_PAGE_SIZE
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20190214043916.22128-3-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When the balloon is inflated, we discard memory place in it using madvise()
with MADV_DONTNEED. And when we deflate it we use MADV_WILLNEED, which
sounds like it makes sense but is actually unnecessary.
The misleadingly named MADV_DONTNEED just discards the memory in question,
it doesn't set any persistent state on it in-kernel; all that's necessary
to bring the memory back is to touch it. MADV_WILLNEED in contrast
specifically says that the memory will be used soon and faults it in.
This patch simplify's the balloon operation by dropping the madvise()
on deflate. This might have an impact on performance - it will move a
delay at deflate time until that memory is actually touched, which
might be more latency sensitive. However:
* Memory that's being given back to the guest by deflating the
balloon *might* be used soon, but it equally could just sit around
in the guest's pools until needed (or even be faulted out again if
the host is under memory pressure).
* Usually, the timescale over which you'll be adjusting the balloon
is long enough that a few extra faults after deflation aren't
going to make a difference.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20190214043916.22128-2-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Detect all invalid configurations (e.g. mingw32 with vhost-user,
non-Linux with vhost-kernel). As a collateral benefit, all vhost-kernel
backends can be now disabled if one wants to reduce the attack surface.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1543851204-41186-6-git-send-email-pbonzini@redhat.com>
Message-Id: <1550165756-21617-7-git-send-email-pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vhost-user does not depend on Linux; it can run on any POSIX system. Restrict
vhost-kernel to Linux in hw/virtio/vhost-backend.c, everything else can be
compiled on all POSIX systems.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1543851204-41186-4-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1550165756-21617-4-git-send-email-pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For downstream s390x builds, we'd like to be able to build QEMU with
CONFIG_VIRTIO_PCI disabled (since virtio-ccw is used here instead),
but still with CONFIG_PCI enabled. This currently fails since the
virtio-*-pci.o files are still included in the build, but virtio-pci.o
is missing. Use the right config switch CONFIG_VIRTIO_PCI to exclude
the virtio-*-pci.o files from the build.
Reported-by: Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use VIRTIO_PCI MACRO instead.
Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The virtqueue_pop() and virtqueue_get_avail_bytes() routines can use the
INDIRECT table to get the data. It is possible to create a packet which
will lead to the assert message like:
include/exec/memory.h:1995: void
address_space_read_cached(MemoryRegionCache *, hwaddr, void *, int):
Assertion `addr < cache->len && len <= cache->len - addr' failed.
Aborted
To do it the first descriptor should have a link to the INDIRECT table
and set the size of it to 0. It doesn't look good that the guest should
be able to trigger the assert in qemu. Add additional check for the size
of the INDIRECT table, which should not be 0.
Signed-off-by: Dima Stepanov <dimastep@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
In virtio_balloon_get_config() we initialize a struct virtio_balloon_config
which we then copy to guest memory. However, the local variable is not
zero initialized. This works OK at the moment because we initialize
all the fields in it; however an upcoming kernel header change will
add some new fields. If we don't zero out the whole struct then we
will start leaking a small amount of the contents of QEMU's stack
to the guest as soon as we update linux-headers/ to a set of headers
that includes the new fields.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190118183603.24757-1-peter.maydell@linaro.org
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Virtio console and qga tests also depend on CONFIG_VIRTIO_SERIAL.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Notice that we can't still run tests with it disabled. Both cdrom-test and
drive_del-test use virtio-scsi without checking if it is enabled.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For consistency with other devices, rename
virtio_host_{initfn,pci_info} to virtio_input_host_{initfn,info}.
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, the vhost-user-test assumes the eventfd is available.
However it's not true because the accel is qtest. So the
'vhost_set_vring_file' will not add fds to the msg and the server
side of vhost-user-test will be broken. The bug is in 'ioeventfd_enabled'.
We should make this function return true if not using kvm accel.
Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Many of the current virtio-*-pci device types actually represent
3 different types of devices:
* virtio 1.0 non-transitional devices
* virtio 1.0 transitional devices
* virtio 0.9 ("legacy device" in virtio 1.0 terminology)
That would be just an annoyance if it didn't break our device/bus
compatibility QMP interfaces. With these multi-purpose device
types, there's no way to tell management software that
transitional devices and legacy devices require a Conventional
PCI bus.
The multi-purpose device types would also prevent us from telling
management software what's the PCI vendor/device ID for them,
because their PCI IDs change at runtime depending on the bus
where they were plugged.
This patch adds separate device types for each of those virtio
device flavors:
- virtio-*-pci: the existing multi-purpose device types
- Configurable using `disable-legacy` and `disable-modern`
properties
- Legacy driver support is automatically enabled/disabled
depending on the bus where it is plugged
- Supports Conventional PCI and PCI Express buses
(but Conventional PCI is incompatible with
disable-legacy=off)
- Changes PCI vendor/device IDs at runtime
- virtio-*-pci-transitional: virtio-1.0 device supporting legacy drivers
- Supports Conventional PCI buses only, because
it has a PIO BAR
- virtio-*-pci-non-transitional: modern-only
- Supports both Conventional PCI and PCI Express buses
The existing TYPE_* macros for these types will point to an
abstract base type, so existing casts in the code will keep
working for all variants.
A simple test script (tests/acceptance/virtio_version.py) is
included, to check if the new device types are equivalent to
using the `disable-legacy` and `disable-modern` options.
Acked-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce a helper for registering different flavours of virtio
devices. Convert code to use the helper, but keep only the
existing generic types. Transitional and non-transitional device
types will be added by another patch.
Acked-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJcD/usAAoJEPMMOL0/L748br0P/iLL5RjzHJ+vrBsphRNPZ0eM
1wvgWJwvo+4JicebsnTWDmElprgetu2+disXyxSJhBOllSb7lwmxKR2OyHlicu5x
RDSk3CTZZuI/CqN08MlEVZiCuCT1LZuJ8Y0RzXBAsJlT51ZpvwprbXO1oyMjjx2P
UhXFuYIa8Wk+8+zuFnYI4nKPo3o8ra8OrtI2AdIneQ8zWEMvJCWhHqrZHeyuHOzb
N5bYEXi3JSIT2qyHyZlFNmXjPCNuMxhrrBc1yjmo6KJgHkVUgvn61hdod4BzvLsd
DWAfdEamBgP4HuU2fUTVFAYXirK+A4tM+ROblJ/Z/V7RHifoulrdedQzrqdM7FA5
6f4SUo+SSjqY1CYnS+zXp5USu6/ciaYZv9jE7W0WKjXt5kCsy9rlT0V8Q4RlOQji
ZoT6LpcVj0qnfdPdYwdWgpqWbr8G5Y3Xm91a+XLSqzj+xfFJQ6h7fCnF1/Ngn0Ep
o3LOtiJCDSma4maFemV7qhWfuaa20vUwfbRKeOvnTirDUv6oXpsP19kScbH06DeD
Hs9aFgi7XmFWNypWHeZNqy00UwiZmb2GcpqL/vAVwkqdMTgttfBdh5P4srRwwM2D
OQROJaHaBya8mzDO7BldapQOGXRHu9UDCP9gkK+BGGu4Edu0U+eRL/08GoFRM1M4
ra2Dn/AQTIW6lk0N9He0
=+od8
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/vivier2/tags/trivial-patches-pull-request' into staging
Trivial patches (2018-12-11)
# gpg: Signature made Tue 11 Dec 2018 18:02:20 GMT
# gpg: using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg: aka "Laurent Vivier <laurent@vivier.eu>"
# gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C
* remotes/vivier2/tags/trivial-patches-pull-request: (30 commits)
Fixes i386 xchgq test
maint: Grammar fix to mailmap
MAINTAINERS: Update email address for Fam Zheng
cutils: Assert in-range base for string-to-integer conversions
util: vfio-helpers: use ARRAY_SIZE in qemu_vfio_init_pci()
target: hax: fix errors in comment
MAINTAINERS: Use my work email to review Build and test automation patches
MAINTAINERS: Add a missing entry for the NVDIMM device
MAINTAINERS: Add a missing entry to the QMP section
MAINTAINERS: Add a missing entry to SPICE
MAINTAINERS: Add missing entries for the MPS2 machine
MAINTAINERS: Add missing entries for the Canon DIGIC machine
MAINTAINERS: Add missing entries to the vhost section
MAINTAINERS: Add missing entries to the PC Chipset section
MAINTAINERS: Add a missing entry for the sun4m machines
MAINTAINERS: Add a missing entry for the Old World machines
MAINTAINERS: Add a missing entry for the Xilinx S3A-DSP 1800 machine
MAINTAINERS: Add missing entries for the Jazz machine
MAINTAINERS: Add missing entries for the Xilinx ZynqMP machine
MAINTAINERS: Add a missing entry to the SPARC CPU
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Instead of accepting any Object*, change user_creatable_complete() to
require a UserCreatable*. Modify the callers to pass the appropriate
argument, removing redundant dynamic cast checks in object creation.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20181204142023.15982-4-marcandre.lureau@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Pass num_sg by value instead of by pointer, as num_sg is never modified in
virtqueue_map_iovec().
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1541139396-4727-1-git-send-email-dongli.zhang@oracle.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Because they are supposed to remain const.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20181114132931.22624-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbzcCHAAoJEDhwtADrkYZT3YsP/2qE4HNY/htj3IP6vNJuSaqw
CLPRTz7zWmUBTE6FqSkvLsq3X2BMFFLeaIPA9EFcbyn2km6qPqBYgg9ElXXvPZBm
6hDeRIoC8FdRD0Apozd5MGC94/lE47PheDRV8V+4KrGLaaMXEPxMZ0wP4AfdS5pS
6Pt2xuF7nPu1+OWVxMk0fXadGjGLEuOQQmTh3B21J5RaynQ3gtd6h7XFC/LJyOGG
LC/6GyPc0h7KU83VnvrRjH/EOpu1wENgrsvWsS0sem8op35Z+i9jU5BfCp4qFkDy
gCHHUEyEeyexS+W+Tj87eBtK2gfrqQx9ovo8CIsWcUwpKbdD6AMK4FKGsDNMNHab
Kg5u/M+O8nHCB7DuursF+3mqEbZHb05cfKe6JEtiq49EuORMV5hp4Ap966noSwTw
UEU0NJNA1p8EdmXVudyyyYR7wpoSSmZpoenA+bJ3nthK8K0KcU4RUGk6ZEbxfJy+
7ENl+3R2IxmxzgXv/x0tz0uFisaVW1rltTXtMte+ElQsO0qy74iHdfR7JHsmLxj9
CO/ABMVoYsWq2OJv8pWLrdKpT4v3HQLJdHhknyu0ZcJGDyICqX29ULLEhPrNEZvW
rxVxAkiemlaqxlUjbrM46CDQQm+w03OCnk7aCYcV4oK+u5+o3mCag705gMPErapZ
6uOE3fAjiWw43sA31mek
=kPZX
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2018-10-22' into staging
Error reporting patches for 2018-10-22
# gpg: Signature made Mon 22 Oct 2018 13:20:23 BST
# gpg: using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-error-2018-10-22: (40 commits)
error: Drop bogus "use error_setg() instead" admonitions
vpc: Fail open on bad header checksum
block: Clean up bdrv_img_create()'s error reporting
vl: Simplify call of parse_name()
vl: Fix exit status for -drive format=help
blockdev: Convert drive_new() to Error
vl: Assert drive_new() does not fail in default_drive()
fsdev: Clean up error reporting in qemu_fsdev_add()
spice: Clean up error reporting in add_channel()
tpm: Clean up error reporting in tpm_init_tpmdev()
numa: Clean up error reporting in parse_numa()
vnc: Clean up error reporting in vnc_init_func()
ui: Convert vnc_display_init(), init_keyboard_layout() to Error
ui/keymaps: Fix handling of erroneous include files
vl: Clean up error reporting in device_init_func()
vl: Clean up error reporting in parse_fw_cfg()
vl: Clean up error reporting in mon_init_func()
vl: Clean up error reporting in machine_set_property()
vl: Clean up error reporting in chardev_init_func()
qom: Clean up error reporting in user_creatable_add_opts_foreach()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Calling error_report() in a function that takes an Error ** argument
is suspicious. Convert a few that are actually warnings to
warn_report().
While there, split a warning consisting of multiple sentences to
conform to conventions spelled out in warn_report()'s contract.
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Fam Zheng <famz@redhat.com>
Cc: Wei Huang <wei@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20181017082702.5581-5-armbru@redhat.com>
Before, we did not clear callback like handle_output when delete
the virtqueue which may result be segmentfault.
The scene is as follows:
1. Start a vm with multiqueue vhost-net,
2. then we write VIRTIO_PCI_GUEST_FEATURES in PCI configuration to
triger multiqueue disable in this vm which will delete the virtqueue.
In this step, the tx_bh is deleted but the callback virtio_net_handle_tx_bh
still exist.
3. Finally, we write VIRTIO_PCI_QUEUE_NOTIFY in PCI configuration to
notify the deleted virtqueue. In this way, virtio_net_handle_tx_bh
will be called and qemu will be crashed.
Although the way described above is uncommon, we had better reinforce it.
CC: qemu-stable@nongnu.org
Signed-off-by: liujunjie <liujunjie23@huawei.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
We've got three places already that provide a prototype for this
function in a .c file - that's ugly. Let's provide a proper prototype
in a header instead, with a proper description why this function should
not be used in most cases.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
According to documentation, NEED_REPLY_MASK should not be set
for VHOST_USER_SET_MEM_TABLE request in postcopy mode.
This restriction was mistakenly applied to 'reply_supported'
variable, which is local and used only for non-postcopy case.
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixes: 9bb3801994 ("vhost+postcopy: Send address back to qemu")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Message-Id: <20181002140947.4107-1-i.maximets@samsung.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
'fd' received from the vhost side is never freed.
Also, everything (including 'postcopy_listen' state) should be
cleaned up on vhost cleanup.
Fixes: 46343570c0 ("vhost+postcopy: Wire up POSTCOPY_END notify")
Fixes: f82c11165f ("vhost+postcopy: Register shared ufd with postcopy")
Cc: qemu-stable@nongnu.org
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Message-Id: <20181008160536.6332-3-i.maximets@samsung.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Both virtio-blk and virtio-scsi use virtio_queue_empty() as the
loop condition in VQ handlers (virtio_blk_handle_vq,
virtio_scsi_handle_cmd_vq). When a device is marked broken in
virtqueue_pop, for example if a vIOMMU address translation failed, we
want to break out of the loop.
This fixes a hanging problem when booting a CentOS 3.10.0-862.el7.x86_64
kernel with ATS enabled:
$ qemu-system-x86_64 \
... \
-device intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on \
-device virtio-scsi-pci,iommu_platform=on,ats=on,id=scsi0,bus=pci.4,addr=0x0
The dead loop happens immediately when the kernel boots and initializes
the device, where virtio_scsi_data_plane_handle_cmd will not return:
> ...
> #13 0x00005586602b7793 in virtio_scsi_handle_cmd_vq
> #14 0x00005586602b8d66 in virtio_scsi_data_plane_handle_cmd
> #15 0x00005586602ddab7 in virtio_queue_notify_aio_vq
> #16 0x00005586602dfc9f in virtio_queue_host_notifier_aio_poll
> #17 0x00005586607885da in run_poll_handlers_once
> #18 0x000055866078880e in try_poll_mode
> #19 0x00005586607888eb in aio_poll
> #20 0x0000558660784561 in aio_wait_bh_oneshot
> #21 0x00005586602b9582 in virtio_scsi_dataplane_stop
> #22 0x00005586605a7110 in virtio_bus_stop_ioeventfd
> #23 0x00005586605a9426 in virtio_pci_stop_ioeventfd
> #24 0x00005586605ab808 in virtio_pci_common_write
> #25 0x0000558660242396 in memory_region_write_accessor
> #26 0x00005586602425ab in access_with_adjusted_size
> #27 0x0000558660245281 in memory_region_dispatch_write
> #28 0x00005586601e008e in flatview_write_continue
> #29 0x00005586601e01d8 in flatview_write
> #30 0x00005586601e04de in address_space_write
> #31 0x00005586601e052f in address_space_rw
> #32 0x00005586602607f2 in kvm_cpu_exec
> #33 0x0000558660227148 in qemu_kvm_cpu_thread_fn
> #34 0x000055866078bde7 in qemu_thread_start
> #35 0x00007f5784906594 in start_thread
> #36 0x00007f5784639e6f in clone
With this patch, virtio_queue_empty will now return 1 as soon as the
vdev is marked as broken, after a "virtio: zero sized buffers are not
allowed" error.
To be consistent, update virtio_queue_empty_rcu as well.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180910145616.8598-2-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
virtio_queue_get_desc_addr returns 64-bit hwaddr while int is usually 32-bit.
If returned hwaddr is not equal to 0 but least-significant 32 bits are
equal to 0 then this code will not actually stop running queue.
Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
Acked-by: Jia He <hejianet@gmail.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Because the cache is sized to include the rings and the event indices,
negotiating the VIRTIO_RING_F_EVENT_IDX feature will result in the size
of the cache changing. And because MemoryRegionCache accesses are
range-checked, if we skip this we end up with an assertion failure.
This happens with OpenBSD 6.3.
Reported-by: Fam Zheng <famz@redhat.com>
Fixes: 97cd965c07
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The generated qapi_event_send_FOO() take an Error ** argument. They
can't actually fail, because all they do with the argument is passing it
to functions that can't fail: the QObject output visitor, and the
@qmp_emit callback, which is either monitor_qapi_event_queue() or
event_test_emit().
Drop the argument, and pass &error_abort to the QObject output visitor
and @qmp_emit instead.
Suggested-by: Eric Blake <eblake@redhat.com>
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180815133747.25032-4-peterx@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Commit message rewritten, update to qapi-code-gen.txt corrected]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Remove KVM specific tests in balloon_page(), instead marking
ballooning as inhibited without KVM_CAP_SYNC_MMU support.
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Check region type first before casting the memory region
to IOMMUMemoryRegion. Otherwise QEMU will abort with below
error message when casting non-IOMMU memory region:
vhost_iommu_region_add: Object 0x561f28bce4f0 is not an
instance of type qemu:iommu-memory-region
Fixes: cb1efcf462 ("iommu: Add IOMMU index argument to notifier APIs")
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
virtio-rng device causes old guest kernels(2.6.32) to hang on latest qemu.
The driver attempts to read from the virtio-rng device too early in it's
initialization. Qemu detects guest is not ready and returns, resulting in
hang.
To fix handle pending requests when guest is running and driver status is
set to 'VIRTIO_CONFIG_S_DRIVER_OK'.
CC: qemu-stable@nongnu.org
Reported-by: Sergio lopez <slopezpa@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add support for multiple IOMMU indexes to the IOMMU notifier APIs.
When initializing a notifier with iommu_notifier_init(), the caller
must pass the IOMMU index that it is interested in. When a change
happens, the IOMMU implementation must pass
memory_region_notify_iommu() the IOMMU index that has changed and
that notifiers must be called for.
IOMMUs which support only a single index don't need to change.
Callers which only really support working with IOMMUs with a single
index can use the result of passing MEMTXATTRS_UNSPECIFIED to
memory_region_iommu_attrs_to_index().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-3-peter.maydell@linaro.org
vDPA support, fix to vhost blk RO bit handling, some include path
cleanups, NFIT ACPI table.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbEXNvAAoJECgfDbjSjVRpc8gH/R8xrcFrV+k9wwbgYcOcGb6Y
LWjseE31pqJcxRV80vLOdzYEuLStZQKQQY7xBDMlA5vdyvZxIA6FLO2IsiJSbFAk
EK8pclwhpwQAahr8BfzenabohBv2UO7zu5+dqSvuJCiMWF3jGtPAIMxInfjXaOZY
odc1zY2D2EgsC7wZZ1hfraRbISBOiRaez9BoGDKPOyBY9G1ASEgxJgleFgoBLfsK
a1XU+fDM6hAVdxftfkTm0nibyf7PWPDyzqghLqjR9WXLvZP3Cqud4p8N29mY51pR
KSTjA4FYk6Z9EVMltyBHfdJs6RQzglKjxcNGdlrvacDfyFi79fGdiosVllrjfJM=
=3+V0
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
acpi, vhost, misc: fixes, features
vDPA support, fix to vhost blk RO bit handling, some include path
cleanups, NFIT ACPI table.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 01 Jun 2018 17:25:19 BST
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (31 commits)
vhost-blk: turn on pre-defined RO feature bit
ACPI testing: test NFIT platform capabilities
nvdimm, acpi: support NFIT platform capabilities
tests/.gitignore: add entry for generated file
arch_init: sort architectures
ui: use local path for local headers
qga: use local path for local headers
colo: use local path for local headers
migration: use local path for local headers
usb: use local path for local headers
sd: fix up include
vhost-scsi: drop an unused include
ppc: use local path for local headers
rocker: drop an unused include
e1000e: use local path for local headers
ioapic: fix up includes
ide: use local path for local headers
display: use local path for local headers
trace: use local path for local headers
migration: drop an unused include
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is only half of the work, because the proxy devices (virtio-*-pci,
virtio-*-ccw, etc.) are still included unconditionally. It is still a
move in the right direction.
Based-on: <20180522194943.24871-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove those unneeded includes to speed up the compilation
process a little bit. (Continue 7eceff5b5a cleanup)
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180528232719.4721-13-f4bug@amsat.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_get_iotlb_entry().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-12-peter.maydell@linaro.org
This patch introduces VHOST_USER_PROTOCOL_F_HOST_NOTIFIER.
With this feature negotiated, vhost-user backend can register
memory region based host notifiers. And it will allow the guest
driver in the VM to notify the hardware accelerator at the
vhost-user backend directly.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When multi queue is enabled e.g. for a virtio-net device,
each queue pair will have a vhost_dev, and the only thing
shared between vhost devs currently is the chardev. This
patch introduces a vhost-user state structure which will
be shared by all vhost devs of the same virtio device.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD protocol
feature to allow slave to send at most 8 descriptors
in each message to master via ancillary data using the
slave channel.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduces a vhost op for vhost backends to allow
them to filter the memory sections that they can handle.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The warning is
hw/virtio/vhost-user.c:1319:26: error: suggest braces
around initialization of subobject [-Werror,-Wmissing-braces]
VhostUserMsg msg = { 0 };
^
{}
While the original code is correct, and technically exactly correct
as per ISO C89, both GCC and Clang support plain empty set of braces
as an extension.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use qemu_set_nonblock rather than a simple fcntl; cleaner
and I have no reason to change other flags.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduces the support for setting memory region
based host notifiers for virtio device. This is helpful when
using a hardware accelerator for a virtio device, because
hardware heavily depends on the notification, this will allow
the guest driver in the VM to notify the hardware directly.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add some trace points for IOTLB translation for vhost. After vhost-user
is setup, the only IO path that QEMU will participate should be the
IOMMU translation, so it'll be good we can track this with explicit
timestamps when needed to see how long time we take to do the
translation, and whether there's anything stuck inside. It might be
useful for triaging vhost-user problems.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
qemu should read and report hugetlb page allocation
counts exported in the following kernel patch:
commit 4c3ca37c4a4394978fd0f005625f6064ed2b9a64
Author: Jonathan Helman <jonathan.helman@oracle.com>
Date: Mon Mar 19 11:00:35 2018 -0700
virtio_balloon: export hugetlb page allocation counts
Export the number of successful and failed hugetlb page
allocations via the virtio balloon driver. These 2 counts
come directly from the vm_events HTLB_BUDDY_PGALLOC and
HTLB_BUDDY_PGALLOC_FAIL.
Signed-off-by: Jonathan Helman <jonathan.helman@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
When IOMMU is enabled, we store virtqueue metadata as iova (though it
may has _phys suffix) and access them through dma helpers. Any
translation failures could be reported by IOMMU.
In this case, trying to validate iova against gpa won't work and will
cause a false error reporting. So this patch bypasses the ring
verification if IOMMU is enabled which is similar to the behavior
before 0ca1fd2d68 that calls vhost_memory_map() which is a nop when
IOMMU is enabled.
Fixes: 0ca1fd2d68 ("vhost: Simplify ring verification checks")
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
My rework of section adding combines overlapping or adjoining regions,
but checks they're actually the same underlying RAM block.
Fix the case where two blocks adjoin but don't overlap; that new region
should get added (but not combined), but my previous patch was disallowing it.
Fixes: c1ece84e7c
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Without a dedicated protocol feature, QEMU cannot know whether
the backend can handle VHOST_USER_SET_CONFIG and
VHOST_USER_GET_CONFIG messages.
This patch adds a protocol feature that is only advertised by
QEMU if the device implements the config ops. Vhost user init
fails if the device support the feature but the backend doesn't.
The backend should only send VHOST_USER_SLAVE_CONFIG_CHANGE_MSG
requests if the protocol feature has been negotiated.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Changpeng Liu <changpeng.liu@intel.com>
As soon as vhost-user init is done, the backend may send
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG, so let's set the
notification callback before it.
Also, it will be used to know whether the device supports
the config feature to advertize it or not.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Changpeng Liu <changpeng.liu@intel.com>
Align RAMBlocks to page size alignment, and adjust the merging code
to deal with partial overlap due to that alignment.
This is needed for postcopy so that we can place/fetch whole hugepages
when under userfault.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Wire up a call to VHOST_USER_POSTCOPY_END message to the vhost clients
right before we ask the listener thread to shutdown.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This message is sent just before the end of postcopy to get the
client to stop using userfault since we wont respond to any more
requests. It should close userfaultfd so that any other pages
get mapped to the backing file automatically by the kernel, since
at this point we know we've received everything.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Register a waker function in vhost-user code to be notified when
pages arrive or requests to previously mapped pages get requested.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Resolve fault addresses read off the clients UFD into RAMBlock
and offset, and call back to the postcopy code to ask for the page.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Stash the RAMBlock and offset for later use looking up
addresses.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We need a better way, but at the moment we need the address of the
mappings sent back to qemu so it can interpret the messages on the
userfaultfd it reads.
This is done as a 3 stage set:
QEMU -> client
set_mem_table
mmap stuff, get addresses
client -> qemu
here are the addresses
qemu -> client
OK - now you can use them
That ensures that qemu has registered the new addresses in it's
userfault code before the client starts accessing them.
Note: We don't ask for the default 'ack' reply since we've got our own.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Split the set_mem_table routines in both qemu and libvhost-user
because the postcopy versions are going to be quite different
once changes in the later patches are added. However, this patch
doesn't produce any functional change, just the split.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Notify the vhost-user slave on reception of the 'postcopy-listen'
event from the source.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Register the UFD that comes in as the response to the 'advise' method
with the postcopy code.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Wire up a notifier to send a VHOST_USER_POSTCOPY_ADVISE
message on an incoming advise.
Later patches will fill in the behaviour/contents of the
message.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a vhost feature flag for postcopy support, and
use the postcopy notifier to check it before allowing postcopy.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The previous commit improved compile time by including less of the
generated QAPI headers. This is impossible for stuff defined directly
in qapi-schema.json, because that ends up in headers that that pull in
everything.
Move everything but include directives from qapi-schema.json to new
sub-module qapi/misc.json, then include just the "misc" shard where
possible.
It's possible everywhere, except:
* monitor.c needs qmp-command.h to get qmp_init_marshal()
* monitor.c, ui/vnc.c and the generated qapi-event-FOO.c need
qapi-event.h to get enum QAPIEvent
Perhaps we'll get rid of those some other day.
Adding a type to qapi/migration.json now recompiles some 120 instead
of 2300 out of 5100 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-25-armbru@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
Introduce two vhost-user meassges: VHOST_USER_CREATE_CRYPTO_SESSION
and VHOST_USER_CLOSE_CRYPTO_SESSION. At this point, the QEMU side
support crypto operation in cryptodev host-user backend.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Impliment the vhost-crypto's funtions, such as startup,
stop and notification etc. Introduce an enum
QCryptoCryptoDevBackendOptionsType in order to
identify the cryptodev vhost backend is vhost-user
or vhost-kernel-module (If exist).
At this point, the cryptdoev-vhost-user works.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In commit 0ca1fd2d68 ("vhost: Simplify ring verification checks"),
it checks the virtqueue desc mapping for 3 times.
Fixed: commit 0ca1fd2d68 ("vhost: Simplify ring verification checks")
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>