If CONFIG_SECCOMP is undefined, the option 'elevatedprivileges' remains
compiled. This would make libvirt set the corresponding capability and
then trigger failure during guest startup. This patch moves the code
regarding seccomp command line options to qemu-seccomp.c file and
wraps qemu_opts_foreach finding sandbox option with CONFIG_SECCOMP.
Because parse_sandbox() is moved into qemu-seccomp.c file, change
seccomp_start() to static function.
Signed-off-by: Yi Min Zhao <zyimin@linux.ibm.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Tested-by: Ján Tomko <jtomko@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
Xen 4.11 has a new API to directly map guest resources. Among the resources
that can be mapped using this API are ioreq pages.
This patch modifies QEMU to attempt to use the new API should it exist,
falling back to the previous mechanism if it is unavailable.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
When global_log_dirty is enabled VRAM modification tracking never
worked correctly. The address that is passed to xen_hvm_modified_memory()
is not the effective PFN but RAM block address which is not the same
for VRAM.
We need to make a translation for this address into PFN using
physmap. Since there is no way to access physmap properly inside
xen_hvm_modified_memory() let's make it a global structure.
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Provide a VMSTATE_BOOL_SUB_ARRAY to go with VMSTATE_UINT8_SUB_ARRAY
and friends.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180521140402.23318-23-peter.maydell@linaro.org
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_get_iotlb_entry().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-12-peter.maydell@linaro.org
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to flatview_translate(); all its
callers now have attrs available.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-11-peter.maydell@linaro.org
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to the MemoryRegion valid.accepts
callback. We'll need this for subpage_accepts().
We could take the approach we used with the read and write
callbacks and add new a new _with_attrs version, but since there
are so few implementations of the accepts hook we just change
them all.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-9-peter.maydell@linaro.org
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to memory_region_access_valid().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.
The callsite in flatview_access_valid() is part of a recursive
loop flatview_access_valid() -> memory_region_access_valid() ->
subpage_accepts() -> flatview_access_valid(); we make it pass
MEMTXATTRS_UNSPECIFIED for now, until the next several commits
have plumbed an attrs parameter through the rest of the loop
and we can add an attrs parameter to flatview_access_valid().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-8-peter.maydell@linaro.org
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_access_valid().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-6-peter.maydell@linaro.org
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_map().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-5-peter.maydell@linaro.org
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_translate()
and address_space_translate_cached(). Callers either have an
attrs value to hand, or don't care and can use MEMTXATTRS_UNSPECIFIED.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-4-peter.maydell@linaro.org
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to tb_invalidate_phys_addr().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180521140402.23318-3-peter.maydell@linaro.org
Add more detail to the documentation for memory_region_init_iommu()
and other IOMMU-related functions and data structures.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20180521140402.23318-2-peter.maydell@linaro.org
Depending on the host abi, float16, aka uint16_t, values are
passed and returned either zero-extended in the host register
or with garbage at the top of the host register.
The tcg code generator has so far been assuming garbage, as that
matches the x86 abi, but this is incorrect for other host abis.
Further, target/arm has so far been assuming zero-extended results,
so that it may store the 16-bit value into a 32-bit slot with the
high 16-bits already clear.
Rectify both problems by mapping "f16" in the helper definition
to uint32_t instead of (a typedef for) uint16_t. This forces
the host compiler to assume garbage in the upper 16 bits on input
and to zero-extend the result on output.
Cc: qemu-stable@nongnu.org
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20180522175629.24932-1-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* New command-line option: --preconfig
This option allows pausing QEMU and allow the configuration
using QMP commands before running board initialization code.
* New QMP set-numa-node, now made possible because of --preconfig
* Small update on -numa error messages
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJbDy2jAAoJECgHk2+YTcWm8OgP/As0judZ7JwT5F5fR4nxtQPr
EODeEI+e/IahGH58Bafx24Fvxy0HZIgwqBDnPgclA8d3ebwmDQIejgk5/00zq2xv
cktRpzXGSBGJCVNctVz8xqN91m0lPtVgeRvXUJPn3hthXRSLO4p0vbyOW8g/C+O2
+dEcGqifAQatyxs9gYmmWoUia8zle/v1bd2v/x/DwiFW9cd47yDr3+ChHh6+nx0W
uh2zymD+ykVeNJ9WSxc3k8zTzQnuxbDK1fNbZsztk9KQDMWG3+u7KRNoJ86/7hKL
RMUfRCUMBgemuZsrFF8wSsha27e9VgCw4oR8dQ5AnTgMjK6nch/839XuFWh7j5qu
ntS4vt1v6IczdflKXX7IUILAgvVffLh2SCGWlKq9Q1eKo+CZTCea+iSbb2QobAE+
9TyuqXVKCsGKLoqkbK39d1UjZFhDQJhMsyuOHreKEINmJRmLI5qrNbEaaysb+mMB
DnFE1NiWjE+6X4w9U1l8lnKNUakOPNDG8cbgtuiEKDC9e3OZeiS8yyPLaQajELSR
Ig7N2GxrLMSfS3LNMSwkACILzVeu03aFoHyCPrpUqD8mpxidN0mmFVup8dawtkJ1
A5dHrKDmzlOT6DivJOgT8/Os/wMo3ErPcvyu69rTUO9PKSbwcFPwi6bMMDzgb4xO
kAles9X7HrHxSeJiggRW
=RHIj
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ehabkost/tags/numa-next-pull-request' into staging
NUMA queue, 2018-05-30
* New command-line option: --preconfig
This option allows pausing QEMU and allow the configuration
using QMP commands before running board initialization code.
* New QMP set-numa-node, now made possible because of --preconfig
* Small update on -numa error messages
# gpg: Signature made Thu 31 May 2018 00:02:59 BST
# gpg: using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/numa-next-pull-request:
tests: functional tests for QMP command set-numa-node
qmp: add set-numa-node command
qmp: permit query-hotpluggable-cpus in preconfig state
tests: extend qmp test with preconfig checks
cli: add --preconfig option
tests: qapi-schema tests for allow-preconfig
qapi: introduce new cmd option "allow-preconfig"
hmp: disable monitor in preconfig state
qapi: introduce preconfig runstate
numa: split out NumaOptions parsing into set_numa_options()
numa: postpone options post-processing till machine_run_board_init()
numa: clarify error message when node index is out of range in -numa dist, ...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This option allows pausing QEMU in the new RUN_STATE_PRECONFIG state,
allowing the configuration of QEMU from QMP before the machine jumps
into board initialization code of machine_run_board_init()
The intent is to allow management to query machine state and additionally
configure it using previous query results within one QEMU instance
(i.e. eliminate the need to start QEMU twice, 1st to query board specific
parameters and 2nd for actual VM start using query results for
additional parameters).
The new option complements -S option and could be used with or without
it. The difference is that -S pauses QEMU when the machine is completely
initialized with all devices wired up and ready to execute guest code
(QEMU needs only to unpause VCPUs to let guest execute its code),
while the "preconfig" option pauses QEMU early before board specific init
callback (machine_run_board_init) is executed and allows the configuration
of machine parameters which will be used by board init code.
When early introspection/configuration is done, command 'exit-preconfig'
should be used to exit RUN_STATE_PRECONFIG and transition to the next
requested state (i.e. if -S is used then QEMU will pause the second
time when board/device initialization is completed or start guest
execution if -S isn't provided on CLI)
PS:
Initially 'preconfig' is planned to be used for configuring numa
topology depending on board specified possible cpus layout.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1526059483-42847-1-git-send-email-imammedo@redhat.com>
[ehabkost: Changed "since 2.13" to "since 3.0"]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
New option will be used to allow commands, which are prepared/need
to run, during preconfig state. Other commands that should be able
to run in preconfig state, should be amended to not expect machine
in initialized state or deal with it.
For compatibility reasons, commands that don't use new flag
'allow-preconfig' explicitly are not permitted to run in
preconfig state but allowed in all other states like they used
to be.
Within this patch allow following commands in preconfig state:
qmp_capabilities
query-qmp-schema
query-commands
query-command-line-options
query-status
exit-preconfig
to allow qmp connection, basic introspection and moving to the next
state.
PS:
set-numa-node and query-hotpluggable-cpus will be enabled later in
a separate patches.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1526057503-39287-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[ehabkost: Changed "since 2.13" to "since 3.0"]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
it will allow to reuse set_numa_options() for parsing
configuration commands received via QMP interface
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1525423069-61903-3-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
in preparation for numa options to being handled via QMP before
machine_run_board_init(), move final numa configuration checks
and processing to machine_run_board_init() so it could take into
account both CLI (via parse_numa_opts()) and QMP input
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1525423069-61903-2-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
So far we relied on job->ret and strerror() to produce an error message
for failed jobs. Not surprisingly, this tends to result in completely
useless messages.
This adds a Job.error field that can contain an error string for a
failing job, and a parameter to job_completed() that sets the field. As
a default, if NULL is passed, we continue to use strerror(job->ret).
All existing callers are changed to pass NULL. They can be improved in
separate patches.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
gdb_handlesig()'s behaviour is not entirely obvious at first
glance. Add a doc comment for it, and also add a comment
explaining why it's ok for gdb_do_syscallv() to ignore
gdb_handlesig()'s return value. (Coverity complains about
this: CID 1390850.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180515181958.25837-1-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
In thunk_type_align() and thunk_type_size() we currently return
-1 if the value at the type_ptr isn't one of the TYPE_* values
we understand. However, this should never happen, and if it does
then the calling code will go confusingly wrong because none
of the callsites try to handle an error return. Switch to an
assertion instead, so that if this does somehow happen we'll have
a nice clear backtrace of what happened rather than a weird crash
or misbehaviour.
This also silences various Coverity complaints about not handling
the negative return value (CID 1005735, 1005736, 1005738, 1390582).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180514174616.19601-1-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Beginning of merging vDPA, new PCI ID, a new virtio balloon stat, intel
iommu rework fixing a couple of security problems (no CVEs yet), fixes
all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbBX2cAAoJECgfDbjSjVRpOEYIAIR6KGkwbAJ9SnO9B71DQHl1
yYYgM7i2HwyZ1YPnXOYWnI1lzQ1bARTf2krQJFGmfjlDaueFf9KnXdNByoVCmG8m
UhF/rQp3DcJ4wTABktPtME8gWdQxKPmDxlN5W3f29Zrm3g9S+Hshi+sfPZUkBxL4
gQMFRctb2SxvQXG+lusHVwo1oF6pzGZMmX35906he3m4xS/cfoeCP7Qj6nSvHZq7
lsLoOeYxHtXWA9gTYxpd7zW+hhUxkspoOqcXySHfO7e5enJANaulTxKuC0T+6HL4
O2iUM+1wjUYE0tQcNJ6x7emA82k5OdG2OMD6gbR1oSdquttJo7+4R+goqpb44rc=
=NUoY
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, virtio, vhost: fixes, features
Beginning of merging vDPA, new PCI ID, a new virtio balloon stat, intel
iommu rework fixing a couple of security problems (no CVEs yet), fixes
all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Wed 23 May 2018 15:41:32 BST
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (28 commits)
intel-iommu: rework the page walk logic
util: implement simple iova tree
intel-iommu: trace domain id during page walk
intel-iommu: pass in address space when page walk
intel-iommu: introduce vtd_page_walk_info
intel-iommu: only do page walk for MAP notifiers
intel-iommu: add iommu lock
intel-iommu: remove IntelIOMMUNotifierNode
intel-iommu: send PSI always even if across PDEs
nvdimm: fix typo in label-size definition
contrib/vhost-user-blk: enable protocol feature for vhost-user-blk
hw/virtio: Fix brace Werror with clang 6.0.0
libvhost-user: Send messages with no data
vhost-user+postcopy: Use qemu_set_nonblock
virtio: support setting memory region based host notifier
vhost-user: support receiving file descriptors in slave_read
vhost-user: add Net prefix to internal state structure
linux-headers: add kvm header for mips
linux-headers: add unistd.h on all arches
update-linux-headers.sh: unistd.h, kvm consistency
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJbBGT2AAoJEIlPj0hw4a6Q0BIQAIciEH80ZLPSQPJciRoBCB/s
1BUsbFelfeHcTXAOxiahmaQbzzffosifK0uJAH5Quy5RotnKknguRgfgAl8nWX34
vXBbYKOmdudNwk3wIhcnnLfiRvNcdF4yDVVArsnvEbCWUBV2mpTrhuqmepZMiY7s
GQU42ly09WQePBPFW19RS/kxsaR1lSuGIuSxtc+0QvDQpqMRK3HslHrjD0jgUISh
pitqL3ztPHpJN1d9KI9fNT066magjnAkWV1gSYH9MAFvWV6JVuXqyJPY/x+YcZEU
uPgZdEmONIoFqLD7yEHf/kXtCoRZavnvX3t0OPvUOPrN8eeui/UrtOyElY0NxmS/
RGKggOAjN4VYdAfXHqLIWHkkE8CicFkukr6hk2nyfDKT63ZTR6TfEESrweL3clej
J2dQ1bPvmAhlPU7tFeo8drAFgeIjLl9QAXEtp2+1FXGYBOT4avwtnt4rrGpnSbpL
wV6SObd4d8yJR1JrA42jXuL2qlubainlceg4dSFGf84iUx68Sq+DXM+ssbVxs4Tc
VEJ2nIGK2OF0jafFUV4sqZ7iDQoor0A/SGiOHk8j3kr/kWbidid6opl+MRcr/II3
CYHyR8IH3LLY/CY18ppVbuCNxovPTilwRsleNy6ZxM1uFeWIOGWUy0WutPTi8wq8
Ksa1MVPftDaMOLL2WE5C
=e+YI
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/sstabellini-http/tags/xen-20180522-tag' into staging
Xen 2018/05/22
# gpg: Signature made Tue 22 May 2018 19:44:06 BST
# gpg: using RSA key 894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
# gpg: aka "Stefano Stabellini <sstabellini@kernel.org>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3 0AEA 894F 8F48 70E1 AE90
* remotes/sstabellini-http/tags/xen-20180522-tag:
xen_disk: be consistent with use of xendev and blkdev->xendev
xen_disk: use a single entry iovec
xen_backend: make the xen_feature_grant_copy flag private
xen_disk: remove use of grant map/unmap
xen_backend: add an emulation of grant copy
xen: remove other open-coded use of libxengnttab
xen_disk: remove open-coded use of libxengnttab
xen_backend: add grant table helpers
xen: add a meaningful declaration of grant_copy_segment into xen_common.h
checkpatch: generalize xen handle matching in the list of types
xen-hvm: create separate function for ioreq server initialization
xen_pt: Present the size of 64 bit BARs correctly
configure: Add explanation for --enable-xen-pci-passthrough
xen/pt: use address_space_memory object for memory region hooks
xen-pvdevice: Introduce a simplistic xen-pvdevice save state
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Create a new header file, move the bochs vbe dispi interface
defines to it, so they can be used outside vga code.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180522165058.15404-2-kraxel@redhat.com
This patch fixes a potential small window that the DMA page table might
be incomplete or invalid when the guest sends domain/context
invalidations to a device. This can cause random DMA errors for
assigned devices.
This is a major change to the VT-d shadow page walking logic. It
includes but is not limited to:
- For each VTDAddressSpace, now we maintain what IOVA ranges we have
mapped and what we have not. With that information, now we only send
MAP or UNMAP when necessary. Say, we don't send MAP notifies if we
know we have already mapped the range, meanwhile we don't send UNMAP
notifies if we know we never mapped the range at all.
- Introduce vtd_sync_shadow_page_table[_range] APIs so that we can call
in any places to resync the shadow page table for a device.
- When we receive domain/context invalidation, we should not really run
the replay logic, instead we use the new sync shadow page table API to
resync the whole shadow page table without unmapping the whole
region. After this change, we'll only do the page walk once for each
domain invalidations (before this, it can be multiple, depending on
number of notifiers per address space).
While at it, the page walking logic is also refactored to be simpler.
CC: QEMU Stable <qemu-stable@nongnu.org>
Reported-by: Jintack Lim <jintack@cs.columbia.edu>
Tested-by: Jintack Lim <jintack@cs.columbia.edu>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce a simplest iova tree implementation based on GTree.
CC: QEMU Stable <qemu-stable@nongnu.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For UNMAP-only IOMMU notifiers, we don't need to walk the page tables.
Fasten that procedure by skipping the page table walk. That should
boost performance for UNMAP-only notifiers like vhost.
CC: QEMU Stable <qemu-stable@nongnu.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
SECURITY IMPLICATION: this patch fixes a potential race when multiple
threads access the IOMMU IOTLB cache.
Add a per-iommu big lock to protect IOMMU status. Currently the only
thing to be protected is the IOTLB/context cache, since that can be
accessed even without BQL, e.g., in IO dataplane.
Note that we don't need to protect device page tables since that's fully
controlled by the guest kernel. However there is still possibility that
malicious drivers will program the device to not obey the rule. In that
case QEMU can't really do anything useful, instead the guest itself will
be responsible for all uncertainties.
CC: QEMU Stable <qemu-stable@nongnu.org>
Reported-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
That is not really necessary. Removing that node struct and put the
list entry directly into VTDAddressSpace. It simplfies the code a lot.
Since at it, rename the old notifiers_list into vtd_as_with_notifiers.
CC: QEMU Stable <qemu-stable@nongnu.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Fixes: commit da6789c27c ("nvdimm: add a macro for property "label-size"")
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Cc: Haozhong Zhang <haozhong.zhang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduces the support for setting memory region
based host notifiers for virtio device. This is helpful when
using a hardware accelerator for a virtio device, because
hardware heavily depends on the notification, this will allow
the guest driver in the VM to notify the hardware directly.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This adds a minimal query-jobs implementation that shouldn't pose many
design questions. It can later be extended to expose more information,
and especially job-specific information.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
BlockJob has fields .offset and .len, which are actually misnomers today
because they are no longer tied to block device sizes, but just progress
counters. As such they make a lot of sense in generic Jobs.
This patch moves the fields to Job and renames them to .progress_current
and .progress_total to describe their function better.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
The transition to the READY state was still performed in the BlockJob
layer, in the same function that sent the BLOCK_JOB_READY QMP event.
This patch brings the state transition to the Job layer and implements
the QMP event using a notifier called from the Job layer, like we
already do for other events related to state transitions.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Instead of having a 'bool ready' in BlockJob, add a function that
derives its value from the job status.
At the same time, this fixes the behaviour to match what the QAPI
documentation promises for query-block-job: 'true if the job may be
completed'. When the ready flag was introduced in commit ef6dbf1e46,
the flag never had to be reset to match the description because after
being ready, the jobs would immediately complete and disappear.
Job transactions and manual job finalisation were introduced only later.
With these changes, jobs may stay around even after having completed
(and they are not ready to be completed a second time), however their
patches forgot to reset the ready flag.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This moves the logic that implements job transactions from BlockJob to
Job.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This doesn't actually move any transaction code to Job yet, but it
renames the type for transactions from BlockJobTxn to JobTxn and makes
them contain Jobs rather than BlockJobs
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
block_job_finish_sync() doesn't contain anything block job specific any
more, so it can be moved to Job.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This moves the .complete callback that tells a READY job to complete
from BlockJobDriver to JobDriver. The wrapper function job_complete()
doesn't require anything block job specific any more and can be moved
to Job.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
block_job_drain() contains a blk_drain() call which cannot be moved to
Job, so add a new JobDriver callback JobDriver.drain which has a common
implementation for all BlockJobs. In addition to this we keep the
existing BlockJobDriver.drain callback that is called by the common
drain implementation for all block jobs.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
block_job_cancel_async() did two things that were still block job
specific:
* Setting job->force. This field makes sense on the Job level, so we can
just move it. While at it, rename it to job->force_cancel to make its
purpose more obvious.
* Resetting the I/O status. This can't be moved because generic Jobs
don't have an I/O status. What the function really implements is a
user resume, except without entering the coroutine. Consequently, it
makes sense to call the .user_resume driver callback here which
already resets the I/O status.
The old block_job_cancel_async() has two separate if statements that
check job->iostatus != BLOCK_DEVICE_IO_STATUS_OK and job->user_paused.
However, the former condition always implies the latter (as is
asserted in block_job_iostatus_reset()), so changing the explicit call
of block_job_iostatus_reset() on the former condition with the
.user_resume callback on the latter condition is equivalent and
doesn't need to access any BlockJob specific state.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This moves the finalisation of a single job from BlockJob to Job.
Some part of this code depends on job transactions, and job transactions
call this code, we introduce some temporary calls from Job functions to
BlockJob ones. This will be fixed once transactions move to Job, too.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Go through the Job layer in order to send QMP events. For the moment,
these functions only call a notifier in the BlockJob layer that sends
the existing commands.
This uses notifiers rather than JobDriver callbacks because internal
users of jobs won't receive QMP events, but might still be interested
in getting notified for the events.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>