The code that calculates the legacy ACPI table size for migration
compatibility uses max_cpus when calculating legacy_aml_len (the size of
the DSDT and SSDT tables). However, the SSDT grows according to APIC ID
limit, not max_cpus.
The bug is not triggered very often because of the 4k alignment on the
table size. But it can be triggered if you are unlucky enough to cross a
4k boundary.
Change the legacy_aml_len calculation to use apic_id_limit, to calculate
the right size.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Qemu crash in the source side while migrating, after starting ipmi service inside vm.
./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 \
-drive file=/work/suse/suse11_sp3_64_vt,format=raw,if=none,id=drive-virtio-disk0,cache=none \
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 \
-vnc :99 -monitor vc -device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-kcs,bmc=bmc0,ioport=0xca2
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffec4268700 (LWP 7657)]
__memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:2757
(gdb) bt
#0 __memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:2757
#1 0x00005555559ef775 in memcpy (__len=3, __src=0xc1421c, __dest=<optimized out>)
at /usr/include/bits/string3.h:51
#2 qemu_put_buffer (f=0x555557a97690, buf=0xc1421c <Address 0xc1421c out of bounds>, size=3)
at migration/qemu-file.c:346
#3 0x00005555559eef66 in vmstate_save_state (f=f@entry=0x555557a97690,
vmsd=0x555555f8a5a0 <vmstate_ISAIPMIKCSDevice>, opaque=0x555557231160,
vmdesc=vmdesc@entry=0x55555798cc40) at migration/vmstate.c:333
#4 0x00005555557cfe45 in vmstate_save (f=f@entry=0x555557a97690, se=se@entry=0x555557231de0,
vmdesc=vmdesc@entry=0x55555798cc40) at /mnt/sdb/zyy/qemu/migration/savevm.c:720
#5 0x00005555557d2be7 in qemu_savevm_state_complete_precopy (f=0x555557a97690,
iterable_only=iterable_only@entry=false) at /mnt/sdb/zyy/qemu/migration/savevm.c:1128
#6 0x00005555559ea102 in migration_completion (start_time=<synthetic pointer>,
old_vm_running=<synthetic pointer>, current_active_state=<optimized out>,
s=0x5555560eaa80 <current_migration.44078>) at migration/migration.c:1707
#7 migration_thread (opaque=0x5555560eaa80 <current_migration.44078>) at migration/migration.c:1855
#8 0x00007ffff3900dc5 in start_thread (arg=0x7ffec4268700) at pthread_create.c:308
#9 0x00007fffefc6c71d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Device ivshmem property use64=0 is designed to make the device
expose a 32 bit shared memory BAR instead of 64 bit one. The
default is a 64 bit BAR, except pc-1.2 and older retain a 32 bit
BAR. A 32 bit BAR can support only up to 1 GiB of shared memory.
This worked as designed until commit 5400c02 accidentally flipped
its sense: since then, we misinterpret use64=0 as use64=1 and vice
versa. Worse, the default got flipped as well. Devices
ivshmem-plain and ivshmem-doorbell are not affected.
Fix by restoring the test of IVShmemState member not_legacy_32bit
that got messed up in commit 5400c02. Also update its
initialization for devices ivhsmem-plain and ivshmem-doorbell.
Without that, they'd regress to 32 bit BARs.
Cc: qemu-stable@nongnu.org
Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Dataplane has been omitting forever the step of setting ISR when
an interrupt is raised. This caused little breakage, because the
specification actually says that ISR may not be updated in MSI mode.
Some versions of the Windows drivers however didn't clear MSI mode
correctly, and proceeded using polling mode (using ISR, not the used
ring index!) for crashdump and hibernation. If it were just crashdump
and hibernation it would not be a big deal, but recent releases of
Windows do not really shut down, but rather log out and hibernate to
make the next startup faster. Hence, this manifested as a more serious
hang during shutdown with e.g. Windows 8.1 and virtio-win 1.8.0 RPMs.
Newer versions fixed this, while older versions do not use MSI at all.
The failure has always been there for virtio dataplane, but it became
visible after commits 9ffe337 ("virtio-blk: always use dataplane path
if ioeventfd is active", 2016-10-30) and ad07cd6 ("virtio-scsi: always
use dataplane path if ioeventfd is active", 2016-10-30) made virtio-blk
and virtio-scsi always use the dataplane code under KVM. The good news
therefore is that it was not a bug in the patches---they were doing
exactly what they were meant for, i.e. shake out remaining dataplane bugs.
The fix is not hard, so it's worth arranging for the broken drivers.
The virtio_should_notify+event_notifier_set pair that is common to
virtio-blk and virtio-scsi dataplane is replaced with a new public
function virtio_notify_irqfd that also sets ISR. The irqfd emulation
code now need not set ISR anymore, so virtio_irq is removed.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This will be needed once dataplane will be able to set it outside
the big QEMU lock.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Following the recent refactoring of virtio notifiers [1], more specifically
the patch ed08a2a0b ("virtio: use virtio_bus_set_host_notifier to
start/stop ioeventfd") that uses virtio_bus_set_host_notifier [2]
by default, core virtio code requires 'ioeventfd_started' to be set
to true/false when the host notifiers are configured.
When vhost is stopped and started, however, there is a stop followed by
another start. Since ioeventfd_started was never set to true, the 'stop'
operation triggered by virtio_bus_set_host_notifier() will not result
in a call to virtio_pci_ioeventfd_assign(assign=false). This leaves
the memory regions with stale notifiers and results on the next start
triggering the following assertion:
kvm_mem_ioeventfd_add: error adding ioeventfd: File exists
Aborted
This patch reintroduces (hopefully in a cleaner way) the concept
that was present with ioeventfd_disabled before the refactoring.
When ioeventfd_grabbed>0, ioeventfd_started tracks whether ioeventfd
should be enabled or not, but ioeventfd is actually not started at
all until vhost releases the host notifiers.
[1] http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg07748.html
[2] http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg07760.html
Reported-by: Felipe Franciosi <felipe@nutanix.com>
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Fixes: ed08a2a0b ("virtio: use virtio_bus_set_host_notifier to start/stop ioeventfd")
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Tested-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We must check for new virtqueue buffers after re-enabling notifications.
This prevents the race condition where the guest added buffers just
after we stopped popping the virtqueue but before we re-enabled
notifications.
I think the virtio-crypto code was based on virtio-net but this crucial
detail was missed. virtio-net does not have the race condition because
it processes the virtqueue one more time after re-enabling
notifications.
Cc: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Device ivshmem property use64=0 is designed to make the device
expose a 32 bit shared memory BAR instead of 64 bit one. The
default is a 64 bit BAR, except pc-1.2 and older retain a 32 bit
BAR. A 32 bit BAR can support only up to 1 GiB of shared memory.
This worked as designed until commit 5400c02 accidentally flipped
its sense: since then, we misinterpret use64=0 as use64=1 and vice
versa. Worse, the default got flipped as well. Devices
ivshmem-plain and ivshmem-doorbell are not affected.
Fix by restoring the test of IVShmemState member not_legacy_32bit
that got messed up in commit 5400c02. Also update its
initialization for devices ivhsmem-plain and ivshmem-doorbell.
Without that, they'd regress to 32 bit BARs.
Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1479385863-7648-1-git-send-email-ann.zhuangyanying@huawei.com>
PC will use this field in other way, so move it outside the common
code so PC could set a different value, i.e. all CPUs
regardless of where they are coming from (-smp X | -device cpu...).
It's quick and dirty hack as it could be implemented in more generic
way in MashineClass. But do it in simple way since only PC is affected
so far.
Later we can generalize it when another affected target gets support
for -device cpu.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1479212236-183810-3-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This reverts commit 080ac219cc.
Legacy FW_CFG_NB_CPUS will be reused instead of 'etc/boot-cpus'
fw_cfg file since it does the same and there is no point
to maintaing duplicate guest ABI, if it can be helped.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1479212236-183810-2-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Lots of fixes all over the place.
Unfortunately, this does not yet fix a regression with vhost
introduced by the last pull, the issue is typically this error:
kvm_mem_ioeventfd_add: error adding ioeventfd: File exists
followed by QEMU aborting.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJYKyfxAAoJECgfDbjSjVRpI4oH/2ZBpUxT/neq4ezX0bou5+1R
lQ1m0VI1JDbay5uYw0Z0rUY7aLP0kX2XLu2jNBZg7fGz3+BPhqAoEjkGdlUran79
fEnuYCvMMokQNzvMaHv+lFXO/MuEJthdDeUJyi4B0NU0sseutiz8opXuSWIC8Ncx
pyqRb8AfgKnsUSRizEVfiOWI1fk+zsTFbSyUwajwKi5E7RNEuHwLiqk5VFhzrrTX
nLwnUvlH7NrcDfliam9ziadhguo5cwCE4jBlMpeHnW5tGalNRulvF5EgwXybIdrU
JaR6lzQabOcoaAuJJ/dgo336B1Ef3JA/hogqfTW4unJGL5VVkWT0HLZ9OV42NEg=
=ibZy
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, vhost, pc, pci: documentation, fixes and cleanups
Lots of fixes all over the place.
Unfortunately, this does not yet fix a regression with vhost
introduced by the last pull, the issue is typically this error:
kvm_mem_ioeventfd_add: error adding ioeventfd: File exists
followed by QEMU aborting.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* remotes/mst/tags/for_upstream: (28 commits)
docs: add PCIe devices placement guidelines
virtio: drop virtio_queue_get_ring_{size,addr}()
vhost: drop legacy vring layout bits
vhost: adapt vhost_verify_ring_mappings() to virtio 1 ring layout
nvdimm acpi: introduce NVDIMM_DSM_MEMORY_SIZE
nvdimm acpi: use aml_name_decl to define named object
nvdimm acpi: rename nvdimm_dsm_reserved_root
nvdimm acpi: fix two comments
nvdimm acpi: define DSM return codes
nvdimm acpi: rename nvdimm_acpi_hotplug
nvdimm acpi: cleanup nvdimm_build_fit
nvdimm acpi: rename nvdimm_plugged_device_list
docs: improve the doc of Read FIT method
nvdimm acpi: clean up nvdimm_build_acpi
pc: memhp: stop handling nvdimm hotplug in pc_dimm_unplug
pc: memhp: move nvdimm hotplug out of memory hotplug
nvdimm acpi: drop the lock of fit buffer
qdev: hotplug: drop HotplugHandler.post_plug callback
vhost: migration blocker only if shared log is used
virtio-net: mark VIRTIO_NET_F_GSO as legacy
...
Message-id: 1479237527-11846-1-git-send-email-mst@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Allow the PCIHostDeviceAddress structure to work as the host property
in vfio-pci when it has it's default value of all fields set to ~0. In
this form the property indicates a non-existant device but given the
field bit sizes gets asserted as excess (and invalid) precision
overflows the string buffer. The BDF of an invalid device
"FFFF:FF:FF.F" is returned instead.
Signed-off-by: Daniel Oram <daniel.oram@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <71f06765c4ba16dcd71cbf78e877619948f04ed9.1478777270.git.daniel.oram@gmail.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
These are not used anymore.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The legacy vring layout is not used anymore as we use the separate
mappings even for legacy devices.
This patch simply removes it.
This also fixes a bug with virtio 1 devices when the vring descriptor table
is mapped at a higher address than the used vring because the following
function may return an insanely great value:
hwaddr virtio_queue_get_ring_size(VirtIODevice *vdev, int n)
{
return vdev->vq[n].vring.used - vdev->vq[n].vring.desc +
virtio_queue_get_used_size(vdev, n);
}
and the mapping fails.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
With virtio 1, the vring layout is split in 3 separate regions of
contiguous memory for the descriptor table, the available ring and the
used ring, as opposed with legacy virtio which uses a single region.
In case of memory re-mapping, the code ensures it doesn't affect the
vring mapping. This is done in vhost_verify_ring_mappings() which assumes
the device is legacy.
This patch changes vhost_verify_ring_mappings() to check the mappings of
each part of the vring separately.
This works for legacy mappings as well.
Cc: qemu-stable@nongnu.org
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
and use it to replace the raw number
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
to make the code more clearer
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Rename it to nvdimm_dsm_handle_reserved_root_method
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
fixed the English issue and code-style issue
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
and use these codes to refine the code
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Rename it to nvdimm_plug()
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
inline buf_size to refine the code a bit
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Its behavior has been changed as the nvdimm device which is being
realized also will be handled in this function, so rename it to
reflect the fact
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
To make the code more clearer, we
1) check ram_slots first, and build ssdt & nfit only when it is available
2) use nvdimm_get_plugged_device_list() to check if there is nvdimm device
plugged
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
as it is never called when nvdimm hotplug happens
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
as they use completely different way to handle hotplug event
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
as there is a global lock to protect vm-exit handlers and
QMP/monitor, this lock can be dropped
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
as nvdimm acpi is okay to build fit when the nvdimm device
has not been 'realized'
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Commit 31190ed7 added a migration blocker in vhost_dev_init() to
check if memfd would succeed. It is better if this blocker first
checks if vhost backend requires shared log. This will avoid a
situation where a blocker is added inappropriately (e.g. shared
log allocation fails when vhost backend doesn't support it).
Signed-off-by: Rafael David Tinoco <rafael.tinoco@canonical.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
virtio 1.0 spec says this is a legacy feature bit,
hide it from guests in modern mode.
Note: for cross-version migration compatibility,
we keep the bit set in host_features.
The result will be that a guest migrating cross-version
will see host features change under it.
As guests only seem to read it once, this should
not be an issue. Meanwhile, will work to fix guests to
ignore this bit in virtio1 mode, too.
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Legacy features are those that transitional devices only
expose on the legacy interface.
Allow different ones per device class.
Cc: qemu-stable@nongnu.org # dependency for the next patch
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
We should not use cpu_to_le16() here, instead each of device/function
value is stored in a 8 byte field.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently the virtio-crypto device hasn't supported
hotpluggable and live migration well. Let's tag it
as not hotpluggable and migration actively and reopen
them once we support them well.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The function does not fully initialize the returned VirtQueueElement and should
be used only internally from the virtio module.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The function undoes the effect of virtqueue_pop and doesn't do anything
destructive or irreversible so virtqueue_unpop is a more fitting name.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Using uint8_t for source id will lose bus num and get the
wrong/invalid IOTLB entry. Fixing by using uint16_t instead and
enlarge level shift.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Version: GnuPG v1
iQEcBAABAgAGBQJYKrs3AAoJEO8Ells5jWIRb1EH/3bAz/HiOl5COURpvwvMgq0E
J3Mr91FPrgepHSeB3cqBBRNqDT2B1gx7DO5dWCQSFMxfGlZ1VX12MGC8ZtBK+8US
AxNqfkzLM5tDFGn6LN4OZgV6uEJhKwSPzyasOXHzwZ7+xWG8RQ8pQc+Lj5PlCcqw
esrm0EN7mwrw76RnDjDaDsUMiVxCZ0GBo3vWhEsXpCjUing14OFKk4pvQm0uIsac
RL5R9xAGmTddN/0cKPG8FXh5xAOb1g/7HPHj97KIc4JrDrK1vrCF4ft6TqIOErGm
OJyNX1o65+7HDsfz6vn2Bmiin4UN1mSQ5paCptS8Z5lyGGY6dQPs51dYThCQZ3I=
=VMkX
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'jasowang/tags/net-pull-request' into staging
# gpg: Signature made Tue 15 Nov 2016 07:37:27 AM GMT
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* jasowang/tags/net-pull-request:
docs: fix COLO architecture diagram
net: fix sending of data with -net socket, listen backend
net: skip virtio-net config of deleted nic's peers
Message-id: 1479195830-4725-1-git-send-email-jasowang@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1373816
qemu core dump happens during repetitive unpug-plug
with multiple queues and Windows RSS-capable guest.
If back-end delete requested during virtio-net device
initialization, driver still can try configure the device
for multiple queues. The virtio-net device is expected
to be removed as soon as the initialization is done.
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
If the pnv machine type is compiled on a 32-bit host, the unsigned long
(host) type is 32-bit. This means that the hweight_long() used to
calculate the number of allowed cores only considers the low 32 bits of
the cores_mask variable, and can thus return 0 in some circumstances.
This corrects the bug.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Suggested-by: Richard Henderson <rth@twiddle.net>
[clg: replaced hweight_long() by ctpop64() ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
High addresses can overflow the uint32_t pcba variable after the 8byte
shift.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The XSCOM addresses for the core registers are encoded in a slightly
different way on POWER8 and POWER9.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When using the serial console in the GTK interface of QEMU (and
QEMU has been compiled with CONFIG_VTE), it is possible to trigger
the assert() statement in vty_receive() in spapr_vty.c by pasting
a chunk of text with length > 16 into the QEMU window.
Most of the other serial backends seem to simply drop characters
that they can not handle, so I think we should also do the same in
spapr-vty to fix this issue.
Buglink: https://bugs.launchpad.net/qemu/+bug/1639322
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
daa2369 "spapr_pci: Add a 64-bit MMIO window" subtly broke migration from
qemu-2.7 to the current version. It split the device's MMIO window into
two pieces for 32-bit and 64-bit MMIO.
The patch included backwards compatibility code to convert the old property
into the new format. However, the property value was also transferred in
the migration stream and compared with a (probably unwise) VMSTATE_EQUAL.
So, the "raw" value from 2.7 is compared to the new style converted value
from (pre-)2.8 giving a mismatch and migration failure.
Although it would be technically possible to fix this in a way allowing
backwards migration, that would leave an ugly legacy around indefinitely.
This patch takes the simpler approach of bumping the migration version,
dropping the unwise VMSTATE_EQUAL (and some equally unwise ones around it)
and ignoring them on an incoming migration.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
PnvChip is defined twice and this can confuse old compilers :
CC ppc64-softmmu/hw/ppc/pnv_xscom.o
In file included from qemu.git/hw/ppc/pnv.c:29:
qemu.git/include/hw/ppc/pnv.h:60: error: redefinition of typedef ‘PnvChip’
qemu.git/include/hw/ppc/pnv_xscom.h:24: note: previous declaration of ‘PnvChip’ was here
make[1]: *** [hw/ppc/pnv.o] Error 1
make[1]: *** Waiting for unfinished jobs....
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
powernv has some code (derived from the spapr equivalent) used in device
tree generation which depends on the CPU's compatibility mode / logical
PVR. However, compatibility modes don't make sense on powernv - at least
not as a property controlled by the host - because the guest in powernv
has full hypervisor level access to the virtual system, and so owns the
PCR (Processor Compatibility Register) which implements compatiblity modes.
Note: the new logic doesn't take into account kvmppc_smt_threads() like the
old version did. However, if core->nr_threads exceeds kvmppc_smt_threads()
then things will already be broken and clamping the value in the device
tree isn't going to save us.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
For the purposes of byte_count_limit verification, add a new flag that
identifies read_cd as sometimes returning data, then check the BCL in
its command handler after we know that it will indeed return data.
Reported-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1477970211-25754-2-git-send-email-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
The printscreen/sysrq and pause/break keys currently don't work for guests
using -usbdevice keyboard when accessed through vnc with a gtk-vnc based
client.
The reason for this is a mismatch between gtk-vnc and qemu in how these keys
should be mapped to XT keycodes.
On the original IBM XT these keys behaved differently than other keys.
Quoting from https://www.win.tue.nl/~aeb/linux/kbd/scancodes-1.html:
The keys PrtSc/SysRq and Pause/Break are special. The former produces
scancode e0 2a e0 37 when no modifier key is pressed simultaneously, e0 37
together with Shift or Ctrl, but 54 together with (left or right) Alt. (And
one gets the expected sequences upon release. But see below.) The latter
produces scancode sequence e1 1d 45 e1 9d c5 when pressed (without modifier)
and nothing at all upon release. However, together with (left or right)
Ctrl, one gets e0 46 e0 c6, and again nothing at release. It does not
repeat.
Gtk-vnc supports the 'QEMU Extended Key Event Message' RFB extension to send
raw XT keycodes directly to qemu, but the specification doesn't explicitly
specify how to map such long/complicated keycode sequences. From the spec
(https://github.com/rfbproto/rfbproto/blob/master/rfbproto.rst#qemu-extended-key-event-message)
The keycode is the XT keycode that produced the keysym. An XT keycode is an
XT make scancode sequence encoded to fit in a single U32 quantity. Single
byte XT scancodes with a byte value less than 0x7f are encoded as is.
2-byte XT scancodes whose first byte is 0xe0 and second byte is less than
0x7f are encoded with the high bit of the first byte set
hid.c currently expects the keycode sequence with shift/ctl for sysrq (e0 37
-> 0xb7 in RFB), whereas gtk-vnc uses the sequence with alt (0x54).
Likewise, hid.c expects the code without modifiers (e1 1d 45 -> 0xc5 in
RFB), whereas gtk-vnc sends the keycode sequence with ctrl for pause (e0 46
-> 0xc6 in RFB).
See keymaps.cvs in gtk-vnc for the mapping used:
https://git.gnome.org/browse/gtk-vnc/tree/src/keymaps.csv#n150
Now, it isn't obvious to me which sequence is really "right", but as the
0x54/0xc6 keycodes are currently unused in hid.c, supporting both seems like
the pragmatic solution to me. The USB HID keyboard boot protocol used by
hid.c doesn't have any other mapping applicable to these keys.
The other guest keyboard interfaces (ps/2, virtio, ..) are not affected,
because they handle these keys differently.
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
Message-id: 20161028145132.1702-1-peter@korsgaard.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
In usbredir destroy dispatch function, it doesn't free the vm change
state handler once registered in usbredir_realize function. This will
lead a memory leak issue. This patch avoid this.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 58216976.d0236b0a.77b99.bcd6@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
In ehci_init_transfer function, if the 'cpage' is bigger than 4,
it doesn't free the 'p->sgl' once allocated previously thus leading
a memory leak issue. This patch avoid this.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 5821c0f4.091c6b0a.e0c92.e811@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
If the guest attempts to talk to a nonexistent device over i2c,
the i2c_start_transfer() function will return non-zero, indicating
that the bus is signalling a NACK. Similarly, if the i2c_send()
function returns nonzero then the target device returned a NACK.
Handle this possibility in the bitbang_i2c code, by returning
the state machine to the STOPPED state and returning the NACK
bit to the guest.
This bit of missing functionality was spotted by Coverity
(it noticed that we weren't checking the return value from
i2c_start_transfer()).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1477332749-27098-1-git-send-email-peter.maydell@linaro.org
According to ARM DUI 0552A 4.2.10. NVIC set pending status
also for disabled interrupts. Correct the logic for
when interrupts are marked pending both on input level
transition and when interrupts are dismissed, to match
the NVIC behaviour rather than the 11MPCore GIC.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Cadence UART device emulator calculates speed by dividing the
baud rate by a 'baud rate generator' & 'baud rate divider' value.
The device specification defines these register values to be
non-zero and within certain limits. Add checks for these limits
to avoid errors like divide by zero.
Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1477596278-1470-1-git-send-email-ppandit@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Introduce this field to control whether ACPI build is enabled by a
particular machine or accelerator.
It defaults to true if the machine itself supports ACPI build. Xen
accelerator will disable it because Xen is in charge of building ACPI
tables for the guest.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Olaf Hering reported a build failure due to an undefined reference
to 'qemu_log_vprintf'. Explicitely including qemu/log.h seems to
fix the issue.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Olaf Hering <olaf@aepfle.de>
Commit 35c5a52d "acpi: do not use TARGET_PAGE_SIZE" changed struct
NvdimmDsmIn from a variable-size structure to a fixed-size structure of
4096 bytes. It forgot to adjust an assert in
nvdimm_dsm_set_label_data(..., NvdimmDsmIn *in, ...):
assert(sizeof(*in) + sizeof(*set_label_data) + set_label_data->length <=
4096);
which could crash QEMU when guest writes NVDIMM labels.
Fix it by replacing sizeof(*in) by offsetof(NvdimmDsmIn, arg3).
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reported-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The initialization was missed before, resulting in some
bad data in the smbus case.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
I misunderstood the workings of the power settings, the power off
is a force off operation and there needs to be a separate graceful
shutdown operation. So replace the force off operation with a
graceful shutdown.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The original commit:
commit 67aa56fc03
Author: Corey Minyard <cminyard@mvista.com>
Date: Thu Dec 17 12:50:06 2015 -0600
ipmi: Add an external connection simulation interface
defined a new variable CONFIG_IPMI_EXTERN, but then went
on to mistakely use the pre-existing CONFIG_IPMI_LOCAL
variable.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is allowed by the IPMI specification for graceful shutdown,
so implement it.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When issuing a chassis 'powerdown' control command, the routine
qemu_system_shutdown_request() should be used to exit the guest.
qemu_system_powerdown_request() will initiate a soft shutdown which is
not what is required by the IPMI (28.3 Chassis Control Command):
0h = power down. Force system into soft off (S4/S45) state. This
is for 'emergency' management power down actions. The command does
not initiate a clean shut-down of the operating system prior to
powering down the system
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Get rid of the unnecessary mutex, it was a vestige
of something else that was not done. That way we don't
have to free it.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
No hotplug support, make sure it doesn't happen.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
_GPE.E04 is dedicated for nvdimm device hotplug
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
_FIT is required for hotplug support, guest will inquire the updated
device info from it if a hotplug event is received
As FIT buffer is not completely mapped into guest address space, so a
new function, Read FIT whose UUID is UUID
648B9CF2-CDA1-4312-8AD9-49C4AF32BD62, handle 0x10000, function index
is 0x1, is reserved by QEMU to read the piece of FIT buffer. The buffer
is concatenated before _FIT return
Refer to docs/specs/acpi-nvdimm.txt for detailed design
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The buffer is used to save the FIT info for all the presented nvdimm
devices which is updated after the nvdimm device is plugged or
unplugged. In the later patch, it will be used to construct NVDIMM
ACPI _FIT method which reflects the presented nvdimm devices after
nvdimm hotplug
As FIT buffer can not completely mapped into guest address space,
OSPM will exit to QEMU multiple times, however, there is the race
condition - FIT may be changed during these multiple exits, so that
some rules are introduced:
1) the user should hold the @lock to access the buffer and
2) mark @dirty whenever the buffer is updated.
@dirty is cleared for the first time OSPM gets fit buffer, if
dirty is detected in the later access, OSPM will restart the
access
As fit should be updated after nvdimm device is successfully realized
so that a new hotplug callback, post_hotplug, is introduced
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For each NVDIMM present or intended to be supported by platform,
platform firmware also exposes an ACPI Namespace Device under
the root device
So it builds nvdimm devices for all slots to support vNVDIMM hotplug
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There are some names repeatedly used in acpi code, define them
as macros to refine the code
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Rename it as dsm_out_buf_siz is more descriptive
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As the arch dependent info, TARGET_PAGE_SIZE, has been dropped
from nvdimm acpi code, it can be compiled arch-independently
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As the function only has 5 args, we use local7 instead of it
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As ARG3 is a reserved name, we rename it to FARG
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to ACPI 6.0 spec, "Memory Device Physical Address
Region Base" in memdev is defined as "This field provides the
Device Physical Address base of the region". This field should
be zero in our case
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Based on ACPI spec:
RegionOffset := TermArg => Integer
However, Named object is not a TermArg.
This patch moves OperationRegion to NCAL() and uses localX as
its RegionOffset
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, 'RLEN' is the totally buffer size written by QEMU and it is
ACPI internally used only. The buffer size returned to guest should
not include 'RLEN' itself
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Make crypto operations are executed asynchronously,
so that other QEMU threads and monitor couldn't
be blocked at the virtqueue handling context.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We use an opaque point to the VirtIOCryptoReq which
can support different packets based on different
algorithms.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduces VirtIOCryptoReq structure to store
crypto request so that we can easily support
asynchronous crypto operation in the future.
At present, we only support cipher and algorithm
chaining.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Realize the symmetric algorithm control queue handler,
including plain cipher and chainning algorithms.
Currently the control queue is used to create and
close session for symmetric algorithm.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Expose the capacity of algorithms supported by
virtio crypto device to the frontend driver using
pci configuration space.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch adds virtio-crypto-pci, which is the pci proxy for the virtio
crypto device.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce the virtio crypto realization, I'll
finish the core code in the following patches. The
thoughts came from virtio net realization.
For more information see:
http://qemu-project.org/Features/VirtioCrypto
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The virtfs_reset() function is called either when the virtio-9p device
gets reset, or when the client starts a new 9P session. In both cases,
if it finds fids from a previous session, the following is printed in
the monitor:
9pfs:virtfs_reset: One or more uncluncked fids found during reset
For example, if a linux guest with a mounted 9P share is reset from the
monitor with system_reset, the message will be printed. This is excessive
since these fids are now clunked and the state is clean.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
A buggy or malicious guest could pass the id of an already opened fid and
cause QEMU to abort. Let's return EINVAL to the guest instead.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
The xattrcreate operation only makes sense on a freshly cloned fid
actually, since any open state would be leaked because of the fid_type
change. This is indeed what the linux kernel client does:
fid = clone_fid(fid);
[...]
retval = p9_client_xattrcreate(fid, name, value_len, flags);
This patch also reverts commit ff55e94d23 since we are sure that a fid
with type P9_FID_NONE doesn't have a previously allocated xattr.
Signed-off-by: Greg Kurz <groug@kaod.org>
We shouldn't allow guests to create extended attribute with arbitrary sizes.
On linux hosts, the limit is XATTR_SIZE_MAX. Let's use it.
Signed-off-by: Greg Kurz <groug@kaod.org>
The v9fs_xattr_read() and v9fs_xattr_write() are passed a guest
originated offset: they must ensure this offset does not go beyond
the size of the extended attribute that was set in v9fs_xattrcreate().
Unfortunately, the current code implement these checks with unsafe
calculations on 32 and 64 bit values, which may allow a malicious
guest to cause OOB access anyway.
Fix this by comparing the offset and the xattr size, which are
both uint64_t, before trying to compute the effective number of bytes
to read or write.
Suggested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-By: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
The 'len' in V9fsXattr comes from the 'size' argument in setxattr()
function in guest. The setxattr() function's declaration is this:
int setxattr(const char *path, const char *name,
const void *value, size_t size, int flags);
and 'size' is treated as u64 in linux kernel client code:
int p9_client_xattrcreate(struct p9_fid *fid, const char *name,
u64 attr_size, int flags)
So the 'len' should have an type of 'uint64_t'.
The 'copied_len' in V9fsXattr is used to account for copied bytes, it
should also have an type of 'uint64_t'.
Suggested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Currently, 9pfs sets the 'copied_len' field in V9fsXattr
to -1 to tag xattr walk fid. As the 'copied_len' is also
used to account for copied bytes, this may make confusion. This patch
add a bool 'xattrwalk_fid' to tag the xattr walk fid.
Suggested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Now the kernel commit 05f0c03fbac1 ("vfio-pci: Allow to mmap
sub-page MMIO BARs if the mmio page is exclusive") allows VFIO
to mmap sub-page BARs. This is the corresponding QEMU patch.
With those patches applied, we could passthrough sub-page BARs
to guest, which can help to improve IO performance for some devices.
In this patch, we expand MemoryRegions of these sub-page
MMIO BARs to PAGE_SIZE in vfio_pci_write_config(), so that
the BARs could be passed to KVM ioctl KVM_SET_USER_MEMORY_REGION
with a valid size. The expanding size will be recovered when
the base address of sub-page BAR is changed and not page aligned
any more in guest. And we also set the priority of these BARs'
memory regions to zero in case of overlap with BARs which share
the same page with sub-page BARs in guest.
Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
When a PCI device is reset, pci_do_device_reset resets all BAR addresses
in the relevant PCIDevice's config buffer.
The VFIO configuration space stays untouched, so the guest OS may choose
to skip restoring the BAR addresses as they would seem intact. The PCI
device may be left non-operational.
One example of such a scenario is when the guest exits S3.
Fix this by resetting the BAR addresses in the VFIO configuration space
as well.
Signed-off-by: Ido Yariv <ido@wizery.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
As reported in the link below, user has a PCI device with a 4KB BAR
which contains the MSI-X table. This seems to hit a corner case in
the kernel where the region reports being mmap capable, but the sparse
mmap information reports a zero sized range. It's not entirely clear
that the kernel is incorrect in doing this, but regardless, we need
to handle it. To do this, fill our mmap array only with non-zero
sized sparse mmap entries and add an error return from the function
so we can tell the difference between nr_mmaps being zero based on
sparse mmap info vs lack of sparse mmap info.
NB, this doesn't actually change the behavior of the device, it only
removes the scary "Failed to mmap ... Performance may be slow" error
message. We cannot currently create an mmap over the MSI-X table.
Link: http://lists.nongnu.org/archive/html/qemu-discuss/2016-10/msg00009.html
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>