Currently the VFIOContainer iommu_data field contains a union with
different information for different host iommu types. However:
* It only actually contains information for the x86-like "Type1" iommu
* Because we have a common listener the Type1 fields are actually used
on all IOMMU types, including the SPAPR TCE type as well
In fact we now have a general structure for the listener which is unlikely
to ever need per-iommu-type information, so this patch removes the union.
In a similar way we can unify the setup of the vfio memory listener in
vfio_connect_container() that is currently split across a switch on iommu
type, but is effectively the same in both cases.
The iommu_data.release pointer was only needed as a cleanup function
which would handle potentially different data in the union. With the
union gone, it too can be removed.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
In irqfd mode, current code attempts to set a resamplefd whatever
the type of the IRQ. For an edge-sensitive IRQ this attempt fails
and as a consequence, the whole irqfd setup fails and we fall back
to the slow mode. This patch bypasses the resamplefd setting for
non level-sentive IRQs.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
unmask EventNotifier might not be initialized in case of edge
sensitive irq. Using EventNotifier pointers make life simpler to
handle the edge-sensitive irqfd setup.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
With current implementation, eventfd VFIO signaling is first set up and
then irqfd is setup, if supported and allowed.
This start sequence causes several issues with IRQ forwarding setup
which, if supported, is transparently attempted on irqfd setup:
IRQ forwarding setup is likely to fail if the IRQ is detected as under
injection into the guest (active at irqchip level or VFIO masked).
This currently always happens because the current sequence explicitly
VFIO-masks the IRQ before setting irqfd.
Even if that masking were removed, we couldn't prevent the case where
the IRQ is under injection into the guest.
So the simpler solution is to remove this 2-step startup and directly
attempt irqfd setup. This is what this patch does.
Also in case the eventfd setup fails, there is no reason to go farther:
let's abort.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
libqos/ahci and tests/fdc-test are under my purview also,
include them in the appropriate stanzas.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1443117055-29240-1-git-send-email-jsnow@redhat.com
ICC bus impl has been droped, so all icc related files are not useful
any more; delete them.
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
After CPU hotplug has been converted to BUS-less hot-plug infrastructure,
the only function ICC bus performs is to propagate reset to LAPICs. However
LAPIC could be reset by registering its reset handler after all device are
initialized.
Do so and drop ~30LOC of not needed anymore ICCBus related code.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
During reset some devices (such as hpet, rtc) might send IRQ to APIC
which changes APIC's state from default one it's supposed to have
at machine startup time.
Fix this by resetting APIC after devices have been reset to cancel
any changes that qemu_devices_reset() might have done to its state.
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
When ICC bus/bridge is removed, APIC MMIO will be left
unmapped since it was mapped into system's address space
indirectly by ICC bridge.
Fix it by moving mapping into APIC code, so it would be
possible to remove ICC bus/bridge code later.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
When doing a re-initialization of a CPU core, the default state is to _not_
have 64-bit long mode enabled. This means the LME (long mode enable) and LMA
(long mode active) bits in the EFER model-specific register should be cleared.
However, the EFER state is part of the CPU environment which is
preserved by do_cpu_init(), so if EFER.LME and EFER.LMA were set at the
time an INIT IPI was received, they will remain set after the init completes.
This is contrary to what the Intel architecture manual describes and what
happens on real hardware, and it leaves the CPU in a weird state that the
guest can't clear.
To fix this, the 'efer' member of the CPUX86State structure has been moved
to an area outside the region preserved by do_cpu_init(), so that it can
be properly re-initialized by x86_cpu_reset().
Signed-off-by: Bill Paul <wpaul@windriver.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
ABM is only implemented as a single instruction set by AMD; all AMD
processors support both instructions or neither. Intel considers POPCNT
as part of SSE4.2, and LZCNT as part of BMI1, but Intel also uses AMD's
ABM flag to indicate support for both POPCNT and LZCNT. It has to be
added to Haswell and Broadwell because Haswell, by adding LZCNT, has
completed the ABM.
Tested with "qemu-kvm -cpu Haswell-noTSX,enforce" (and also with older
machine types) on an Haswell-EP machine.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
There's one report of migration breaking due to missing MSR_TSC_AUX
save/restore. Fix this by adding a new subsection that saves the state
of this MSR.
https://bugzilla.redhat.com/show_bug.cgi?id=1261797
Reported-by: Xiaoqing Wei <xwei@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
CC: Marcelo Tosatti <mtosatti@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The function is now only used from within a single file.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Convert the kvm_default_features and kvm_default_unset_features arrays
into a simple list of property/value pairs that will be applied to
X86CPU objects when using KVM.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The code in smp_parse already checks the topology information for
sockets * cores * threads < cpus and bails out with an error in
that case. However, it is still possible to supply a bad configuration
the other way round, e.g. with:
qemu-system-xxx -smp 4,sockets=1,cores=4,threads=2
QEMU then still starts the guest, with topology configuration that
is rather incomprehensible and likely not what the user wanted.
So let's add another check to refuse such wrong configurations.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
In order to simplify arguments of function, introduce a new struct
named X86CPUTopoInfo.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
New features:
guest RAM buffer overrun mitigation
RAM physical address gaps for memory hotplug
(except refactoring which got some review comments)
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJWDo8IAAoJECgfDbjSjVRpCwgH/jmj2sYXmP2ywGxsHOS7JKEF
DyF9crBbrNnkB0+qpON0vmCeoFx5AX6CeyWC0w2bFy4z2yN9lb7jKenkp/guHWne
eX4x5RVpvW9Ed1l9v4vGuI+5IB3gvZEXQB4hiAMz5fXMCVs0OZ4dyRODHqyXKMvy
lBCdb0YVvZOPYxRYhnAllOt0uBLLY8pl5i6QGekFkfQMCrsLagySqLPkRNTR0l8O
2PNd3oBPJi5Qb2jWyJNS45mPMDU6lEIiZSbzn7zAUVduu15hqS9VYZPlZzrNazSu
7hx6Zegq0G1MMpiVhwlpi5Ov1hqAA+zAIl4QcTN31ueHYdxD/x310nAqtm7Eov4=
=iFA1
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio,pc features, fixes
New features:
guest RAM buffer overrun mitigation
RAM physical address gaps for memory hotplug
(except refactoring which got some review comments)
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 02 Oct 2015 15:04:56 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream:
vhost-user-test: fix predictable filename on tmpfs
vhost-user-test: use tmpfs by default
pc: memhp: force gaps between DIMM's GPA
memhp: extend address auto assignment to support gaps
vhost-user: unit test for new messages
vhost-user-test: do not reinvent glib-compat.h
virtio: Notice when the system doesn't support MSIx at all
pc: Add a comment explaining why pc_compat_2_4() doesn't exist
exec: allocate PROT_NONE pages on top of RAM
oslib: allocate PROT_NONE pages on top of RAM
oslib: rework anonimous RAM allocation
virtio-net: correctly drop truncated packets
virtio: introduce virtqueue_discard()
virtio: introduce virtqueue_unmap_sg()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIVAwUAVg56qLRIkN7ePJvAAQjWVRAAtnKDUPIcgctKZBYyMGQR+/BuhVA65s1j
IiNmbmMCXxsYu7X62JsQqDo7SK/HZEdhebEJi3DH4FkWJZxw7TFPeS2NZDuO/Q31
HbqWP5BUOH8Ow8nJDvgUtHffpqBL7R8OIgAu7y/6XF4CXfLKwkxAn5cN4ua61acY
NfbnUuXl7FYZeDl+JiQzxtARgonjmwxlV84C9swuJmFM1MTfKVtre8puppItsslq
OmlhnYk5K70oo7rNDCnMXvRdEEO0k8syNT+yXkHuQ+0OVTaNIAf5M1kvJVo8hrWV
u1E87UwKU+29Ocs9JOWnAIV+NkVNjPip0YtpZQJn0oy/EiaMS/5oy9uEEh9L906u
aZzcFaIgQfRagPnw4z6o9cxtW4RvgQ7Bi8Ll8dfPMv3TF+REmaV1ZkmoCHe9usBN
Ix14JfKSQnR+LiZjetiQ+V34ZLC4ZZn7KsEWIGjRe9WI+EqJAcT81PMM5dXkYuxj
6tPVs6vdkE5jy8FFBjyfXi9MflKiQKcsTvImwQp9ORKwJfLnkyQoIStOtqnLwewn
00L6tQDVGWQ790gJVbYEfcDfD4p6SJ2sj20mtPJ6lyMR5oEVjzXWl81YzZvqE1/S
K665ZJBHqVba+g7m70z64u3tS/Zsex9K6ruNAFTULpdlEu6VMTpcPsUCfMpE0/4+
XN4lh1hnY1A=
=89TV
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20151002' into staging
First set of Linux-user que patches for 2.5
# gpg: Signature made Fri 02 Oct 2015 13:38:00 BST using RSA key ID DE3C9BC0
# gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>"
# gpg: aka "Riku Voipio <riku.voipio@linaro.org>"
* remotes/riku/tags/pull-linux-user-20151002:
linux-user: assert that target_mprotect cannot fail
linux-user/signal.c: Use setup_rt_frame() instead of setup_frame() for target openrisc
linux-user/syscall.c: Add EAGAIN to host_to_target_errno_table for
linux-user: add name_to_handle_at/open_by_handle_at
linux-user: Return target error number in do_fork()
linux-user: fix cmsg conversion in case of multiple headers
linux-user: remove MAX_ARG_PAGES limit
linux-user: remove unused image_info members
linux-user: Treat --foo options the same as -foo
linux-user: use EXIT_SUCCESS and EXIT_FAILURE
linux-user: Add proper error messages for bad options
linux-user: Add -help
linux-user: Exit 0 when -h is used
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
vhost-user-test uses getpid to create a unique filename. This name is
predictable, and a security problem. Instead, use a tmp directory
created by mkdtemp, which is a suggested best practice.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Most people don't run make check by default, so they skip vhost-user
unit tests. Solve this by using tmpfs instead, unless hugetlbfs is
specified (using an environment variable).
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
mapping DIMMs non contiguously allows to workaround
virtio bug reported earlier:
http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html
in this case guest kernel doesn't allocate buffers
that can cross DIMM boundary keeping each buffer
local to a DIMM.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
setting gap to TRUE will make sparse DIMM
address auto allocation, leaving gaps between
a new DIMM address and preceeding existing DIMM.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Data is empty for now, but do make sure master
sets the new feature bit flag.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
glib-compat.h has the gunk to support both old-style and new-style
gthread functions. Use it instead of reinventing it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1265196
The following command fails on an NFS mountpoint:
$ qemu-img create -f qcow2 -o preallocation=falloc disk.img 262144
Formatting 'disk.img', fmt=qcow2 size=262144 encryption=off cluster_size=65536 preallocation='falloc' lazy_refcounts=off
qemu-img: disk.img: Could not preallocate data for the new file: Bad file descriptor
The reason turns out to be because NFS doesn't support the
posix_fallocate call. glibc emulates it instead. However glibc's
emulation involves using the pread(2) syscall. The pread syscall
fails with EBADF if the file descriptor is opened without the read
open-flag (ie. open (..., O_WRONLY)).
I contacted glibc upstream about this, and their response is here:
https://bugzilla.redhat.com/show_bug.cgi?id=1265196#c9
There are two possible fixes: Use Linux fallocate directly, or (this
fix) work around the problem in qemu by opening the file with O_RDWR
instead of O_WRONLY.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1265196
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Disabling I/O limits from a BDS also drains all pending throttled
requests, so it should be done at the beginning of bdrv_close() with
the rest of the bdrv_drain() calls before the BlockDriver is closed.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
As of 934659c460, $QEMU_IO is generally no
longer a program name, and therefore "sudo -n $QEMU_IO" will no longer
work.
Fix this by copying the qemu-io invocation function from common.config,
making it use $sudo for invoking $QEMU_IO_PROG, and then use that
function instead of $QEMU_IO.
Reported-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit 50b7b000 improved HMP error messages, but forgot to update
qemu-iotests to match.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
aio_worker() wrote the return code to the wrong variable.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Guangmu Zhu <guangmuzhu@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
According to the Pop:
"Subsystem reset operates only on those elements in the configuration
which are not CPUs".
As this is what we actually do, let's simply rename the function.
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Message-Id: <1443689387-34473-6-git-send-email-jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Existing code missed to set a parent for the quiesce and hotplug event.
While this didn't matter in practise, new introspection APIs basically now
do an object_unref(object_new(T)), which loops forever.
When trying to remove the event facility bus, the code tries to
unparent all childs on the bus, so they are properly deleted and therefore removed.
As object_unparent() on these child devices doesn't work, as there is no parent,
we loop forever.
Let's fix this by adding the event facility as a parent. Also switch from
object_initialize to object_new, so the only valid reference is in fact the
parent property. This makes it more obvious when the device (state) is actually
gone (and how the reference counting works).
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Message-Id: <1443689387-34473-4-git-send-email-jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Let's expose some virtual/fake registers as virtualization specific
registers.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Message-Id: <1443689387-34473-3-git-send-email-jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Some gcc versions (e.g. Fedora 22 gcc 5.1.1) seem to use floating
point registers for spilling and filling of general purpose registers.
As the BIOS does not activate the AFP register setting of CR0 this can
cause data exception program checks.
Disallow floating point in the BIOS as a simple solution.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Message-Id: <1443689387-34473-2-git-send-email-jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Set the Microblaze CPU PC in the reset instead of setting it
in the realize. This is required as the PC is zeroed in the
reset function and causes problems in some situations.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
During mirror, if the target device does not support zero init, a
mirror may result in a corrupted image for sync="full" mode.
This is due to how the initial dirty bitmap is set up prior to copying
data - we did not mark sectors as dirty that are unallocated. This
means those unallocated sectors are skipped over on the target, and for
a device without zero init, invalid data may reside in those holes.
If both of the following conditions are true, then we will explicitly
mark all sectors as dirty:
1.) sync = "full"
2.) bdrv_has_zero_init(target) == false
If the target does support zero init, but a target image is passed in
with data already present (i.e. an "existing" image), it is assumed the
data present in the existing image is valid data for those sectors.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 91ed4bc5bda7e2b09eb508b07c83f4071fe0b3c9.1443705220.git.jcody@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
And do not issue an error_report in that case.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pc_compat_2_4() doesn't exist, and we shouldn't create one. Add a
comment explaining why the function doesn't exist and why pc_compat_*()
functions are deprecated.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This inserts a read and write protected page between RAM and QEMU
memory, for file-backend RAM.
This makes it harder to exploit QEMU bugs resulting from buffer
overflows in devices using variants of cpu_physical_memory_map,
dma_memory_map etc.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
This inserts a read and write protected page between RAM and QEMU
memory. This makes it harder to exploit QEMU bugs resulting from buffer
overflows in devices using variants of cpu_physical_memory_map,
dma_memory_map etc.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
At the moment we first allocate RAM, sometimes more than necessary for
alignment reasons. We then free the extra RAM.
Rework this to avoid the temporary allocation: reserve the
range by mapping it with PROT_NONE, then use just the
necessary range with MAP_FIXED.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
When packet is truncated during receiving, we drop the packets but
neither discard the descriptor nor add and signal used
descriptor. This will lead several issues:
- sg mappings are leaked
- rx will be stalled if a lots of packets were truncated
In order to be consistent with vhost, fix by discarding the descriptor
in this case.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduces virtqueue_discard() to discard a descriptor and
unmap the sgs. This will be used by the patch that will discard
descriptor when packet is truncated.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Factor out sg unmapping logic. This will be reused by the patch that
can discard descriptor.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Andrew James <andrew.james@hpe.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>