To prepare for migration to realize, let's use a local error
object in vfio_initfn. Also let's use the same error prefix for all
error messages.
On top of the 1-1 conversion, we start using a common error prefix for
all error messages. We also introduce a similar warning prefix which will
be used later on.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
- a patch to add a vdc->reset() handler to virtio-9p
- a bunch of patches to fix various memory leaks (thanks to Li Qiang)
- some code cleanups for 9pfs
-----BEGIN PGP SIGNATURE-----
iEYEABECAAYFAlgE59oACgkQAvw66wEB28IrzgCePLq8RVQHvxJKcx9CO1XQaXCE
Dp4AoJm+CDVuaBd+cojoUmwGaLmZG8lb
=8nIw
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
This pull request contains:
- a patch to add a vdc->reset() handler to virtio-9p
- a bunch of patches to fix various memory leaks (thanks to Li Qiang)
- some code cleanups for 9pfs
# gpg: Signature made Mon 17 Oct 2016 16:01:46 BST
# gpg: using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg: aka "Greg Kurz <groug@free.fr>"
# gpg: aka "Greg Kurz <gkurz@fr.ibm.com>"
# gpg: aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg: aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg: aka "Gregory Kurz (Cimai Technology) <gkurz@cimai.com>"
# gpg: aka "Gregory Kurz (Meiosys Technology) <gkurz@meiosys.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
* remotes/gkurz/tags/for-upstream:
9pfs: fix memory leak in v9fs_write
9pfs: fix memory leak in v9fs_link
9pfs: fix memory leak in v9fs_xattrcreate
9pfs: fix information leak in xattr read
virtio-9p: add reset handler
9pfs: only free completed request if not flushed
9pfs: drop useless check in pdu_free()
9pfs: use coroutine_fn annotation in hw/9pfs/9p.[ch]
9pfs: use coroutine_fn annotation in hw/9pfs/co*.[ch]
9pfs: fsdev: drop useless extern annotation for functions
9pfs: fix potential host memory leak in v9fs_read
9pfs: allocate space for guest originated empty strings
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If an error occurs when marshalling the transfer length to the guest, the
v9fs_write() function doesn't free an IO vector, thus leading to a memory
leak. This patch fixes the issue.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
[groug, rephrased the changelog]
Signed-off-by: Greg Kurz <groug@kaod.org>
The v9fs_link() function keeps a reference on the source fid object. This
causes a memory leak since the reference never goes down to 0. This patch
fixes the issue.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
[groug, rephrased the changelog]
Signed-off-by: Greg Kurz <groug@kaod.org>
The 'fs.xattr.value' field in V9fsFidState object doesn't consider the
situation that this field has been allocated previously. Every time, it
will be allocated directly. This leads to a host memory leak issue if
the client sends another Txattrcreate message with the same fid number
before the fid from the previous time got clunked.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
[groug, updated the changelog to indicate how the leak can occur]
Signed-off-by: Greg Kurz <groug@kaod.org>
9pfs uses g_malloc() to allocate the xattr memory space, if the guest
reads this memory before writing to it, this will leak host heap memory
to the guest. This patch avoid this.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Virtio devices should implement the VirtIODevice->reset() function to
perform necessary cleanup actions and to bring the device to a quiescent
state.
In the case of the virtio-9p device, this means:
- emptying the list of active PDUs (i.e. draining all in-flight I/O)
- freeing all fids (i.e. close open file descriptors and free memory)
That's what this patch does.
The reset handler first waits for all active PDUs to complete. Since
completion happens in the QEMU global aio context, we just have to
loop around aio_poll() until the active list is empty.
The freeing part involves some actions to be performed on the backend,
like closing file descriptors or flushing extended attributes to the
underlying filesystem. The virtfs_reset() function already does the
job: it calls free_fid() for all open fids not involved in an ongoing
I/O operation. We are sure this is the case since we have drained
the PDU active list.
The current code implements all backend accesses with coroutines, but we
want to stay synchronous on the reset path. We can either change the
current code to be able to run when not in coroutine context, or create
a coroutine context and wait for virtfs_reset() to complete. This patch
goes for the latter because it results in simpler code.
Note that we also need to create a dummy PDU because it is also an API
to pass the FsContext pointer to all backend callbacks.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
If a PDU has a flush request pending, the current code calls pdu_free()
twice:
1) pdu_complete()->pdu_free() with pdu->cancelled set, which does nothing
2) v9fs_flush()->pdu_free() with pdu->cancelled cleared, which moves the
PDU back to the free list.
This works but it complexifies the logic of pdu_free().
With this patch, pdu_complete() only calls pdu_free() if no flush request
is pending, i.e. qemu_co_queue_next() returns false.
Since pdu_free() is now supposed to be called with pdu->cancelled cleared,
the check in pdu_free() is dropped and replaced by an assertion.
Signed-off-by: Greg Kurz <groug@kaod.org>
All these functions either call the v9fs_co_* functions which have the
coroutine_fn annotation, or pdu_complete() which calls qemu_co_queue_next().
Let's mark them to make it obvious they execute in coroutine context.
Signed-off-by: Greg Kurz <groug@kaod.org>
All these functions use the v9fs_co_run_in_worker() macro, and thus always
call qemu_coroutine_self() and qemu_coroutine_yield().
Let's mark them to make it obvious they execute in coroutine context.
Signed-off-by: Greg Kurz <groug@kaod.org>
In 9pfs read dispatch function, it doesn't free two QEMUIOVector
object thus causing potential memory leak. This patch avoid this.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Greg Kurz <groug@kaod.org>
If a guest sends an empty string paramater to any 9P operation, the current
code unmarshals it into a V9fsString equal to { .size = 0, .data = NULL }.
This is unfortunate because it can cause NULL pointer dereference to happen
at various locations in the 9pfs code. And we don't want to check str->data
everywhere we pass it to strcmp() or any other function which expects a
dereferenceable pointer.
This patch enforces the allocation of genuine C empty strings instead, so
callers don't have to bother.
Out of all v9fs_iov_vunmarshal() users, only v9fs_xattrwalk() checks if
the returned string is empty. It now uses v9fs_string_size() since
name.data cannot be NULL anymore.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
[groug, rewritten title and changelog,
fix empty string check in v9fs_xattrwalk()]
Signed-off-by: Greg Kurz <groug@kaod.org>
Currently, the MMIO space for accessing PCI on pseries guests begins at
1 TiB in guest address space. Each PCI host bridge (PHB) has a 64 GiB
chunk of address space in which it places its outbound PIO and 32-bit and
64-bit MMIO windows.
This scheme as several problems:
- It limits guest RAM to 1 TiB (though we have a limited fix for this
now)
- It limits the total MMIO window to 64 GiB. This is not always enough
for some of the large nVidia GPGPU cards
- Putting all the windows into a single 64 GiB area means that naturally
aligning things within there will waste more address space.
In addition there was a miscalculation in some of the defaults, which meant
that the MMIO windows for each PHB actually slightly overran the 64 GiB
region for that PHB. We got away without nasty consequences because
the overrun fit within an unused area at the beginning of the next PHB's
region, but it's not pretty.
This patch implements a new scheme which addresses those problems, and is
also closer to what bare metal hardware and pHyp guests generally use.
Because some guest versions (including most current distro kernels) can't
access PCI MMIO above 64 TiB, we put all the PCI windows between 32 TiB and
64 TiB. This is broken into 1 TiB chunks. The first 1 TiB contains the
PIO (64 kiB) and 32-bit MMIO (2 GiB) windows for all of the PHBs. Each
subsequent TiB chunk contains a naturally aligned 64-bit MMIO window for
one PHB each.
This reduces the number of allowed PHBs (without full manual configuration
of all the windows) from 256 to 31, but this should still be plenty in
practice.
We also change some of the default window sizes for manually configured
PHBs to saner values.
Finally we adjust some tests and libqos so that it correctly uses the new
default locations. Ideally it would parse the device tree given to the
guest, but that's a more complex problem for another time.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
On real hardware, and under pHyp, the PCI host bridges on Power machines
typically advertise two outbound MMIO windows from the guest's physical
memory space to PCI memory space:
- A 32-bit window which maps onto 2GiB..4GiB in the PCI address space
- A 64-bit window which maps onto a large region somewhere high in PCI
address space (traditionally this used an identity mapping from guest
physical address to PCI address, but that's not always the case)
The qemu implementation in spapr-pci-host-bridge, however, only supports a
single outbound MMIO window, however. At least some Linux versions expect
the two windows however, so we arranged this window to map onto the PCI
memory space from 2 GiB..~64 GiB, then advertised it as two contiguous
windows, the "32-bit" window from 2G..4G and the "64-bit" window from
4G..~64G.
This approach means, however, that the 64G window is not naturally aligned.
In turn this limits the size of the largest BAR we can map (which does have
to be naturally aligned) to roughly half of the total window. With some
large nVidia GPGPU cards which have huge memory BARs, this is starting to
be a problem.
This patch adds true support for separate 32-bit and 64-bit outbound MMIO
windows to the spapr-pci-host-bridge implementation, each of which can
be independently configured. The 32-bit window always maps to 2G.. in PCI
space, but the PCI address of the 64-bit window can be configured (it
defaults to the same as the guest physical address).
So as not to break possible existing configurations, as long as a 64-bit
window is not specified, a large single window can be specified. This
will appear the same way to the guest as the old approach, although it's
now implemented by two contiguous memory regions rather than a single one.
For now, this only adds the possibility of 64-bit windows. The default
configuration still uses the legacy mode.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Currently the default PCI host bridge for the 'pseries' machine type is
constructed with its IO windows in the 1TiB..(1TiB + 64GiB) range in
guest memory space. This means that if > 1TiB of guest RAM is specified,
the RAM will collide with the PCI IO windows, causing serious problems.
Problems won't be obvious until guest RAM goes a bit beyond 1TiB, because
there's a little unused space at the bottom of the area reserved for PCI,
but essentially this means that > 1TiB of RAM has never worked with the
pseries machine type.
This patch fixes this by altering the placement of PHBs on large-RAM VMs.
Instead of always placing the first PHB at 1TiB, it is placed at the next
1 TiB boundary after the maximum RAM address.
Technically, this changes behaviour in a migration-breaking way for
existing machines with > 1TiB maximum memory, but since having > 1 TiB
memory was broken anyway, this seems like a reasonable trade-off.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
The 'spapr-pci-host-bridge' represents the virtual PCI host bridge (PHB)
for a PAPR guest. Unlike on x86, it's routine on Power (both bare metal
and PAPR guests) to have numerous independent PHBs, each controlling a
separate PCI domain.
There are two ways of configuring the spapr-pci-host-bridge device: first
it can be done fully manually, specifying the locations and sizes of all
the IO windows. This gives the most control, but is very awkward with 6
mandatory parameters. Alternatively just an "index" can be specified
which essentially selects from an array of predefined PHB locations.
The PHB at index 0 is automatically created as the default PHB.
The current set of default locations causes some problems for guests with
large RAM (> 1 TiB) or PCI devices with very large BARs (e.g. big nVidia
GPGPU cards via VFIO). Obviously, for migration we can only change the
locations on a new machine type, however.
This is awkward, because the placement is currently decided within the
spapr-pci-host-bridge code, so it breaks abstraction to look inside the
machine type version.
So, this patch delegates the "default mode" PHB placement from the
spapr-pci-host-bridge device back to the machine type via a public method
in sPAPRMachineClass. It's still a bit ugly, but it's about the best we
can do.
For now, this just changes where the calculation is done. It doesn't
change the actual location of the host bridges, or any other behaviour.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
The existing implementation remains same and ics-base is introduced. The
type name "ics" is retained, and all the related functions renamed as
ics_simple_*
This will allow different implementations for the source controllers
such as the MSI support of PHB3 on Power8 which uses in-memory state
tables for example.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[ clg: added ICS_BASE_GET_CLASS and related fixes, based on :
http://patchwork.ozlabs.org/patch/646010/ ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Instead of an array of fixed sized blocks, use a list, as we will need
to have sources with variable number of interrupts. SPAPR only uses
a single entry. Native will create more. If performance becomes an
issue we can add some hashed lookup but for now this will do fine.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[ move the initialization of list to xics_common_initfn,
restore xirr_owner after migration and move restoring to
icp_post_load]
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
[ clg: removed the icp_post_load() changes from nikunj patchset v3:
http://patchwork.ozlabs.org/patch/646008/ ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Rather than machine instances having backward-compatible option
defaults that need to be repeatedly re-enabled for every new machine
type we introduce, we set the defaults appropriate for newer machine
types, then add code to explicitly disable instance options as needed
to maintain compatibility with older machine types.
Currently pseries-2.5 does not inherit from pseries-2.6 in this
fashion, which is okay at the moment since we do not have any
instance compatibility options for pseries-2.6+ currently.
We will make use of this in future patches though, so fix it here.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[dwg: Extended to make 2.7 inherit from 2.8 as well]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Needed to make sure usb redirection is prepared to actually handle the
callback from the usb host adapter. Without this interrupt endpoints
don't work on xhci.
Note: On ehci the usb_wakeup() call only schedules a BH for the actual
work, which hides this bug because the allocation happens before ehci
calls back even without this patch.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Message-id: 1476096313-7730-1-git-send-email-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The default DMA offset is set to 3. When the property is not set by
the consumer, the default causes DMA access to be shifted by 3
bytes. In PXA, this results in incorrect DMA access, leading to error
notification in the USB controller driver. A better default would be
0, so that there is no offset, when the consumer does not specify one.
Signed-off-by: Vijay Kumar B. <vijaykumar@zilogic.com>
Reviewed-by: Deepak S. <deepak@zilogic.com>
Message-id: 1475060958-7760-1-git-send-email-vijaykumar@zilogic.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
snprintf return value is *not* the number of chars written into the
buffer, but the number of chars needed. So in case the buffer is too
small you can go alloc a bigger one and try again. But that also means
you can't simply use the return value for the next snprintf call
without checking beforehand that things did actually fit.
Problem is that usb_desc_create_serial didn't perform that check, so a
loooong path string (can happen with deep pci-bridge nesting) results in
the third snprintf call smashing the stack.
Fix this by throwing out all the snpintf calls and use g_strdup_printf
instead.
https://bugzilla.redhat.com/show_bug.cgi?id=1381630
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1475659998-22045-1-git-send-email-kraxel@redhat.com
All callsites have a XHCIEPContext pointer anyway, so we can just pass
it directly instead of fiddeling with slotid and epid.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1474965172-30321-9-git-send-email-kraxel@redhat.com
xhci_kick_epctx is a xhci_kick_ep variant which takes an XHCIEPContext
as input instead of slotid and epid. So in case we have a XHCIEPContext
at hand at the callsite we can just pass it directly.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1474965172-30321-7-git-send-email-kraxel@redhat.com
xhci has a fixed number of 24 (TD_QUEUE) XHCITransfer structs per
endpoint, which turns out to be a problem for usb3 devices with 32 (or
more) bulk streams. xhci re-checks the trb rings on every finished
transfer to make sure it'll pick up any pending work. But that scheme
breaks in case the first transfer of a ring can't be started because we
ran out of XHCITransfer structs already.
So remove static XHCITransfer array from XHCIEPContext. Use a linked
list instead, and allocate/free XHCITransfer as needed. Add helper
functions to allocate & initialize and to cleanup & release
XHCITransfer structs. That also simplifies trb management, we never
have to realloc XHCITransfer->trbs because we don't reuse XHCITransfer
structs any more.
New dynamic limit for in-flight xhci transfers per endpoint is
number-of-streams + 16.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1474965172-30321-5-git-send-email-kraxel@redhat.com
EV_QUEUE must not change because an array of that size is part of live
migration data. Hard-code current value there, so we can touch TD_QUEUE
without breaking live migration.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1474965172-30321-3-git-send-email-kraxel@redhat.com
Needed to avoid we run in circles forever in case the guest builds
an endless loop with link trbs.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Tested-by: P J P <ppandit@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1476096382-7981-1-git-send-email-kraxel@redhat.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCAAGBQJX+LTGAAoJEHAbT2saaT5ZIBwH+wfho+xxruEjro6qPvSAtdKk
BBsOWBfBoqWfbAbOxxCO8ina2nA7p5XbyzSXUr94nZhvZMB9BkgL6la03gdS0Yr2
jHf0J9mM8fIbMQFsEKGOPcdpvU7VEXeFwridZYzypiRvbNSdWK3SKVBKgz2ADNhb
l4Tos81IZeH/mw8HcU3XgSGSTV4JuKP4XsnmwlFMa8/sWM/X3vVgx5IG26KURZQm
pW720jcX0meSfji5YvhspfbBbp1g2EorTZb6iLcZf+OUIB6XkViMisVasnyOo2HJ
cehPlhAHixwq1kXGItc1fs11VloZ6hvEZ7kZ615jAdsD2sGJObtGDxgyJW3+gPo=
=HPHj
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-fetch' into staging
trivial patches for 2016-10-08
# gpg: Signature made Sat 08 Oct 2016 09:56:38 BST
# gpg: using RSA key 0x701B4F6B1A693E59
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg: aka "Michael Tokarev <mjt@corpit.ru>"
# gpg: aka "Michael Tokarev <mjt@debian.org>"
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D 4324 457C E0A0 8044 65C5
# Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931 4B22 701B 4F6B 1A69 3E59
* remotes/mjt/tags/trivial-patches-fetch: (26 commits)
net/filter-mirror: Fix mirror initial check typo
virtio: rename the bar index field name in VirtIOPCIProxy
linux-user: include <poll.h> instead of <sys/poll.h>
char: fix missing return in error path for chardev TLS init
CODING_STYLE: Fix a typo ("have" vs. "has")
bitmap: refine and move BITMAP_{FIRST/LAST}_WORD_MASK
build-sys: fix find-in-path
m68k: change default system clock for m5208evb
exec: remove unused compacted argument
usb: ehci: fix memory leak in ehci_process_itd
qapi: make the json schema files more regular.
maint: Add module_block.h to .gitignore
MAINTAINERS: Some updates related to the SH4 machines
MAINTAINERS: Add some more MIPS related files
MAINTAINERS: Add usermode related config files
MAINTAINERS: Add some more pattern to recognize all win32 related files
MAINTAINERS: Add some more rocker related files
MAINTAINERS: Add header files to CRIS section
MAINTAINERS: Add some more files to the virtio section
MAINTAINERS: Add some SPARC machine related files
...
# Conflicts:
# MAINTAINERS
The Trigger Mode field of IOAPIC must match the Trigger Mode in
the IRTE according to VT-d Spec 5.1.5.1.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Now all the usages of the old version of VMSTATE_VIRTIO_DEVICE are gone,
so we can get rid of the conditionals, and the old macro.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the new VMSTATE_VIRTIO_DEVICE macro.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the new VMSTATE_VIRTIO_DEVICE macro.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the new VMSTATE_VIRTIO_DEVICE macro.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the new VMSTATE_VIRTIO_DEVICE macro.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the new VMSTATE_VIRTIO_DEVICE macro.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the new VMSTATE_VIRTIO_DEVICE macro. The device virtio-gpu is
special because it actually does not adhere to the virtio migration
schema, because device state is last.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the new VMSTATE_VIRTIO_DEVICE macro.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the new VMSTATE_VIRTIO_DEVICE macro.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the new VMSTATE_VIRTIO_DEVICE macro.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the new VMSTATE_VIRTIO_DEVICE macro.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In most cases the functions passed to VMSTATE_VIRTIO_DEVICE
only call the virtio_load and virtio_save wrappers. Some include some
pre- and post- massaging too. The massaging is better expressed
as such in the VMStateDescription.
Let us prepare for changing the semantic of the VMSTATE_VIRTIO_DEVICE
macro so that it is more similar to the other VMSTATE_*_DEVICE macros
in a sense that it is a field definition.
The preprocessor conditionals are going to be removed as soon as
every usage is converted to the new semantic.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This error is caused by a buggy guest: let's switch the device to the
broken state instead of terminating QEMU.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The virtio_scsi_bad_req() function is called when a guest sends a
request with missing or ill-sized headers. This generally happens
when the virtio_scsi_parse_req() function returns an error.
With this patch, virtio_scsi_bad_req() will mark the device as broken,
detach the request from the virtqueue and free it, instead of forcing
QEMU to exit.
In nearly all locations where virtio_scsi_bad_req() is called, the only
thing to do next is to return to the caller.
The virtio_scsi_handle_cmd_req_prepare() function is an exception though.
It is called in a loop by virtio_scsi_handle_cmd_vq() and passed requests
freshly popped from a cmd virtqueue; virtio_scsi_handle_cmd_req_prepare()
does some sanity checks on the request and returns a boolean flag to
indicate whether the request should be queued or not. In the latter case,
virtio_scsi_handle_cmd_req_prepare() has detected a non-fatal error and
sent a response back to the guest.
We have now a new condition to take into account: the device is broken
and should stop all processing.
The return value of virtio_scsi_handle_cmd_req_prepare() is hence changed
to an int. A return value of zero means that the request should be queued.
Other non-fatal error cases where the request shoudn't be queued return
a negative errno (values are vaguely inspired by the error condition, but
the only goal here is to discriminate the case we're interested in).
And finally, if virtio_scsi_bad_req() was called, -EINVAL is returned. In
this case, virtio_scsi_handle_cmd_vq() detaches and frees already queued
requests, instead of submitting them.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
All these errors are caused by a buggy guest: let's switch the device to
the broken state instead of terminating QEMU. Also we detach the element
from the virtqueue and free it.
If this happens, virtio_net_flush_tx() also returns -EINVAL, so that all
callers can stop processing the virtqueue immediatly.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
All these errors are caused by a buggy guest: let's switch the device to
the broken state instead of terminating QEMU. Also we detach the element
from the virtqueue and free it.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This error is caused by a buggy guest: let's switch the device to the
broken state instead of terminating QEMU. Also we detach the element
from the virtqueue and free it.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
All these errors are caused by a buggy guest: QEMU should not exit.
With this patch, if virtio_blk_handle_request() detects a buggy request, it
marks the device as broken and returns an error to the caller so it takes
appropriate action.
In the case of virtio_blk_handle_vq(), we detach the request from the
virtqueue, free its allocated memory and stop popping new requests.
We don't need to bother about multireq since virtio_blk_handle_request()
errors out early and mrb.num_reqs == 0.
In the case of virtio_blk_dma_restart_bh(), we need to detach and free all
queued requests as well.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
A broken guest may send a request without providing buffers for the reply
or for the request itself, and virtqueue_pop() will return an element with
either in_num == 0 or out_num == 0.
All 9P requests are expected to start with the following 7-byte header:
uint32_t size_le;
uint8_t id;
uint16_t tag_le;
If iov_to_buf() fails to return these 7 bytes, then something is wrong in
the guest.
In both cases, it is wrong to crash QEMU, since the root cause lies in the
guest.
This patch hence does the following:
- keep the check of in_num since pdu_complete() assumes it has enough
space to store the reply and we will send something broken to the guest
- let iov_to_buf() handle out_num == 0, since it will return 0 just like
if the guest had provided an zero-sized buffer.
- call virtio_error() to inform the guest that the device is now broken,
instead of aborting
- detach the request from the virtqueue and free it
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Some functions that were called from the dataplane code are now only used
locally:
virtio_blk_init_request()
virtio_blk_handle_request()
virtio_blk_submit_multireq()
since commit "03de2f527499 virtio-blk: do not use vring in dataplane", and
virtio_blk_free_request()
since commit "6aa46d8ff1ee virtio: move VirtQueueElement at the beginning
of the structs".
This patch converts them to static.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ports enter a "throttled" state when writing to the chardev would block.
The current output VirtQueueElement is kept around until the chardev
becomes writable again.
There are several places in the virtio-serial lifecycle where the
VirtQueueElement should be thrown away. For example, if the virtio
device is reset then virtqueue elements are no longer valid.
This patch adds the discard_throttle_data() function to unmap the
scatter-gather list and decrement vq->inuse. This ensures that the
VirtQueueElement is freed properly.
Cc: amit.shah@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Make sure to unmap the scatter-gather list and decrement vq->inuse
before freeing requests in virtio_blk_reset().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
During device reset or similar situations a VirtQueueElement needs to be
freed without pushing it onto the used ring or rewinding the virtqueue.
Extract a new function to do this.
Later patches add virtio_detach_element() calls to existing device so
that scatter-gather lists are unmapped and vq->inuse goes back to zero
during device reset. Currently some devices don't bother and simply
call g_free(elem) which is not a clean way to throw away a
VirtQueueElement.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Workaround for long standing issue where Linux kernel
assigns hotplugged CPU to 1st numa node as it discards
proximity for possible CPUs from SRAT after it's parsed.
_PXM method allows linux query proximity directly from
hotplugged CPU object, which allows Linux to assing CPU
to the correct numa node.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Replace repeated pattern
for (i = 0; i < nb_numa_nodes; i++) {
if (test_bit(idx, numa_info[i].node_cpu)) {
...
break;
with a helper function to lookup numa node index for cpu.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add support for enabling the virtio 1.0 "emergency write"
(VIRTIO_CONSOLE_F_EMERG_WRITE) feature. The previous patch introduced
the plumbing required for this; now we expose the virtio feature to
the guest. The feature is disabled for compatibility machines to avoid
exposing a new feature to existing guests.
As required by the virtio 1.0 spec, the emergency write functionality
is available to the guest even if the guest doesn't negotatiate the
feature, as well as before feature negotation.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add the infrastructure required for the virtio 1.0 "emergency write"
(VIRTIO_CONSOLE_F_EMERG_WRITE) feature. Because we don't touch the
size of the configuration area, guests will not be able to actually
make use of this without further patches.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since there in wrapper around madvise(), the virtio-balloon
code is able to work without the precompiled directive, the
directive can be removed.
Signed-off-by: Liang Li <liang.z.li@intel.com>
Suggested-by: Thomas Huth <thuth@redhat.com>
Reviewd-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
the bar index names are much similar to the bar memory regions,
distinguish them to improve the code readability.
Signed-off-by: Chen Fan <fan.chen@easystack.cn>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The shipping default setting for the Freescale M5208EVB board is to run
the CPU at 166.67MHz. The current qemu emulation code for this board is
defaulting to 66MHz. This results in time appearing to run way to slowly.
So a "sleep 5" in a standard ColdFire Linux build takes almost 15
seconds in real time to actually complete.
Change the hard coded default to match the default hardware setting.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
While processing isochronous transfer descriptors(iTD), if the page
select(PG) field value is out of bands it will return. In this
situation the ehci's sg list is not freed thus leading to a memory
leak issue. This patch avoid this.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Almost all block devices are qdevified by now. This allows us to go back
from the BlockBackend to the DeviceState. xen_disk is the last device
that is missing. We'll remember in the BlockBackend if a xen_disk is
attached and can then disable any features that require going from a BB
to the DeviceState.
While at it, clearly mark the function used by xen_disk as legacy even
in its name, not just in TODO comments.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
A couple of distributors are compiling their distributions
with "-mcpu=power8" for ppc64le these days, so the user sooner
or later runs into a crash there when not explicitely specifying
the "-cpu POWER8" option to QEMU (which is currently using POWER7
for the "pseries" machine by default). Due to this reason, the
linux-user target already switched to POWER8 a while ago (see commit
de3f1b9841). Since the softmmu target
of course has the same problem, we should switch there to POWER8 for
the newer machine types, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
If the user passes an alias name and a property to -cpu, QEMU fails to
find the CPU definition and exits.
$ qemu-system-ppc64 -cpu POWER8E,compat=power7
qemu-system-ppc64: Unable to find sPAPR CPU Core definition
This happens because spapr_get_cpu_core_type() passes the full string from
the command line (i.e. "POWER8E,compat=power7") to ppc_cpu_lookup_alias(),
instead of the alias name piece only (i.e. "POWER8E").
The fix is to pass model_pieces[0] to ppc_cpu_lookup_alias().
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
KVM-PR currently does not support transactional memory, and the
implementation in TCG is just a fake. We should not announce TM
support in the ibm,pa-features property when running on such a
system, so disable it by default and only enable it if the KVM
implementation supports it (i.e. recent versions of KVM-HV).
These changes are based on some earlier work from Anton Blanchard
(thanks!).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The current code uses pa_features_206 for POWERPC_MMU_2_06, and
for everything else, it uses pa_features_207. This is bad in some
cases because there is also a "degraded" MMU version of ISA 2.06,
called POWERPC_MMU_2_06a, which should of course use the flags for
2.06 instead. And there is also the possibility that the user runs
the pseries machine with a POWER5+ or even 970 processor. In that
case we certainly do not want to set the flags for 2.07, and rather
simply skip the setting of the pa-features property instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The function spapr_populate_cpu_dt() has become quite big
already, and since we likely have to extend the pa-features
property for every new processor generation, it is nicer
if we put the related code into a separate function.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that 2.7 is released, create the pseries-2.8 machine type and add the
boilerplate compatiblity macro stuff. There's nothing new to put into the
2.7 compatiliby properties yet, but we'll need something eventually, so
we might as well get it ready now.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A typo introduced in f19661c8 prevents qemu from building when configured
with --enable-trace-backend=dtrace.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
There was an error with some of the register implementation assuming
there are 16 priority queues supported when the IP only supports 8. This
patch corrects the registers to only support 8 queues.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 33bf2d28326d22875602234b8b15cf56fb678333.1474911607.git.alistair.francis@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a generic loader to QEMU which can be used to load images or set
memory values.
Internally inside QEMU this is a device. It is a strange device that
provides no hardware interface but allows QEMU to monkey patch memory
specified when it is created. To be able to do this it has a reset
callback that does the memory operations.
This device allows the user to monkey patch memory. To be able to do
this it needs a backend to manage the datas, the same as other
memory-related devices. In this case as the backend is so trivial we
have merged it with the frontend instead of creating and maintaining a
seperate backend.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-id: 10f2a9dce5e5e11b6c6d959415b0ad6ee22bcba5.1475195078.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If GIC ITS is supported, add description in ACPI MADT table, then guest
could use ITS when booting with ACPI.
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1474616617-366-9-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If supported by the configuration, ITS will be added automatically.
This patch also renames v2m_phandle to msi_phandle because it's now used
by both MSI implementations.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1474616617-366-7-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The ITS control frame is in-kernel emulated while accesses to the
GITS_TRANSLATER are mediated through the KVM_SIGNAL_MSI ioctl (MSI
direct MSI injection advertised by the CAP_SIGNAL_MSI capability)
the kvm_gsi_direct_mapping is explicitly set to false to emphasize the
difference with GICv2M. Direct mapping cannot work with ITS since
the content of the MSI data is not the target interrupt ID but an
eventd id.
GSI routing is advertised (kvm_gsi_routing_allowed) as well as
msi/irqfd signaling (kvm_msi_via_irqfd_allowed).
The MSI frame (GITS_TRANSLATER) absolute GPA is computed on first
kvm_its_send_msi() call. It is then passed through KVM_SIGNAL_MSI
ioctl.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1474616617-366-6-git-send-email-eric.auger@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is the basic skeleton for both KVM and software-emulated ITS.
Since we already prepare status structure, we also introduce complete
VMState description. But, because we currently have no migratable
implementations, we also set unmigratable flag.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1474616617-366-3-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Advertise gsi routing and set up irqchip routing entries for
GIC SPIs.
This is not mandated as long as MSI routing is not used
(because the kernel sets a default irqchip routing table).
However once MSI routing gets used (for VIRTIO-PCI vhost for
example), the first call to KVM_SET_GSI_ROUTING overrides the
kernel default irqchip table.
If no routing entry exists for the GSI, any IRQFD signaling for
this GSI will fail.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1474616617-366-2-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Message-id: 1474641676-25017-1-git-send-email-drjones@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
I'm now saving all 3 of the pll entries; only 2 were saved before.
There are a couple of times that were previously stored as offsets
from 'now' calculated before saving; with vmstate it's easier
to store the 'now' and fix it up on reload.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1474977735-10156-3-git-send-email-dgilbert@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
I've converted the fields in it's main data structure
to fixed size types in ways that look sane.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1474977735-10156-2-git-send-email-dgilbert@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Initialization of a class instance cannot depend on its own properties
as these are not yet set. Move parts of integratorcm_init() that depend
on the "memsz" property to the newly added integratorcm_realize().
This fixes: https://bugs.launchpad.net/qemu/+bug/1624726
Signed-off-by: Jakub Jermar <jakub@jermar.eu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add missed out mappings. These mappings are from the "Intel PXA27x
Processor Developer's Kit User Guide".
Signed-off-by: Vijay Kumar B. <vijaykumar@zilogic.com>
Reviewed-by: Deepak S. <deepak@zilogic.com>
Message-id: 1475063033-8176-3-git-send-email-vijaykumar@zilogic.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
According to the manual the (5, 5) corresponds to backspace key, and
not Enter key. Linux kernel maps (5, 4) to the enter key. Fixing it up
to match the mapping in the Linux kernel.
Signed-off-by: Vijay Kumar B. <vijaykumar@zilogic.com>
Reviewed-by: Deepak S. <deepak@zilogic.com>
Message-id: 1475063033-8176-2-git-send-email-vijaykumar@zilogic.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add the STM32F2xx ADC device. This device randomly
generates values on each read.
This also includes creating a hw/adc directory.
Signed-off-by: Alistair Francis <alistair@alistair23.me>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 3240e660adaf537f55a63ce06096e844aece8cda.1474742262.git.alistair@alistair23.me
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If correctly configured allow the STM32F2xx timer to print
out the PWM duty cycle information.
Signed-off-by: Alistair Francis <alistair@alistair23.me>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: cdb59039a25e061615713a94b40797baa12ea9f9.1474742262.git.alistair@alistair23.me
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cleanup the individual DeviceState and SysBusDevice
variables to re-use the same variable for each
device.
Signed-off-by: Alistair Francis <alistair@alistair23.me>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: fc5d75a57d320b69704df2c1146ff0fd482e4a88.1474742262.git.alistair@alistair23.me
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Version: GnuPG v2
iQEcBAABCAAGBQJX8GfGAAoJEMo1YkxqkXHGFGQH/27Zjku3/aImefu+BkpYt3Zn
xbSqnPPpmqZqJmjf5JNIJm3YP9wYMk0BrSzitUox/1SAGR/2FoAc78ODVVUgisJD
NmJI4yHxow4KmqHpjoQ5lZEeWZOPg+KhNEtVOf+QuedrrSpsQCoe5AEPSQKP9Ngg
OoxMj/Iu+RZawl7QFizqAvHNkfS9Km02P56BnXsh02Ie1QeckPE/lTj1J1HTI5it
vlQk7GXCoAhbXnlzKPXN1iiPL+7G7QJZ9INF28l1JWX8chO4XK7yqywfjPtL5Ej8
PeZM18kSXrpnSffktnfTlUfa3Ancle7bN+BnkY5Ws7E1njX2WkxyIRfV1V9krHU=
=6JV5
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/famz/tags/for-upstream' into staging
# gpg: Signature made Sun 02 Oct 2016 02:49:58 BST
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
* remotes/famz/tags/for-upstream:
docker: Build in a clean directory
smbios: fix uuid copy
xenpv: Fix qemu_uuid compiling error
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We have to change the vmstate version due to changes in statistics counters.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-Id: <1474921408-24710-5-git-send-email-hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This interface will be used by HMP commands 'info irq' and 'info pic'.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-Id: <1474921408-24710-2-git-send-email-hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
So now edu device can support both line or msi interrupt, depending on
how user configures it.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1475067819-21413-1-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu tracks guest time based on vector [base_rtc, last_update], in which
last_update stands for a monotonic tick which is actually uptime of the
host.
according to rtc implementation codes of recent releases and upstream,
after
migration, the time base vector [base_rtc, last_update] isn't updated to
coordinate with the destionation host, ie. qemu doesnt update last_update
to
uptime of the destination host.
what problem have we got because of this bug? after migration, guest time
may
jump back to several days ago, that will make some critical business
applications,
such as lotus notes, malfunction.
this patch is trying to fix the problem. first, when vmsave in progress,
we
rtc_update_time to refresh time stamp in cmos array, then during
vmrestore,
we rtc_set_time to update qemu base_rtc and last_update variable according
to time
stamp in cmos array.
Signed-off-by: Junlian Bell <zhongjun@sangfor.com.cn>
Message-Id: <20160926124101.2364-1-zhongjun@sangfor.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Kiarie <davidkiarie4@gmail.com>
Message-Id: <1475553808-13285-2-git-send-email-davidkiarie4@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix a memory leak in ide_register_restart_cb() in hw/ide/core.c and add
idebus_unrealize() in hw/ide/qdev.c to have calls to
qemu_del_vm_change_state_handler() to deal with the dangling change
state handler during hot-unplugging ide devices which might lead to a
crash.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1474995212-10580-1-git-send-email-ashijeetacharya@gmail.com
[Minor whitespace fix --js]
Signed-off-by: John Snow <jsnow@redhat.com>
Similar to existing fixes for IDE (87ac25fd) and ATAPI (7f951b2d), the
AIOCB must be cleared in the callback. Otherwise, we may accidentally
try to reset a dangling pointer in bdrv_aio_cancel() from a port reset.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1474575040-32079-2-git-send-email-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
ATA8-APT defines the state transitions for both a host controller and
for the hardware device during the lifecycle of a DMA transfer, in
section 9.7 "DMA command protocol."
One of the interesting tidbits here is that when a device transitions
from DDMA0 ("Prepare state") to DDMA1 ("Data_Transfer State"), it can
choose to set either BSY or DRQ to signal this transition, but not both.
as ide_sector_dma_start is the last point in our preparation process
before we begin the real data transfer process (for either AHCI or BMDMA),
this is the correct transition point for DDMA0 to DDMA1.
I have chosen !BSY && DRQ for QEMU to make the transition from DDMA0 the
most obvious.
Reported-by: Benjamin David Lunt <fys@fysnet.net>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1470175541-19344-1-git-send-email-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
We can teach Xen to drain and flush each device as it needs to, instead
of trying to flush ALL devices. This removes the last user of
blk_flush_all.
The function is therefore removed under the premise that any new uses
of blk_flush_all would be the wrong paradigm: either flush the single
device that requires flushing, or use an appropriate flush_all mechanism
from outside of the BlkBackend layer.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Convert rc4030 to VMState.
Now saving the whole 16 entries rather than 15.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
[Yongbok Kim: edited commit message]
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Since 9c5ce8db, the uuid is wrongly copied, as QemuUUID 'in' argument is
already a pointer.
Fixes ASAN complaining:
hw/smbios/smbios.c:489:5: runtime error: load of address 0x7fffcdb91b00
with insufficient space for an object of type '__int128 unsigned'
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20160928143810.25558-1-marcandre.lureau@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
[Warp the long error message line in commit message. - Fam]
Signed-off-by: Fam Zheng <famz@redhat.com>
9c5ce8db2 switched the type of qemu_uuid and this should have followed.
Fix it.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1474968011-29382-1-git-send-email-famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
The trace points for hw/virtio/virtio-balloon.c were mistakenly put
in the top level trace-events file, instead of util/trace-events in
commit 270ab88f7c
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Thu Jun 16 09:39:57 2016 +0100
trace: split out trace events for hw/virtio/ directory
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1473872624-23285-5-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The trace points for hw/mem/pc-dimm.c were mistakenly put
in the hw/i386/trace-events file, instead of hw/mem/trace-events
in
commit 5eb76e480b
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Thu Jun 16 09:40:10 2016 +0100
trace: split out trace events for hw/i386/ directory
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1473872624-23285-4-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This fixes problems with translated set 1, where most make code were wrong.
This fixes problems with set 3 for extended keys (like arrows) and lot of other keys.
Added a FIXME for set 3, where most keys must not (by default) deliver a break code.
Detailed list of changes on untranslated set 2:
- change of ALTGR break code from 0xe4 to 0xf0 0x08
- change of ALTGR_R break code from 0xe0 0xe4 to 0xe0 0xf0 0x08
- change of F7 make code from 0x02 to 0x83
- change of F7 break code from 0xf0 0x02 to 0xf0 0x83
- change of PRINT make code from 0xe0 0x7c to 0xe0 0x12 0xe0 0x7c
- change of PRINT break code from 0xe0 0xf0 0x7c to 0xe0 0xf0 0x7c 0xe0 0xf0 0x12
- change of PAUSE key: new make code = old make code + old break code, no more break code
- change on RO break code from 0xf3 to 0xf0 0x51
- change on KP_COMMA break code from 0xfe to 0xf0 0x6d
Detailed list of changes on translated set 2 (the most commonly used):
- change of PRINT make code from 0xe0 0x37 to 0xe0 0x2a 0xe0 0x37
- change of PRINT break code from 0xe0 0xb7 to 0xe0 0xb7 0xe0 0xaa
- change of PAUSE key: new make code = old make code + old break code, no more break code
Reference:
http://www.computer-engineering.org/ps2keyboard/scancodes1.htmlhttp://www.computer-engineering.org/ps2keyboard/scancodes2.htmlhttp://www.computer-engineering.org/ps2keyboard/scancodes3.html
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1473969987-5890-5-git-send-email-hpoussin@reactos.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Change ps2_put_keycode to get an untranslated scancode, which is translated if needed.
As qemu_input_key_value_to_scancode() gives translated scancodes, untranslate them
in ps2_keyboard_event first before giving them to ps2_put_keycode.
Results are not changed, except for some keys in translated set 3.
Translation table is available at
https://www.win.tue.nl/~aeb/linux/kbd/scancodes-10.html
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1473969987-5890-4-git-send-email-hpoussin@reactos.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
When getting scancode, current scancode must be preceded from reply ack.
When setting scancode, we must reject invalid scancodes.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1473969987-5890-3-git-send-email-hpoussin@reactos.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
According to the PoP, subchannels are only considered operational if
they are enabled _and_ the device number is valid. With the current
checks being enabled _or_ having a valid device number was
sufficient. This caused qemu to allow IO on subchannels that were not
enabled.
Fix the checks to require both bits to be set.
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Now that each S390 PCI device uses an IO region as MSIX region. The
code in s390_translate_iommu() will never be triggered. Let's remove
it.
Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
For efficiency we now assign one msix io region for each pci device
and provide it with the pointer to the zPCI device as opaque
parameter. In addition, we remove msix address space and add msix io
region as a subregion to the root memory region of pci device.
Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Pull mr variable declarations at the top of the functions instead of
mixing them up with the code. This is in preparation for followup
patches.
Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Copy data operated on during request from/to local buffers to/from
the grant references.
Before grant copy operation local buffers must be allocated what is
done by calling ioreq_init_copy_buffers. For the 'read' operation,
first, the qemu device invokes the read operation on local buffers
and on the completion grant copy is called and buffers are freed.
For the 'write' operation grant copy is performed before invoking
write by qemu device.
A new value 'feature_grant_copy' is added to recognize when the
grant copy operation is supported by a guest.
Signed-off-by: Paulina Szubarczyk <paulinaszubarczyk@gmail.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Functions of type FindSysbusDeviceFunc currently return an integer.
However, this return value is always ignored by the caller in
find_sysbus_device().
This changes the function type to return void, to avoid confusion over
the function semantics.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
CPUState is a fairly common pointer to pass to these helpers. This means
if you need other arguments for the async_run_on_cpu case you end up
having to do a g_malloc to stuff additional data into the routine. For
the current users this isn't a massive deal but for MTTCG this gets
cumbersome when the only other parameter is often an address.
This adds the typedef run_on_cpu_func for helper functions which has an
explicit CPUState * passed as the first parameter. All the users of
run_on_cpu and async_run_on_cpu have had their helpers updated to use
CPUState where available.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
[Sergey Fedorov:
- eliminate more CPUState in user data;
- remove unnecessary user data passing;
- fix target-s390x/kvm.c and target-s390x/misc_helper.c]
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts)
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> (s390 parts)
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1470158864-17651-3-git-send-email-alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It's 2.8 now, and maybe it's time to switch IOAPIC default version to
0x20.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1474608795-23058-1-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
x86 vIOMMUs still lack of a complete IOMMU notifier mechanism.
Before that is achieved, let's open a door for vhost DMAR support,
which only requires cache invalidations (UNMAP operations).
Meanwhile, convert hw_error() to error_report() and exit(1), to make
the error messages cleaner and obvious (no CPU registers will be dumped).
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1474606948-14391-4-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This uses the wrong frame size for packets composed of multiple
descriptors.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This uses the wrong frame size for packets composed of multiple
descriptors.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
ColdFire Fast Ethernet Controller uses buffer descriptors to manage
data flow to/fro receive & transmit queues. While transmitting
packets, it could continue to read buffer descriptors if a buffer
descriptor has length of zero and has crafted values in bd.flags.
Set upper limit to number of buffer descriptors.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This patch fixes 2 issues:
1. Bits set in EIAC register should be cleared
from IMS when EIAM is not used.
2. Only bit that corresonds to the interrupt being
raised should be cleared.
See spec. 10.2.4.7 Interrupt Auto Clear
Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Do not raise ACK interrupts when
RFCTL.ACKDIS bit is set (see spec. 10.2.5.16).
Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Interrupt mask for legacy OTHER causes should
not apply to MSI-X OTHER cause.
Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This patch fixes incorrect check for
interrypt type being used.
PBSCLR register is valid for MSI-X only.
See spec. 10.2.3.13 MSI—X PBA Clear
Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
CTRL_EXT.EIAME bit controls clearing of IAM bits,
but current code clears IMS bits instead.
See spec. 10.2.2.5 Extended Device Control Register.
Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Before this patch first netdev queue only was flushed.
Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
hw/net/e1000e_core.c:56: warning: e1000e_set_interrupt_cause declared inline after being called
hw/net/e1000e_core.c:56: warning: previous declaration of e1000e_set_interrupt_cause was here
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This allows increasing the rx queue size up to 1024: unlike with tx,
guests don't put in huge S/G lists into RX so the risk of running into
the max 1024 limitation due to some off-by-one seems small.
It's helpful for users like OVS-DPDK which don't do any buffering on the
host - 1K roughly matches 500 entries in tun + 256 in the current rx
queue, which seems to work reasonably well. We could probably make do
with ~750 entries but virtio spec limits us to powers of two.
It might be a good idea to specify an s/g size limit in a future
version.
It also might be possible to make the queue size smaller down the road, 64
seems like the minimal value which will still work (as guests seem to
assume a queue full of 1.5K buffers is enough to process the largest
incoming packet, which is ~64K). No one actually asked for this, and
with virtio 1 guests can reduce ring size without need for host
configuration, so don't bother with this for now.
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Jason Wang <jasowang@redhat.com>
Suggested-by: Patrik Hermansson <phermansson@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>