MIG_CMD_PACKAGED is a migration command that wraps a chunk of migration
stream inside a package whose length can be determined purely by reading
its header. The destination guarantees that the whole MIG_CMD_PACKAGED
is read off the stream prior to parsing the contents.
This is used by postcopy to load device state (from the package)
while leaving the main stream free to receive memory pages.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The state of the postcopy process is managed via a series of messages;
* Add wrappers and handlers for sending/receiving these messages
* Add state variable that track the current state of postcopy
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Postcopy needs to have two migration streams loading concurrently;
one from memory (with the device state) and the other from the fd
with the memory transactions.
Split the core of qemu_loadvm_state out so we can use it for both.
Allow the inner loadvm loop to quit and cause the parent loops to
exit as well.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Open a return path, and handle messages that are received upon it.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add migrate_send_rp_message to send a message from destination to source along the return path.
(It uses a mutex to let it be called from multiple threads)
Add migrate_send_rp_shut to send a 'shut' message to indicate
the destination is finished with the RP.
Add migrate_send_rp_ack to send a 'PONG' message in response to a PING
Use it in the MSG_RP_PING handler
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add two src->dest commands:
* OPEN_RETURN_PATH - To request that the destination open the return path
* PING - Request an acknowledge from the destination
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Create QEMU_VM_COMMAND section type for sending commands from
source to destination. These commands are not intended to convey
guest state but to control the migration process.
For use in postcopy.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
In postcopy we're going to need to perform the complete phase
for postcopiable devices at a different point, start out by
renaming all of the 'complete's to make the difference obvious.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1446203414-4013-7-git-send-email-kraxel@redhat.com
These messages are disabled by default; a perfect usecase for tracepoints,
which in fact already exist. Add the missing information to them and
stop using qemu_log_mask.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The function qemu_savevm_state_cancel is called after the migration
in migration_thread, it seems strange to 'cancel' it after completion,
rename it to qemu_savevm_state_cleanup looks better.
Signed-off-by: Liang Li <liang.z.li@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>al3
Reviewed-by: Amit Shah <amit.shah@redhat.com>al3
Signed-off-by: Juan Quintela <quintela@redhat.com>al3
The event throttling state machine is hard to understand. I'm not
sure it's entirely correct. Rewrite it in a more straightforward
manner:
State 1: No event sent recently (less than evconf->rate ns ago)
Invariant: evstate->timer is not pending, evstate->qdict is null
On event: send event, arm timer, goto state 2
State 2: Event sent recently, no additional event being delayed
Invariant: evstate->timer is pending, evstate->qdict is null
On event: store it in evstate->qdict, goto state 3
On timer: goto state 1
State 3: Event sent recently, additional event being delayed
Invariant: evstate->timer is pending, evstate->qdict is non-null
On event: store it in evstate->qdict, goto state 3
On timer: send evstate->qdict, clear evstate->qdict,
arm timer, goto state 2
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1444921716-9511-3-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
These messages are disabled by default; a perfect usecase for tracepoints.
Convert them over.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Replace error_report() and use tracing instead. It's not an error to get
a connection or a disconnection, so silence this and trace it instead.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Thibaut Collet <thibaut.collet@6wind.com>
The malloc vtable is not supported anymore in glib, because it broke
when constructors called g_malloc. Remove tracing of g_malloc,
g_realloc and g_free calls.
Note that, for systemtap users, glib also provides tracepoints
glib.mem_alloc, glib.mem_free, glib.mem_realloc, glib.slice_alloc
and glib.slice_free.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1442417924-25831-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add virglrenderer library detection. Add 3d mode to virtio-gpu,
wire up virglrenderer library. When in 3d mode render using the
new context management and texture scanout callbacks.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
In irqfd mode, current code attempts to set a resamplefd whatever
the type of the IRQ. For an edge-sensitive IRQ this attempt fails
and as a consequence, the whole irqfd setup fails and we fall back
to the slow mode. This patch bypasses the resamplefd setting for
non level-sentive IRQs.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This is a start on using size_t more in qemu-file and friends;
it fixes up QEMUFilePutBufferFunc and QEMUFileGetBufferFunc
to take size_t lengths and return ssize_t return values (like read(2))
and fixes up all the different implementations of them.
Note that I've not yet followed this deeply into bdrv_ implementations.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1439463094-5394-5-git-send-email-dgilbert@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
The code that gets run at the end of the migration process
is getting large, and I'm about to add more for postcopy.
Split it into a separate function.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1439463094-5394-3-git-send-email-dgilbert@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
In some cases, we need to disable copy-on-read, and just
read the data.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Message-id: 1441682913-14320-2-git-send-email-wency@cn.fujitsu.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Specifying an emulated PCI vendor/device ID can be useful for testing
various quirk paths, even though the behavior and functionality of
the device with bogus IDs is fully unsupportable. We need to use a
uint32_t for the vendor/device IDs, even though the registers
themselves are only 16-bit in order to be able to determine whether
the value is valid and user set.
The same support is added for subsystem vendor/device ID, though these
have the possibility of being useful and supported for more than a
testing tool. An emulated platform might want to impose their own
subsystem IDs or at least hide the physical subsystem ID. Windows
guests will often reinstall drivers due to a change in subsystem IDs,
something that VM users may want to avoid. Of course careful
attention would be required to ensure that guest drivers do not rely
on the subsystem ID as a basis for device driver quirks.
All of these options are added using the standard experimental option
prefix and should not be considered stable.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This is just another quirk, for reset rather than affecting memory
regions. Move it to our new quirks file.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Config windows make use of an address register and a data register.
In VGA cards, these are often used to provide real mode code in the
BIOS an easy way to access MMIO registers since the window often
resides in an I/O port register. When the MMIO register has a mirror
of PCI config space, we need to trap those accesses and redirect them
to emulated config space.
The previous version of this functionality made use of a single
MemoryRegion and single match address. This version uses separate
MemoryRegions for each of the address and data registers and allows
for multiple match addresses. This is useful for Nvidia cards which
have two ranges which index into PCI config space.
The previous implementation is left for the follow-on patch for a more
reviewable diff.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Another rework of this quirk, this time to update to the new quirk
structure. We can handle the address and data registers with
separate MemoryRegions and a quirk specific data structure, making the
code much more understandable.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The Nvidia 0x3d0 quirk makes use of a two separate registers and gives
us our first chance to make use of separate memory regions for each to
simplify the code a bit.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This is an easy quirk that really doesn't need a data structure if
its own. We can pass vdev as the opaque data and access to the
MemoryRegion isn't required.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Create a vendor:device ID helper that we'll also use as we rework the
rest of the quirks. Re-reading the config entries, even if we get
more blacklist entries, is trivial overhead and only incurred during
device setup. There's no need to typedef the blacklist structure,
it's a static private data type used once. The elements get bumped
up to uint32_t to avoid future maintenance issues if PCI_ANY_ID gets
used for a blacklist entry (avoiding an actual hardware match). Our
test loop is also crying out to be simplified as a for loop.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This allows vfio_msi* tracing. The MSI/X interrupt tracing is also
pulled out of #ifdef DEBUG_VFIO to avoid a recompile for tracing this
path. A few cycles to read the message is hardly anything if we're
already in QEMU.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Rename functions and tracing callbacks so that we can trace vfio_intx*
to see all the INTx related activities.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
There's quite a bit of cleanup that can be done to the RTL8168 quirk,
as well as the tracing to prevent a spew of uninteresting accesses
for anything else the driver might choose to use the window registers
for besides the MSI-X table. There should be no functional change,
but it's now possible to get compact and useful traces by enabling
vfio_rtl8168_quirk*, ex:
vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f000
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f000
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0xfee0100c
vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f004
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f004
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0x0
vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f008
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f008
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0x49b1
vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f00c
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f00c
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0x0
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJV+CwGAAoJEL6G67QVEE/fResQAKiHbjRRPjtCNjAvVixd2ewa
O39TXlgQol4EiMKgsrIJf33yaEJQIj5ElNfKOUysgcLdGfL69+XWGQ5WgoHZx40d
0Iiy8rGOTmCAQMgQYkRmJyayPTkK96jt8rl9psE0ab7JhS4CA2NbgnPWLLzVFwEx
0BJ0SgHvzIGYy0N+9aQ7lVVUUja/Ksg64/6AAPpBHMkBZkOruk132E9B0D0mL7kL
rka3OMgLpKqginKD4t3MKII1CnR5iSS2NNB/fJxVzrWK84Wv1/SbD1QnSlHPFWl6
ffeD9j3F8ihFVdi0nssxK6kHYZW+dAeC8VPxpLcnFffHiNa7yU4XGQxmMuR3F/W/
Su/R6W9JSP1dY6MCvCPjJNa2t9AW5iG0pGm4MckoZp4H6F46OPuxb0/GWoz/9prU
S7BPLoB3h7h3otmokIL2MvqlU/5lfqUhlhW7w7ZS6fTNXUT2amFlq2UJZpFuEt0b
3kAsAaGAq4wk5QB04lSbxW+u/F669L0dobu2FtOHiHECe3bihrCxk0OckzdA0fOP
kZ14jIsvagXgWG2NAMQFKKXL3OCpfbObEm+mQp6JR6y108TwdXR3XYCudVHAHyK7
GS+rhTdOtUgtQgpJG97RgdBd1nvil2dZ+NizX9DXu5EhT6le3PKijIOkq/6TLw5H
5qAYBZCGQXl1bNrmifcH
=6TWk
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/berrange/tags/vnc-crypto-v9-for-upstream' into staging
Merge vnc-crypto-v9
# gpg: Signature made Tue 15 Sep 2015 15:32:38 BST using RSA key ID 15104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>"
* remotes/berrange/tags/vnc-crypto-v9-for-upstream:
ui: convert VNC server to use QCryptoTLSSession
ui: fix return type for VNC I/O functions to be ssize_t
crypto: introduce new module for handling TLS sessions
crypto: add sanity checking of TLS x509 credentials
crypto: introduce new module for TLS x509 credentials
crypto: introduce new module for TLS anonymous credentials
crypto: introduce new base module for TLS credentials
qom: allow QOM to be linked into tools binaries
crypto: move crypto objects out of libqemuutil.la
tests: remove repetition in unit test object deps
qapi: allow override of default enum prefix naming
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Introduce a QCryptoTLSSession object that will encapsulate
all the code for setting up and using a client/sever TLS
session. This isolates the code which depends on the gnutls
library, avoiding #ifdefs in the rest of the codebase, as
well as facilitating any possible future port to other TLS
libraries, if desired. It makes use of the previously
defined QCryptoTLSCreds object to access credentials to
use with the session. It also includes further unit tests
to validate the correctness of the TLS session handshake
and certificate validation. This is functionally equivalent
to the current TLS session handling code embedded in the
VNC server, and will obsolete it.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If the administrator incorrectly sets up their x509 certificates,
the errors seen at runtime during connection attempts are very
obscure and difficult to diagnose. This has been a particular
problem for people using openssl to generate their certificates
instead of the gnutls certtool, because the openssl tools don't
turn on the various x509 extensions that gnutls expects to be
present by default.
This change thus adds support in the TLS credentials object to
sanity check the certificates when QEMU first loads them. This
gives the administrator immediate feedback for the majority of
common configuration mistakes, reducing the pain involved in
setting up TLS. The code is derived from equivalent code that
has been part of libvirt's TLS support and has been seen to be
valuable in assisting admins.
It is possible to disable the sanity checking, however, via
the new 'sanity-check' property on the tls-creds object type,
with a value of 'no'.
Unit tests are included in this change to verify the correctness
of the sanity checking code in all the key scenarios it is
intended to cope with. As part of the test suite, the pkix_asn1_tab.c
from gnutls is imported. This file is intentionally copied from the
(long since obsolete) gnutls 1.6.3 source tree, since that version
was still under GPLv2+, rather than the GPLv3+ of gnutls >= 2.0.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce a QCryptoTLSCredsX509 class which is used to
manage x509 certificate TLS credentials. This will be
the preferred credential type offering strong security
characteristics
Example CLI configuration:
$QEMU -object tls-creds-x509,id=tls0,endpoint=server,\
dir=/path/to/creds/dir,verify-peer=yes
The 'id' value in the -object args will be used to associate the
credentials with the network services. For example, when the VNC
server is later converted it would use
$QEMU -object tls-creds-x509,id=tls0,.... \
-vnc 127.0.0.1:1,tls-creds=tls0
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Introduce a QCryptoTLSCredsAnon class which is used to
manage anonymous TLS credentials. Use of this class is
generally discouraged since it does not offer strong
security, but it is required for backwards compatibility
with the current VNC server implementation.
Simple example CLI configuration:
$QEMU -object tls-creds-anon,id=tls0,endpoint=server
Example using pre-created diffie-hellman parameters
$QEMU -object tls-creds-anon,id=tls0,endpoint=server,\
dir=/path/to/creds/dir
The 'id' value in the -object args will be used to associate the
credentials with the network services. For example, when the VNC
server is later converted it would use
$QEMU -object tls-creds-anon,id=tls0,.... \
-vnc 127.0.0.1:1,tls-creds=tls0
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Introduce a QCryptoTLSCreds class to act as the base class for
storing TLS credentials. This will be later subclassed to provide
handling of anonymous and x509 credential types. The subclasses
will be user creatable objects, so instances can be created &
deleted via 'object-add' and 'object-del' QMP commands respectively,
or via the -object command line arg.
If the credentials cannot be initialized an error will be reported
as a QMP reply, or on stderr respectively.
The idea is to make it possible to represent and manage TLS
credentials independently of the network service that is using
them. This will enable multiple services to use the same set of
credentials and minimize code duplication. A later patch will
convert the current VNC server TLS code over to use this object.
The representation of credentials will be functionally equivalent
to that currently implemented in the VNC server with one exception.
The new code has the ability to (optionally) load a pre-generated
set of diffie-hellman parameters, if the file dh-params.pem exists,
whereas the current VNC server will always generate them on startup.
This is beneficial for admins who wish to avoid the (small) time
sink of generating DH parameters at startup and/or avoid depleting
entropy.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a reason to grab calls and trace points,
so it is easier to debug grab related ui issues.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJV8bU4AAoJEIlPj0hw4a6QIuUP/2zKkoU+KAO1/V5f2WBTwzZc
8X/t+yGMRaQS9ibWldg/kLJ+uqHt1O0XUDyoLFK03jfBd3bJDpGuVAKe39XQmNov
y0f+ytGDtLCRglBw2jJT1tu29y3GbCXYxLKLj9vHEoCt4OEdh5xQlwK5ZkzT+SOF
Qxnx+5rWMb3xnzxlfg354IJ0AGq1qZemkdhqwUJ66/mFKGRxjavn1cCqcb93tbMU
UYKdEkoATRPRrTIhLepUnb3x3fMtlKgZJdqpVDQ3+mwXLGa2C31qJe1h/ac8HVCj
1Rqj8h4va23LntOLS3AIYQcfDjDj1AQbfVKhpZzkYce3kPkXmJ+JwJ6CMQch0Bgw
bD6q8/5sJ30Weyi0Yp+ZjVWH2LVXYguf1csPw510c+ZJIsYTDv+AxF63hVmmdp8G
8B5YHhVMKkUtgrammdardjFBhl2XF+zn072RMh6KBAruI7YBAxo0hbRjoy2EWx0h
Z93VgcBZ6n6iYNlxpQ8kNxbdnJXo4mgHMBTTe9aOkfXArvllrfJZIWsi5aScrqbb
aP5RbFCoRWJVA2qOWywJL8W+rLtTK9244yuqwbhaxcBVw8/fH8VhJD2XxS7yozxS
LZwoYO7pjLpqwfnnqtnXOVjWD7aVlEGKWQSe7EV9wIDPrSU/RpBhP09kIu1yCqgM
Qki6v4d94v3S5Ounwl4n
=7+ii
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/sstabellini/tags/xen-2015-09-10-tag' into staging
xen-2015-09-10
# gpg: Signature made Thu 10 Sep 2015 17:52:08 BST using RSA key ID 70E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
* remotes/sstabellini/tags/xen-2015-09-10-tag: (29 commits)
xen/pt: Don't slurp wholesale the PCI configuration registers
xen/pt: Check for return values for xen_host_pci_[get|set] in init
xen/pt: Move bulk of xen_pt_unregister_device in its own routine.
xen/pt: Make xen_pt_unregister_device idempotent
xen/pt: Log xen_host_pci_get/set errors in MSI code.
xen/pt: Log xen_host_pci_get in two init functions
xen/pt: Remove XenPTReg->data field.
xen/pt: Check if reg->init function sets the 'data' past the reg->size
xen/pt: Sync up the dev.config and data values.
xen/pt: Use xen_host_pci_get_[byte|word] instead of dev.config
xen/pt: Use XEN_PT_LOG properly to guard against compiler warnings.
xen/pt/msi: Add the register value when printing logging and error messages
xen: use errno instead of rc for xc_domain_add_to_physmap
xen/pt: xen_host_pci_config_read returns -errno, not -1 on failure
xen/pt: Make xen_pt_msi_set_enable static
xen/pt: Update comments with proper function name.
xen/HVM: atomically access pointers in bufioreq handling
xen-hvm: When using xc_domain_add_to_physmap also include errno when reporting
xen, gfx passthrough: add opregion mapping
xen, gfx passthrough: register host bridge specific to passthrough
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The current trace prototypes and (matching) trace calls lead to
"unorthodox" PCI BDF notation in at least the stderr trace backend. For
example, the four BARs of a QXL video card at 00:01.0 (bus 0, slot 1,
function 0) are traced like this (PID and timestamps removed):
pci_update_mappings_add d=0x7f14a73bf890 00:00.1 0,0x84000000+0x4000000
pci_update_mappings_add d=0x7f14a73bf890 00:00.1 1,0x80000000+0x4000000
pci_update_mappings_add d=0x7f14a73bf890 00:00.1 2,0x88200000+0x2000
pci_update_mappings_add d=0x7f14a73bf890 00:00.1 3,0xd060+0x20
The slot and function values are in reverse order.
Stick with the conventional BDF notation.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Don Koch <dkoch@verizon.com>
Cc: qemu-trivial@nongnu.org
Fixes: 7828d75045
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
s390 guest initialization is modified to make use of new s390-storage-keys
device. Old code that globally allocated storage key array is removed.
The new device enables storage key access for kvm guests.
Cache storage key QOM objects in frequently used helper functions to avoid a
performance hit every time we use one of these functions.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
If mirror has more free buffers than IOV_MAX, preadv(2)/pwritev(2)
EINVAL failures may be encountered.
It is possible to trigger this by setting granularity to a low value
like 8192.
This patch stops appending chunks once IOV_MAX is reached.
The spurious EINVAL failure can be reproduced with a qcow2 image file
and the following QMP invocation:
qmp.command('drive-mirror', device='virtio0', target='/tmp/r7.s1',
granularity=8192, sync='full', mode='absolute-paths',
format='raw')
While the guest is running dd if=/dev/zero of=/var/tmp/foo oflag=direct
bs=4k.
Cc: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1435761950-26714-1-git-send-email-stefanha@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Drop .can_receive and move the semantics into minimac2_rx, by returning
0.
That is once minimac2_rx returns 0, incoming packets will be queued
until the queue is explicitly flushed. We do this when s->regs[R_STATE0]
or s->regs[R_STATE1] is changed in minimac2_write.
Also drop the unused trace point.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1436955553-22791-9-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
To make sections optional, we need to do it at the beggining of the code.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This includes a new section that for now just stores the current qemu state.
Right now, there are only one way to control what is the state of the
target after migration.
- If you run the target qemu with -S, it would start stopped.
- If you run the target qemu without -S, it would run just after migration finishes.
The problem here is what happens if we start the target without -S and
there happens one error during migration that puts current state as
-EIO. Migration would ends (notice that the error happend doing block
IO, network IO, i.e. nothing related with migration), and when
migration finish, we would just "continue" running on destination,
probably hanging the guest/corruption data, whatever.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Use the order of incoming RAMBlocks from the source to record
an index number; that then allows us to sort the destination
local RAMBlock list to match the source.
Now that the RAMBlocks are known to be in the same order, this
simplifies the RDMA Registration step which previously tried to
match RAMBlocks based on offset (which isn't guaranteed to match).
Looking at the existing compress code, I think it was erroneously
relying on an assumption of matching ordering, which this fixes.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
In the next patch we remove the hash on the destination,
rdma_delete_block does two things with the hash which can be avoided:
a) The caller passes the offset and rdma_delete_block looks it up
in the hash; fixed by getting the caller to pass the block
b) The hash gets recreated after deletion; fixed by making that
conditional on the hash being initialised.
While this function is currently only used during cleanup, Michael
asked that we keep it general for future dynamic block registration
work.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
We need the names of RAMBlocks as they're loaded for RDMA,
reuse a slightly modified ram_control_load_hook:
a) Pass a 'data' parameter to use for the name in the block-reg
case
b) Only some hook types now require the presence of a hook function.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
In a later patch the block name will be used to match up two views
of the block list. Keep a copy of the block name with the local block
list.
(At some point it could be argued that it would be best just to let
migration see the innards of RAMBlock and avoid the need to use
foreach).
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This patch aims at optimizing IRQ handling using irqfd framework.
Instead of handling the eventfds on user-side they are handled on
kernel side using
- the KVM irqfd framework,
- the VFIO driver virqfd framework.
the virtual IRQ completion is trapped at interrupt controller
This removes the need for fast/slow path swap.
Overall this brings significant performance improvements.
Signed-off-by: Alvise Rigo <a.rigo@virtualopensystems.com>
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Vikram Sethi <vikrams@codeaurora.org>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Changes:
* improve dp8393x network card and rc4030 chipset emulation
* support misaligned R6 and MSA memory accesses
* support MIPS eXtended and Large Physical Addressing
* add Config5.FRE bit and ERETNC instruction (Config5.LLB)
* support ememsize on MALTA
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJVeppzAAoJEFIRjjwLKdprnk8H/1owSOreh0sMFbosvqlEhjXl
lvjjuprWMdX+8M1JlaDvTbw6+LDB3Rihp3A6/I9A0GFiZaORmPzg7efULAI1H6ST
0HfxMAO17eW+PJ3lvk0HidNDr01+RzTvwpizrHgQ9WJubJv0xREU+YG5yn1gPS4N
aMMTKCAQFDba7iQQLKXUYvLz76+xyzW4VIvHVLx/SU86yPg9T7CwLpppipR8+zY5
3BC4NUw/xLlS0LCYQGM8XmYgBiQ6lEAz/Y29bGlUg+LeYysjSgNSeoNbOs1M3kQp
X0Hn7b28I1CjM2wZQ9GkT/ig+jhMvw27motnAe8vKood4ytfcor+dCCS13sE8fg=
=F564
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/lalrae/tags/mips-20150612' into staging
MIPS patches 2015-06-12
Changes:
* improve dp8393x network card and rc4030 chipset emulation
* support misaligned R6 and MSA memory accesses
* support MIPS eXtended and Large Physical Addressing
* add Config5.FRE bit and ERETNC instruction (Config5.LLB)
* support ememsize on MALTA
# gpg: Signature made Fri Jun 12 09:38:11 2015 BST using RSA key ID 0B29DA6B
# gpg: Good signature from "Leon Alrae <leon.alrae@imgtec.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8DD3 2F98 5495 9D66 35D4 4FC0 5211 8E3C 0B29 DA6B
* remotes/lalrae/tags/mips-20150612: (29 commits)
target-mips: enable XPA and LPA features
target-mips: remove misleading comments in translate_init.c
target-mips: add MTHC0 and MFHC0 instructions
target-mips: add CP0.PageGrain.ELPA support
target-mips: support Page Frame Number Extension field
target-mips: extend selected CP0 registers to 64-bits in MIPS32
target-mips: correct MFC0 for CP0.EntryLo in MIPS64
net/dp8393x: fix hardware reset
net/dp8393x: correctly reset in_use field
net/dp8393x: add load/save support
net/dp8393x: add PROM to store MAC address
net/dp8393x: QOM'ify
net/dp8393x: use dp8393x_ prefix for all functions
net/dp8393x: do not use old_mmio accesses
net/dp8393x: always calculate proper checksums
dma/rc4030: convert to QOM
dma/rc4030: use trace events instead of custom logging
dma/rc4030: document register at offset 0x210
dma/rc4030: do not use old_mmio accesses
dma/rc4030: use AddressSpace and address_space_rw in users
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Split qemu_savevm_state_begin to:
qemu_savevm_state_header That writes the initial file header.
qemu_savevm_state_begin That sets up devices and does the first
device pass.
Used later in postcopy.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
For historic reasons, ram migration have been on arch_init.c. Just
split it into migration/ram.c, the same that happened with block.c.
There is only code movement, no changes altogether.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This patch adds the core code for virtio gpu emulation,
covering 2d support.
Written by Dave Airlie and Gerd Hoffmann.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Exit with an error (instead of simply logging a trace event)
whenever the same fw_cfg file name is added multiple times via
one of the fw_cfg_add_file[_callback]() host-side API calls.
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
From this point forward, any guest-side writes to the fw_cfg
data register will be treated as no-ops. This patch also removes
the unused host-side API function fw_cfg_add_callback(), which
allowed the registration of a callback to be executed each time
the guest completed a full overwrite of a given fw_cfg data item.
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This patch adds the code requested to assign interrupts to
a guest. The interrupts are mediated through user handled
eventfds only.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Vikram Sethi <vikrams@codeaurora.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Minimal VFIO platform implementation supporting register space
user mapping but not IRQ assignment.
Signed-off-by: Kim Phillips <kim.phillips@linaro.org>
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Vikram Sethi <vikrams@codeaurora.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This is to reduce VIO noise while debugging PCI DMA.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Introduce a preliminary framework in virt-acpi-build.c with the main
ACPI build functions. It exposes the generated ACPI contents to
guest over fw_cfg.
The required ACPI v5.1 tables for ARM are:
- RSDP: Initial table that points to XSDT
- RSDT: Points to FADT GTDT MADT tables
- FADT: Generic information about the machine
- GTDT: Generic timer description table
- MADT: Multiple APIC description table
- DSDT: Holds all information about system devices/peripherals, pointed by FADT
Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 1432522520-8068-5-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When memory hot unplug fails, this patch adds support to send
QMP event to notify mgmt about this failure.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
- implements QEMU hardware part of memory hot unplug protocol
described at "docs/spec/acpi_mem_hotplug.txt"
- handles memory remove notification event
- handles device eject notification
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch adds tracing code for all SIGP orders (including the destination
vcpu and the resulting condition code).
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Message-Id: <1424783731-43426-6-git-send-email-jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* remotes/qmp-unstable/queue/qmp:
docs: add memory-hotplug.txt
qemu-options.hx: improve -m description
virtio-balloon: Add some trace events
virtio-balloon: Fix balloon not working correctly when hotplug memory
pc-dimm: add a function to calculate VM's current RAM size
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It looks like the dtrace trace code gets upset if you have trace names
with __ in, which the migration/rdma.c code does.
Rename the functions and the associated traces.
Fixes: 733252deb8
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reported-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Andreas Färber <afaerber@suse.de>
Message-id: 1424105885-12149-1-git-send-email-dgilbert@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add trace calls. Convert some #ifdef DEBUG printfs to trace.
Signed-off-by: Don Koch <dkoch@verizon.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
this patch finally introduces multiread support to virtio-blk. While
multiwrite support was there for a long time, read support was missing.
The complete merge logic is moved into virtio-blk.c which has
been the only user of request merging ever since. This is required
to be able to merge chunks of requests and immediately invoke callbacks
for those requests. Secondly, this is required to switch to
direct invocation of coroutines which is planned at a later stage.
The following benchmarks show the performance of running fio with
4 worker threads on a local ram disk. The numbers show the average
of 10 test runs after 1 run as warmup phase.
| 4k | 64k | 4k
MB/s | rd seq | rd rand | rd seq | rd rand | wr seq | wr rand
--------------+--------+---------+--------+---------+--------+--------
master | 1221 | 1187 | 4178 | 4114 | 1745 | 1213
multiread | 1829 | 1189 | 4639 | 4110 | 1894 | 1216
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Turn all the D/DD/DDDPRINTFs into trace events
Turn most of the fprintf(stderr, into error_report
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Mostly on the load side, so that when we get a complaint about
a migration failure we can figure out what it didn't like.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
A bunch of fixes all over the place. Also, beginning to generalize acpi build
code for reuse by ARM.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUx465AAoJECgfDbjSjVRpzewIAI/tzV1oCR1D/YDYBYpiK68W
85JJbyR90DpS9unjrkeUHEnJgkegCk8dMXlWJOlshpwxDw2khC2ol0yS6siwC6Z/
1peL9E5zHz2H8KWfH6JlhqLETovZxjd5Uv3q1mWULvK+zZcPzeQDCky5I8mbEw4b
0LGDGX8mcLlDnit9mnAbgHu7cbqGa0jtXoJTFveKdxQtHdyj4cAg0wCjOLhnEo6s
fJP7K1TJ2Ptiwwlk2cnj8T4Z9AoJkWjpFfr94dST2KqR3z5j8OUZYYhifrZa3e8t
qxO/UatY4IwSnsmWCn/hvzlHZFa03sc9nPIkAlj96j78sHqPafDJxCdnwlX8pF8=
=Kfwr
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc, virtio fixes and cleanups
A bunch of fixes all over the place. Also, beginning to generalize acpi build
code for reuse by ARM.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 27 Jan 2015 13:12:25 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream:
pc-dimm: Add Error argument to pc_existing_dimms_capacity
pc-dimm: Make pc_existing_dimms_capacity global
pc: Fix DIMMs capacity calculation
smbios: Don't report unknown CPU speed (fix SVVP regression)
smbios: Fix dimm size calculation when RAM is multiple of 16GB
bios-linker-loader: move source to common location
bios-linker-loader: move header to common location
virtio: fix feature bit checks
bios-tables-test: split piix4 and q35 tests
acpi: build_append_nameseg(): add padding if necessary
acpi: update generated hex files
acpi-test: update expected DSDT
pc: acpi: fix WindowsXP BSOD when memory hotplug is enabled
pci: Split pcie_host_mmcfg_map()
Add some trace calls to pci.c.
ich9: add disable_s3, disable_s4, s4_val properties
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The ioreq-server API added to Xen 4.5 offers better security than
the existing Xen/QEMU interface because the shared pages that are
used to pass emulation request/results back and forth are removed
from the guest's memory space before any requests are serviced.
This prevents the guest from mapping these pages (they are in a
well known location) and attempting to attack QEMU by synthesizing
its own request structures. Hence, this patch modifies configure
to detect whether the API is available, and adds the necessary
code to use the API if it is.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
A new common module is created. It implements all functions
that have no device specificity (PCI, Platform).
This patch only consists in move (no functional changes)
Signed-off-by: Kim Phillips <kim.phillips@linaro.org>
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
vfio_get_device now takes a VFIODevice as argument. The function is split
into 2 parts: vfio_get_device which is generic and vfio_populate_device
which is bus specific.
3 new fields are introduced in VFIODevice to store dev_info.
vfio_put_base_device is created.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This structure is going to be shared by VFIOPCIDevice and
VFIOPlatformDevice. VFIOBAR includes it.
vfio_eoi becomes an ops of VFIODevice specialized by parent device.
This makes possible to transform vfio_bar_write/read into generic
vfio_region_write/read that will be used by VFIOPlatformDevice too.
vfio_mmap_bar becomes vfio_map_region
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This patch removes all DPRINTF and replace them by trace points.
A few DPRINTF used in error cases were transformed into error_report.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
MSI-X works slightly different than INTx; the doorbell
registers are not necessarily used as MSI-X interrupts
are directed anyway. So the head pointer on the
reply queue needs to be updated as soon as a frame
is completed, and we can set the doorbell only
when in INTx mode.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Windows requires the frames to be unmapped, otherwise we run
into a race condition where the updated frame data is not
visible to the guest.
With that we can simplify the queue algorithm and use a bitmap
for tracking free frames.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Improve queue logging by displaying head and tail pointer
of the completion queue.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The windows driver is sending several init_firmware commands
when in MSI-X mode. It is, however, using only the first
queue. So disregard any additional init_firmware commands
until the HBA is reset.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The EFI firmware doesn't handle unit attentions properly,
so we need to clear the Power On/Reset unit attention upon
initial reset.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
To ease debugging we should be decoding
the register names.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The trace events already contain the function name, so the actual
message doesn't need to contain any of these informations.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Device models should access their block backends only through the
block-backend.h API. Convert them, and drop direct includes of
inappropriate headers.
Just four uses of BlockDriverState are left:
* The Xen paravirtual block device backend (xen_disk.c) opens images
itself when set up via xenbus, bypassing blockdev.c. I figure it
should go through qmp_blockdev_add() instead.
* Device model "usb-storage" prompts for keys. No other device model
does, and this one probably shouldn't do it, either.
* ide_issue_trim_cb() uses bdrv_aio_discard() instead of
blk_aio_discard() because it fishes its backend out of a BlockAIOCB,
which has only the BlockDriverState.
* PC87312State has an unused BlockDriverState[] member.
The next two commits take care of the latter two.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Let QEMU propagate the cpu state to kvm. If kvm doesn't yet support it, it is
silently ignored as kvm will still handle the cpu state itself in that case.
The state is not synced back, thus kvm won't have a chance to actively modify
the cpu state. To do so, control has to be given back to QEMU (which is already
done so in all relevant cases).
Setting of the cpu state can fail either because kvm doesn't support the
interface yet, or because the state is invalid/not supported. Failed attempts
will be traced
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
CC: Andreas Faerber <afaerber@suse.de>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
This patch makes sure that halting a cpu and stopping a cpu are two different
things. Stopping a cpu will also set the cpu halted - this is needed for common
infrastructure to work (note that the stop and stopped flag cannot be used for
our purpose because they are already used by other mechanisms).
A cpu can be halted ("waiting") when it is operating. If interrupts are
disabled, this is called a "disabled wait", as it can't be woken up anymore. A
stopped cpu is treated like a "disabled wait" cpu, but in order to prepare for a
proper cpu state synchronization with the kvm part, we need to track the real
logical state of a cpu.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
CC: Andreas Faerber <afaerber@suse.de>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
kvmclock fix. This was reverted last minute for 2.1, but it is now back
with the problematic case fixed.
Note: I will soon switch to a subkey for signing purposes. To verify
future signed pull requests from me, please update my key with
"gpg --recv-keys 9B4D86F2". You should see 3 new subkeys---the
one for signing will be a 2048-bit RSA key, 4E6B09D7.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJUJXmEAAoJEBvWZb6bTYbyuZoP+gJScjcoHVL27YL6GqCsLzQ+
swk/6TDaboR5+t3Tt1vHOWOvtXmoZlxp+sQU50ltiKS1mwRTsMAhEeTInTNPLJ6z
aKX46QdvNib2bYdUgnl8nqOiKQAZiA6rR8os2lWJ3E8n1n6qBbeWwneQK2YvRaVf
RMjmDm7sjhZt5h868O7SPNDSjk8k5TgYj5MOsDugFxQr6JXGxt8xmB+Ml6v6Ne+y
ufDchXtBhANIkr1aNMeOBYdX+/wuY+fltjnG11xpuAL1a63F/b3EGvPoqUHQCZSd
PzHmkSZV0z3rmgVTpcAHxSw2MxUEgmXLvxWAQTahrzGUi8mH3hF7z7zNqfULZWa1
3hctUdlf9aWbIgdfEdBZqOjol58TYvgCjIa7W2upsvEf7sjctpCZnMmBi8vMM3FQ
qv7S4139jN70+7zNY8CCnXPuOY63rsFTAs8XCPF7DL3HESzm7x8MEFgW7/7iJtwQ
PNJVbJX8+FpDmdAy/fwkMKA8BvjmNBuIZrr9/IfFJ6uM70//brq3OXq25agToBpZ
QYWW5CXYeQXurPBK1Q6XkGx61mNuc5zOveVS+8MhJKw9SiL/cIhQzrh3fgFtIP+e
63LlufFPMA2+fhBJctptIoU9+XeUwalQic92ZqepuXB/OZV6vhhV9/K2zZMUMpl9
B/Gg3cd1w5rMHg6O2kov
=nP/e
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Usual mix of patches, the most important being Alex and Marcelo's
kvmclock fix. This was reverted last minute for 2.1, but it is now back
with the problematic case fixed.
Note: I will soon switch to a subkey for signing purposes. To verify
future signed pull requests from me, please update my key with
"gpg --recv-keys 9B4D86F2". You should see 3 new subkeys---the
one for signing will be a 2048-bit RSA key, 4E6B09D7.
# gpg: Signature made Fri 26 Sep 2014 15:34:44 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: aka "Paolo Bonzini <bonzini@gnu.org>"
* remotes/bonzini/tags/for-upstream:
kvm/valgrind: don't mark memory as initialized
po: fix conflict with %.mo rule in rules.mak
kvmvapic: fix migration when VM paused and when not running Windows
serial: check if backed by a physical serial port at realize time
serial: reset state at startup
target-i386: update fp status fix
hw/dma/i8257: Silence phony error message
kvmclock: Ensure time in migration never goes backward
kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation
Introduce cpu_clean_all_dirty
pit: fix pit interrupt can't inject into vm after migration
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This exceeded the trace argument limit for LTTNG UST and wasn't really
needed as the flags value is stored anyway. Dropping this fixes the
compile failure for UST. It can probably be merged with the previous
trace shortening patch.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Recent traces rework introduced 2 tracepoints with 13 and 20
arguments. When dtrace backend is selected
(--enable-trace-backend=dtrace), compile fails as
sys/sdt.h defines DTRACE_PROBE up to DTRACE_PROBE12 only.
This splits long tracepoints.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
A few files have been renamed without updating their comment here. A
few events have been added in the wrong place. Clean that up.
Comments with no space after the '#' look ugly and confuse
cleanup-trace-events.pl. Insert a space.
scripts/cleanup-trace-events.pl is now happy again.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411476811-24251-5-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Event monitor_protocol_event is unused since commit 7517517. Drop it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411476811-24251-4-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Event megasas_io_read was added in commit e8f943c, but never used.
Drop it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411476811-24251-3-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
iscsi_aio_write16_cb, iscsi_aio_writev, iscsi_aio_read16_cb, and
iscsi_aio_readv have not not been in use since commit
063c3378a9 ("block/iscsi: introduce
bdrv_co_{readv, writev, flush_to_disk}").
These were the only trace events in block/iscsi.c so drop the the
trace.h include.
Cc: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411394595-15300-4-git-send-email-stefanha@redhat.com
This trace event was added in commit
840a178c94 ("usb: mtp filesharing") but
never used.
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411394595-15300-3-git-send-email-stefanha@redhat.com
This trace event has not been in use since commit
b002254dbd ("virtio-blk: Unify
{non-,}dataplane's request handlings").
Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411394595-15300-2-git-send-email-stefanha@redhat.com
This converts many kinds of debug prints to traces.
This implements packets logging to avoid unnecessary calculations if
usb_ohci_td_pkt_short/usb_ohci_td_pkt_long is not enabled.
This makes OHCI errors (such as "DMA error") invisible by default.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Convert into trace event. Otherwise the message
dma: unregistered DMA channel used nchan=0 dma_pos=0 dma_len=1
gets printed every time and fills up the log-file with 50 MiB / minute.
Signed-off-by: Philipp Hahn <hahn@univention.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With this patch the qemu console core stops using PixelFormat and pixman
format codes side-by-side, pixman format code is the primary way to
specify the DisplaySurface format:
* DisplaySurface stops carrying a PixelFormat field.
* qemu_create_displaysurface_from() expects a pixman format now.
Functions to convert PixelFormat to pixman_format_code_t (and back)
exist for those who still use PixelFormat. As PixelFormat allows
easy access to masks and shifts it will probably continue to exist.
[ xenfb added by Benjamin Herrenschmidt ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add some trace events to virtio-rng for easier debugging
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This adds a couple of tcg specific trace-events which are useful for
tracing execution though tcg generated blocks. It's been tested with
lttng user space tracing but is generic enough for all systems. The tcg
events are:
* translate_block - when a subject block is translated
* exec_tb - when a translated block is entered
* exec_tb_exit - when we exit the translated code
* exec_tb_nocache - special case translations
Of course we can only trace the entrance to the first block of a chain
as each block will jump directly to the next when it can. See the -d
nochain patch to allow more complete tracing at the expense of
performance.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We have the experience that the guest doesn't stop successfully
though it was instructed to shut down.
The root cause may be not in QEMU mostly. However, QEMU is often
suspected at the beginning just because the issue occurred in
virtualization environment.
Therefore, we need to affirm that QEMU received the shutdown
request and raised ACPI irq from "virsh shutdown" command,
virt-manger or stopping QEMU process to the VM .
So that we can affirm the problems was belonged to the Guset OS
rather than the QEMU itself.
When we stop guests by "virsh shutdown" command or virt-manger,
or stopping QEMU process, qemu_system_powerdown_request() or
qemu_system_shutdown_request() is called. Then the below functions
in main_loop_should_exit() of Vl.c are called roughly in the
following order.
if (qemu_powerdown_requested())
qemu_system_powerdown()
monitor_protocol_event(QEVENT_POWERDOWN, NULL)
OR
if(qemu_shutdown_requested()}
monitor_protocol_event(QEVENT_SHUTDOWN, NULL);
The tracepoint of monitor_protocol_event() already exists, but no
tracepoints are defined for qemu_system_powerdown_request() and
qemu_system_shutdown_request(). So this patch adds two tracepoints for
the two functions. We believe that it will become much easier to
isolate the problem mentioned above by these tracepoints.
Signed-off-by: Yang Zhiyong <yangzy.fnst@cn.fujitsu.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Currently SPAPR PHB keeps track of all allocated MSI (here and below
MSI stands for both MSI and MSIX) interrupt because
XICS used to be unable to reuse interrupts. This is a problem for
dynamic MSI reconfiguration which happens when guest reloads a driver
or performs PCI hotplug. Another problem is that the existing
implementation can enable MSI on 32 devices maximum
(SPAPR_MSIX_MAX_DEVS=32) and there is no good reason for that.
This makes use of new XICS ability to reuse interrupts.
This reorganizes MSI information storage in sPAPRPHBState. Instead of
static array of 32 descriptors (one per a PCI function), this patch adds
a GHashTable when @config_addr is a key and (first_irq, num) pair is
a value. GHashTable can dynamically grow and shrink so the initial limit
of 32 devices is gone.
This changes migration stream as @msi_table was a static array while new
@msi_devs is a dynamic hash table. This adds temporary array which is
used for migration, it is populated in "spapr_pci"::pre_save() callback
and expanded into the hash table in post_load() callback. Since
the destination side does not know the number of MSI-enabled devices
in advance and cannot pre-allocate the temporary array to receive
migration state, this makes use of new VMSTATE_STRUCT_VARRAY_ALLOC macro
which allocates the array automatically.
This resets the MSI configuration space when interrupts are released by
the ibm,change-msi RTAS call.
This fixed traces to be more informative.
This changes vmstate_spapr_pci_msi name from "...lsi" to "...msi" which
was incorrect by accident. As the internal representation changed,
thus bumps migration version number.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: drop g_malloc_n usage]
Signed-off-by: Alexander Graf <agraf@suse.de>
This implements interrupt release function so IRQs can be returned back
to the pool for reuse in cases such as PCI hot plug.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
The current allocator returns IRQ numbers from a pool and does not
support IRQs reuse in any form as it did not keep track of what it
previously returned, it only keeps the last returned IRQ. Some use
cases such as PCI hot(un)plug may require IRQ release and reallocation.
This moves an allocator from SPAPR to XICS.
This switches IRQ users to use new API.
This uses LSI/MSI flags to know if interrupt is allocated.
The interrupt release function will be posted as a separate patch.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Add mhp_pc_dimm_assigned_slot & mhp_pc_dimm_assigned_address
events to trace which address and slot where assigned to
plugged in PC_DIMM device on target-i386 machine.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add events for tracing accesses to memory hotplug IO ports.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently only single TCE entry per request is supported (H_PUT_TCE).
However PAPR+ specification allows multiple entry requests such as
H_PUT_TCE_INDIRECT and H_STUFF_TCE. Having less transitions to the host
kernel via ioctls, support of these calls can accelerate IOMMU operations.
This implements H_STUFF_TCE and H_PUT_TCE_INDIRECT.
This advertises "multi-tce" capability to the guest if the host kernel
supports it (KVM_CAP_SPAPR_MULTITCE) or guest is running in TCG mode.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Modern Linux kernels support last POWERPC CPUs so when a kernel boots,
in most cases it can find a matching cpu_spec in the kernel's cpu_specs
list. However if the kernel is quite old, it may be missing a definition
of the actual CPU. To provide an ability for old kernels to work on modern
hardware, a Processor Compatibility Mode has been introduced
by the PowerISA specification.
>From the hardware prospective, it is supported by the Processor
Compatibility Register (PCR) which is defined in PowerISA. The register
enables one of the compatibility modes (2.05/2.06/2.07).
Since PCR is a hypervisor privileged register and cannot be
directly accessed from the guest, the mode selection is done via
ibm,client-architecture-support (CAS) RTAS call using which the guest
specifies what "raw" and "architected" CPU versions it supports.
QEMU works out the best match, changes a "cpu-version" property of
every CPU and notifies the guest about the change by setting these
properties in the buffer passed as a response on a custom H_CAS hypercall.
This implements ibm,client-architecture-support parameters parsing
(now only for PVRs) and cooks the device tree diff with new values for
"cpu-version", "ibm,ppc-interrupt-server#s" and
"ibm,ppc-interrupt-server#s" properties.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
The PAPR+ specification defines a ibm,client-architecture-support (CAS)
RTAS call which purpose is to provide a negotiation mechanism for
the guest and the hypervisor to work out the best compatibility parameters.
During the negotiation process, the guest provides an array of various
options and capabilities which it supports, the hypervisor adjusts
the device tree and (optionally) reboots the guest.
At the moment the Linux guest calls CAS method at early boot so SLOF
gets called. SLOF allocates a memory buffer for the device tree changes
and calls a custom KVMPPC_H_CAS hypercall. QEMU parses the options,
composes a diff for the device tree, copies it to the buffer provided
by SLOF and returns to SLOF. SLOF updates the device tree and returns
control to the guest kernel. Only then the Linux guest parses the device
tree so it is possible to avoid unnecessary reboot in most cases.
The device tree diff is a header with an update format version
(defined as 1 in this patch) followed by a device tree with the properties
which require update.
If QEMU detects that it has to reboot the guest, it silently does so
as the guest expects reboot to happen because this is usual pHyp firmware
behavior.
This defines custom KVMPPC_H_CAS hypercall. The current SLOF already
has support for it.
This implements stub which returns very basic tree (root node,
no properties) to the guest.
As the return buffer does not contain any change, no change in behavior is
expected.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
This allows guests to have a different timebase origin from the host.
This is needed for migration, where a guest can migrate from one host
to another and the two hosts might have a different timebase origin.
However, the timebase seen by the guest must not go backwards, and
should go forwards only by a small amount corresponding to the time
taken for the migration.
This is only supported for recent POWER hardware which has the TBU40
(timebase upper 40 bits) register. That includes POWER6, 7, 8 but not
970.
This adds kvm_access_one_reg() to access a special register which is not
in env->spr. This requires kvm_set_one_reg/kvm_get_one_reg patch.
The feature must be present in the host kernel.
This bumps vmstate_spapr::version_id and enables new vmstate_ppc_timebase
only for it. Since the vmstate_spapr::minimum_version_id remains
unchanged, migration from older QEMU is supported but without
vmstate_ppc_timebase.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Exploit the new api for userspace-controlled cmma. If supported, enable
cmma during kvm initialization and register a reset handler for cmma,
which is also called directly from the load IPL code.
The reset functionality is needed to reset the cmma state of the guest
pages, e.g. if a system reset is triggered via qemu monitor; otherwise
this could result in data corruption.
A guest triggered reboot may now lead to multiple cmma resets; this is
OK, however, as this is slowpath anyway and the simplest way to achieve
the intended effects.
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
* remotes/bonzini/scsi-next:
[PATCH] block/iscsi: bump year in copyright notice
block/iscsi: allow cluster_size of 4K and greater
block/iscsi: clarify the meaning of ISCSI_CHECKALLOC_THRES
block/iscsi: speed up read for unallocated sectors
block/iscsi: allow fall back to WRITE SAME without UNMAP
MAINTAINERS: mark megasas as maintained
megasas: Add MSI support
megasas: Enable MSI-X support
megasas: Implement LD_LIST_QUERY
scsi: Improve error messages more
scsi-disk: Improve error messager if can't get version number
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Nasty 0xe0 logic is gone. We map through QKeyCode now, giving us a
nice, readable mapping table.
Quick smoke test in OpenFirmware looks ok. Careful check from arch
maintainers would be very nice, especially on the capslock and numlock
logic. I'm not fully sure whenever I got it translated correctly and
also what it is supposed to do in the first place ...
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
s390x introduced helper functions for getting/setting one_regs with
commit 860643bc. However, nothing about these is s390-specific.
Alexey Kardashevskiy had already posted a general version, so let's
merge the two patches and massage the code a bit.
CC: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some hardware instances do support MSI, so we should do likewise.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Newer firmware implement a LD_LIST_QUERY command, and due to a driver
issue no drives might be detected if this command isn't supported.
So add emulation for this command, too.
Cc: qemu-stable@nongnu.org
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some ONE_REGS on s390 are not protected by a capability. Older kernels
might not provide those and return an error. Fortunately these registers
are only critical for the migration path. There is no need to error out
on reset and normal runtime. Furthermore, these kernels don't provide
a proper dirty bitmap anyway, so let's use tracing for those errors.
Also provide generic one reg helper to simplify the code.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Implementation of a USB Media Transfer Device device for easy
filesharing. Read-only. No access control inside qemu, it will
happily export any file it is able to open to the guest, i.e.
standard unix access rights for the qemu process apply.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This replaces DPRINTF macro with tracepoints.
This moves some messages from migration.c to savevm.c.
This adds tracepoint to signal about fileds failed to migrate.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
The throttling delay calculation was using an inaccurate sector count to
calculate the time to sleep. This broke rate-limiting for the block
mirror job.
Move the delay calculation into mirror_iteration() where we know how
many sectors were transferred. This lets us calculate an accurate delay
time.
Reported-by: Joaquim Barrera <jbarrera@ac.upc.edu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This adds @idstr to savevm_section_start and savevm_section_end
tracepoints.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
It might be useful for tracing migration.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Handle the new CCW_CMD_SET_IND_ADAPTER command enabling adapter interrupts
on guest request. When active, host->guest notifications will be handled
via global_indicator -> queue indicators instead of queue indicators +
subchannel I/O interrupt. Indicators for virtqueues may be present at an
offset.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
This patch introduces the hypervisor call H_GET_TCE which is basically the
reverse of H_PUT_TCE, as defined in the Power Architecture Platform
Requirements (PAPR).
The hcall H_GET_TCE is required by the kdump kernel which is calling it to
retrieve the TCE set up by the panicing kernel.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
PR KVM lacks support of many SPRs in set/get one register API but it does
really break PR KVM. So convert them to switchable traces for now.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
- sclp event facility: cleanup structure. This allows to use
realize/unrealize as well as migration support via vmsd
- reboot: Two fixes that make reboot much more reliable
- ipl: make elf loading more robust
- flic interrupt controller: This allows to migrate floating
interrupts, as well as clear them on reset etc.
- enable async_pf feature of KVM on s390
- several sclp fixes and cleanups
- several sigp fixes and cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJTDwZVAAoJEBF7vIC1phx8lx4P/Rv+UVD9XDFFF8yHuye1am40
NpRGjdarQ/9QUkS4gqyKwYUvIjAClk5id7U2d5zrfdc8XC49AH0ZhVFMdRupaOon
AUqXjOXD5zAh9bfUcewg1EK1P1VuKcp0hyh0jFlIqk9Xmidw8N5guQ6iBoTqGJD5
UYTp0PuSqIjY1RCuF4fCTCurzRd1+J2oKcQBip7BSWlVuWZlg2/hPxoIraLezlz2
huwOU9tkSGXwSRv4C6fCcukEwlqnvkE6W0MCrHrcb2T8xYwAR2Jjs0TsscbKxb+t
lIjZRiCxBrFwOLUqGN8DMYtZPffR+cigZ5bYb4o3PPJ0DQL4vLQVd8SPMPrdJhbb
M7UOaeTclSTQuzmM/Uuc1pmrFc8PDq0dg50dT3weH2bW8aSgyqutYGpmUcm1Q6kq
JLFuyswOBr1vS9o0TlBunP4+TqJJrnGvtIQ4EbRZm7zP78mBaIIrUcAZlbgOI+XI
cSjtFXkBOCz0j28J9GSHrsWMC7RQ179TGdcH/FjDpu0dNDOxH7eH5gZPQoQDAqwC
SjstqJdIFnd0qxOB1EqcgMUxbSqQYq3hoGvJ644ZrMA3T5trBn0fSw3J9ZU/qAK7
EvOKRacMfcacIj4l0aEQgpwqVmktwIYnkfetX/QAKw/4AImJz/R9GRkmYgjCfOH8
/CUfXM71zWLEdv1o5uJ5
=toIt
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/borntraeger/tags/kvm-s390-20140227' into staging
Several features, fixes and cleanups for kvm/s390:
- sclp event facility: cleanup structure. This allows to use
realize/unrealize as well as migration support via vmsd
- reboot: Two fixes that make reboot much more reliable
- ipl: make elf loading more robust
- flic interrupt controller: This allows to migrate floating
interrupts, as well as clear them on reset etc.
- enable async_pf feature of KVM on s390
- several sclp fixes and cleanups
- several sigp fixes and cleanups
* remotes/borntraeger/tags/kvm-s390-20140227: (22 commits)
s390x/ipl: Fix crash of ELF images with arbitrary entry points
s390x/kvm: Rework priv instruction handlers
s390x/kvm: Add missing SIGP CPU RESET order
s390x/kvm: Rework SIGP INITIAL CPU RESET handler
s390x/cpu: Use ioctl to reset state in the kernel
s390-ccw.img: new binary rom to match latest fixes
s390-ccw.img: Fix sporadic errors with ccw boot image - initialize css
s390-ccw.img: Fix sporadic reboot hangs: Initialize next_idx
s390x/event-facility: exploit realize/unrealize
s390x/event-facility: add support for live migration
s390x/event-facility: code restructure
s390x/event-facility: some renaming
s390x/sclp: Fixed setting of condition code register
s390x/sclp: Add missing checks to SCLP handler
s390x/sclp: Fixed the size of sccb and code parameter
s390x/eventfacility: mask out commands
s390x/virtio-hcall: Specification exception for illegal subcodes
s390x/virtio-hcall: Add range check for hypervisor call
s390x/kvm: Fixed bad SIGP SET-ARCHITECTURE handler
s390x/async_pf: Check for apf extension and enable pfault
...
Conflicts:
linux-headers/linux/kvm.h
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch implements a floating-interrupt controller device (flic)
which interacts with the s390 flic kvm_device.
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Introduces two simple functions:
int kvm_device_ioctl(int fd, int type, ...);
int kvm_create_device(KVMState *s, uint64_t type, bool test);
These functions wrap the basic ioctl-based interactions with KVM in a
way similar to other KVM ioctl wrappers.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Message-id: 1392687720-26806-4-git-send-email-christoffer.dall@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
s/offet/offset/
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
n_start can be actually calculated from offset. The number of
sectors to be allocated(n_end - n_start) can be passed in in
num. By removing n_start and n_end, we can save two parameters.
The side effect is there is a bug in qcow2.c:preallocate() that
passes incorrect n_start to qcow2_alloc_cluster_offset() is
fixed. The bug can be triggerred by a larger cluster size than
the default value(65536), for example:
./qemu-img create -f qcow2 \
-o 'cluster_size=131072,preallocation=metadata' file.img 4G
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch adds support for special usb descriptors used by microsoft
windows. They allow more fine-grained control over driver binding and
adding entries to the registry for configuration.
As this is a guest-visible change the "msos-desc" compat property
has been added to turn this off for 1.7 + older
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
# By Paolo Bonzini (17) and others
# Via Stefan Hajnoczi
* stefanha/block: (48 commits)
qemu-iotests: filter QEMU monitor \r\n
aio: make aio_poll(ctx, true) block with no fds
block: clean up bdrv_drain_all() throttling comments
qcow2: use start_of_cluster() and offset_into_cluster() everywhere
qemu-img: decrease progress update interval on convert
qemu-img: round down request length to an aligned sector
qemu-img: dynamically adjust iobuffer size during convert
block/iscsi: set bs->bl.opt_transfer_length
block: add opt_transfer_length to BlockLimits
block/iscsi: set bdi->cluster_size
qemu-img: fix usage instruction for qemu-img convert
qemu-img: add support for skipping zeroes in input during convert
qemu-nbd: add doc for option -f
qemu-iotests: add test for snapshot in qemu-img convert
qemu-img: add -l for snapshot in convert
qemu-iotests: add 058 internal snapshot export with qemu-nbd case
qemu-nbd: support internal snapshot export
snapshot: distinguish id and name in load_tmp
qemu-iotests: Split qcow2 only cases in 048
qemu-iotests: Clean up spaces in usage output
...
Message-id: 1386347807-27359-1-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
Bugfixes for uas emulation.
Add remote wakeup support for ehci.
Add suspend support for xhci.
Misc minor tweaks and fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJSmEXxAAoJEEy22O7T6HE47t0QALonQORRj0IUAH0cOdfAhlQ3
tGMQksBCYevBatKt4iZQgkw6H0jwse6QfsgsG2dfznEO+ZWsrt9cxe1UrqxbK2PN
2PY/I9Ke1iP6tjcf9ftjqt+mZcAg/FHrbua5hb8zXRQnqu2jr0y3Cp7k2Jax4j4d
Zl2FJ+sd4lGNR3Qpb85Muxtii8XERmMqvAit72VN4VAW4iE+SQAFSOgzBC512b55
wLVc6DrbnM8I4AVJQ8RH2pMQau0/aBHFbU8By2RKbymkJmIG2nFqLH6eSJ19QgzY
CmX8yGDJM5LGAGRZCeDSeuilxFU/WCSoTtkL8cPcYUv4cSTm+forzxhVz+CVOeVu
JJsWNkaIxu4mxfRyADjUKkWoKX7ACro3ErfAWHdv8hwuhZ4uD6cf2++nXVDK9dq4
yLL2nR4YG0NTOdQNKrsUbltf9gC5cWqNRgVMJ5VfqIBGtjXdTbpGpcUEFuDDegjk
GhfN8lcpqgnFj0U4fAGLxHYXHvJRpNeWzEEANPuEYnWr2tSrgBWKkYLaooTDHt5r
FUE6lmKL+BzQYnXfWWqh1fZoiBzzrMaT3OkHc2vx/SrGLuO/rVWTzXsFQI+NGPHp
XxuyocFoKZA2yGr9h6eBBp9mtd5y0oOVxBR0WbkgvmbyxkX7Zq9r2PSoDOm26oE3
5kmApAnSij83aT06Qe8P
=2yvC
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'kraxel/tags/pull-usb-1' into staging
Improvements for usb3 bulk stream (usb core, xhci).
Bugfixes for uas emulation.
Add remote wakeup support for ehci.
Add suspend support for xhci.
Misc minor tweaks and fixes.
# gpg: Signature made Thu 28 Nov 2013 11:44:49 PM PST using RSA key ID D3E87138
# gpg: Can't check signature: public key not found
# By Hans de Goede (11) and others
# Via Gerd Hoffmann
* kraxel/tags/pull-usb-1:
usb: move usb_{hi,lo} helpers to header file.
usb: add vendor request defines
trace-events: Clean up after removal of old usb-host code
Revert "usb-tablet: Don't claim wakeup capability for USB-2 version"
ehci: implement port wakeup
xhci: Call usb_device_alloc/free_streams
usb: Add usb_device_alloc/free_streams
usb: Add max_streams attribute to endpoint info
uas: s/ui/iu/
uas: Fix response iu struct definition
uas: Bounds check tags when using streams
uas: Streams are numbered 1-y, rather then 0-x
uas: Fix / cleanup usb_uas_task error handling
uas: Only use report iu-s for task_mgmt status reporting
scsi: Add 2 new sense codes needed by uas
xhci: add support for suspend/resume
xhci: Add a few missing checks for disconnected devices
Message-id: 1385712381-30918-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
Writing zeroes to a file can be done by punching a hole if
MAY_UNMAP is set.
Note that in this case ENOTSUP is not ignored, but makes
the block layer fall back to the generic implementation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This will be used by the SCSI layer.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Commit b5613fd neglected to drop the trace events along with the code.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Update portsc register and raise irq in case a suspended
port is woken up, so remote wakeup works on our ehci ports.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
# By Stefan Weil (8) and others
# Via Michael Tokarev
* mjt/trivial-patches:
tests/.gitignore: ignore test-throttle
exec: Fix broken build for MinGW (regression)
kvm: Fix compiler warning (clang)
tcg-sparc: Fix parenthesis warning
Makefile: Remove some more files when cleaning
target-i386: Fix segment cache dump
iov: avoid "orig_len may be used unitialized" warning
vscclient: remove unnecessary use of uninitialized variable
trace-events: Clean up with scripts/cleanup-trace-events.pl again
tci: Fix qemu-alpha on 32 bit hosts (wrong assertions)
*-user: Improve documentation for lock_user function
MAINTAINERS: Add missing entry to filelist for TCI target
translate-all: Fix formatting of dump output
*-user: Fix typo in comment (ulocking -> unlocking)
docs: Fix IO port number for CPU present bitmap.
q35: Fix typo in constant DEFUALT -> DEFAULT.
configure: Undefine _FORTIFY_SOURCE prior using it
Message-id: 1379696296-32105-1-git-send-email-mjt@msgid.tls.msk.ru
Event qxl_render_blit_guest_primary_initialized is unused since commit
c58c7b9, drop it.
Commit 42e5b4c moved hw/ppc/xics.c to hw/intc/xics.c without updating
the comment in trace-events.
"scripts/cleanup-trace-events.pl trace-events | diff trace-events" is
now clean again.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
KVM request types are normally defined using hex constants but QEMU traces
print decimal values instead, which is not very convenient.
This changes the request type format from %d to %x.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This includes pc and pci cleanups and enhancements,
and a virtio bugfix for level interrupts.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQEcBAABAgAGBQJSIveoAAoJECgfDbjSjVRp2C8IAL7DE0oM0jfEB5DAd8jlULHx
hA8RP21rFzyU8PwtHB+72+C1ImldBge4hvhI+qbsm6PoW3RCeV/lbESIRTiv8dCO
pGUOFmv8MfJAH+WWFsle5mRisoTksYQWWBMHCOqvmaY4JL9pBQOhCLHVhV1XfjtL
hO7uGrWmlijeILv5CxYyPMYuOEdVvRSZKzE+Fp2YKfNstiQrS5fJIlqmwCHrlneW
l2atnt2d9ZV1K8QYiGg4GRVbSAMJvA1wum+0F4gnXIz9yAeOt+Ht1s8cNKQDMouJ
r2OyVgPM9aS/XaO6ejct1Sjo7Vgh/Ublrpw3lFqV/qHix6rEHwy2I3JHFEJPjvk=
=SytJ
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'mst/tags/for_anthony' into staging
pc,pci,virtio fixes and cleanups
This includes pc and pci cleanups and enhancements,
and a virtio bugfix for level interrupts.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Sun 01 Sep 2013 03:15:36 AM CDT using RSA key ID D28D5469
# gpg: Can't check signature: public key not found
# By Michael S. Tsirkin (3) and others
# Via Michael S. Tsirkin
* mst/tags/for_anthony:
virtio_pci: fix level interrupts with irqfd
pc: reduce duplication, fix PIIX descriptions
hw: Clean up bogus default boot order
pci: add config space access traces
pc: fix regression for 64 bit PCI memory
pci: Introduce helper to retrieve a PCI device's DMA address space
Message-id: 1378023590-11109-1-git-send-email-mst@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
This converts old style fprintf to traces.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: change patch subject]
Signed-off-by: Alexander Graf <agraf@suse.de>
This adds pci_cfg_read and pci_cfg_write traces for config spaces
accesses.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is quite handy to debug softmmu targets.
Reviewed-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1375016242-32651-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Introduces a new Xen PV PCI device which will act as a binding point for
PV drivers for Xen.
The device has parameterized vendor-id, device-id and revision to allow to
be configured as a binding point for any vendor's PV drivers.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
They're all wrong since (at least) Paolo's big source tree
reorganization. Need to shuffle some event declarations around to
keep them under the correct source file comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Dropped event Unused since
mirror_cow 884fea4
paio_complete 47e6b25
paio_cancel 47e6b25
usb_ehci_data 0ce668b
megasas_qf_dequeue never used
megasas_handle_frame never used
megasas_io_continue never used
megasas_iovec_map_failed never used
megasas_dcmd_map_failed never used
milkymist_softusb_mouse_event 4c15ba9
xen_map_block 6506e4f
xen_unmap_block 6506e4f
qemu_spice_start 67be672
qemu_spice_stop 67be672
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If a user chooses to turn on the auto-converge migration capability
these changes detect the lack of convergence and throttle down the
guest. i.e. force the VCPUs out of the guest for some duration
and let the migration thread catchup and help converge.
Verified the convergence using the following :
- Java Warehouse workload running on a 20VCPU/256G guest(~80% busy)
- OLTP like workload running on a 80VCPU/512G guest (~80% busy)
Sample results with Java warehouse workload : (migrate speed set to 20Gb and
migrate downtime set to 4seconds).
(qemu) info migrate
capabilities: xbzrle: off auto-converge: off <----
Migration status: active
total time: 1487503 milliseconds
expected downtime: 519 milliseconds
transferred ram: 383749347 kbytes
remaining ram: 2753372 kbytes
total ram: 268444224 kbytes
duplicate: 65461532 pages
skipped: 64901568 pages
normal: 95750218 pages
normal bytes: 383000872 kbytes
dirty pages rate: 67551 pages
---
(qemu) info migrate
capabilities: xbzrle: off auto-converge: on <----
Migration status: completed
total time: 241161 milliseconds
downtime: 6373 milliseconds
transferred ram: 28235307 kbytes
remaining ram: 0 kbytes
total ram: 268444224 kbytes
duplicate: 64946416 pages
skipped: 64903523 pages
normal: 7044971 pages
normal bytes: 28179884 kbytes
Signed-off-by: Chegu Vinod <chegu_vinod@hp.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
backup_start() creates a block job that copies a point-in-time snapshot
of a block device to a target block device.
We call backup_do_cow() for each write during backup. That function
reads the original data from the block device before it gets
overwritten. The data is then written to the target device.
Currently backup cluster size is hardcoded to 65536 bytes.
[I made a number of changes to Dietmar's original patch and folded them
in to make code review easy. Here is the full list:
* Drop BackupDumpFunc interface in favor of a target block device
* Detect zero clusters with buffer_is_zero() and use bdrv_co_write_zeroes()
* Use 0 delay instead of 1us, like other block jobs
* Unify creation/start functions into backup_start()
* Simplify cleanup, free bitmap in backup_run() instead of cb
* function
* Use HBitmap to avoid duplicating bitmap code
* Use bdrv_getlength() instead of accessing ->total_sectors
* directly
* Delete the backup.h header file, it is no longer necessary
* Move ./backup.c to block/backup.c
* Remove #ifdefed out code
* Coding style and whitespace cleanups
* Use bdrv_add_before_write_notifier() instead of blockjob-specific hooks
* Keep our own in-flight CowRequest list instead of using block.c
tracked requests. This means a little code duplication but is much
simpler than trying to share the tracked requests list and use the
backup block size.
* Add on_source_error and on_target_error error handling.
* Use trace events instead of DPRINTF()
-- stefanha]
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
# By Paolo Bonzini (11) and others
# Via Paolo Bonzini
* bonzini/iommu-for-anthony:
memory: clean up phys_page_find
memory: populate FlatView for new address spaces
memory: limit sections in the radix tree to the actual address space size
s390x: reduce TARGET_PHYS_ADDR_SPACE_BITS to 62
memory: fix address space initialization/destruction
memory: make memory_global_sync_dirty_bitmap take an AddressSpace
memory: do not duplicate memory_region_destructor_none
memory: Rename readable flag to romd_mode
memory: Replace open-coded memory_region_is_romd
memory: allow memory_region_find() to run on non-root memory regions
memory: assert that PhysPageEntry's ptr does not overflow
exec: eliminate stq_phys_notdirty
exec: make qemu_get_ram_ptr private
exec: eliminate qemu_put_ram_ptr
exec: remove obsolete comment
Message-id: 1369414987-8839-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
qemu_co_queue_next(&queue) arranges that the next queued coroutine is
run at a later point in time. This deferred restart is useful because
the caller may not want to transfer control yet.
This behavior was implemented using QEMUBH in the past, which meant that
CoQueue (and hence CoMutex and CoRwlock) had a dependency on the
AioContext event loop. This hidden dependency causes trouble when we
move to a world with multiple event loops - now qemu_co_queue_next()
needs to know which event loop to schedule the QEMUBH in.
After pondering how to stash AioContext I realized the best solution is
to not use AioContext at all. This patch implements the deferred
restart behavior purely in terms of coroutines and no longer uses
QEMUBH.
Here is how it works:
Each Coroutine has a wakeup queue that starts out empty. When
qemu_co_queue_next() is called, the next coroutine is added to our
wakeup queue. The wakeup queue is processed when we yield or terminate.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Commit e7a09b92b7 added a trace at each
memory freeing, but unfortunately inverted size and pointer when printing
them. Fix trace.
This also led to a compilation error on 32 bit hosts:
In file included from include/trace.h:4:0,
from trace/generated-events.c:3:
./trace/generated-tracers.h: In function ‘trace_qemu_anon_ram_free’:
./trace/generated-tracers.h:64:9: error: format ‘%zu’ expects argument of type
‘size_t’, but argument 3 has type ‘void *’ [-Werror=format]
./trace/generated-tracers.h:64:9: error: format ‘%p’ expects argument of type
‘void *’, but argument 4 has type ‘size_t’ [-Werror=format]
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1369045989-14016-1-git-send-email-hpoussin@reactos.org
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We switched from qemu_memalign to mmap() but then we don't modify
qemu_vfree() to do a munmap() over free(). Which we cannot do
because qemu_vfree() frees memory allocated by qemu_{mem,block}align.
Introduce a new function that does the munmap(), luckily the size is
available in the RAMBlock.
Reported-by: Amos Kong <akong@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Message-id: 1368454796-14989-3-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This is preparatory to the introduction of a separate freeing API.
Reported-by: Amos Kong <akong@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Message-id: 1368454796-14989-2-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This provides a way to detect the cast that leads to a (reproducible)
crash even when QOM cast debugging is disabled.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1368188203-3407-6-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch enable us to know exit reason of KVM_RUN. It will help us
know where the trouble is caused.
Signed-off-by: Kazuya Saito <saito.kazuya@jp.fujitsu.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch adds tracepoints at ioctl to kvm. Tracing these ioctl is
useful for clarification whether the cause of troubles is qemu or kvm.
Signed-off-by: Kazuya Saito <saito.kazuya@jp.fujitsu.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This fixes the following error:
In file included from qemu/include/trace.h:4:0,
from trace/generated-events.c:3:
./trace/generated-tracers.h: In function ‘trace_pvscsi_get_sg_list’:
./trace/generated-tracers.h:4271:9: error: format ‘%lu’ expects argument of
type ‘long unsigned int’, but argument 4 has type ‘size_t’ [-Werror=format]
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Report the supported speeds for device and port in the error message.
Also add the speeds to the tracepoint. And while being at it drop
the redundant error message in usb_desc_attach, usb_device_attach will
report the error anyway.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Yan Vugenfirer <yan@daynix.com>
[ Rename files to vmw_pvscsi, fix setting of hostStatus in
pvscsi_request_cancelled - Paolo ]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reimplement usb-host on top of libusb.
Reasons to do this:
(1) Largely rewritten from scratch, nice opportunity to kill historical
cruft.
(2) Offload usbfs handling to libusb.
(3) Have a single portable code base instead of bsd + linux variants.
(4) Bring usb-host support to any platform supported by libusbx.
For now this goes side-by-side to the existing code. That is only to
simplify regression testing though, at the end of the day I want remove
the old code and support libusb exclusively. Merge early in 1.5 cycle,
remove the old code after 1.5 release or something like this.
Thanks to qdev the old and new code can coexist nicely on linux. Just
use "-device usb-host-linux" to use the old linux driver instead of the
libusb one (which takes over the "usb-host" name).
The bsd driver isn't qdev'ified so it isn't that easy for bsd.
I didn't bother making it runtime switchable, so you have to rebuild
qemu with --disable-libusb to get back the old code.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Check for port reset first and skip everything else then.
Add sanity checks for PLS updates.
Add PLC notification when entering PLS_U0 state.
This gets host-initiated port resume going on win8.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Make gui update rate adaption code in gui_update() actually work.
Sprinkle in a tracepoint so you can see the code at work. Remove
the update rate adaption code in vnc and make vnc simply use the
generic bits instead.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Hardcode depth to 32 bpp. It effectively was that way before because
that is the default surface depth, this just makes it explicit in the
code.
Rename depth to new_depth to make it consistent with the new_width +
new_height names. In theory we can make new_depth changeable (i.e.
allow the guest to fill in -- say -- 16 there). In practice the guests
don't try, the X-Server refuses to start if you ask it to use 16bpp
depth (via DefaultDepth in the Screen section).
Always return the correct rmask+gmask+bmask values for the given
new_depth.
Fix mode setting to also verify at new_depth to make sure we have a
correct DisplaySurface, even if the current video mode happes to be
16bpp (set by vgabios via bochs vbe interface). While being at it
switch over to use qemu_create_displaysurface_from, so the surface is
backed by guest-visible video memory and we save a memcpy.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
# By Kevin Wolf (22) and Peter Lieven (1)
# Via Stefan Hajnoczi
* stefanha/block: (23 commits)
block: Fix direct use of protocols as driver for bdrv_open()
qcow2: Gather clusters in a looping loop
qcow2: Move cluster gathering to a non-looping loop
qcow2: Allow requests with multiple l2metas
qcow2: Use byte granularity in qcow2_alloc_cluster_offset()
qcow2: Prepare handle_alloc/copied() for byte granularity
qcow2: handle_copied(): Implement non-zero host_offset
qcow2: handle_copied(): Get rid of keep_clusters parameter
qcow2: handle_copied(): Get rid of nb_clusters parameter
qcow2: Factor out handle_copied()
qcow2: Clean up handle_alloc()
qcow2: Finalise interface of handle_alloc()
qcow2: handle_alloc(): Get rid of keep_clusters parameter
qcow2: handle_alloc(): Get rid of nb_clusters parameter
qcow2: Factor out handle_alloc()
qcow2: Decouple cluster allocation from cluster reuse code
qcow2: Change handle_dependency to byte granularity
qcow2: Improve check for overlapping allocations
qcow2: Handle dependencies earlier
qcow2: Remove bogus unlock of s->lock
...
This patch enables us to know RunState transition. It will be userful
for investigation when the trouble occured in special event such like
live migration, shutdown, suspend, and so on.
Signed-off-by: Kazuya Saito <saito.kazuya@jp.fujitsu.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Decouple DisplaySurface allocation & deallocation from DisplayState.
Replace dpy_gfx_resize + dpy_gfx_setdata with a dpy_gfx_replace_surface
function.
This handles the graphic hardware emulation.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Split callbacks into separate Ops struct. Pass DisplayChangeListener
pointer as first argument to all callbacks. Uninline a bunch of
display functions and move them from console.h to console.c
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Move global variables into a struct so multiple thread pools can be
supported in the future.
This patch does not change thread-pool.h interfaces. There is still a
global thread pool and it is not yet possible to create/destroy
individual thread pools. Moving the variables into a struct first makes
later patches easier to review.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
This patch allows to specify multiple directories where qemu should look
for data files. To implement that the behavior of the -L switch is
slightly different now: Instead of replacing the data directory the
path specified will be appended to the data directory list. So when
specifiying -L multiple times all directories specified will be checked,
in the order they are specified on the command line, instead of just the
last one.
Additionally the default paths are always appended to the directory
data list. This allows to specify a incomplete directory (such as the
seabios out/ directory) via -L. Anything not found there will be loaded
from the default paths, so you don't have to create a symlink farm for
all the rom blobs.
For trouble-shooting a tracepoint has been added, logging which blob
has been loaded from which location.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1362739344-8068-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add streams support to the xhci emulation. No secondary streams yet,
only linear stream arays are supported for now.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add a new virtio transport that uses channel commands to perform
virtio operations.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Provide a mechanism for qemu to provide fully virtual subchannels to
the guest.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Provide handlers for (most) channel I/O instructions.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Yet another optimization is to extend the mirroring iteration to include more
adjacent dirty blocks. This limits the number of I/O operations and makes
mirroring efficient even with a small granularity. Most of the infrastructure
is already in place; we only need to put a loop around the computation of
the origin and sector count of the iteration.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
With AIO support in place, we can start copying more than one chunk
in parallel. This patch introduces the required infrastructure for
this: the buffer is split into multiple granularity-sized chunks,
and there is a free list to access them.
Because of copy-on-write, a single operation may already require
multiple chunks to be available on the free list.
In addition, two different iterations on the HBitmap may want to
copy the same cluster. We avoid this by keeping a bitmap of in-flight
I/O operations, and blocking until the previous iteration completes.
This should be a pretty rare occurrence, though; as long as there is
no overlap the next iteration can start before the previous one finishes.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is really no change in the behavior of the job here, since
there is still a maximum of one in-flight I/O operation between
the source and the target. However, this patch already introduces
the AIO callbacks (which are unmodified in the next patch)
and some of the logic to count in-flight operations and only
complete the job when there is none.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When mirroring runs, the backing files for the target may not yet be
ready. However, this means that a copy-on-write operation on the target
would fill the missing sectors with zeros. Copy-on-write only happens
if the granularity of the dirty bitmap is smaller than the cluster size
(and only for clusters that are allocated in the source after the job
has started copying). So far, the granularity was fixed to 1MB; to avoid
the problem we detected the situation and required the backing files to
be available in that case only.
However, we want to lower the granularity for efficiency, so we need
a better solution. The solution is to always copy a whole cluster the
first time it is touched. The code keeps a bitmap of clusters that
have already been allocated by the mirroring job, and only does "manual"
copy-on-write if the chunk being copied is zero in the bitmap.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This actually uses the dirty bitmap in the block layer, and converts
mirroring to use an HBitmapIter.
Reviewed-by: Laszlo Ersek <lersek@redhat.com> (except block/mirror.c parts)
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
HBitmaps provides an array of bits. The bits are stored as usual in an
array of unsigned longs, but HBitmap is also optimized to provide fast
iteration over set bits; going from one bit to the next is O(logB n)
worst case, with B = sizeof(long) * CHAR_BIT: the result is low enough
that the number of levels is in fact fixed.
In order to do this, it stacks multiple bitmaps with progressively coarser
granularity; in all levels except the last, bit N is set iff the N-th
unsigned long is nonzero in the immediately next level. When iteration
completes on the last level it can examine the 2nd-last level to quickly
skip entire words, and even do so recursively to skip blocks of 64 words or
powers thereof (32 on 32-bit machines).
Given an index in the bitmap, it can be split in group of bits like
this (for the 64-bit case):
bits 0-57 => word in the last bitmap | bits 58-63 => bit in the word
bits 0-51 => word in the 2nd-last bitmap | bits 52-57 => bit in the word
bits 0-45 => word in the 3rd-last bitmap | bits 46-51 => bit in the word
So it is easy to move up simply by shifting the index right by
log2(BITS_PER_LONG) bits. To move down, you shift the index left
similarly, and add the word index within the group. Iteration uses
ffs (find first set bit) to find the next word to examine; this
operation can be done in constant time in most current architectures.
Setting or clearing a range of m bits on all levels, the work to perform
is O(m + m/W + m/W^2 + ...), which is O(m) like on a regular bitmap.
When iterating on a bitmap, each bit (on any level) is only visited
once. Hence, The total cost of visiting a bitmap with m bits in it is
the number of bits that are set in all bitmaps. Unless the bitmap is
extremely sparse, this is also O(m + m/W + m/W^2 + ...), so the amortized
cost of advancing from one bit to the next is usually constant.
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Many callers pass size_t, which gets silently truncated to uint32_t.
Harmless, because all practical sizes are well below 4GiB. Clean it
up anyway. Size overflow now fails assertions.
Bonus: saves a whole bunch of silly casts.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
virtio-blk-data-plane is a subset implementation of virtio-blk. It only
handles read, write, and flush requests. It does this using a dedicated
thread that executes an epoll(2)-based event loop and processes I/O
using Linux AIO.
This approach performs very well but can be used for raw image files
only. The number of IOPS achieved has been reported to be several times
higher than the existing virtio-blk implementation.
Eventually it should be possible to unify virtio-blk-data-plane with the
main body of QEMU code once the block layer and hardware emulation is
able to run outside the global mutex.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The virtio-blk-data-plane cannot access memory using the usual QEMU
functions since it executes outside the global mutex and the memory APIs
are this time are not thread-safe.
This patch introduces a virtqueue module based on the kernel's vhost
vring code. The trick is that we map guest memory ahead of time and
access it cheaply outside the global mutex.
Once the hardware emulation code can execute outside the global mutex it
will be possible to drop this code.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add a new spice chardev to allow arbitrary communication between the
host and the Spice client via the spice server.
Examples:
This allows the Spice client to have a special port for the qemu
monitor:
... -chardev spiceport,name=org.qemu.monitor,id=monitorport
-mon chardev=monitorport
v2:
- remove support for chardev to chardev linking
- conditionnaly compile with SPICE_SERVER_VERSION
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This patch adds tracing / debugging calls to the XICS interrupt controller
implementation used on the pseries machine.
Signed-off-by: Ben Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Now that we have separate status and length fields in USBPacket
update the completion tracepoint to log both.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Starting with commit 1c380f9460 dma
transfers can actually fail. This patch makes ehci keep track
of the busmaster bit in pci config space, by setting/clearing the
dma_context pointer. Attempts to dma without context will result
in raising HSE (Host System Error) interrupt and stopping the host
controller.
This patch fixes WinXP not booting with a usb stick attached to ehci.
Root cause is seabios activating ehci so you can boot from the stick,
and WinXP clearing the busmaster bit before resetting the host
controller, leading to ehci actually trying dma while it is disabled.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
* kraxel/usb.69: (31 commits)
usb-redir: Allow redirecting super speed devices to high speed controllers
usb-redir: Allow to attach USB 2.0 devices to 1.1 host controller
usb-redir: Use reject rather the disconnect on bad ep info
usb-redir: Add an usbredir_setup_usb_eps() helper function
usb-redir: Add support for input pipelining
usb-redir: Add support for 32 bits bulk packet length
combined-packet: Add a workaround for Linux usbfs + live migration
usb: Add packet combining functions
uhci: Don't crash on device disconnect
uhci: Add a uhci_handle_td_error() helper function
usb/ehci-pci: add helper to create ich9 usb controllers
usb/ehci-pci: add ich9 00:1a.* variant
usb/ehci-pci: dynamic type generation
uhci: add ich9 00:1a.* variants
uhci: stick irq routing info into UHCIInfo too.
uhci: dynamic type generation
xilinx_zynq: add USB controllers
usb/ehci: add sysbus variant
usb/ehci: split into multiple source files
usb/ehci: Guard definition of EHCI_DEBUG
...
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add a generic thread-pool. The code is roughly based on posix-aio-compat.c,
with some changes, especially the following:
- use QemuSemaphore instead of QemuCond;
- separate the state of the thread from the return code of the worker
function. The return code is totally opaque for the thread pool;
- do not busy wait when doing cancellation.
A more generic threadpool (but still specific to I/O so that in the future
it can use special scheduling classes or PI mutexes) can have many uses:
it allows more flexibility in raw-posix.c and can more easily be extended
to Win32, and it will also be used to do an msync of the persistent bitmap.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* kraxel/usb.68: (36 commits)
xhci: fix usb name in caps
xhci: make number of interrupters and slots configurable
xhci: allow disabling interrupters
xhci: flush endpoint context unconditinally
xhci: fix function name in error message
uhci: Use only one queue for ctrl endpoints
uhci: Retry to fill the queue while waiting for td completion
uhci: Always mark a queue valid when we encounter it
uhci: When the guest marks a pending td non-active, cancel the queue
uhci: Detect guest td re-use
uhci: Verify queue has not been changed by guest
uhci: Immediately free queues on device disconnect
uhci: Store ep in UHCIQueue
uhci: Make uhci_fill_queue() actually operate on an UHCIQueue
uhci: Add uhci_read_td() helper function
uhci: Rename UHCIAsync->td to UHCIAsync->td_addr
uhci: Move emptying of the queue's asyncs' queue to uhci_queue_free
uhci: Drop unnecessary forward declaration of some static functions
uhci: Don't retry on error
uhci: cleanup: Add an unlink call to uhci_async_cancel()
...
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
According to the spec a guest can unlink a qh, and then as soon as frindex
has changed by 1 since the unlink, assume it is idle and re-use it. However
for various reasons, we cannot simply consider a qh as unlinked if we've not
seen it for 1 frame. This means that it is possible for a guest to re-use /
restart the queue while we still see its old state. This patch adds a safety
check for this, and "early" retires queues when they were changed by the guest.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This patch adds the implementation of a new job that mirrors a disk to
a new image while letting the guest continue using the old image.
The target is treated as a "black box" and data is copied from the
source to the target in the background. This can be used for several
purposes, including storage migration, continuous replication, and
observation of the guest I/O in an external program. It is also a
first step in replacing the inefficient block migration code that is
part of QEMU.
The job is possibly never-ending, but it is logically structured into
two phases: 1) copy all data as fast as possible until the target
first gets in sync with the source; 2) keep target in sync and
ensure that reopening to the target gets a correct (full) copy
of the source data.
The second phase is indicated by the progress in "info block-jobs"
reporting the current offset to be equal to the length of the file.
When the job is cancelled in the second phase, QEMU will run the
job until the source is clean and quiescent, then it will report
successful completion of the job.
In other words, the BLOCK_JOB_CANCELLED event means that the target
may _not_ be consistent with a past state of the source; the
BLOCK_JOB_COMPLETED event means that the target is consistent with
a past state of the source. (Note that it could already happen
that management lost the race against QEMU and got a completion
event instead of cancellation).
It is not yet possible to complete the job and switch over to the target
disk. The next patches will fix this and add many refinements to the
basic idea introduced here. These include improved error management,
some tunable knobs and performance optimizations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>