qcow2 writes a cluster reference count on every cluster update. This causes
performance to crater when using anything but cache=writeback. This is most
noticeable when using savevm. Right now, qcow2 isn't a reliable format
regardless of the type of cache your using because metadata is not updated in
the correct order. Considering this, I think it's somewhat reasonable to use
writeback caching by default with qcow2 files.
It at least avoids the massive performance regression for users until we sort
out the issues in qcow2.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5879 c046a42c-6fe2-441c-8c8c-71466251a162
Balloon devices allow you to ask the guest to allocate memory. This allows you
to release that memory. It's mostly useful for freeing up large chunks of
memory from cooperative guests.
Ballooning is supported by both Xen and VirtIO.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5873 c046a42c-6fe2-441c-8c8c-71466251a162
Virtio-blk is a paravirtual block device based on VirtIO. It can be used by
specifying the if=virtio parameter to the -drive parameter.
When using -enable-kvm, it can achieve very good performance compared to IDE or
SCSI.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5870 c046a42c-6fe2-441c-8c8c-71466251a162
Changeset r5636 changed the timers to run in the alarm callback. The
alarm callback can only be called as frequently as the host alarm timer
fires. For older Linux hosts and possibly non-Linux hosts, this can be
as high as a 1ms granularity.
icount calculates the select timeout based on the next deadline and
select is usually capable of sleeping for a short period of time than
alarm timer granularity. This means that changing the timer callbacks
to be based on alarm firing caused timers to fire much later than they
ought to when using icount.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5796 c046a42c-6fe2-441c-8c8c-71466251a162
This patch enhances QEMU's built-in debugger for SMP guest debugging.
Using the thread support of the gdb remote protocol, each VCPU is mapped
on a pseudo thread and exposed to the gdb frontend. This way you can
easy switch the focus of gdb between the VCPUs and observe their states.
On breakpoint hit, the focus is automatically adjusted just as for
normal multi-threaded application under gdb control.
Furthermore, the patch propagates breakpoint and watchpoint insertions
or removals to all CPUs, not just the current one as it was the case so
far. Without this, SMP guest debugging was practically unfeasible.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5743 c046a42c-6fe2-441c-8c8c-71466251a162
This is pure code motion. The savevm code is all common code so we can build
it once and share the object with all executables.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5700 c046a42c-6fe2-441c-8c8c-71466251a162
KVM's live migration support included support for exec: URLs, allowing system
state to be written or received via an arbitrary popen()ed subprocess. This
provides a convenient way to pipe state through a compression algorithm or an
arbitrary network transport on its way to its destination, and a convenient way
to write state to disk; libvirt's qemu driver currently uses migration to exec:
targets for this latter purpose.
This version of the patch refactors now-common code from migrate-tcp.c into
migrate.c.
Signed-off-by: Charles Duffy <Charles_Duffy@messageone.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5694 c046a42c-6fe2-441c-8c8c-71466251a162
Currently tap does not generate signals on I/O; this causes
network latency to be dependent on the timer tick (1ms without
dyntick, guest dependent with dyntick). By generating a signal
on I/O, we can inform the guest immediately that a packet has
arrived.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5688 c046a42c-6fe2-441c-8c8c-71466251a162
host_alarm_timer fires in a separate thread. The windows build current
uses SetEvent() and WaitEvent() to then notify the main thread. This is
functionally equivalent to what we're doing in Unix with pipe(). So let's
just #ifdef the pipe() code on Windows since it doesn't build there anyway.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5637 c046a42c-6fe2-441c-8c8c-71466251a162
This further cleans up the main loop getting it a lot closer to what a main
loop should be.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5636 c046a42c-6fe2-441c-8c8c-71466251a162
Changing the default IO timeout to 5 s (#5578) made a race visible
between the alarm_timer and select() in main_loop_wait(): If the timer
fired before select was able to block, the full select() timeout could
have been applied instead of returning immediately. Since #5578, this
causes heavy problems to the Musicpal board emulation with stalls up to
5 s, but also with some older Linux guest kernels.
The following patch introduces a pipe that is written to by
host_alarm_handler and select()'ed in main_loop_wait(). This avoids
prevents that select() blocks though a timer has fired and waits for
processing.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5633 c046a42c-6fe2-441c-8c8c-71466251a162
This patch adds very basic KVM support. KVM is a kernel module for Linux that
allows userspace programs to make use of hardware virtualization support. It
current supports x86 hardware virtualization using Intel VT-x or AMD-V. It
also supports IA64 VT-i, PPC 440, and S390.
This patch only implements the bare minimum support to get a guest booting. It
has very little impact the rest of QEMU and attempts to integrate nicely with
the rest of QEMU.
Even though this implementation is basic, it is significantly faster than TCG.
Booting and shutting down a Linux guest:
w/TCG: 1:32.36 elapsed 84% CPU
w/KVM: 0:31.14 elapsed 59% CPU
Right now, KVM is disabled by default and must be explicitly enabled with
-enable-kvm. We can enable it by default later when we have had better
testing.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5627 c046a42c-6fe2-441c-8c8c-71466251a162
It is safe not to set dpy_refresh and that's used to indicate that the display
doesn't need updates. This saves us two wakeups per second.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5583 c046a42c-6fe2-441c-8c8c-71466251a162
The motivating goal behind this is to allow other tools to use the CharDriver
code. This patch is pure code motion except for the Makefile changes and the
copyright/header in qemu-char.c.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5580 c046a42c-6fe2-441c-8c8c-71466251a162
The goal of this series is to move the CharDriverState code out of vl.c and
into its own file, qemu-char.c. This patch moves around some declarations so
the next patch can be pure code motion.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5579 c046a42c-6fe2-441c-8c8c-71466251a162
With the recent changes to the main loop, we no longer have unconditional
polling. This means we can now sleep in select() for much longer than we
previously did. This patch increases our select() sleep time from 10ms to 5s
which is effectively unlimited since we're going to wake up sooner than that
in almost all circumstances.
With this patch, I see the number of wake-ups with an idle dynamic ticks guest
drop from 80 per second to about 15 times per second.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5578 c046a42c-6fe2-441c-8c8c-71466251a162
Tidy up win32 main loop bits, allow timeout >= 1s, and force timeout to 0 if
there is a pending bottom half.
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5577 c046a42c-6fe2-441c-8c8c-71466251a162
This patch makes qemu keep track of the character devices in use and
implements a "info chardev" monitor command to print a list.
qemu_chr_open() sticks the devices into a linked list now. It got a new
argument (label), so there is a name for each device. It also assigns a
filename to each character device. By default it just copyes the
filename passed in. Individual drivers can fill in something else
though. qemu_chr_open_pty() sets the filename to name of the pseudo tty
allocated.
Output looks like this:
(qemu) info chardev
monitor: filename=unix:/tmp/run.sh-26827/monitor,server,nowait
serial0: filename=unix:/tmp/run.sh-26827/console,server
serial1: filename=pty:/dev/pts/5
parallel0: filename=vc:640x480
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5575 c046a42c-6fe2-441c-8c8c-71466251a162
The current DMA routines are driven by a call in main_loop_wait() after every
select.
This patch converts the DMA code to be driven by a constantly rescheduled
bottom half. The advantage of using a scheduled bottom half is that we can
stop scheduling the bottom half when there no DMA channels are runnable. This
means we can potentially detect this case and sleep longer in the main loop.
The only two architectures implementing DMA_run() are cris and i386. For cris,
I converted it to a simple repeating bottom half. I've only compile tested
this as cris does not seem to work on a 64-bit host. It should be functionally
identical to the previous implementation so I expect it to work.
For x86, I've made sure to only fire the DMA bottom half if there is a DMA
channel that is runnable. The effect of this is that unless you're using sb16
or a floppy disk, the DMA bottom half never fires.
You probably should test this malc. My own benchmarks actually show slight
improvement by it's possible the change in timing could affect your demos.
Since v1, I've changed the code to use a BH instead of a timer. cris at least
seems to depend on faster than 10ms polling.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5573 c046a42c-6fe2-441c-8c8c-71466251a162
Bottom halves are supposed to not complete until the next iteration of the main
loop. This is very important to ensure that guests can not cause stack
overflows in the block driver code. Right now, if you attempt to schedule a
bottom half within a bottom half callback, you will enter an infinite loop.
This patch uses the same logic that we use for the IOHandler loop to make the
bottom half processing robust in list manipulation while in a callback.
This patch also introduces idle scheduling for bottom halves. qemu_bh_poll()
returns an indication of whether any bottom halves were successfully executed.
qemu_aio_wait() uses this to immediately return if a bottom half was executed
instead of waiting for a completion notification.
qemu_bh_schedule_idle() works around this by not reporting the callback has
run in the qemu_bh_poll loop. qemu_aio_wait() probably needs some refactoring
but that would require a larger code audit. idle scheduling seems like a good
compromise.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5572 c046a42c-6fe2-441c-8c8c-71466251a162
This patch fixes migration so that it works on Win32. This requires using
socket specific calls since sockets cannot be treated like file descriptors
on win32.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5525 c046a42c-6fe2-441c-8c8c-71466251a162
The live migration code broke the windows build. As part of this
change, I've switched the BIOS path to C:\Program Files\Qemu instead of
/c/Program Files/Qemu. The later is only valid when launching from MSYS
but the former is always valid.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5524 c046a42c-6fe2-441c-8c8c-71466251a162
This patch changes the cache= option to accept none, writeback, or writethough
to control the host page cache behavior. By default, writethrough caching is
now used which internally is implemented by using O_DSYNC to open the disk
images. When using -snapshot, writeback is used by default since data integrity
it not at all an issue.
cache=none has the same behavior as cache=off previously. The later syntax is
still supported by now deprecated. I also cleaned up the O_DIRECT
implementation to avoid many of the #ifdefs.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5485 c046a42c-6fe2-441c-8c8c-71466251a162
This patch adds an ethernet announce function that will minimize downtime
when doing a live migration. This code originates from KVM.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5477 c046a42c-6fe2-441c-8c8c-71466251a162
This patch introduces a command line parameter and monitor command for starting
a live migration. The next patch will provide an example of how to use these
parameters.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5476 c046a42c-6fe2-441c-8c8c-71466251a162
This patch allows QEMUFile's read and write operations to return
negative error codes. This is necessary to detect things like closed
streams during live migration.
It also removes unused code for QEMUFileFD write path. Finally, it
makes sure to avoid attempting to flush an output buffer if the file
is only being used for input. This was spotted by Uri Lublin.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5474 c046a42c-6fe2-441c-8c8c-71466251a162
Replace signalfd with signal handler/pipe. There is no way to interrupt
the CPU execution loop when a file descriptor becomes readable. This
results in a large performance regression in sparc emulation during
bootup.
This patch switches us to signal handler/pipe which was originally
suggested by Ian Jackson. The signal handler lets us interrupt the
CPU emulation loop while the write to a pipe lets us avoid the
select/signal race condition.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5451 c046a42c-6fe2-441c-8c8c-71466251a162
Introduce a max_cpus per-machine variable, allowing individual boards
to limit it's number of CPUs. Check requested number of CPUs in setup
code and exit if it exceeds the supported number for the machine.
This also renders the static MAX_CPUS check obsolete, so remove this
from vl.c.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5443 c046a42c-6fe2-441c-8c8c-71466251a162
This patch replaces the static memory savevm/loadvm handler with a "live" one.
This handler is used even if performing a non-live migration.
The key difference between this handler and the previous is that each page is
prefixed with the address of the page. The QEMUFile rate limiting code, in
combination with the live migration dirty tracking bits, is used to determine
which pages should be sent and how many should be sent.
The live save code "converges" when the number of dirty pages reaches a fixed
amount. Currently, this is 10 pages. This is something that should eventually
be derived from whatever the bandwidth limitation is.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5437 c046a42c-6fe2-441c-8c8c-71466251a162
The current savevm/loadvm protocol has some draw backs. It does not support
the ability to do progressive saving which means it cannot be used for live
checkpointing or migration. The sections sizes are 32-bit integers which
means that it will not function when using more than 4GB of memory for a guest.
It attempts to seek within the output file which means it cannot be streamed.
The current protocol also is pretty lax about how it supports forward
compatibility. If a saved section version is greater than what the restore
code support, the restore code generally treats the saved data as being in
whatever version it supports. This means that restoring a saved VM on an older
version of QEMU will likely result in silent guest failure.
This patch introduces a new version of the savevm protocol. It has the
following features:
* Support for progressive save of sections (for live checkpoint/migration)
* An asynchronous API for doing save
* Support for interleaving multiple progressive save sections
(for future support of memory hot-add/storage migration)
* Fully streaming format
* Strong section version checking
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5434 c046a42c-6fe2-441c-8c8c-71466251a162