Changing the default IO timeout to 5 s (#5578) made a race visible
between the alarm_timer and select() in main_loop_wait(): If the timer
fired before select was able to block, the full select() timeout could
have been applied instead of returning immediately. Since #5578, this
causes heavy problems to the Musicpal board emulation with stalls up to
5 s, but also with some older Linux guest kernels.
The following patch introduces a pipe that is written to by
host_alarm_handler and select()'ed in main_loop_wait(). This avoids
prevents that select() blocks though a timer has fired and waits for
processing.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5633 c046a42c-6fe2-441c-8c8c-71466251a162
This patch adds very basic KVM support. KVM is a kernel module for Linux that
allows userspace programs to make use of hardware virtualization support. It
current supports x86 hardware virtualization using Intel VT-x or AMD-V. It
also supports IA64 VT-i, PPC 440, and S390.
This patch only implements the bare minimum support to get a guest booting. It
has very little impact the rest of QEMU and attempts to integrate nicely with
the rest of QEMU.
Even though this implementation is basic, it is significantly faster than TCG.
Booting and shutting down a Linux guest:
w/TCG: 1:32.36 elapsed 84% CPU
w/KVM: 0:31.14 elapsed 59% CPU
Right now, KVM is disabled by default and must be explicitly enabled with
-enable-kvm. We can enable it by default later when we have had better
testing.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5627 c046a42c-6fe2-441c-8c8c-71466251a162
It is safe not to set dpy_refresh and that's used to indicate that the display
doesn't need updates. This saves us two wakeups per second.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5583 c046a42c-6fe2-441c-8c8c-71466251a162
The motivating goal behind this is to allow other tools to use the CharDriver
code. This patch is pure code motion except for the Makefile changes and the
copyright/header in qemu-char.c.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5580 c046a42c-6fe2-441c-8c8c-71466251a162
The goal of this series is to move the CharDriverState code out of vl.c and
into its own file, qemu-char.c. This patch moves around some declarations so
the next patch can be pure code motion.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5579 c046a42c-6fe2-441c-8c8c-71466251a162
With the recent changes to the main loop, we no longer have unconditional
polling. This means we can now sleep in select() for much longer than we
previously did. This patch increases our select() sleep time from 10ms to 5s
which is effectively unlimited since we're going to wake up sooner than that
in almost all circumstances.
With this patch, I see the number of wake-ups with an idle dynamic ticks guest
drop from 80 per second to about 15 times per second.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5578 c046a42c-6fe2-441c-8c8c-71466251a162
Tidy up win32 main loop bits, allow timeout >= 1s, and force timeout to 0 if
there is a pending bottom half.
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5577 c046a42c-6fe2-441c-8c8c-71466251a162
This patch makes qemu keep track of the character devices in use and
implements a "info chardev" monitor command to print a list.
qemu_chr_open() sticks the devices into a linked list now. It got a new
argument (label), so there is a name for each device. It also assigns a
filename to each character device. By default it just copyes the
filename passed in. Individual drivers can fill in something else
though. qemu_chr_open_pty() sets the filename to name of the pseudo tty
allocated.
Output looks like this:
(qemu) info chardev
monitor: filename=unix:/tmp/run.sh-26827/monitor,server,nowait
serial0: filename=unix:/tmp/run.sh-26827/console,server
serial1: filename=pty:/dev/pts/5
parallel0: filename=vc:640x480
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5575 c046a42c-6fe2-441c-8c8c-71466251a162
The current DMA routines are driven by a call in main_loop_wait() after every
select.
This patch converts the DMA code to be driven by a constantly rescheduled
bottom half. The advantage of using a scheduled bottom half is that we can
stop scheduling the bottom half when there no DMA channels are runnable. This
means we can potentially detect this case and sleep longer in the main loop.
The only two architectures implementing DMA_run() are cris and i386. For cris,
I converted it to a simple repeating bottom half. I've only compile tested
this as cris does not seem to work on a 64-bit host. It should be functionally
identical to the previous implementation so I expect it to work.
For x86, I've made sure to only fire the DMA bottom half if there is a DMA
channel that is runnable. The effect of this is that unless you're using sb16
or a floppy disk, the DMA bottom half never fires.
You probably should test this malc. My own benchmarks actually show slight
improvement by it's possible the change in timing could affect your demos.
Since v1, I've changed the code to use a BH instead of a timer. cris at least
seems to depend on faster than 10ms polling.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5573 c046a42c-6fe2-441c-8c8c-71466251a162
Bottom halves are supposed to not complete until the next iteration of the main
loop. This is very important to ensure that guests can not cause stack
overflows in the block driver code. Right now, if you attempt to schedule a
bottom half within a bottom half callback, you will enter an infinite loop.
This patch uses the same logic that we use for the IOHandler loop to make the
bottom half processing robust in list manipulation while in a callback.
This patch also introduces idle scheduling for bottom halves. qemu_bh_poll()
returns an indication of whether any bottom halves were successfully executed.
qemu_aio_wait() uses this to immediately return if a bottom half was executed
instead of waiting for a completion notification.
qemu_bh_schedule_idle() works around this by not reporting the callback has
run in the qemu_bh_poll loop. qemu_aio_wait() probably needs some refactoring
but that would require a larger code audit. idle scheduling seems like a good
compromise.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5572 c046a42c-6fe2-441c-8c8c-71466251a162
This patch fixes migration so that it works on Win32. This requires using
socket specific calls since sockets cannot be treated like file descriptors
on win32.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5525 c046a42c-6fe2-441c-8c8c-71466251a162
The live migration code broke the windows build. As part of this
change, I've switched the BIOS path to C:\Program Files\Qemu instead of
/c/Program Files/Qemu. The later is only valid when launching from MSYS
but the former is always valid.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5524 c046a42c-6fe2-441c-8c8c-71466251a162
This patch changes the cache= option to accept none, writeback, or writethough
to control the host page cache behavior. By default, writethrough caching is
now used which internally is implemented by using O_DSYNC to open the disk
images. When using -snapshot, writeback is used by default since data integrity
it not at all an issue.
cache=none has the same behavior as cache=off previously. The later syntax is
still supported by now deprecated. I also cleaned up the O_DIRECT
implementation to avoid many of the #ifdefs.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5485 c046a42c-6fe2-441c-8c8c-71466251a162
This patch adds an ethernet announce function that will minimize downtime
when doing a live migration. This code originates from KVM.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5477 c046a42c-6fe2-441c-8c8c-71466251a162
This patch introduces a command line parameter and monitor command for starting
a live migration. The next patch will provide an example of how to use these
parameters.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5476 c046a42c-6fe2-441c-8c8c-71466251a162
This patch allows QEMUFile's read and write operations to return
negative error codes. This is necessary to detect things like closed
streams during live migration.
It also removes unused code for QEMUFileFD write path. Finally, it
makes sure to avoid attempting to flush an output buffer if the file
is only being used for input. This was spotted by Uri Lublin.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5474 c046a42c-6fe2-441c-8c8c-71466251a162
Replace signalfd with signal handler/pipe. There is no way to interrupt
the CPU execution loop when a file descriptor becomes readable. This
results in a large performance regression in sparc emulation during
bootup.
This patch switches us to signal handler/pipe which was originally
suggested by Ian Jackson. The signal handler lets us interrupt the
CPU emulation loop while the write to a pipe lets us avoid the
select/signal race condition.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5451 c046a42c-6fe2-441c-8c8c-71466251a162
Introduce a max_cpus per-machine variable, allowing individual boards
to limit it's number of CPUs. Check requested number of CPUs in setup
code and exit if it exceeds the supported number for the machine.
This also renders the static MAX_CPUS check obsolete, so remove this
from vl.c.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5443 c046a42c-6fe2-441c-8c8c-71466251a162
This patch replaces the static memory savevm/loadvm handler with a "live" one.
This handler is used even if performing a non-live migration.
The key difference between this handler and the previous is that each page is
prefixed with the address of the page. The QEMUFile rate limiting code, in
combination with the live migration dirty tracking bits, is used to determine
which pages should be sent and how many should be sent.
The live save code "converges" when the number of dirty pages reaches a fixed
amount. Currently, this is 10 pages. This is something that should eventually
be derived from whatever the bandwidth limitation is.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5437 c046a42c-6fe2-441c-8c8c-71466251a162
The current savevm/loadvm protocol has some draw backs. It does not support
the ability to do progressive saving which means it cannot be used for live
checkpointing or migration. The sections sizes are 32-bit integers which
means that it will not function when using more than 4GB of memory for a guest.
It attempts to seek within the output file which means it cannot be streamed.
The current protocol also is pretty lax about how it supports forward
compatibility. If a saved section version is greater than what the restore
code support, the restore code generally treats the saved data as being in
whatever version it supports. This means that restoring a saved VM on an older
version of QEMU will likely result in silent guest failure.
This patch introduces a new version of the savevm protocol. It has the
following features:
* Support for progressive save of sections (for live checkpoint/migration)
* An asynchronous API for doing save
* Support for interleaving multiple progressive save sections
(for future support of memory hot-add/storage migration)
* Fully streaming format
* Strong section version checking
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5434 c046a42c-6fe2-441c-8c8c-71466251a162
To support live migration, we override QEMUFile so that instead of writing to
disk, the save/restore state happens over a network connection.
This patch makes QEMUFile read/write operations function pointers so that we
can override them for live migration.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5352 c046a42c-6fe2-441c-8c8c-71466251a162
Instead of having (current)three command line switches -std-vga,
-cirrusvga and -vmwarevga, provide one -vga switch which takes
an argument, so that:
qemu -std-vga becomes qemu -vga std
qemu -cirrusvga becomes qemu -vga cirrus
qemu -vmwarevga becomes qemu -vga vmware
Update documentation accordingly.
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5335 c046a42c-6fe2-441c-8c8c-71466251a162
Right now, we sprinkle #if defined(QEMU_IMG) && defined(QEMU_NBD) all over the
code. It's ugly and causes us to have to build multiple object files for
linking against qemu and the tools.
This patch introduces a new file, qemu-tool.c which contains enough for
qemu-img, qemu-nbd, and QEMU to all share the same objects.
This also required getting qemu-nbd to be a bit more Windows friendly. I also
changed the Windows block-raw to use normal IO instead of overlapping IO since
we don't actually do AIO yet on Windows. I changed the various #if 0's to
#if WIN32_AIO to make it easier for someone to eventually fix AIO on Windows.
After this patch, there are no longer any #ifdef's related to qemu-img and
qemu-nbd.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5226 c046a42c-6fe2-441c-8c8c-71466251a162
This patch adds support for removing USB devices by host address.
Which is usefull for things like libvirtd because there is no easy way to
find guest USB address of the host device.
In other words you can now do:
usb_add host:3.5
...
usb_del host:3.5
Before the patch 'usb_del' did not support 'host:' notation.
----
Syntax for specifying auto connect filters has been improved.
Old syntax was
host:bus.dev
host:pid:vid
New syntax is
host:auto:bus.dev[:pid:vid]
In both the cases any attribute can be set to "*".
New syntax is more flexible and lets you do things like
host:3.*:5533:* /* grab any device on bus 3 with vendor id 5533 */
It's now possible to remove auto filters. For example:
usb_del host:auto:3.*:5533:*
Active filters are printed after all host devices in 'info usb' output.
Which now looks like this:
Device 1.1, speed 480 Mb/s
Hub: USB device 1d6b:0002, EHCI Host Controller
Device 1.4, speed 480 Mb/s
Class 00: USB device 1058:0704, External HDD
Auto filters:
Device 3.* ID *:*
Signed-off-by: Max Krasnyansky <maxk@kernel.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5205 c046a42c-6fe2-441c-8c8c-71466251a162
This patch introduces signalfd() to work around the signal/select race in
checking for AIO completions. For platforms that don't support signalfd(), we
emulate it with threads.
There was a long discussion about this approach. I don't believe there are any
fundamental problems with this approach and I believe eliminating the use of
signals is a good thing.
I've tested Windows and Linux using Windows and Linux guests. I've also checked
for disk IO performance regressions.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5187 c046a42c-6fe2-441c-8c8c-71466251a162
When CONFIG_SLIRP is not defined, we should not try to use
-net user as a default.
Patch from Jeremy Fitzhardinge <jeremy@goop.org> (who is a Citrix
staff member).
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5092 c046a42c-6fe2-441c-8c8c-71466251a162
The direction bit in the control register should not be directly
set using PPWCONTROL. The kernel gives the following debug message.
parport0 (ppdev0): use data_reverse for this!
More over setting the data pins to forward mode does not work,
perhaps a bug in the Linux PP driver. The right way to do this is
to use PPDATADIR to set the direction. The patch checks if the
user is toggling the direction bit, and invokes PPDATADIR to
do the job.
Signed-off-by: Vijay Kumar B <vijaykumar@bravegnu.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5063 c046a42c-6fe2-441c-8c8c-71466251a162
Add idle field to DisplayState struct, so drivers can figure
the display is idle and take advantage of that.
The xen framebuffer driver will use this to communicate the
idle state to the guest, so it knows it can stop doing updates
to a virtual display which is invisible anyway.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@eu.citrix.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5056 c046a42c-6fe2-441c-8c8c-71466251a162
This patch makes qemu handle signals better. It sets the request_shutdown
flag, making the main_loop exit and qemu taking the usual exit route, with
atexit handlers being called and so on, instead of qemu just being killed
by the signal.
To avoid calling vm_start() from the signal handler main_loop() got an
additional check so qemu_system_shutdown_request() works even when the
vm is in stopped state.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5055 c046a42c-6fe2-441c-8c8c-71466251a162
QEMU can now automatically grab host USB devices that match the filter.
For now I just extended 'host:X.Y' and 'host:VID:PID' syntax to handle
wildcards. So for example if you do something like
usb_add host:5.*
QEMU will automatically grab any non-hub device with host address 5.*.
Same with the 'host:PID:*', we grab any device that matches PID.
Filtering itself is very generic so we can probably add more elaborate
syntax like 'host:BUS.ADDR:VID:PID'. So that we can do 'host:5.*:6000:*'.
Anyway, it's implemented using a periodic timer that scans host devices
and grabs those that match the filter. Timer is started when the first
filter is added.
We now keep the list of all host devices that we grabbed to make sure that
we do not grab the same device twice.
btw It's currently possible to grab the same host device more than once.
ie You can just do "usb_add host:1.1" more than once, which of course does
not work. So this patch fixes that issue too.
Along with auto disconnect patch that I send a minute ago the setup is very
seamless now. You can just allocate some usb ports to the VMs and plug/unplug
devices at any time.
Signed-off-by: Max Krasnyansky <maxk@kernel.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5048 c046a42c-6fe2-441c-8c8c-71466251a162
I got really annoyed by the fact that you have to manually do
usb_del in the monitor when host device is unplugged and decided
to fix it :)
Basically we now automatically remove guest USB device
when the actual host device is disconnected.
At first I've extended set_fd_handlerX() stuff to support checking
for exceptions on fds. But unfortunately usbfs code does not wake up
user-space process when device is removed, which means we need a
timer to periodically check if device is still there. So I removed
fd exception stuff and implemented it with the timer.
Signed-off-by: Max Krasnyansky <maxk@kernel.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5047 c046a42c-6fe2-441c-8c8c-71466251a162
This patch upgrades the emulated UART to 16550A, the code comes from
xen-unstable. The main improvement was introduced with the following patch and
subsequent email thread:
http://lists.xensource.com/archives/html/xen-devel/2007-12/msg00129.html
The changes compared to previous version are:
- change clock_gettime to qemu_get_clock
- no token bucket anymore;
- fixed a small bug handling IRQs; this was the problem that prevented
kgdb to work over the serial (thanks to Jason Wessel for the help
spotting and reproducing this bug).
- many many style fixes;
- savevm version number increased;
- not including termios.h and sys/ioctl.h anymore, declaring static
constants in qemu-char.h instead;
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4993 c046a42c-6fe2-441c-8c8c-71466251a162
This patch allows to display the "Password:" prompt if we use encrypted
disk with "-nographic" option.
It also modifies management of "-nographic" to not override user's
choices for "-serial", "-parallel" and "-monitor".
When qemu has to ask a password with "-nographic" with a multiplexed
serial interface, it forces the focus to the monitor and restore
original focus after.
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4979 c046a42c-6fe2-441c-8c8c-71466251a162
This patch repairs the management of encrypted disk images and allows to
enter the password.
Changelog:
v2:
- move read_password() before do_loadvm()
- really start monitor if output is stdio.
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4976 c046a42c-6fe2-441c-8c8c-71466251a162
This patch moves the pty char device imlementation away from the generic
filehandle code. It tries to detect as good as possible whenever there
is someone connected to the slave pty device and only send data down the
road in case someone is listening. Unfortunaly we have to poll via
timer once in a while to check the status because we have to use read()
on the master pty to figure the status (returns -EIO when unconnected).
Poll intervall for an idle guest is one second, when the guest sends
data to the virtual device linked to the pty we check more frequently.
The point for all of this is to avoid qemu blocking and not responding
any more. Writing to the master pty handle succeeds even when nobody is
connected to (and reading from) to the slave end of the pty. The kernel
just bufferes the writes. And as soon as the kernel buffer is full the
write() call blocks forever ...
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4956 c046a42c-6fe2-441c-8c8c-71466251a162
Save 1.5MB (32bit) or 3MB (64bit) memory by keeping ioport tables
sparse and use a test against NULL instead.
Signed-off-by: Samuel Thibault <samuel.thibault@eu.citrix.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4927 c046a42c-6fe2-441c-8c8c-71466251a162
When using -daemonize, we want to avoid chdir() until after we've opened the
block devices. It's also perfectly fine to use -dameonize along with SDL.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4924 c046a42c-6fe2-441c-8c8c-71466251a162
Hopefully someday will be merged with cs4231.c (SPARC version)
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4741 c046a42c-6fe2-441c-8c8c-71466251a162
Currently tap does not generate signals on I/O; this causes
network latency to be dependent on the timer tick (1ms without
dyntick, guest dependent with dyntick). By generating a signal
on I/O, we can inform the guest immediately that a packet has
arrived.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4341 c046a42c-6fe2-441c-8c8c-71466251a162