Trace events outside the global mutex cannot be used with the simple
trace backend since it is not thread-safe. There is no check to prevent
them being enabled so people sometimes learn this the hard way.
This patch restructures the simple trace backend with a ring buffer
suitable for multiple concurrent writers. A writeout thread empties the
trace buffer when threshold fill levels are reached. Should the
writeout thread be unable to keep up with trace generation, records will
simply be dropped.
Each time events are dropped a special record is written to the trace
file indicating how many events were dropped. The event ID is
0xfffffffffffffffe and its signature is dropped(uint32_t count).
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
To prepare splitting up KVM and TCG CPU entry/exit, move the debug
exception into cpus.c and invoke cpu_handle_debug_exception on return
from qemu_cpu_exec.
This also allows to clean up the debug request signaling: We can assign
the job of informing main-loop to qemu_system_debug_request and stop the
calling cpu directly in cpu_handle_debug_exception. That means a debug
stop will now only be signaled via debug_requested and not additionally
via vmstop_requested.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Instead of fiddling with debug_requested and vmstop_requested directly,
introduce qemu_system_debug_request and turn qemu_system_vmstop_request
into a public interface. This aligns those services with exiting ones in
vl.c.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Define and use dedicated constants for vm_stop reasons, they actually
have nothing to do with the EXCP_* defines used so far. At this chance,
specify more detailed reasons so that VM state change handlers can
evaluate them.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
First of all, vm_can_run is a misnomer, it actually means "no request
pending". Moreover, there is no need to check all pending requests
twice, the first time via the inner loop check and then again when
actually processing the requests. We can simply remove the inner loop
and do the checks directly.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
If there is any pending request that requires us to leave the inner loop
if main_loop, makes sure we do this as soon as possible by enforcing
non-blocking IO processing.
At this change, move variable definitions out of the inner loop to
improve readability.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
A pending vmstop request is also a reason to leave the inner main loop.
So far we ignored it, and pending stop requests issued over VCPU threads
were simply ignored.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
If some I/O operation ends up calling qemu_system_reset_request in VCPU
context, we record this and inform the io-thread, but we do not
terminate the VCPU loop. This can lead to fairly unexpected behavior if
the triggering reset operation is supposed to work synchronously.
Fix this for TCG (when run in deterministic I/O mode) by setting the
VCPU on stop and issuing a cpu_exit. KVM requires some more work on its
VCPU loop.
[ ported from qemu-kvm ]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Also use qemu_strdup() instead of strdup() in bootindex code.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Watch this:
(qemu) drive_add 0 if=none
(qemu) info block
none0: type=hd removable=0 [not inserted]
(qemu) drive_del none0
Segmentation fault (core dumped)
add_init_drive() is confused about drive_init()'s failure modes, and
cleans up when it shouldn't. This leaves the DriveInfo with member
opts dangling. drive_del attempts to free it, and dies.
drive_init() behaves as follows:
* If it created a drive with media, it returns its DriveInfo.
* If it created a drive without media, it clears *fatal_error and
returns NULL.
* If it couldn't create a drive, it sets *fatal_error and returns
NULL.
Of its three callers:
* drive_init_func() is correct.
* usb_msd_init() assumes drive_init() failed when it returns NULL.
This is correct only because it always passes option "file", and
"drive without media" can't happen then.
* add_init_drive() assumes drive_init() failed when it returns NULL.
This is incorrect.
Clean up drive_init() to return NULL on failure and only on failure.
Drop its parameter fatal_error.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Let the callers build the optstr. Only one wants to. All the others
become simpler, because they don't have to worry about escaping '%'.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We silently ignore multiple definitions for the same drive:
$ qemu-system-x86_64 -nodefaults -vnc :1 -S -monitor stdio -drive if=ide,index=1,file=tmp.qcow2 -drive if=ide,index=1,file=nonexistant
QEMU 0.13.50 monitor - type 'help' for more information
(qemu) info block
ide0-hd1: type=hd removable=0 file=tmp.qcow2 backing_file=tmp.img ro=0 drv=qcow2 encrypted=0
With if=none, this can become quite confusing:
$ qemu-system-x86_64 -nodefaults -vnc :1 -S -monitor stdio -drive if=none,index=1,file=tmp.qcow2,id=eins -drive if=none,index=1,file=nonexistant,id=zwei -device ide-drive,drive=eins -device ide-drive,drive=zwei
qemu-system-x86_64: -device ide-drive,drive=zwei: Property 'ide-drive.drive' can't find value 'zwei'
The second -device fails, because it refers to drive zwei, which got
silently ignored.
Make multiple drive definitions fail cleanly.
Unfortunately, there's code that relies on multiple drive definitions
being silently ignored: main() merrily adds default drives even when
the user already defined these drives. Fix that up.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Before, type & index were hidden in printf-like fmt, ... parameters,
which get expanded into an option string. Rather inconvenient for
uses later in this series.
New IF_DEFAULT to ask for the machine's default interface. Before,
that was done by having no option "if" in the option string.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
strtosz() needs to return a 64 bit type even on 32 bit
architectures. Otherwise qemu-img will fail to create disk
images >= 2GB
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Stefan Weil reported the regression caused by
ec990eb622 as follows
> The second regression also occurs with MIPS malta.
> Networking no longer works with the default pcnet nic.
>
> This is caused because the reset function for pcnet is no
> longer called during system boot. The result in an invalid
> mac address (all zero) and a non-working nic.
>
> For this second regression I still have no simple solution.
> Of course mips_malta.c should be converted to qdev which
> would fix both problems (but only for malta system emulation).
The issue is, it is assumed that all qbuses, qdeves are under
main_system_bus. But there are qbuses whose parent is NULL. So it
is necessary to trigger reset for those qbuses.
(On the other hand, if NULL is passed to qdev_create(), its parent bus
is main_system_bus.)
Ideally those buses should be moved under bus controller
device which is qdev. But it's not done yet.
So register qbus reset handler for qbus whose parent is NULL.
Reported-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Avoid the warning below by using snprintf:
../libhw64/vl.o(.text+0x78d4): In function `get_boot_devices_list':
/src/qemu/vl.c:763: warning: sprintf() is often misused, please use snprintf()
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Action that depends on fully initialized device model should register
with this notifier chain.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
If bootindex is specified on command line a string that describes device
in firmware readable way is added into sorted list. Later this list will
be passed into firmware to control boot order.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
qxl is a paravirtual graphics card. The qxl device is the bridge
between the guest and the spice server (aka libspice-server). The
spice server will send the rendering commands to the spice client, which
will actually render them.
The spice server is also able to render locally, which is done in case
the guest wants read something from video memory. Local rendering is
also used to support display over vnc and sdl.
qxl is activated using "-vga qxl". qxl supports multihead, additional
cards can be added via '-device qxl".
[ v2: add copyright to files ]
[ v2: use qemu-common.h for standard includes ]
[ v2: create separate qxl-vga device for primary ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This patch changes the reset handling so that qdev has no knowledge of the
global system reset. Instead, a new bus/device level function is introduced
that allows all devices/buses on the bus/device to be reset using a depth
first transversal.
N.B. we have to expose the implicit system bus because we have various hacks
that result in an implicit system bus existing. Instead, we ought to have an
explicitly created system bus that we can trigger reset from. That's a topic
for a future patch though.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
VM state change notifications are invoked from vm_start()/vm_stop().
Trace these state changes so we can reason about the state of the VM
from trace output.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Since commit 4bed983730 an .fd_read()
handler that deletes its IOHandler is exposed to .fd_write() being
called on the deleted IOHandler.
This patch fixes deletion so that .fd_read() and .fd_write() are never
called on an IOHandler that is marked for deletion.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
strtosz() returns -1 on error. It now supports human unit formats in
eg. 1.0G, with better error handling.
The following suffixes are supported:
B/b = bytes
K/k = KB
M/m = MB
G/g = GB
T/t = TB
This patch changes -numa and -m input to use strtosz().
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move timer init functions to a new file, qemu-timer-common.c. Make other
critical timer functions inlined to preserve performance in
qemu-timer.c, also move muldiv64() (used by the inline functions)
to qemu-timer.h.
Adjust block/raw-posix.c and simpletrace.c to use get_clock() directly.
Remove a similar/duplicate definition in qemu-tool.c.
Adjust hw/omap_clk.c to include qemu-timer.h because muldiv64() is used
there.
After this change, tracing can be used also for user code and
simpletrace on Win32.
Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Acked-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Expaned '-mon' arg to allow a 'pretty=on' flag. This makes the
monitor pretty print its replies to easy human debugging / reading
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
vl.c has a Sun-specific hack to supply a prototype for madvise(),
but the call site has apparently moved to arch_init.c.
Haiku doesn't implement madvise() in favor of posix_madvise().
OpenBSD and Solaris 10 don't implement posix_madvise() but madvise().
MinGW implements neither.
Check for madvise() and posix_madvise() in configure and supply qemu_madvise()
as wrapper. Prefer madvise() over posix_madvise() due to flag availability.
Convert all callers to use qemu_madvise() and QEMU_MADV_*.
Note that on Solaris the warning is fixed by moving the madvise() prototype,
not by qemu_madvise() itself. It helps with porting though, and it simplifies
most call sites.
v7 -> v8:
* Some versions of MinGW have no sys/mman.h header. Reported by Blue Swirl.
v6 -> v7:
* Adopt madvise() rather than posix_madvise() semantics for returning errors.
* Use EINVAL in place of ENOTSUP.
v5 -> v6:
* Replace two leftover instances of POSIX_MADV_NORMAL with QEMU_MADV_INVALID.
Spotted by Blue Swirl.
v4 -> v5:
* Introduce QEMU_MADV_INVALID, suggested by Alexander Graf.
Note that this relies on -1 not being a valid advice value.
v3 -> v4:
* Eliminate #ifdefs at qemu_advise() call sites. Requested by Blue Swirl.
This will currently break the check in kvm-all.c by calling madvise() with
a supported flag, which will not fail. Ideas/patches welcome.
v2 -> v3:
* Reuse the *_MADV_* defines for QEMU_MADV_*. Suggested by Alexander Graf.
* Add configure check for madvise(), too.
Add defines to Makefile, not QEMU_CFLAGS.
Convert all callers, untested. Suggested by Blue Swirl.
* Keep Solaris' madvise() prototype around. Pointed out by Alexander Graf.
* Display configure check results.
v1 -> v2:
* Don't rely on posix_madvise() availability, add qemu_madvise().
Suggested by Blue Swirl.
Signed-off-by: Andreas Färber <afaerber@opensolaris.org>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
With that patch applied you'll actually see the guests screen in the
spice client. This does *not* bring qxl and full spice support though.
This is basically the qxl vga mode made more generic, so it plays
together with any qemu-emulated gfx card. You can display stdvga or
cirrus via spice client. You can have both vnc and spice enabled and
clients connected at the same time.
Add -spice command line switch. Has support setting passwd and port for
now. With this patch applied the spice client can successfully connect
to qemu. You can't do anything useful yet though.
This patch drops DT_VNC. The display types are only used to select
select the local display (i.e. curses, sdl, coca, ...). Remote
displays (for now only vnc, spice will follow) can be enabled
independently.
This patch adds an optional command line switch '-trace' to specify the
filename to write traces to, when qemu starts.
Eg, If compiled with the 'simple' trace backend,
[temp@system]$ qemu -trace FILENAME IMAGE
Allows the binary traces to be written to FILENAME instead of the option
set at config-time.
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
This is equivalent to SM_PASSTHROUGH security model.
The only exception is, failure of privilige operation like chown
are ignored. This makes a passthrough like security model usable
for people who runs kvm as non root
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
When making copy of arguments we were doing partial copy
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Fix a warning from OpenBSD linker:
../libhw32/vl.o(.text+0x5c3c): In function `main':
/src/qemu/vl.c:2335: warning: sprintf() is often misused, please use snprintf()
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Switch tree to lookup-by-name using qemu_find_opts().
Also hook up virtfs options so qemu_find_opts works for them too.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When a 'cont' is issued on a VM that's just waiting for an incoming
migration, the VM reboots and boots into the guest, possibly corrupting
its storage since it could be shared with another VM running elsewhere.
Ensure that a VM started with '-incoming' is only run when an incoming
migration successfully completes.
A new qerror, QERR_MIGRATION_EXPECTED, is added to signal that 'cont'
failed due to no incoming migration has been attempted yet.
Reported-by: Laine Stump <laine@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
We already set sockets to nonzero in the code above.
So this if statement always evaluates true. Remove it.
Signed-off-by: Joel Schopp <jschopp@austin.ibm.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
These functions are also used for kvm under !CONFIG_IOTHREAD, having
'tcg' in their name is just misleading.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Synchronize RAM blocks with the target and migrate using name/offset
pairs. This ensures both source and target have the same view of
RAM and that we get the right bits into the right slot.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>