2004-04-04 19:21:17 +04:00
|
|
|
\input texinfo @c -*- texinfo -*-
|
2006-05-01 01:58:41 +04:00
|
|
|
@c %**start of header
|
|
|
|
@setfilename qemu-tech.info
|
2010-02-06 01:52:00 +03:00
|
|
|
|
|
|
|
@documentlanguage en
|
|
|
|
@documentencoding UTF-8
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@settitle QEMU Internals
|
|
|
|
@exampleindent 0
|
|
|
|
@paragraphindent 0
|
|
|
|
@c %**end of header
|
2004-04-04 19:21:17 +04:00
|
|
|
|
2010-02-06 01:51:59 +03:00
|
|
|
@ifinfo
|
|
|
|
@direntry
|
|
|
|
* QEMU Internals: (qemu-tech). The QEMU Emulator Internals.
|
|
|
|
@end direntry
|
|
|
|
@end ifinfo
|
|
|
|
|
2004-04-04 19:21:17 +04:00
|
|
|
@iftex
|
|
|
|
@titlepage
|
|
|
|
@sp 7
|
|
|
|
@center @titlefont{QEMU Internals}
|
|
|
|
@sp 3
|
|
|
|
@end titlepage
|
|
|
|
@end iftex
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@ifnottex
|
|
|
|
@node Top
|
|
|
|
@top
|
|
|
|
|
|
|
|
@menu
|
|
|
|
* Introduction::
|
|
|
|
* QEMU Internals::
|
|
|
|
* Regression Tests::
|
|
|
|
* Index::
|
|
|
|
@end menu
|
|
|
|
@end ifnottex
|
|
|
|
|
|
|
|
@contents
|
|
|
|
|
|
|
|
@node Introduction
|
2004-04-04 19:21:17 +04:00
|
|
|
@chapter Introduction
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@menu
|
2011-10-10 14:48:23 +04:00
|
|
|
* intro_features:: Features
|
|
|
|
* intro_x86_emulation:: x86 and x86-64 emulation
|
|
|
|
* intro_arm_emulation:: ARM emulation
|
|
|
|
* intro_mips_emulation:: MIPS emulation
|
|
|
|
* intro_ppc_emulation:: PowerPC emulation
|
|
|
|
* intro_sparc_emulation:: Sparc32 and Sparc64 emulation
|
|
|
|
* intro_xtensa_emulation:: Xtensa emulation
|
|
|
|
* intro_other_emulation:: Other CPU emulation
|
2006-05-01 01:58:41 +04:00
|
|
|
@end menu
|
|
|
|
|
|
|
|
@node intro_features
|
2004-04-04 19:21:17 +04:00
|
|
|
@section Features
|
|
|
|
|
|
|
|
QEMU is a FAST! processor emulator using a portable dynamic
|
|
|
|
translator.
|
|
|
|
|
|
|
|
QEMU has two operating modes:
|
|
|
|
|
|
|
|
@itemize @minus
|
|
|
|
|
2007-09-17 01:08:06 +04:00
|
|
|
@item
|
2008-10-09 22:52:04 +04:00
|
|
|
Full system emulation. In this mode (full platform virtualization),
|
|
|
|
QEMU emulates a full system (usually a PC), including a processor and
|
|
|
|
various peripherals. It can be used to launch several different
|
|
|
|
Operating Systems at once without rebooting the host machine or to
|
|
|
|
debug system code.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
2007-09-17 01:08:06 +04:00
|
|
|
@item
|
2008-10-09 22:52:04 +04:00
|
|
|
User mode emulation. In this mode (application level virtualization),
|
|
|
|
QEMU can launch processes compiled for one CPU on another CPU, however
|
|
|
|
the Operating Systems must match. This can be used for example to ease
|
|
|
|
cross-compilation and cross-debugging.
|
2004-04-04 19:21:17 +04:00
|
|
|
@end itemize
|
|
|
|
|
|
|
|
As QEMU requires no host kernel driver to run, it is very safe and
|
|
|
|
easy to use.
|
|
|
|
|
|
|
|
QEMU generic features:
|
|
|
|
|
2007-09-17 01:08:06 +04:00
|
|
|
@itemize
|
2004-04-04 19:21:17 +04:00
|
|
|
|
|
|
|
@item User space only or full system emulation.
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@item Using dynamic translation to native code for reasonable speed.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
2008-10-09 22:52:04 +04:00
|
|
|
@item
|
|
|
|
Working on x86, x86_64 and PowerPC32/64 hosts. Being tested on ARM,
|
|
|
|
HPPA, Sparc32 and Sparc64. Previous versions had some support for
|
|
|
|
Alpha and S390 hosts, but TCG (see below) doesn't support those yet.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
|
|
|
@item Self-modifying code support.
|
|
|
|
|
|
|
|
@item Precise exceptions support.
|
|
|
|
|
2008-10-09 22:52:04 +04:00
|
|
|
@item
|
|
|
|
Floating point library supporting both full software emulation and
|
|
|
|
native host FPU instructions.
|
|
|
|
|
2004-04-04 19:21:17 +04:00
|
|
|
@end itemize
|
|
|
|
|
|
|
|
QEMU user mode emulation features:
|
2007-09-17 01:08:06 +04:00
|
|
|
@itemize
|
2004-04-04 19:21:17 +04:00
|
|
|
@item Generic Linux system call converter, including most ioctls.
|
|
|
|
|
|
|
|
@item clone() emulation using native CPU clone() to use Linux scheduler for threads.
|
|
|
|
|
2007-09-17 01:08:06 +04:00
|
|
|
@item Accurate signal handling by remapping host signals to target signals.
|
2004-04-04 19:21:17 +04:00
|
|
|
@end itemize
|
|
|
|
|
2008-10-09 22:52:04 +04:00
|
|
|
Linux user emulator (Linux host only) can be used to launch the Wine
|
2012-04-16 08:31:11 +04:00
|
|
|
Windows API emulator (@url{http://www.winehq.org}). A BSD user emulator for BSD
|
2008-10-09 22:52:04 +04:00
|
|
|
hosts is under development. It would also be possible to develop a
|
|
|
|
similar user emulator for Solaris.
|
|
|
|
|
2004-04-04 19:21:17 +04:00
|
|
|
QEMU full system emulation features:
|
2007-09-17 01:08:06 +04:00
|
|
|
@itemize
|
2008-10-09 22:52:04 +04:00
|
|
|
@item
|
|
|
|
QEMU uses a full software MMU for maximum portability.
|
|
|
|
|
|
|
|
@item
|
2009-08-11 02:07:24 +04:00
|
|
|
QEMU can optionally use an in-kernel accelerator, like kvm. The accelerators
|
|
|
|
execute some of the guest code natively, while
|
2008-10-09 22:52:04 +04:00
|
|
|
continuing to emulate the rest of the machine.
|
|
|
|
|
|
|
|
@item
|
|
|
|
Various hardware devices can be emulated and in some cases, host
|
|
|
|
devices (e.g. serial and parallel ports, USB, drives) can be used
|
|
|
|
transparently by the guest Operating System. Host device passthrough
|
|
|
|
can be used for talking to external physical peripherals (e.g. a
|
|
|
|
webcam, modem or tape drive).
|
|
|
|
|
|
|
|
@item
|
|
|
|
Symmetric multiprocessing (SMP) even on a host with a single CPU. On a
|
|
|
|
SMP host system, QEMU can use only one CPU fully due to difficulty in
|
|
|
|
implementing atomic memory accesses efficiently.
|
|
|
|
|
2004-04-04 19:21:17 +04:00
|
|
|
@end itemize
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node intro_x86_emulation
|
2008-10-09 22:52:04 +04:00
|
|
|
@section x86 and x86-64 emulation
|
2004-04-04 19:21:17 +04:00
|
|
|
|
|
|
|
QEMU x86 target features:
|
|
|
|
|
2007-09-17 01:08:06 +04:00
|
|
|
@itemize
|
2004-04-04 19:21:17 +04:00
|
|
|
|
2007-09-17 01:08:06 +04:00
|
|
|
@item The virtual x86 CPU supports 16 bit and 32 bit addressing with segmentation.
|
2008-10-09 22:52:04 +04:00
|
|
|
LDT/GDT and IDT are emulated. VM86 mode is also supported to run
|
|
|
|
DOSEMU. There is some support for MMX/3DNow!, SSE, SSE2, SSE3, SSSE3,
|
|
|
|
and SSE4 as well as x86-64 SVM.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
|
|
|
@item Support of host page sizes bigger than 4KB in user mode emulation.
|
|
|
|
|
|
|
|
@item QEMU can emulate itself on x86.
|
|
|
|
|
2007-09-17 01:08:06 +04:00
|
|
|
@item An extensive Linux x86 CPU test program is included @file{tests/test-i386}.
|
2004-04-04 19:21:17 +04:00
|
|
|
It can be used to test other x86 virtual CPUs.
|
|
|
|
|
|
|
|
@end itemize
|
|
|
|
|
|
|
|
Current QEMU limitations:
|
|
|
|
|
2007-09-17 01:08:06 +04:00
|
|
|
@itemize
|
2004-04-04 19:21:17 +04:00
|
|
|
|
2008-10-09 22:52:04 +04:00
|
|
|
@item Limited x86-64 support.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
|
|
|
@item IPC syscalls are missing.
|
|
|
|
|
2007-09-17 01:08:06 +04:00
|
|
|
@item The x86 segment limits and access rights are not tested at every
|
2004-04-04 19:21:17 +04:00
|
|
|
memory access (yet). Hopefully, very few OSes seem to rely on that for
|
|
|
|
normal use.
|
|
|
|
|
|
|
|
@end itemize
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node intro_arm_emulation
|
2004-04-04 19:21:17 +04:00
|
|
|
@section ARM emulation
|
|
|
|
|
|
|
|
@itemize
|
|
|
|
|
|
|
|
@item Full ARM 7 user emulation.
|
|
|
|
|
|
|
|
@item NWFPE FPU support included in user Linux emulation.
|
|
|
|
|
|
|
|
@item Can run most ARM Linux binaries.
|
|
|
|
|
|
|
|
@end itemize
|
|
|
|
|
2007-07-11 14:24:28 +04:00
|
|
|
@node intro_mips_emulation
|
|
|
|
@section MIPS emulation
|
|
|
|
|
|
|
|
@itemize
|
|
|
|
|
|
|
|
@item The system emulation allows full MIPS32/MIPS64 Release 2 emulation,
|
|
|
|
including privileged instructions, FPU and MMU, in both little and big
|
|
|
|
endian modes.
|
|
|
|
|
|
|
|
@item The Linux userland emulation can run many 32 bit MIPS Linux binaries.
|
|
|
|
|
|
|
|
@end itemize
|
|
|
|
|
|
|
|
Current QEMU limitations:
|
|
|
|
|
|
|
|
@itemize
|
|
|
|
|
|
|
|
@item Self-modifying code is not always handled correctly.
|
|
|
|
|
|
|
|
@item 64 bit userland emulation is not implemented.
|
|
|
|
|
|
|
|
@item The system emulation is not complete enough to run real firmware.
|
|
|
|
|
2007-07-12 13:03:30 +04:00
|
|
|
@item The watchpoint debug facility is not implemented.
|
|
|
|
|
2007-07-11 14:24:28 +04:00
|
|
|
@end itemize
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node intro_ppc_emulation
|
2004-04-04 19:21:17 +04:00
|
|
|
@section PowerPC emulation
|
|
|
|
|
|
|
|
@itemize
|
|
|
|
|
2007-09-17 01:08:06 +04:00
|
|
|
@item Full PowerPC 32 bit emulation, including privileged instructions,
|
2004-04-04 19:21:17 +04:00
|
|
|
FPU and MMU.
|
|
|
|
|
|
|
|
@item Can run most PowerPC Linux binaries.
|
|
|
|
|
|
|
|
@end itemize
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node intro_sparc_emulation
|
2008-10-09 22:52:04 +04:00
|
|
|
@section Sparc32 and Sparc64 emulation
|
2004-04-04 19:21:17 +04:00
|
|
|
|
|
|
|
@itemize
|
|
|
|
|
2007-04-05 22:40:23 +04:00
|
|
|
@item Full SPARC V8 emulation, including privileged
|
2005-07-02 18:31:34 +04:00
|
|
|
instructions, FPU and MMU. SPARC V9 emulation includes most privileged
|
2007-10-20 12:09:05 +04:00
|
|
|
and VIS instructions, FPU and I/D MMU. Alignment is fully enforced.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
2007-10-20 12:09:05 +04:00
|
|
|
@item Can run most 32-bit SPARC Linux binaries, SPARC32PLUS Linux binaries and
|
|
|
|
some 64-bit SPARC Linux binaries.
|
2005-07-02 18:31:34 +04:00
|
|
|
|
|
|
|
@end itemize
|
|
|
|
|
|
|
|
Current QEMU limitations:
|
|
|
|
|
2007-09-17 01:08:06 +04:00
|
|
|
@itemize
|
2005-07-02 18:31:34 +04:00
|
|
|
|
|
|
|
@item IPC syscalls are missing.
|
|
|
|
|
2007-11-25 21:40:20 +03:00
|
|
|
@item Floating point exception support is buggy.
|
2005-07-02 18:31:34 +04:00
|
|
|
|
|
|
|
@item Atomic instructions are not correctly implemented.
|
|
|
|
|
2008-10-09 22:52:04 +04:00
|
|
|
@item There are still some problems with Sparc64 emulators.
|
|
|
|
|
|
|
|
@end itemize
|
|
|
|
|
2011-10-10 14:48:23 +04:00
|
|
|
@node intro_xtensa_emulation
|
|
|
|
@section Xtensa emulation
|
|
|
|
|
|
|
|
@itemize
|
|
|
|
|
|
|
|
@item Core Xtensa ISA emulation, including most options: code density,
|
|
|
|
loop, extended L32R, 16- and 32-bit multiplication, 32-bit division,
|
2012-11-29 19:53:20 +04:00
|
|
|
MAC16, miscellaneous operations, boolean, FP coprocessor, coprocessor
|
|
|
|
context, debug, multiprocessor synchronization,
|
2011-10-10 14:48:23 +04:00
|
|
|
conditional store, exceptions, relocatable vectors, unaligned exception,
|
|
|
|
interrupts (including high priority and timer), hardware alignment,
|
|
|
|
region protection, region translation, MMU, windowed registers, thread
|
|
|
|
pointer, processor ID.
|
|
|
|
|
2012-11-29 19:53:20 +04:00
|
|
|
@item Not implemented options: data/instruction cache (including cache
|
|
|
|
prefetch and locking), XLMI, processor interface. Also options not
|
|
|
|
covered by the core ISA (e.g. FLIX, wide branches) are not implemented.
|
2011-10-10 14:48:23 +04:00
|
|
|
|
|
|
|
@item Can run most Xtensa Linux binaries.
|
|
|
|
|
|
|
|
@item New core configuration that requires no additional instructions
|
|
|
|
may be created from overlay with minimal amount of hand-written code.
|
|
|
|
|
|
|
|
@end itemize
|
|
|
|
|
2008-10-09 22:52:04 +04:00
|
|
|
@node intro_other_emulation
|
|
|
|
@section Other CPU emulation
|
2004-04-04 19:21:17 +04:00
|
|
|
|
2008-10-09 22:52:04 +04:00
|
|
|
In addition to the above, QEMU supports emulation of other CPUs with
|
|
|
|
varying levels of success. These are:
|
|
|
|
|
|
|
|
@itemize
|
|
|
|
|
|
|
|
@item
|
|
|
|
Alpha
|
|
|
|
@item
|
|
|
|
CRIS
|
|
|
|
@item
|
|
|
|
M68k
|
|
|
|
@item
|
|
|
|
SH4
|
2004-04-04 19:21:17 +04:00
|
|
|
@end itemize
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node QEMU Internals
|
2004-04-04 19:21:17 +04:00
|
|
|
@chapter QEMU Internals
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@menu
|
|
|
|
* QEMU compared to other emulators::
|
|
|
|
* Portable dynamic translation::
|
|
|
|
* Condition code optimisations::
|
|
|
|
* CPU state optimisations::
|
|
|
|
* Translation cache::
|
|
|
|
* Direct block chaining::
|
|
|
|
* Self-modifying code and translated code invalidation::
|
|
|
|
* Exception support::
|
|
|
|
* MMU emulation::
|
2008-10-09 22:52:04 +04:00
|
|
|
* Device emulation::
|
2006-05-01 01:58:41 +04:00
|
|
|
* Hardware interrupts::
|
|
|
|
* User emulation specific details::
|
|
|
|
* Bibliography::
|
|
|
|
@end menu
|
|
|
|
|
|
|
|
@node QEMU compared to other emulators
|
2004-04-04 19:21:17 +04:00
|
|
|
@section QEMU compared to other emulators
|
|
|
|
|
|
|
|
Like bochs [3], QEMU emulates an x86 CPU. But QEMU is much faster than
|
|
|
|
bochs as it uses dynamic compilation. Bochs is closely tied to x86 PC
|
|
|
|
emulation while QEMU can emulate several processors.
|
|
|
|
|
|
|
|
Like Valgrind [2], QEMU does user space emulation and dynamic
|
|
|
|
translation. Valgrind is mainly a memory debugger while QEMU has no
|
|
|
|
support for it (QEMU could be used to detect out of bound memory
|
|
|
|
accesses as Valgrind, but it has no support to track uninitialised data
|
|
|
|
as Valgrind does). The Valgrind dynamic translator generates better code
|
|
|
|
than QEMU (in particular it does register allocation) but it is closely
|
|
|
|
tied to an x86 host and target and has no support for precise exceptions
|
|
|
|
and system emulation.
|
|
|
|
|
|
|
|
EM86 [4] is the closest project to user space QEMU (and QEMU still uses
|
|
|
|
some of its code, in particular the ELF file loader). EM86 was limited
|
|
|
|
to an alpha host and used a proprietary and slow interpreter (the
|
|
|
|
interpreter part of the FX!32 Digital Win32 code translator [5]).
|
|
|
|
|
|
|
|
TWIN [6] is a Windows API emulator like Wine. It is less accurate than
|
|
|
|
Wine but includes a protected mode x86 interpreter to launch x86 Windows
|
2004-09-05 20:04:16 +04:00
|
|
|
executables. Such an approach has greater potential because most of the
|
2004-04-04 19:21:17 +04:00
|
|
|
Windows API is executed natively but it is far more difficult to develop
|
|
|
|
because all the data structures and function parameters exchanged
|
|
|
|
between the API and the x86 code must be converted.
|
|
|
|
|
|
|
|
User mode Linux [7] was the only solution before QEMU to launch a
|
|
|
|
Linux kernel as a process while not needing any host kernel
|
|
|
|
patches. However, user mode Linux requires heavy kernel patches while
|
|
|
|
QEMU accepts unpatched Linux kernels. The price to pay is that QEMU is
|
|
|
|
slower.
|
|
|
|
|
2008-10-09 22:52:04 +04:00
|
|
|
The Plex86 [8] PC virtualizer is done in the same spirit as the now
|
|
|
|
obsolete qemu-fast system emulator. It requires a patched Linux kernel
|
|
|
|
to work (you cannot launch the same kernel on your PC), but the
|
|
|
|
patches are really small. As it is a PC virtualizer (no emulation is
|
|
|
|
done except for some privileged instructions), it has the potential of
|
|
|
|
being faster than QEMU. The downside is that a complicated (and
|
|
|
|
potentially unsafe) host kernel patch is needed.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
|
|
|
The commercial PC Virtualizers (VMWare [9], VirtualPC [10], TwoOStwo
|
|
|
|
[11]) are faster than QEMU, but they all need specific, proprietary
|
|
|
|
and potentially unsafe host drivers. Moreover, they are unable to
|
|
|
|
provide cycle exact simulation as an emulator can.
|
|
|
|
|
2008-10-09 22:52:04 +04:00
|
|
|
VirtualBox [12], Xen [13] and KVM [14] are based on QEMU. QEMU-SystemC
|
|
|
|
[15] uses QEMU to simulate a system where some hardware devices are
|
|
|
|
developed in SystemC.
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node Portable dynamic translation
|
2004-04-04 19:21:17 +04:00
|
|
|
@section Portable dynamic translation
|
|
|
|
|
|
|
|
QEMU is a dynamic translator. When it first encounters a piece of code,
|
|
|
|
it converts it to the host instruction set. Usually dynamic translators
|
|
|
|
are very complicated and highly CPU dependent. QEMU uses some tricks
|
|
|
|
which make it relatively easily portable and simple while achieving good
|
|
|
|
performances.
|
|
|
|
|
2008-10-09 22:52:04 +04:00
|
|
|
After the release of version 0.9.1, QEMU switched to a new method of
|
|
|
|
generating code, Tiny Code Generator or TCG. TCG relaxes the
|
|
|
|
dependency on the exact version of the compiler used. The basic idea
|
|
|
|
is to split every target instruction into a couple of RISC-like TCG
|
|
|
|
ops (see @code{target-i386/translate.c}). Some optimizations can be
|
|
|
|
performed at this stage, including liveness analysis and trivial
|
|
|
|
constant expression evaluation. TCG ops are then implemented in the
|
|
|
|
host CPU back end, also known as TCG target (see
|
|
|
|
@code{tcg/i386/tcg-target.c}). For more information, please take a
|
|
|
|
look at @code{tcg/README}.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node Condition code optimisations
|
2004-04-04 19:21:17 +04:00
|
|
|
@section Condition code optimisations
|
|
|
|
|
2008-10-09 22:52:04 +04:00
|
|
|
Lazy evaluation of CPU condition codes (@code{EFLAGS} register on x86)
|
|
|
|
is important for CPUs where every instruction sets the condition
|
|
|
|
codes. It tends to be less important on conventional RISC systems
|
2009-05-10 22:23:46 +04:00
|
|
|
where condition codes are only updated when explicitly requested. On
|
|
|
|
Sparc64, costly update of both 32 and 64 bit condition codes can be
|
|
|
|
avoided with lazy evaluation.
|
2008-10-09 22:52:04 +04:00
|
|
|
|
|
|
|
Instead of computing the condition codes after each x86 instruction,
|
|
|
|
QEMU just stores one operand (called @code{CC_SRC}), the result
|
|
|
|
(called @code{CC_DST}) and the type of operation (called
|
|
|
|
@code{CC_OP}). When the condition codes are needed, the condition
|
|
|
|
codes can be calculated using this information. In addition, an
|
|
|
|
optimized calculation can be performed for some instruction types like
|
|
|
|
conditional branches.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
2008-06-03 23:51:57 +04:00
|
|
|
@code{CC_OP} is almost never explicitly set in the generated code
|
2004-04-04 19:21:17 +04:00
|
|
|
because it is known at translation time.
|
|
|
|
|
2009-05-10 22:23:46 +04:00
|
|
|
The lazy condition code evaluation is used on x86, m68k, cris and
|
|
|
|
Sparc. ARM uses a simplified variant for the N and Z flags.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node CPU state optimisations
|
2004-04-04 19:21:17 +04:00
|
|
|
@section CPU state optimisations
|
|
|
|
|
2008-10-09 22:52:04 +04:00
|
|
|
The target CPUs have many internal states which change the way it
|
|
|
|
evaluates instructions. In order to achieve a good speed, the
|
|
|
|
translation phase considers that some state information of the virtual
|
|
|
|
CPU cannot change in it. The state is recorded in the Translation
|
|
|
|
Block (TB). If the state changes (e.g. privilege level), a new TB will
|
|
|
|
be generated and the previous TB won't be used anymore until the state
|
|
|
|
matches the state recorded in the previous TB. For example, if the SS,
|
|
|
|
DS and ES segments have a zero base, then the translator does not even
|
|
|
|
generate an addition for the segment base.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
|
|
|
[The FPU stack pointer register is not handled that way yet].
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node Translation cache
|
2004-04-04 19:21:17 +04:00
|
|
|
@section Translation cache
|
|
|
|
|
2011-11-04 21:14:44 +04:00
|
|
|
A 32 MByte cache holds the most recently used translations. For
|
2004-04-04 19:21:17 +04:00
|
|
|
simplicity, it is completely flushed when it is full. A translation unit
|
|
|
|
contains just a single basic block (a block of x86 instructions
|
|
|
|
terminated by a jump or by a virtual CPU state change which the
|
|
|
|
translator cannot deduce statically).
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node Direct block chaining
|
2004-04-04 19:21:17 +04:00
|
|
|
@section Direct block chaining
|
|
|
|
|
|
|
|
After each translated basic block is executed, QEMU uses the simulated
|
|
|
|
Program Counter (PC) and other cpu state informations (such as the CS
|
|
|
|
segment base value) to find the next basic block.
|
|
|
|
|
|
|
|
In order to accelerate the most common cases where the new simulated PC
|
|
|
|
is known, QEMU can patch a basic block so that it jumps directly to the
|
|
|
|
next one.
|
|
|
|
|
|
|
|
The most portable code uses an indirect jump. An indirect jump makes
|
|
|
|
it easier to make the jump target modification atomic. On some host
|
|
|
|
architectures (such as x86 or PowerPC), the @code{JUMP} opcode is
|
|
|
|
directly patched so that the block chaining has no overhead.
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node Self-modifying code and translated code invalidation
|
2004-04-04 19:21:17 +04:00
|
|
|
@section Self-modifying code and translated code invalidation
|
|
|
|
|
|
|
|
Self-modifying code is a special challenge in x86 emulation because no
|
|
|
|
instruction cache invalidation is signaled by the application when code
|
|
|
|
is modified.
|
|
|
|
|
|
|
|
When translated code is generated for a basic block, the corresponding
|
2008-10-09 22:52:04 +04:00
|
|
|
host page is write protected if it is not already read-only. Then, if
|
|
|
|
a write access is done to the page, Linux raises a SEGV signal. QEMU
|
|
|
|
then invalidates all the translated code in the page and enables write
|
|
|
|
accesses to the page.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
|
|
|
Correct translated code invalidation is done efficiently by maintaining
|
|
|
|
a linked list of every translated block contained in a given page. Other
|
2007-09-17 01:08:06 +04:00
|
|
|
linked lists are also maintained to undo direct block chaining.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
2008-10-09 22:52:04 +04:00
|
|
|
On RISC targets, correctly written software uses memory barriers and
|
|
|
|
cache flushes, so some of the protection above would not be
|
|
|
|
necessary. However, QEMU still requires that the generated code always
|
|
|
|
matches the target instructions in memory in order to handle
|
|
|
|
exceptions correctly.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node Exception support
|
2004-04-04 19:21:17 +04:00
|
|
|
@section Exception support
|
|
|
|
|
|
|
|
longjmp() is used when an exception such as division by zero is
|
2007-09-17 01:08:06 +04:00
|
|
|
encountered.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
|
|
|
The host SIGSEGV and SIGBUS signal handlers are used to get invalid
|
2008-10-09 22:52:04 +04:00
|
|
|
memory accesses. The simulated program counter is found by
|
|
|
|
retranslating the corresponding basic block and by looking where the
|
|
|
|
host program counter was at the exception point.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
|
|
|
The virtual CPU cannot retrieve the exact @code{EFLAGS} register because
|
|
|
|
in some cases it is not computed because of condition code
|
|
|
|
optimisations. It is not a big concern because the emulated code can
|
|
|
|
still be restarted in any cases.
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node MMU emulation
|
2004-04-04 19:21:17 +04:00
|
|
|
@section MMU emulation
|
|
|
|
|
2008-10-09 22:52:04 +04:00
|
|
|
For system emulation QEMU supports a soft MMU. In that mode, the MMU
|
|
|
|
virtual to physical address translation is done at every memory
|
|
|
|
access. QEMU uses an address translation cache to speed up the
|
|
|
|
translation.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
|
|
|
In order to avoid flushing the translated code each time the MMU
|
|
|
|
mappings change, QEMU uses a physically indexed translation cache. It
|
2007-09-17 01:08:06 +04:00
|
|
|
means that each basic block is indexed with its physical address.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
|
|
|
When MMU mappings change, only the chaining of the basic blocks is
|
|
|
|
reset (i.e. a basic block can no longer jump directly to another one).
|
|
|
|
|
2008-10-09 22:52:04 +04:00
|
|
|
@node Device emulation
|
|
|
|
@section Device emulation
|
|
|
|
|
|
|
|
Systems emulated by QEMU are organized by boards. At initialization
|
|
|
|
phase, each board instantiates a number of CPUs, devices, RAM and
|
|
|
|
ROM. Each device in turn can assign I/O ports or memory areas (for
|
|
|
|
MMIO) to its handlers. When the emulation starts, an access to the
|
|
|
|
ports or MMIO memory areas assigned to the device causes the
|
|
|
|
corresponding handler to be called.
|
|
|
|
|
|
|
|
RAM and ROM are handled more optimally, only the offset to the host
|
|
|
|
memory needs to be added to the guest address.
|
|
|
|
|
|
|
|
The video RAM of VGA and other display cards is special: it can be
|
|
|
|
read or written directly like RAM, but write accesses cause the memory
|
|
|
|
to be marked with VGA_DIRTY flag as well.
|
|
|
|
|
|
|
|
QEMU supports some device classes like serial and parallel ports, USB,
|
|
|
|
drives and network devices, by providing APIs for easier connection to
|
|
|
|
the generic, higher level implementations. The API hides the
|
|
|
|
implementation details from the devices, like native device use or
|
|
|
|
advanced block device formats like QCOW.
|
|
|
|
|
|
|
|
Usually the devices implement a reset method and register support for
|
|
|
|
saving and loading of the device state. The devices can also use
|
|
|
|
timers, especially together with the use of bottom halves (BHs).
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node Hardware interrupts
|
2004-04-04 19:21:17 +04:00
|
|
|
@section Hardware interrupts
|
|
|
|
|
2012-07-17 01:37:07 +04:00
|
|
|
In order to be faster, QEMU does not check at every basic block if a
|
2011-01-07 23:31:39 +03:00
|
|
|
hardware interrupt is pending. Instead, the user must asynchronously
|
2004-04-04 19:21:17 +04:00
|
|
|
call a specific function to tell that an interrupt is pending. This
|
|
|
|
function resets the chaining of the currently executing basic
|
|
|
|
block. It ensures that the execution will return soon in the main loop
|
|
|
|
of the CPU emulator. Then the main loop can test if the interrupt is
|
|
|
|
pending and handle it.
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node User emulation specific details
|
2004-04-04 19:21:17 +04:00
|
|
|
@section User emulation specific details
|
|
|
|
|
|
|
|
@subsection Linux system call translation
|
|
|
|
|
|
|
|
QEMU includes a generic system call translator for Linux. It means that
|
|
|
|
the parameters of the system calls can be converted to fix the
|
|
|
|
endianness and 32/64 bit issues. The IOCTLs are converted with a generic
|
|
|
|
type description system (see @file{ioctls.h} and @file{thunk.c}).
|
|
|
|
|
|
|
|
QEMU supports host CPUs which have pages bigger than 4KB. It records all
|
|
|
|
the mappings the process does and try to emulated the @code{mmap()}
|
|
|
|
system calls in cases where the host @code{mmap()} call would fail
|
|
|
|
because of bad page alignment.
|
|
|
|
|
|
|
|
@subsection Linux signals
|
|
|
|
|
|
|
|
Normal and real-time signals are queued along with their information
|
|
|
|
(@code{siginfo_t}) as it is done in the Linux kernel. Then an interrupt
|
|
|
|
request is done to the virtual CPU. When it is interrupted, one queued
|
|
|
|
signal is handled by generating a stack frame in the virtual CPU as the
|
|
|
|
Linux kernel does. The @code{sigreturn()} system call is emulated to return
|
|
|
|
from the virtual signal handler.
|
|
|
|
|
|
|
|
Some signals (such as SIGALRM) directly come from the host. Other
|
2011-01-07 23:31:39 +03:00
|
|
|
signals are synthesized from the virtual CPU exceptions such as SIGFPE
|
2004-04-04 19:21:17 +04:00
|
|
|
when a division by zero is done (see @code{main.c:cpu_loop()}).
|
|
|
|
|
|
|
|
The blocked signal mask is still handled by the host Linux kernel so
|
|
|
|
that most signal system calls can be redirected directly to the host
|
|
|
|
Linux kernel. Only the @code{sigaction()} and @code{sigreturn()} system
|
|
|
|
calls need to be fully emulated (see @file{signal.c}).
|
|
|
|
|
|
|
|
@subsection clone() system call and threads
|
|
|
|
|
|
|
|
The Linux clone() system call is usually used to create a thread. QEMU
|
|
|
|
uses the host clone() system call so that real host threads are created
|
|
|
|
for each emulated thread. One virtual CPU instance is created for each
|
|
|
|
thread.
|
|
|
|
|
|
|
|
The virtual x86 CPU atomic operations are emulated with a global lock so
|
|
|
|
that their semantic is preserved.
|
|
|
|
|
|
|
|
Note that currently there are still some locking issues in QEMU. In
|
|
|
|
particular, the translated cache flush is not protected yet against
|
|
|
|
reentrancy.
|
|
|
|
|
|
|
|
@subsection Self-virtualization
|
|
|
|
|
|
|
|
QEMU was conceived so that ultimately it can emulate itself. Although
|
|
|
|
it is not very useful, it is an important test to show the power of the
|
|
|
|
emulator.
|
|
|
|
|
|
|
|
Achieving self-virtualization is not easy because there may be address
|
2008-10-09 22:52:04 +04:00
|
|
|
space conflicts. QEMU user emulators solve this problem by being an
|
|
|
|
executable ELF shared object as the ld-linux.so ELF interpreter. That
|
|
|
|
way, it can be relocated at load time.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node Bibliography
|
2004-04-04 19:21:17 +04:00
|
|
|
@section Bibliography
|
|
|
|
|
|
|
|
@table @asis
|
|
|
|
|
2007-09-17 01:08:06 +04:00
|
|
|
@item [1]
|
2004-04-04 19:21:17 +04:00
|
|
|
@url{http://citeseer.nj.nec.com/piumarta98optimizing.html}, Optimizing
|
|
|
|
direct threaded code by selective inlining (1998) by Ian Piumarta, Fabio
|
|
|
|
Riccardi.
|
|
|
|
|
|
|
|
@item [2]
|
|
|
|
@url{http://developer.kde.org/~sewardj/}, Valgrind, an open-source
|
|
|
|
memory debugger for x86-GNU/Linux, by Julian Seward.
|
|
|
|
|
|
|
|
@item [3]
|
|
|
|
@url{http://bochs.sourceforge.net/}, the Bochs IA-32 Emulator Project,
|
|
|
|
by Kevin Lawton et al.
|
|
|
|
|
|
|
|
@item [4]
|
|
|
|
@url{http://www.cs.rose-hulman.edu/~donaldlf/em86/index.html}, the EM86
|
|
|
|
x86 emulator on Alpha-Linux.
|
|
|
|
|
|
|
|
@item [5]
|
2006-05-01 01:58:41 +04:00
|
|
|
@url{http://www.usenix.org/publications/library/proceedings/usenix-nt97/@/full_papers/chernoff/chernoff.pdf},
|
2004-04-04 19:21:17 +04:00
|
|
|
DIGITAL FX!32: Running 32-Bit x86 Applications on Alpha NT, by Anton
|
|
|
|
Chernoff and Ray Hookway.
|
|
|
|
|
|
|
|
@item [6]
|
|
|
|
@url{http://www.willows.com/}, Windows API library emulation from
|
|
|
|
Willows Software.
|
|
|
|
|
|
|
|
@item [7]
|
2007-09-17 01:08:06 +04:00
|
|
|
@url{http://user-mode-linux.sourceforge.net/},
|
2004-04-04 19:21:17 +04:00
|
|
|
The User-mode Linux Kernel.
|
|
|
|
|
|
|
|
@item [8]
|
2007-09-17 01:08:06 +04:00
|
|
|
@url{http://www.plex86.org/},
|
2004-04-04 19:21:17 +04:00
|
|
|
The new Plex86 project.
|
|
|
|
|
|
|
|
@item [9]
|
2007-09-17 01:08:06 +04:00
|
|
|
@url{http://www.vmware.com/},
|
2004-04-04 19:21:17 +04:00
|
|
|
The VMWare PC virtualizer.
|
|
|
|
|
|
|
|
@item [10]
|
2007-09-17 01:08:06 +04:00
|
|
|
@url{http://www.microsoft.com/windowsxp/virtualpc/},
|
2004-04-04 19:21:17 +04:00
|
|
|
The VirtualPC PC virtualizer.
|
|
|
|
|
|
|
|
@item [11]
|
2007-09-17 01:08:06 +04:00
|
|
|
@url{http://www.twoostwo.org/},
|
2004-04-04 19:21:17 +04:00
|
|
|
The TwoOStwo PC virtualizer.
|
|
|
|
|
2008-10-09 22:52:04 +04:00
|
|
|
@item [12]
|
|
|
|
@url{http://virtualbox.org/},
|
|
|
|
The VirtualBox PC virtualizer.
|
|
|
|
|
|
|
|
@item [13]
|
|
|
|
@url{http://www.xen.org/},
|
|
|
|
The Xen hypervisor.
|
|
|
|
|
|
|
|
@item [14]
|
|
|
|
@url{http://kvm.qumranet.com/kvmwiki/Front_Page},
|
|
|
|
Kernel Based Virtual Machine (KVM).
|
|
|
|
|
|
|
|
@item [15]
|
|
|
|
@url{http://www.greensocs.com/projects/QEMUSystemC},
|
|
|
|
QEMU-SystemC, a hardware co-simulator.
|
|
|
|
|
2004-04-04 19:21:17 +04:00
|
|
|
@end table
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node Regression Tests
|
2004-04-04 19:21:17 +04:00
|
|
|
@chapter Regression Tests
|
|
|
|
|
|
|
|
In the directory @file{tests/}, various interesting testing programs
|
2007-07-12 13:03:30 +04:00
|
|
|
are available. They are used for regression testing.
|
2004-04-04 19:21:17 +04:00
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@menu
|
|
|
|
* test-i386::
|
|
|
|
* linux-test::
|
|
|
|
@end menu
|
|
|
|
|
|
|
|
@node test-i386
|
2004-04-04 19:21:17 +04:00
|
|
|
@section @file{test-i386}
|
|
|
|
|
|
|
|
This program executes most of the 16 bit and 32 bit x86 instructions and
|
|
|
|
generates a text output. It can be compared with the output obtained with
|
|
|
|
a real CPU or another emulator. The target @code{make test} runs this
|
|
|
|
program and a @code{diff} on the generated output.
|
|
|
|
|
|
|
|
The Linux system call @code{modify_ldt()} is used to create x86 selectors
|
|
|
|
to test some 16 bit addressing and 32 bit with segmentation cases.
|
|
|
|
|
|
|
|
The Linux system call @code{vm86()} is used to test vm86 emulation.
|
|
|
|
|
|
|
|
Various exceptions are raised to test most of the x86 user space
|
|
|
|
exception reporting.
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node linux-test
|
2004-04-04 19:21:17 +04:00
|
|
|
@section @file{linux-test}
|
|
|
|
|
|
|
|
This program tests various Linux system calls. It is used to verify
|
|
|
|
that the system call parameters are correctly converted between target
|
|
|
|
and host CPUs.
|
|
|
|
|
2006-05-01 01:58:41 +04:00
|
|
|
@node Index
|
|
|
|
@chapter Index
|
|
|
|
@printindex cp
|
|
|
|
|
|
|
|
@bye
|